on debugging a c++ function with float arguments
I ran into a problem when I tried to debug some C++ code that uses VTK. The problem ultimately came down to GDB not understanding that some arguments were being passed via registers instead of on the stack. I worked around this problem using the GDB convenience variable $_caller_is
.
Code Example
/**
*
*/
// stdlib
#include <cstdio>
// VTK
#include <vtkSetGet.h>
#include <vtkObject.h>
#include <vtkNew.h>
#include <vtkObjectFactory.h>
//--- Define a simple vtkObject subclass with the offending method.
struct Foo : public vtkObject {
static Foo *New();
vtkSetVector6Macro(Bounds, float);
float Bounds[6];
};
vtkStandardNewMacro(Foo);
//--- Demonstrate the bug.
int main() {
vtkNew<Foo> foo;
float bounds[6];
bounds[0] = 0.373737f;
bounds[1] = 1.373737f;
bounds[2] = 2.373737f;
bounds[3] = 3.373737f;
bounds[4] = 4.373737f;
bounds[5] = 5.373737f;
std::fprintf(stderr, "main.bounds = { %+0.2f, %+0.2f, %+0.2f, %+0.2f, %+0.2f, %+0.2f };\n",
bounds[0], bounds[1], bounds[2], bounds[3], bounds[4], bounds[5]);
foo->SetBounds(bounds);
std::fprintf(stderr, "foo.Bounds = { %+0.2f, %+0.2f, %+0.2f, %+0.2f, %+0.2f, %+0.2f };\n",
foo->Bounds[0], foo->Bounds[1], foo->Bounds[2], foo->Bounds[3], foo->Bounds[4], foo->Bounds[5]);
return 0;
}
My code looked very similar to this. I attached a debugger to it and ran the following gdb batch file.
#!/usr/bin/env -S gdb -x
start
break -function Foo::SetBounds
commands
printf "---8<---\n"
info args
printf "--->8---\n"
continue
end
continue
What I saw when I ran this was that the first SetBounds
call was called with the correct argument, a pointer to a float array with the correct contents. But in the second SetBounds
call, I was getting garbage floating point values with extreme exponents (e-41
, e21
, etc).
To elaborate: in VTK's vtkSetGet.h
header, the vtkSetVector6Macro
is defined, roughly, as:
#define vtkSetVector6Macro(Name, Type) \
void Set##Name(const Type *_arg) { \
this->Set##Name(_arg[0], _arg[1], _arg[2], _arg[3], _arg[4], _arg[5]); \
} \
void Set##Name(Type _arg1, Type _arg2, Type _arg3, Type _arg4, Type _arg5) { \
this->Name[0] = _arg1; \
this->Name[1] = _arg2; \
this->Name[2] = _arg3; \
this->Name[3] = _arg4; \
this->Name[4] = _arg5; \
this->Name[5] = _arg6; \
}
In other words, one function takes a pointer to an array of a type, and the other takes each argument individually. The first function defers to the second to actually update the member variable.
Diagnosing the Problem
I'd already suspected that this could have been a “register vs stack” problem. For this reason, I checked the disassembly of each SetBounds
function and saw that it was writing to the %xmm1
, %xmm2
, etc registers in the first SetBounds
function and then reading from %ymm1
, %ymm2
, etc registers in the second SetBounds
function.
Aside: I was a little confused on the switch between %xmm1
and %ymm1
. I read online that these are SSE registers that hold multiple floats, chars, etc within a single register. The %xmm1
register holds 4 floats, while the %ymm1
register holds 8 floats, with the first 4 being mirrored in %xmm1
.
To verify this, I set a breakpoint in the second SetBounds
function, verified that info args
showed the incorrect garbage values, and then verified that the %ymm1
register output by info all-registers
had the correct values.
Working Around the Problem
Ideally, in GDB, I'd have a way to set a breakpoint on only one of the SetBounds
functions. Barring that, I can check if the function that called us is also SetBounds
or not (i.e. if we're the second or first, respectively). In a GDB script, that looks like:
#!/usr/bin/env -S gdb -x
start
break -function Foo::SetBounds if ! $_caller_is("Foo::SetBounds")
commands
printf "Foo::SetBounds((float[6])"
output *(float *)_arg@6
printf ")\n"
continue
end
continue