Friday, October 17, 2014

Reversing C++ binaries 3: Virtual members

Now that we have a better understanding of how classes are compiled, we can analyze polymorphism. We can expect virtual members to be handled differently, because a class instance may have a different implementation: the compiler simply can't know at compile time which function to call.

I modified the old TestClass to include a virtual method:

class TestClass
{
  public:
  // _ZN9TestClassC1Ev
  TestClass()
  {
    stuff = 1;
  }

  // _ZN9TestClassD1Ev
  virtual ~TestClass()
  {
    stuff = 0;
  }

  // _ZN9TestClass8GetStuffEv
  virtual int GetStuff()
  {
    return stuff;
  }

  private:
  int stuff;
};

// _Z4DoItP9TestClass
int DoIt(TestClass* t1)
{
  return t1->GetStuff();
}

int main()
{
  TestClass* t1 = new TestClass();
  int r = DoIt(t1);
  delete t1;
  return r;
}

Let's look at the disassebly of the function DoIt:

0804869d <_Z4DoItP9TestClass>:
 804869d: push   %ebp
 804869e: mov    %esp,%ebp
 80486a0: lea    -0x1028(%esp),%esp
 80486a7: orl    $0x0,(%esp)
 80486ab: lea    0x1010(%esp),%esp
 80486b2: mov    0x8(%ebp),%eax ; t1 = ebp + 8
 80486b5: mov    (%eax),%eax    ; obj = *t1
 80486b7: add    $0x8,%eax      ;
 80486ba: mov    (%eax),%eax    ; GetStuff = obj[8]
 80486bc: mov    0x8(%ebp),%edx
 80486bf: mov    %edx,(%esp)
 80486c2: call   *%eax          ; GetStuff(this)
 80486c4: leave
 80486c5: ret

Unlike previous usages, the DoIt disassembly doesn't contain an explict call to GetStuff. Instead, there is a call to register eax, which is initialized by deferencing a field of TestClass. This particular field is not present into the C++ code, thus we must look at the disassembled constructor:

080487a6 <_ZN9TestClassC1Ev>:
 80487a6: push   %ebp
 80487a7: mov    %esp,%ebp
 80487a9: lea    -0x1010(%esp),%esp
 80487b0: orl    $0x0,(%esp)
 80487b4: lea    0x1010(%esp),%esp
 80487bb: mov    0x8(%ebp),%eax
 80487be: movl   $0x8048910,(%eax) ; this[0] = 0x8048910
 80487c4: mov    0x8(%ebp),%eax
 80487c7: movl   $0x1,0x4(%eax)    ; this->stuff = 1;
 80487ce: pop    %ebp
 80487cf: ret

The address 0x8048910 resides in the .rodata section and point to the Virtual Table of TestClass. The vtable contains references to all virtual methods present in TestClass:

Contents of section .rodata:
 8048900 03000000 01000200 00000000 28890408  ............(...
 8048910 d0870408 0e880408 3c880408 39546573  ........<...9Tes
 8048920 74436c61 73730000 28a00408 1c890408  tClass..(.......

// TestClass virtual table:

 0x8048910 + 0x00: 080487d0 ; _ZN9TestClassD1Ev (Complete Object destructor)
 0x8048910 + 0x04: 0804880e ; _ZN9TestClassD0Ev (Deleting destructor)
 0x8048910 + 0x08: 0804883c ; _ZN9TestClass8GetStuffEv (GetStuff)

To expain the difference between the two destruction, I quote the C++ ABI Itanium reference:

  1. Base object destructor of a class T: A function that runs the destructors for non-static data members of T and non-virtual direct base classes of T. Mangled with suffix D2.
  2. Complete object destructor of a class T: a function that, in addition to the actions required of a base object destructor, runs the destructors for the virtual base classes of T. Mangled with suffix D1.
  3. Deleting destructor of a class T: a function that, in addition to the actions required of a complete object destructor, calls the appropriate deallocation function (i.e,. operator delete) for T. Mangled with suffix D0.


No comments: