Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Inheritance and the VTABLE

You can imagine what happens when you perform inheritance and redefine some of the virtual functions. The compiler creates a new VTABLE for your new class, and it inserts your new function addresses, using the base-class function addresses for any virtual functions you don’t redefine. One way or another, there’s always a full set of function addresses in the VTABLE, so you’ll never be able to make a call to an address that isn’t there (which would be disastrous).

But what happens when you inherit and add new virtual functions in the derived class? Here’s a simple example:

//: C15:Addv.cpp
// Adding virtuals in derivation
#include <iostream>
using namespace std;

class Base {
  int i;
public:
  Base(int ii) : i(ii) {}
  virtual int value() const { return i; }
};

class Derived : public Base {
public:
  Derived(int ii) : Base(ii) {}
  int value() const {
    return Base::value() * 2;
  }
  // New virtual function in the Derived class:
  virtual int shift(int x) const {
    return Base::value() << x;
  }
};

int main() {
  Base* B[] = { new Base(7), new Derived(7) };
  cout << "B[0]->value() = "
       << B[0]->value() << endl;
  cout << "B[1]->value() = "
       << B[1]->value() << endl;
//! cout << "B[1]->shift(3) = "
//!      << B[1]->shift(3) << endl; // Illegal
} ///:~

The class Base contains a single virtual function value( ), and Derived adds a second one called shift( ), as well as redefining the meaning of value( ). A diagram will help visualize what’s happening. Here are the VTABLEs created by the compiler for Base and Derived:

Notice the compiler maps the location of the value address into exactly the same spot in the Derived VTABLE as it is in the Base VTABLE. Similarly, if a class is inherited from Derived, its version of shift would be placed in its VTABLE in exactly the same spot as it is in Derived. This is because (as you saw with the assembly-language example) the compiler generates code that uses a simple numerical offset into the VTABLE to select the virtual function. Regardless of what specific subtype the object belongs to, its VTABLE is laid out the same way, so calls to the virtual functions will always be made the same way.

In this case, however, the compiler is working only with a pointer to a base-class object. The base class has only the value( ) function, so that is the only function the compiler will allow you to call. How could it possibly know that you are working with a Derived object, if it has only a pointer to a base-class object? That pointer might point to some other type, which doesn’t have a shift function. It may or may not have some other function address at that point in the VTABLE, but in either case, making a virtual call to that VTABLE address is not what you want to do. So it’s fortunate and logical that the compiler protects you from making virtual calls to functions that exist only in derived classes.

There are some less-common cases where you may know that the pointer actually points to an object of a specific subclass. If you want to call a function that only exists in that subclass, then you must cast the pointer. You can remove the error message produced by the previous program like this:

((Derived*)B[1])->shift(3)

Here, you happen to know that B[1] points to a Derived object, but generally you don’t know that. If your problem is set up so that you must know the exact types of all objects, you should rethink it, because you’re probably not using virtual functions properly. However, there are some situations where the design works best (or you have no choice) if you know the exact type of all objects kept in a generic container. This is the problem of run-time type identification (RTTI).

Run-time type identification is all about casting base-class pointers down to derived-class pointers (“up” and “down” are relative to a typical class diagram, with the base class at the top). Casting up happens automatically, with no coercion, because it’s completely safe. Casting down is unsafe because there’s no compile time information about the actual types, so you must know exactly what type the object really is. If you cast it into the wrong type, you’ll be in trouble.

Chapter XX describes the way C++ provides run-time type information.

Object slicing

There is a distinct difference between passing addresses and passing values when treating objects polymorphically. All the examples you’ve seen here, and virtually all the examples you should see, pass addresses and not values. This is because addresses all have the same size, [42] so passing the address of an object of a derived type (which is usually bigger) is the same as passing the address of an object of the base type (which is usually smaller). As explained before, this is the goal when using polymorphism – code that manipulates objects of a base type can transparently manipulate derived-type objects as well.

If you use an object instead of a pointer or reference as the recipient of your upcast, something will happen that may surprise you: the object is “sliced” until all that remains is the subobject that corresponds to your destination. In the following example you can see what’s left after slicing by examining the size of the objects:

//: C15:Slice.cpp
// Object slicing
#include <iostream>
using namespace std;

class Base {
  int i;
public:
  Base(int ii = 0) : i(ii) {}
  virtual int sum() const { return i; }
};

class Derived : public Base {
  int j;
public:
  Derived(int ii = 0, int jj = 0)
    : Base(ii), j(jj) {}
  int sum() const { return Base::sum() + j; }
};

void call(Base b) {
  cout << "sum = " << b.sum() << endl;
}

int main() {
  Base b(10);
  Derived d(10, 47);
  call(b);
  call(d);
} ///:~

The function call( ) is passed an object of type Base by value . It then calls the virtual function sum( ) for the Base object. In main( ), you might expect the first call to produce 10, and the second to produce 57. In fact, both calls produce 10.

Two things are happening in this program. First, call( ) accepts only a Base object, so all the code inside the function body will manipulate only members associated with Base. Any calls to call( ) will cause an object the size of Base to be pushed on the stack and cleaned up after the call. This means that if an object of a class inherited from Base is passed to call( ), the compiler accepts it, but it copies only the Base portion of the object. It slices the derived portion off of the object, like this:

Now you may wonder about the virtual function call. Here, the virtual function makes use of portions of both Base (which still exists) and Derived, which no longer exists because it was sliced off! So what happens when the virtual function is called?

You’re saved from disaster precisely because the object is being passed by value. Because of this, the compiler thinks it knows the precise type of the object (and it does, here, because any information that contributed extra features to the objects has been lost). In addition, when passing by value, it uses the copy-constructor for a Base object, which initializes the VPTR to the Base VTABLE and copies only the Base parts of the object. There’s no explicit copy-constructor here, so the compiler synthesizes one. Under all interpretations, the object truly becomes a Base during slicing.

Object slicing actually removes part of the object rather than simply changing the meaning of an address as when using a pointer or reference. Because of this, upcasting into an object is not often done; in fact, it’s usually something to watch out for and prevent. You can explicitly prevent object slicing by putting pure virtual functions in the base class; this will cause a compile-time error message for an object slice.


[42]Actually, not all pointers are the same size on all machines. In the context of this discussion, however, they can be considered to be the same.

Contents | Prev | Next


Contact: webmaster@codeguru.com
CodeGuru - the website for developers.
[an error occurred while processing this directive]