CodeGuru : Thinking in C++

Factories: encapsulating object creation

When you discover that you need to add new types to a system, the most sensible first step to take is to use polymorphism to create a common interface to those new types, thus separating the rest of the code in your system from the knowledge of the specific types that you are adding. This way, new types may be added without disturbing exising code ... or so it seems. At first it would appear that the only place you need to change the code in such a design is the place where you inherit a new type, but this is not quite true. You must still create an object of your new type, and at the point of creation you must specify the exact constructor to use. Thus, if the code that creates objects is distributed throughout your application, you have the same problem when adding new types – you must still chase down all the points of your code where it type matters. It happens to be the creation of the type that matters in this case rather than the use of the type (which is taken care of by polymorphism), but the effect is the same: adding a new type can cause problems.

The solution is to force the creation of objects to occur through a common factory rather than to allow the creational code to be spread throughout your system. If all the code in your program must go through this factory whenever it needs to create one of your objects, then all you must do when you add a new object is to modify the factory.

As an example, let’s revisit the Shape system. One approach is to make the factory a static method of the base class:

//: C25:ShapeFactory1.cpp
#include "../purge.h"
#include <iostream>
#include <string>
#include <exception>
#include <vector>
using namespace std;

class Shape {
public:
  virtual void draw() = 0;
  virtual void erase() = 0;
  virtual ~Shape() {}
  class BadShapeCreation : public exception {
    string reason;
  public:
    BadShapeCreation(string type) {
      reason = "Cannot create type " + type;
    }
    const char *what() const { 
      return reason.c_str(); 
    }
  };
  static Shape* factory(string type) 
    throw(BadShapeCreation);
};

class Circle : public Shape {
  Circle() {} // Private constructor
  friend class Shape;
public:
  void draw() { cout << "Circle::draw\n"; }
  void erase() { cout << "Circle::erase\n"; }
  ~Circle() { cout << "Circle::~Circle\n"; }
};

class Square : public Shape {
  Square() {}
  friend class Shape;
public:
  void draw() { cout << "Square::draw\n"; }
  void erase() { cout << "Square::erase\n"; }
  ~Square() { cout << "Square::~Square\n"; }
};

Shape* Shape::factory(string type) 
  throw(Shape::BadShapeCreation) {
  if(type == "Circle") return new Circle();
  if(type == "Square") return new Square();
  throw BadShapeCreation(type);
}

char* shlist[] = { "Circle", "Square", "Square",
  "Circle", "Circle", "Circle", "Square", "" };

int main() {
  vector<Shape*> shapes;
  try {
    for(char** cp = shlist; **cp; cp++)
      shapes.push_back(Shape::factory(*cp));
  } catch(Shape::BadShapeCreation e) {
    cout << e.what() << endl;
    return 1;
  }
  for(int i = 0; i < shapes.size(); i++) {
    shapes[i]->draw();
    shapes[i]->erase();
  }
  purge(shapes);

} ///:~

The factory( ) takes an argument that allows it to determine what type of Shape to create; it happens to be a string in this case but it could be any set of data. The factory( ) is now the only other code in the system that needs to be changed when a new type of Shape is added (the initialization data for the objects will presumably come from somewhere outside the system, and not be a hard-coded array as in the above example).

To ensure that the creation can only happen in the factory( ), the constructors for the specific types of Shape are made private, and Shape is declared a friend so that factory( ) has access to the constructors (you could also declare only Shape::factory( ) to be a friend, but it seems reasonably harmless to declare the entire base class as a friend).

Polymorphic factories

The static factory( ) method in the previous example forces all the creation operations to be focused in one spot, to that’s the only place you need to change the code. This is certainly a reasonable solution, as it throws a box around the process of creating objects. However, the Design Patterns book emphasizes that the reason for the Factory Method pattern is so that different types of factories can be subclassed from the basic factory (the above design is mentioned as a special case). However, the book does not provide an example, but instead just repeats the example used for the Abstract Factory . Here is ShapeFactory1.cpp modified so the factory methods are in a separate class as virtual functions:

//: C25:ShapeFactory2.cpp
// Polymorphic factory methods
#include "../purge.h"
#include <iostream>
#include <string>
#include <exception>
#include <vector>
#include <map>
using namespace std;

class Shape {
public:
  virtual void draw() = 0;
  virtual void erase() = 0;
  virtual ~Shape() {}
};

class ShapeFactory {
  virtual Shape* create() = 0;
  static map<string, ShapeFactory*> factories;
public:
  virtual ~ShapeFactory() {}
  friend class ShapeFactoryInizializer;
  class BadShapeCreation : public exception {
    string reason;
  public:
    BadShapeCreation(string type) {
      reason = "Cannot create type " + type;
    }
    const char *what() const { 
      return reason.c_str(); 
    }
  };
  static Shape* 
  createShape(string id) throw(BadShapeCreation){
    if(factories.find(id) != factories.end())
      return factories[id]->create();
    else
      throw BadShapeCreation(id);
  }
};

// Define the static object:
map<string, ShapeFactory*> 
  ShapeFactory::factories;

class Circle : public Shape {
  Circle() {} // Private constructor
public:
  void draw() { cout << "Circle::draw\n"; }
  void erase() { cout << "Circle::erase\n"; }
  ~Circle() { cout << "Circle::~Circle\n"; }
  class Factory;
  friend class Factory;
  class Factory : public ShapeFactory {
  public:
    Shape* create() { return new Circle; }
  };
};

class Square : public Shape {
  Square() {}
public:
  void draw() { cout << "Square::draw\n"; }
  void erase() { cout << "Square::erase\n"; }
  ~Square() { cout << "Square::~Square\n"; }
  class Factory;
  friend class Factory;
  class Factory : public ShapeFactory {
  public:
    Shape* create() { return new Square; }
  };
};

// Singleton to initialize the ShapeFactory:
class ShapeFactoryInizializer {
  static ShapeFactoryInizializer si;
  ShapeFactoryInizializer() {
    ShapeFactory::factories["Circle"] =
      new Circle::Factory;
    ShapeFactory::factories["Square"] =
      new Square::Factory;
  }
};

// Static member definition:
ShapeFactoryInizializer
  ShapeFactoryInizializer::si;

char* shlist[] = { "Circle", "Square", "Square",
  "Circle", "Circle", "Circle", "Square", "" };

int main() {
  vector<Shape*> shapes;
  try {
    for(char** cp = shlist; **cp; cp++)
      shapes.push_back(
        ShapeFactory::createShape(*cp));
  } catch(ShapeFactory::BadShapeCreation e) {
    cout << e.what() << endl;
    return 1;
  }
  for(int i = 0; i < shapes.size(); i++) {
    shapes[i]->draw();
    shapes[i]->erase();
  }
  purge(shapes);

} ///:~

Now the factory method appears in its own class, ShapeFactory, as the virtual create( ) . This is a private method which means it cannot be called directly, but it can be overridden. The subclasses of Shape must each create their own subclasses of ShapeFactory and override the shape( ) method to create an object of their own type. The actual creation of shapes is performed by calling ShapeFactory::createShape( ), which is a static method that uses the map in ShapeFactory to find the appropriate factory object based on an identifier that you pass it. The factory is immediately used to create the shape object, but you could imagine a more complex problem where the appropriate factory object is returned and then used by the caller to create an object in a more sophisticated way. However, it seems that much of the time you don’t need the intricacies of the polymorphic factory method, and a single static method in the base class (as shown in ShapeFactory1.cpp) will work fine.

Notice that the ShapeFactory must be initialized by loading its map with factory objects, which takes place in the singleton ShapeFactoryInizializer. So to add a new type to this design you must inherit the type, create a factory, and modify ShapeFactoryInizializer so that an instance of your factory is inserted in the map. This extra complexity again suggests the use of a static factory method if you don’t need to create individual factory objects.

Abstract factories

The Abstract Factory pattern looks like the factory objects we’ve seen previously, with not one but several factory methods. Each of the factory methods creates a different kind of object. The idea is that at the point of creation of the factory object, you decide how all the objects created by that factory will be used. The example given in Design Patterns implements portability across various graphical user interfaces (GUIs): you create a factory object appropriate to the GUI that you’re working with, and from then on when you ask it for a menu, button, slider, etc. it will automatically create the appropriate version of that item for the GUI. Thus you’re able to isolate, in one place, the effect of changing from one GUI to another.

As another example suppose you are creating a general-purpose gaming environment and you want to be able to support different types of games. Here’s how it might look using an abstract factory:

//: C25:AbstractFactory.cpp
// A gaming environment
#include <iostream>
using namespace std;

class Obstacle {
public:
  virtual void action() = 0;
};

class Player {
public:
  virtual void interactWith(Obstacle*) = 0;
};

class Kitty: public Player {
  virtual void interactWith(Obstacle* ob) {
    cout << "Kitty has encountered a ";
    ob->action();
  }
};

class KungFuGuy: public Player {
  virtual void interactWith(Obstacle* ob) {
    cout << "KungFuGuy now battles against a ";
    ob->action();
  }
};

class Puzzle: public Obstacle {
public:
  void action() { cout << "Puzzle\n"; }
};

class NastyWeapon: public Obstacle {
public:
  void action() { cout << "NastyWeapon\n"; }
};

// The abstract factory:
class GameElementFactory {
public:
  virtual Player* makePlayer() = 0;
  virtual Obstacle* makeObstacle() = 0;
};

// Concrete factories:
class KittiesAndPuzzles : 
  public GameElementFactory {
public:
  virtual Player* makePlayer() { 
    return new Kitty;
  }
  virtual Obstacle* makeObstacle() {
    return new Puzzle;
  }
};

class KillAndDismember : 
  public GameElementFactory {
public:
  virtual Player* makePlayer() { 
    return new KungFuGuy;
  }
  virtual Obstacle* makeObstacle() {
    return new NastyWeapon;
  }
};

class GameEnvironment {
  GameElementFactory* gef;
  Player* p;
  Obstacle* ob;
public:
  GameEnvironment(GameElementFactory* factory) :
    gef(factory), p(factory->makePlayer()), 
    ob(factory->makeObstacle()) {}
  void play() {
    p->interactWith(ob);
  }
  ~GameEnvironment() {
    delete p;
    delete ob;
    delete gef;
  }
};

int main() {
  GameEnvironment 
    g1(new KittiesAndPuzzles),
    g2(new KillAndDismember);
  g1.play();
  g2.play();

} ///:~

In this environment, Player objects interact with Obstacle objects, but there are different types of players and obstacles depending on what kind of game you’re playing. You determine the kind of game by choosing a particular GameElementFactory, and then the GameEnvironment controls the setup and play of the game. In this example, the setup and play is very simple, but those activities (the initial conditions and the state change ) can determine much of the game’s outcome. Here, GameEnvironment is not designed to be inherited, although it could very possibly make sense to do that.

This also contains examples of Double Dispatching and the Factory Method , both of which will be explained later.

Virtual constructors

Show simpler version of virtual constructor scheme, letting the user create the object with new. Probably make constructor for objects private and use a maker function to force all objects on the heap.

One of the primary goals of using a factory is so that you can organize your code so you don’t have to select an exact type of constructor when creating an object. That is, you can say, “I don’t know precisely what type of object you are, but here’s the information: Create yourself.”

In addition, during a constructor call the virtual mechanism does not operate (early binding occurs). Sometimes this is awkward. For example, in the Shape program it seems logical that inside the constructor for a Shape object, you would want to set everything up and then draw( ) the Shape. draw( ) should be a virtual function, a message to the Shape that it should draw itself appropriately, depending on whether it is a circle, square, line, and so on. However, this doesn’t work inside the constructor, for the reasons given in Chapter XX: Virtual functions resolve to the “local” function bodies when called in constructors.

If you want to be able to call a virtual function inside the constructor and have it do the right thing, you must use a technique to simulate a virtual constructor (which is a variation of the Factory Method ). This is a conundrum. Remember the idea of a virtual function is that you send a message to an object and let the object figure out the right thing to do. But a constructor builds an object. So a virtual constructor would be like saying, “I don’t know exactly what type of object you are, but build yourself anyway.” In an ordinary constructor, the compiler must know which VTABLE address to bind to the VPTR, and if it existed, a virtual constructor couldn’t do this because it doesn’t know all the type information at compile-time. It makes sense that a constructor can’t be virtual because it is the one function that absolutely must know everything about the type of the object.

And yet there are times when you want something approximating the behavior of a virtual constructor.

In the Shape example, it would be nice to hand the Shape constructor some specific information in the argument list and let the constructor create a specific type of Shape (a Circle, Square) with no further intervention. Ordinarily, you’d have to make an explicit call to the Circle, Square constructor yourself.

Coplien[72] calls his solution to this problem “envelope and letter classes.” The “envelope” class is the base class, a shell that contains a pointer to an object of the base class. The constructor for the “envelope” determines (at runtime, when the constructor is called, not at compile-time, when the type checking is normally done) what specific type to make, then creates an object of that specific type (on the heap) and assigns the object to its pointer. All the function calls are then handled by the base class through its pointer. So the base class is acting as a proxy for the derived class:

//: C25:VirtualConstructor.cpp
#include <iostream>
#include <string>
#include <exception>
#include <vector>
using namespace std;

class Shape {
  Shape* s;
  // Prevent copy-construction & operator=
  Shape(Shape&);
  Shape operator=(Shape&);
protected:
  Shape() { s = 0; };
public:
  virtual void draw() { s->draw(); }
  virtual void erase() { s->erase(); }
  virtual void test() { s->test(); };
  virtual ~Shape() {
    cout << "~Shape\n";
    if(s) {
      cout << "Making virtual call: ";
      s->erase(); // Virtual call
    }
    cout << "delete s: ";
    delete s; // The polymorphic deletion
  }
  class BadShapeCreation : public exception {
    string reason;
  public:
    BadShapeCreation(string type) {
      reason = "Cannot create type " + type;
    }
    const char *what() const { 
      return reason.c_str(); 
    }
  };
  Shape(string type) throw(BadShapeCreation);
};

class Circle : public Shape {
  Circle(Circle&);
  Circle operator=(Circle&);
  Circle() {} // Private constructor
  friend class Shape;
public:
  void draw() { cout << "Circle::draw\n"; }
  void erase() { cout << "Circle::erase\n"; }
  void test() { draw(); }
  ~Circle() { cout << "Circle::~Circle\n"; }
};

class Square : public Shape {
  Square(Square&);
  Square operator=(Square&);
  Square() {}
  friend class Shape;
public:
  void draw() { cout << "Square::draw\n"; }
  void erase() { cout << "Square::erase\n"; }
  void test() { draw(); }
  ~Square() { cout << "Square::~Square\n"; }
};

Shape::Shape(string type) 
  throw(Shape::BadShapeCreation) {
  if(type == "Circle") 
    s = new Circle();
  else if(type == "Square")
    s = new Square();
  else throw BadShapeCreation(type);
  draw();  // Virtual call in the constructor
}

char* shlist[] = { "Circle", "Square", "Square",
  "Circle", "Circle", "Circle", "Square", "" };

int main() {
  vector<Shape*> shapes;
  cout << "virtual constructor calls:" << endl;
  try {
    for(char** cp = shlist; **cp; cp++)
      shapes.push_back(new Shape(*cp));
  } catch(Shape::BadShapeCreation e) {
    cout << e.what() << endl;
    return 1;
  }
  for(int i = 0; i < shapes.size(); i++) {
    shapes[i]->draw();
    cout << "test\n";
    shapes[i]->test();
    cout << "end test\n";
    shapes[i]->erase();
  }
  Shape c("Circle"); // Create on the stack
  cout << "destructor calls:" << endl;
  for(int j = 0; j < shapes.size(); j++) {
    delete shapes[j];
    cout << "\n------------\n";
  }

} ///:~

The base class Shape contains a pointer to an object of type Shape as its only data member. When you build a “virtual constructor” scheme, you must exercise special care to ensure this pointer is always initialized to a live object.

Each time you derive a new subtype from Shape, you must go back and add the creation for that type in one place, inside the “virtual constructor” in the Shape base class. This is not too onerous a task, but the disadvantage is you now have a dependency between the Shape class and all classes derived from it (a reasonable trade-off, it seems). Also, because it is a proxy, the base-class interface is truly the only thing the user sees.

In this example, the information you must hand the virtual constructor about what type to create is very explicit: It’s a string that names the type. However, your scheme may use other information – for example, in a parser the output of the scanner may be handed to the virtual constructor, which then uses that information to determine which token to create.

The virtual constructor Shape(type) can only be declared inside the class; it cannot be defined until after all the derived classes have been declared. However, the default constructor can be defined inside class Shape , but it should be made protected so temporary Shape objects cannot be created. This default constructor is only called by the constructors of derived-class objects. You are forced to explicitly create a default constructor because the compiler will create one for you automatically only if there are no constructors defined. Because you must define Shape(type), you must also define Shape( ).

The default constructor in this scheme has at least one very important chore – it must set the value of the s pointer to zero. This sounds strange at first, but remember that the default constructor will be called as part of the construction of the actual object – in Coplien’s terms, the “letter,” not the “envelope.” However, the “letter” is derived from the “envelope,” so it also inherits the data member s. In the “envelope,” s is important because it points to the actual object, but in the “letter,” s is simply excess baggage. Even excess baggage should be initialized, however, and if s is not set to zero by the default constructor called for the “letter,” bad things happen (as you’ll see later).

The virtual constructor takes as its argument information that completely determines the type of the object. Notice, though, that this type information isn’t read and acted upon until runtime, whereas normally the compiler must know the exact type at compile-time (one other reason this system effectively imitates virtual constructors).

Inside the virtual constructor there’s a switch statement that uses the argument to construct the actual (“letter”) object, which is then assigned to the pointer inside the “envelope.” At that point, the construction of the “letter” has been completed, so any virtual calls will be properly directed.

As an example, consider the call to draw( ) inside the virtual constructor. If you trace this call (either by hand or with a debugger), you can see that it starts in the draw( ) function in the base class, Shape. This function calls draw( ) for the “envelope” s pointer to its “letter.” All types derived from Shape share the same interface, so this virtual call is properly executed, even though it seems to be in the constructor. (Actually, the constructor for the “letter” has already completed.) As long as all virtual calls in the base class simply make calls to identical virtual function through the pointer to the “letter,” the system operates properly.

To understand how it works, consider the code in main( ). To fill the vector shapes , “virtual constructor” calls are made to Shape. Ordinarily in a situation like this, you would call the constructor for the actual type, and the VPTR for that type would be installed in the object. Here, however, the VPTR used in each case is the one for Shape, not the one for the specific Circle, Square, or Triangle.

In the for loop where the draw( ) and erase( ) functions are called for each Shape, the virtual function call resolves, through the VPTR, to the corresponding type. However, this is Shape in each case. In fact, you might wonder why draw( ) and erase( ) were made virtual at all. The reason shows up in the next step: The base-class version of draw( ) makes a call, through the “letter” pointer s, to the virtual function draw( ) for the “letter.” This time the call resolves to the actual type of the object, not just the base class Shape. Thus the runtime cost of using virtual constructors is one more virtual call every time you make a virtual function call.

In order to create any function that is overridden, such as draw( ), erase( ) or test( ), you must proxy all calls to the s pointer in the base class implementation, as shown above. This is because, when the call is made, the call to the envelope’s member function will resolve as being to Shape, and not to a derived type of Shape. Only when you make the proxy call to s will the virtual behavior take place. In main( ), you can see that everything works correctly, even when calls are made inside constructors and destructors.

Destructor operation

The activities of destruction in this scheme are also tricky. To understand, let’s verbally walk through what happens when you call delete for a pointer to a Shape object – specifically, a Square – created on the heap. (This is more complicated than an object created on the stack.) This will be a delete through the polymorphic interface, as in the statement delete shapes[i] in main( ).

The type of the pointer shapes[i] is of the base class Shape, so the compiler makes the call through Shape. Normally, you might say that it’s a virtual call, so Square’s destructor will be called. But with the virtual constructor scheme, the compiler is creating actual Shape objects, even though the constructor initializes the letter pointer to a specific type of Shape. The virtual mechanism is used, but the VPTR inside the Shape object is Shape’s VPTR, not Square’s. This resolves to Shape’s destructor, which calls delete for the letter pointer s, which actually points to a Square object. This is again a virtual call, but this time it resolves to Square’s destructor.

With a destructor, however, C++ guarantees, via the compiler, that all destructors in the hierarchy are called. Square’s destructor is called first, followed by any intermediate destructors, in order, until finally the base-class destructor is called. This base-class destructor has code that says delete s . When this destructor was called originally, it was for the “envelope” s, but now it’s for the “letter” s, which is there because the “letter” was inherited from the “envelope,” and not because it contains anything. So this call to delete should do nothing.

The solution to the problem is to make the “letter” s pointer zero. Then when the “letter” base-class destructor is called, you get delete 0 , which by definition does nothing. Because the default constructor is protected, it will be called only during the construction of a “letter,” so that’s the only situation where s is set to zero.

Your most common tool for hiding construction will probably be ordinary factory methods rather than the more complex approaches. The idea of adding new types with minimal effect on the rest of the system will be further explored later in this chapter.

[72]James O. Coplien, Advanced C++ Programming Styles and Idioms , Addison-Wesley, 1992.

Contents | Prev | Next

Contact: webmaster@codeguru.com
CodeGuru - the website for developers. [an error occurred while processing this directive]