CodeGuru : Thinking in C++

The primary idea in the STL is the container (also known as a collection), which is just what it sounds like: a place to hold things. You need containers because objects are constantly marching in and out of your program and there must be someplace to put them while they’re around. You can’t make named local objects because in a typical program you don’t know how many, or what type, or the lifetime of the objects you’re working with. So you need a container that will expand whenever necessary to fill your needs.

All the containers in the STL hold objects and expand themselves. In addition, they hold your objects in a particular way. The difference between one container and another is the way the objects are held and how the sequence is created. Let’s start by looking at the simplest containers.

A vector is a linear sequence that allows rapid random access to its elements. However, it’s expensive to insert an element in the middle of the sequence, and is also expensive when it allocates additional storage. A deque is also a linear sequence, and it allows random access that’s nearly as fast as vector , but it’s significantly faster when it needs to allocate new storage, and you can easily add new elements at either end ( vector only allows the addition of elements at its tail). A list the third type of basic linear sequence, but it’s expensive to move around randomly and cheap to insert an element in the middle. Thus list, deque and vector are very similar in their basic functionality (they all hold linear sequences), but different in the cost of their activities. So for your first shot at a program, you could choose any one, and only experiment with the others if you’re tuning for efficiency.

Many of the problems you set out to solve will only require a simple linear sequence like a vector, deque or list. All three have a member function push_back( ) which you use to insert a new element at the back of the sequence ( deque and list also have push_front( )).

But now how do you retrieve those elements? With a vector or deque, it is possible to use the indexing operator[ ] , but that doesn’t work with list. Since it would be nicest to learn a single interface, we’ll often use the one defined for all STL containers: the iterator.

An iterator is a class that abstracts the process of moving through a sequence. It allows you to select each element of a sequence without knowing the underlying structure of that sequence . This is a powerful feature, partly because it allows us to learn a single interface that works with all containers, and partly because it allows containers to be used interchangeably.

One more observation and you’re ready for another example. Even though the STL containers hold objects by value (that is, they hold the whole object inside themselves) that’s probably not the way you’ll generally use them if you’re doing object-oriented programming. That’s because in OOP, most of the time you’ll create objects on the heap with new and then upcast the address to the base-class type, later manipulating it as a pointer to the base class. The beauty of this is that you don’t worry about the specific type of object you’re dealing with, which greatly reduces the complexity of your code and increases the maintainability of your program. This process of upcasting is what you try to do in OOP with polymorphism, so you’ll usually be using containers of pointers.

Consider the classic “shape” example where shapes have a set of common operations, and you have different types of shapes. Here’s what it looks like using the STL vector to hold pointers to various types of Shape created on the heap:

The creation of Shape, Circle, Square and Triangle should be fairly familiar. Shape is a pure abstract base class (because of the pure specifier =0) that defines the interface for all types of shapes. The derived classes redefine the virtual function draw( ) to perform the appropriate operation. Now we’d like to create a bunch of different types of Shape object, but where to put them? In an STL container, of course. For convenience, this typedef:

uses that alias to create another one, for vector<Shape*>::iterator. Notice that the container type name must be used to produce the appropriate iterator, which is defined as a nested class. Although there are different types of iterators (forward, bidirectional, reverse, etc., which will be explained later) they all have the same basic interface: you can increment them with ++, you can dereference them to produce the object they’re currently selecting, and you can test them to see if they’re at the end of the sequence. That’s what you’ll want to do 90% of the time. And that’s what is done in the above example: after creating a container, it’s filled with different types of Shape*. Notice that the upcast happens as the Circle, Square or Rectangle pointer is added to the shapes container, which doesn’t know about those specific types but instead holds only Shape*. So as soon as the pointer is added to the container it loses its specific identity and becomes an anonymous Shape*. This is exactly what we want: toss them all in and let polymorphism sort it out.

The first for loop creates an iterator and sets it to the beginning of the sequence by calling the begin( ) member function for the container. All containers have begin( ) and end( ) member functions that produce an iterator selecting, respectively, the beginning of the sequence and one past the end of the sequence. To test to see if you’re done, you make sure you’re != to the iterator produced by end( ). Not < or <=. The only test that works is !=. So it’s very common to write a loop like:

What do you do with the iterator to produce the element it’s selecting? You dereference it using (what else) the ‘ *’ (which is actually an overloaded operator). What you get back is whatever the container is holding. This container holds Shape*, so that’s what *i produces. If you want to send a message to the Shape, you must select that message with ->, so you write the line:

This calls the draw( ) function for the Shape* the iterator is currently selecting. The parentheses are ugly but necessary to produce the proper order of evaluation. As an alternative, operator-> is defined so that you can say:

As they are destroyed or in other cases where the pointers are removed, the STL containers do not call delete for the pointers they contain. If you create an object on the heap with new and place its pointer in a container, the container can’t tell if that pointer is also placed inside another container. So the STL just doesn’t do anything about it, and puts the responsibility squarely in your lap. The last lines in the program move through and delete every object in the container so proper cleanup occurs.

It’s very interesting to note that you can change the type of container that this program uses with two lines. Instead of including <vector>, you include <list>, and in the first typedef you say:

instead of using a vector. Everything else goes untouched. This is possible not because of an interface enforced by inheritance (there isn’t any inheritance in the STL, which comes as a surprise when you first see it), but because the interface is enforced by a convention adopted by the designers of the STL, precisely so you could perform this kind of interchange. Now you can easily switch between vector and list and see which one works fastest for your needs.

The basic concepts