Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Freely-available

STL extensions

Although the STL containers may provide all the functionality you’ll ever need, they are not complete. For example, the standard implementations of set and map use trees, and although these are reasonably fast they may not be fast enough for your needs. In the C++ Standards Committee it was generally agreed that hashed implementations of set and map should have been included in Standard C++, however there was not considered to be enough time to add these components, and thus they were left out.

Fortunately, there are freely-available alternatives. One of the nice things about the STL is that it establishes a basic model for creating STL-like classes, so anything built using the same model is easy to understand if you are already familiar with the STL.

The SGI STL (freely available at http://www.sgi.com/Technology/STL/) is one of the most robust implementations of the STL, and can be used to replace your compiler’s STL if that is found wanting. In addition they’ve added a number of extensions including hash_set, hash_multiset, hash_map, hash_multimap, slist (a singly-linked list) and rope (a variant of string optimized for very large strings and fast concatenation and substring operations).

Let’s consider a performance comparison between a tree-based map and the SGI hash_map. To keep things simple, the mappings will be from int to int:

//: C20:MapVsHashMap.cpp
// The hash_map header is not part of the 
// Standard C++ STL. It is an extension that 
// is only available as part of the SGI STL:
#include <hash_map> 
#include <iostream>
#include <map>
#include <ctime>
using namespace std;

int main(){
  hash_map<int, int> hm;
  map<int, int> m;
  clock_t ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      m.insert(make_pair(j,j));
  cout << "map insertions: " 
    << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      hm.insert(make_pair(j,j));
  cout << "hash_map insertions: " 
    << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      m[j];
  cout << "map::operator[] lookups: " 
    << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      hm[j];
  cout << "hash_map::operator[] lookups: "
    << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      m.find(j);
  cout << "map::find() lookups: "
    << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      hm.find(j);
  cout << "hash_map::find() lookups: " 
    << clock() - ticks << endl;
} ///:~

The performance test I ran showed a speed improvement of roughly 4:1 for the hash_map over the map in all operations (and as expected, find( ) is slightly faster than operator[ ] for lookups for both types of map). If a profiler shows a bottleneck in your map, you should consider a hash_map.

Contents | Prev | Next


Contact: webmaster@codeguru.com
CodeGuru - the website for developers.
[an error occurred while processing this directive]