Associative
containers
The
set,
map,
multiset
and 
multimap
are called 
associative
containers
because they associate 
keys
with 
values.
Well, at least 
maps
and 
multimaps
associate keys to values, but you can look at a 
set
as a 
map
that has no values, only keys (and they can in fact be implemented this way),
and the same for the relationship between 
multiset
and 
multimap.
So, because of the structural similarity 
sets
and 
multisets
are lumped in with associative containers.
 The
most important basic operations with associative containers are putting things
in, and in the case of a 
set,
seeing if something is in the set. In the case of a 
map,
you want to first see if a key is in the 
map,
and if it exists you want the associated value for that key to be returned. Of
course, there are many variations on this theme but that’s the
fundamental concept. The following example shows these basics:
 
//: C20:AssociativeBasics.cpp
// Basic operations with sets and maps
#include "Noisy.h"
#include <iostream>
#include <set>
#include <map>
using namespace std;
int main() {
  Noisy na[] = { Noisy(), Noisy(), Noisy(), 
    Noisy(), Noisy(), Noisy(), Noisy() };
  // Add elements via constructor:
  set<Noisy> ns(na, na+ sizeof na/sizeof(Noisy));
  // Ordinary insertion:
  Noisy n;
  ns.insert(n);
  cout << endl;
  // Check for set membership:
  cout << "ns.count(n)= " << ns.count(n) << endl;
  if(ns.find(n) != ns.end())
    cout << "n(" << n << ") found in ns" << endl;
  // Print elements:
  copy(ns.begin(), ns.end(), 
    ostream_iterator<Noisy>(cout, " "));
  cout << endl;
  cout << "\n-----------\n";
  map<int, Noisy> nm;
  for(int i = 0; i < 10; i++)
    nm[i]; // Automatically makes pairs
  cout << "\n-----------\n";
  for(int j = 0; j < nm.size(); j++)
    cout << "nm[" << j <<"] = " << nm[j] << endl;
  cout << "\n-----------\n";
  nm[10] = n;
  cout << "\n-----------\n";
  nm.insert(make_pair(47, n));
  cout << "\n-----------\n";
  cout << "\n nm.count(10)= " 
    << nm.count(10) << endl;
  cout << "nm.count(11)= " 
    << nm.count(11) << endl;
  map<int, Noisy>::iterator it = nm.find(6);
  if(it != nm.end())
    cout << "value:" << (*it).second
      << " found in nm at location 6" << endl;
  for(it = nm.begin(); it != nm.end(); it++)
    cout << (*it).first << ":" 
      << (*it).second << ", ";
  cout << "\n-----------\n";The
set<Noisy>
object 
ns
is created using two iterators into an array of 
Noisy
objects, but there is also a default constructor and a copy-constructor, and
you can pass in an object that provides an alternate scheme for doing
comparisons. Both 
sets
and 
maps
have an 
insert( )
member function to put things in, and there are a couple of different ways to
check to see if an object is already in an associative container: 
count( ),
when given a key, will tell you how many times that key occurs (this can only
be zero or one in a 
set
or 
map,
but it can be more than one with a 
multiset
or 
multimap).
The 
find( )
member function will produce an iterator indicating the first occurrence (with 
set
and 
map,
the 
only
occurrence) of the key that you give it, or the past-the-end iterator if it
can’t find the key. The 
count( )
and 
find( )
member functions exist for all the associative containers, which makes sense.
The associative containers also have member functions 
lower_bound( ),
upper_bound( )
and 
equal_range( ),
which actually only make sense for 
multiset
and 
multimap,
as you shall see (but don’t try to figure out how they would be useful for 
set
and 
map,
since they are designed for dealing with a range of duplicate keys, which those
containers don’t allow).
 Designing
an 
operator[
]
always produces a little bit of a dilemma because it’s intended to be
treated as an array-indexing operation, so people don’t tend to think
about performing a test before they use it. But what happens if you decide to
index out of the bounds of the array? One option, of course, is to throw an
exception, but with a 
map
“indexing out of the array” could mean that you want an entry
there, and that’s the way the STL 
map
treats it. The first 
for
loop after the creation of the 
map<int,
Noisy> nm
just “looks up” objects using the 
operator[
]
,
but this is actually creating new 
Noisy
objects! The 
map
creates a new key-value pair (using the default constructor for the value) if
you look up a value with 
operator[
]
and it isn’t there. This means that if you really just want to look
something up and not create a new entry, you must use 
count( )
(to see if it’s there) or 
find( )
(to get an iterator to it).
 The
for
loop that prints out the values of the container using 
operator[
]
has a number of problems. First, it requires integral keys (which we happen to
have in this case). Next and worse, if all the keys are not sequential,
you’ll end up counting from 0 to the size of the container, and if there
are some spots which don’t have key-value pairs you’ll
automatically create them, and miss some of the higher values of the keys.
Finally, if you look at the output from the 
for
loop you’ll see that things are 
very
busy, and it’s quite puzzling at first why there are so many
constructions and destructions for what appears to be a simple lookup. The
answer only becomes clear when you look at the code in the 
map
template for 
operator[
]
,
which will be something like this:
 
mapped_type& operator[] (const key_type& k) {
  value_type tmp(k,T()); 
  return (*((insert(tmp)).first)).second;Following
the trail, you’ll find that 
map::value_type
is:
 typedef
pair<const Key, T> value_type;
 Now
you need to know what a 
pair
is, which can be found in 
<utility>: 
template <class T1, class T2>
struct pair { 
  typedef T1 first_type; 
  typedef T2 second_type; 
  T1 first; 
  T2 second; 
  pair(); 
  pair(const T1& x, const T2& y) 
    : first(x), second(y) {}
  // Templatized copy-constructor:
  template<class U, class V> 
   pair(const pair<U, V> &p);It
turns out this is a very important (albeit simple) 
struct
which is used quite a bit in the STL. All it really does it package together
two objects, but it’s very useful, especially when you want to return two
objects from a function (since a 
return
statement only takes one object). There’s even a shorthand for creating a
pair called 
make_pair( ),
which is used in 
AssociativeBasics.cpp. So
to retrace the steps, 
map::value_type
is a 
pair
of the key and the value of the map – actually, it’s a single entry
for the map. But notice that 
pair
packages its objects by value, which means that copy-constructions are
necessary to get the objects into the 
pair.
Thus, the creation of 
tmp
in 
map::operator[
]
will involve at least a copy-constructor call and destructor call for each
object in the 
pair.
Here, we’re getting off easy because the key is an 
int.
But if you want to really see what kind of activity can result from 
map::operator[
]
,
try running this:
 
//: C20:NoisyMap.cpp
// Mapping Noisy to Noisy
#include "Noisy.h"
#include <map>
using namespace std;
int main() {
  map<Noisy, Noisy> mnn;
  Noisy n1, n2;
  cout << "\n--------\n";
  mnn[n1] = n2;
  cout << "\n--------\n";
  cout << mnn[n1] << endl;
  cout << "\n--------\n";You’ll
see that both the insertion and lookup generate a lot of extra objects, and
that’s because of the creation of the 
tmp
object. If you look back up at 
map::operator[
]
you’ll see that the second line calls 
insert( )
passing it 
tmp
– that is, 
operator[
]
does an insertion every time. The return value of 
insert( )
is a different kind of 
pair,
where 
first
is an iterator pointing to the key-value 
pair
that was just inserted, and 
second
is a 
bool
indicating whether the insertion took place. You can see that 
operator[
]
grabs 
first
(the iterator), dereferences it to produce the 
pair,
and then returns the 
second
which is the value at that location. 
 So
on the upside, 
map
has this fancy “make a new entry if one isn’t there”
behavior, but the downside is that you 
always
get a lot of extra object creations and destructions when you use 
map::operator[
]
.
Fortunately, 
AssociativeBasics.cpp
also demonstrates how to reduce the overhead of insertions and deletions, by
not using 
operator[
]
if you don’t have to. The 
insert( )
member function is slightly more efficient than 
operator[
]
.
With a 
set
you only hold one object, but with a 
map
you hold key-value pairs, so 
insert( )
requires a 
pair
as its argument. Here’s where 
make_pair( )
comes in handy, as you can see.
 For
looking objects up in a 
map,
you can use 
count( )
to see whether a key is in the map, or you can use 
find( )
to produce an iterator pointing directly at the key-value pair. Again, since the 
map
contains 
pairs
that’s what the iterator produces when you dereference it, so you have to
select 
first
and 
second.
When you run 
AssociativeBasics.cpp
you’ll notice that the iterator approach involves no extra object
creations or destructions at all. It’s not as easy to write or read,
though.
 If
you use a 
map
with large, complex objects and discover there’s too much overhead when
doing lookups and insertions (don’t assume this from the beginning
– take the easy approach first and use a profiler to discover
bottlenecks), then you can use the counted-handle approach shown in Chapter XX
so that you are only passing around small, lightweight objects.
 Of
course, you can also iterate through a 
set
or 
map
and operate on each of its objects. This will be demonstrated in later examples.
 
Generators
and fillers 
for
associative containers
You’ve
seen how useful the 
fill( ),
fill_n( ),
generate( )
and 
generate_n( )
function templates in 
<algorithm>
have been for filling the sequential containers (
vector,
list
and 
deque)
with data. However, these are implemented by using 
operator=
to
assign values into the sequential containers, and the way that you add objects
to associative containers is with their respective 
insert( )
member functions. Thus the default “assignment” behavior causes a
problem when trying to use the “fill” and “generate”
functions with associative containers.
 One
solution is to duplicate the “fill” and “generate”
functions, creating new ones that can be used with associative containers. It
turns out that only the 
fill_n( )
and 
generate_n( )
functions can be duplicated (
fill( )
and
generate( )
copy
in between two iterators, which doesn’t make sense with associative
containers), but the job is fairly easy, since you have the 
<algorithm>
header file to work from (and since it contains templates, all the source code
is there):
 
//: C20:assocGen.h
// The fill_n() and generate_n() equivalents 
// for associative containers.
#ifndef ASSOCGEN_H
#define ASSOCGEN_H
template<class Assoc, class Count, class T>
void 
assocFill_n(Assoc& a, Count n, const T& val) {
  for (; 0 < n; --n)
    a.insert(val);
}
template<class Assoc, class Count, class Gen>
void assocGen_n(Assoc& a, Count n, Gen g) {
  for (; 0 < n; --n)
    a.insert(g());
}#endif
// ASSOCGEN_H ///:~
 You
can see that instead of using iterators, the container class itself is passed
(by reference, of course, since you wouldn’t want to make a local copy,
fill it, and then have it discarded at the end of the scope).
 This
code demonstrates two valuable lessons. The first lesson is that if the
algorithms don’t do what you want, copy the nearest thing and modify it.
You have the example at hand in the STL header, so most of the work has already
been done.
 The
second lesson is more pointed: if you look long enough, there’s probably
a way to do it in the STL 
without
inventing anything new. The present problem can instead be solved by using an 
insert_iterator
(produced by a call to 
inserter( )),
which calls 
insert( )
to place items in the container instead of 
operator=.
This is 
not
simply a variation of 
front_insert_iterator
(produced by a call to 
front_inserter( ))
or 
back_insert_iterator
(produced by a call to 
back_inserter( )),
since those iterators use 
push_front( )
and 
push_back( ),
respectively. Each of the insert iterators is different by virtue of the member
function it uses for insertion, and 
insert( )
is the one we need. Here’s a demonstration that shows filling and
generating both a 
map
and a 
set
(of course, it can also be used with 
multimap
and 
multiset).
First, some templatized, simple generators are created (this may seem like
overkill, but you never know when you’ll need them; for that reason
they’re placed in a header file):
 
//: C20:SimpleGenerators.h
// Generic generators, including
// one that creates pairs
#include <iostream>
#include <utility>
// A generator that increments its value:
template<typename T>
class IncrGen {
  T i;
public:
  IncrGen(T ii) : i (ii) {}
  T operator()() { return i++; }
};
// A generator that produces an STL pair<>:
template<typename T1, typename T2>
class PairGen {
  T1 i;
  T2 j;
public:
  PairGen(T1 ii, T2 jj) : i(ii), j(jj) {}
  std::pair<T1,T2> operator()() { 
    return std::pair<T1,T2>(i++, j++); 
  }
};
// A generic global operator<< 
// for printing any STL pair<>:
template<typename Pair> std::ostream&
operator<<(std::ostream& os, const Pair& p) {
  return os << p.first << "\t" 
    << p.second << std::endl;Both
generators expect that 
T
can be incremented, and they simply use 
operator++
to generate new values from whatever you used for initialization. 
PairGen
creates an STL 
pair
object as its return value, and that’s what can be placed into a 
map
or 
multimap
using 
insert( ). The
last function is a generalization of 
operator<<
for 
ostreams,
so that any 
pair
can be printed, assuming each element of the 
pair
supports
a stream 
operator<<.
As you can see below, this allows the use of 
copy( )
to output the 
map: 
//: C20:AssocInserter.cpp
// Using an insert_iterator so fill_n() and
// generate_n() can be used with associative 
// containers
#include "SimpleGenerators.h"
#include <iterator>
#include <iostream>
#include <algorithm>
#include <set>
#include <map>
using namespace std;
int main() {
  set<int> s;
  fill_n(inserter(s, s.begin()), 10, 47);
  generate_n(inserter(s, s.begin()), 10, 
    IncrGen<int>(12));
  copy(s.begin(), s.end(), 
    ostream_iterator<int>(cout, "\n"));
  
  map<int, int> m;
  fill_n(inserter(m, m.begin()), 10, 
    make_pair(90,120));
  generate_n(inserter(m, m.begin()), 10, 
    PairGen<int, int>(3, 9));
  copy(m.begin(), m.end(), 
    ostream_iterator<pair<int,int> >(cout,"\n"));The
second argument to 
inserter
is an interator, which actually isn’t used in the case of associative
containers since they maintain their order internally, rather than allowing you
to tell them where the element should be inserted. However, an 
insert_iterator
can be used with many different types of containers so you must provide the
iterator.
 Note
how the 
ostream_iterator
is created to output a 
pair;
this wouldn’t have worked if the 
operator<<
hadn’t been created, and since it’s a template it is automatically
instantiated for 
pair<int,
int>
. 
The
magic of maps
An
ordinary array uses an integral value to index into a sequential set of
elements of some type. A 
map
is an 
associative
array
,
which means you associate one object with another in an array-like fashion, but
instead of selecting an array element with a number as you do with an ordinary
array, you look it up with an object! The example which follows counts the
words in a text file, so the index is the 
string
object representing the word, and the value being looked up is the object that
keeps count of the strings.
 In
a single-item container like a 
vector
or 
list,
there’s only one thing being held. But in a 
map,
you’ve got two things: the 
key
(what you look up by, as in 
mapname[key])
and the 
value
that results from the lookup with the key. If you simply want to move through
the entire 
map
and list each key-value pair, you use an iterator, which when dereferenced
produces a 
pair
object containing both the key and the value. You access the members of a 
pair
by selecting 
first
or 
second. This
same philosophy of packaging two items together is also used to insert elements
into the map, but the 
pair
is created as part of the instantiated 
map
and is called 
value_type,
containing the key and the value. So one option for inserting a new element is
to create a 
value_type
object, loading it with the appropriate objects and then calling the 
insert( )
member function for the 
map.
Instead, the following example makes use of the aforementioned special feature
of 
map:
if you’re trying to find an object by passing in a key to 
operator[
]
and that object doesn’t exist, 
operator[
]
will automatically insert a new key-value pair for you, using the default
constructor for the value object. With that in mind, consider an implementation
of a word counting program:
 
//: C20:WordCount.cpp
//{L} StreamTokenizer
// Count occurrences of words using a map
#include "StreamTokenizer.h"
#include "../require.h"
#include <string>
#include <map>
#include <iostream>
#include <fstream>
using namespace std;
class Count {
  int i;
public:
  Count() : i(0) {}
  void operator++(int) { i++; } // Post-increment  
  int& val() { return i; }
};
typedef map<string, Count> WordMap;
typedef WordMap::iterator WMIter;
int main(int argc, char* argv[]) {
  requireArgs(argc, 1);
  ifstream in(argv[1]);
  assure(in, argv[1]);
  StreamTokenizer words(in);
  WordMap wordmap;
  string word;
  while((word = words.next()).size() != 0)
    wordmap[word]++;
  for(WMIter w = wordmap.begin(); 
      w != wordmap.end(); w++)
    cout << (*w).first << ": "
      << (*w).second.val() << endl;The
need for the 
Count
class is to contain an 
int
that’s automatically initialized to zero. This is necessary because of
the crucial line:
 This
finds the word that has been produced by 
StreamTokenizer
and
increments the 
Count
object associated with that word, which is fine as long as there 
is
a key-value pair for that 
string.
If there isn’t, the 
map
automatically inserts a key for the word you’re looking up, and a 
Count
object, which is initialized to zero by the default constructor. Thus, when
it’s incremented the 
Count
becomes 1.
 Printing
the entire list requires traversing it with an iterator (there’s no 
copy( )
shortcut for a 
map
unless you want to write an 
operator<<
for
the 
pair
in the map). As previously mentioned, dereferencing this iterator produces a 
pair
object, with the 
first
member the key and the 
second
member the value. In this case 
second
is a 
Count
object, so its 
val( )
member must be called to produce the actual word count.
 If
you want to find the count for a particular word, you can use the array index
operator, like this:
 cout
<< "the: " << wordmap["the"].val() << endl;
 You
can see that one of the great advantages of the 
map
is the clarity of the syntax; an associative array makes intuitive sense to the
reader (note, however, that if “the” isn’t already in the 
wordmap
a new entry will be created!).
 
A
command-line argument tool
A
problem that often comes up in programming is the management of program
arguments that you can specify on the command line. Usually you’d like to
have a set of defaults that can be changed via the command line. The following
tool expects the command line arguments to be in the form 
flag1=value1
with no spaces around the ‘
=‘
(so it will be treated as a single argument). The 
ProgVal
class simply inherits from 
map<string,
string>
: 
//: C20:ProgVals.h
// Program values can be changed by command line
#ifndef PROGVALS_H
#define PROGVALS_H
#include <map>
#include <iostream>
#include <string>
class ProgVals 
  : public std::map<std::string, std::string> {
public:
  ProgVals(std::string defaults[][2], int sz);
  void parse(int argc, char* argv[],
    std::string usage, int offset = 1);
  void print(std::ostream& out = std::cout);
};#endif
// PROGVALS_H ///:~
 The
constructor expects an array of 
string
pairs (as you’ll see, this allows you to initialize it with an array of 
char*)
and the size of that array. The 
parse( )
member function is handed the command-line arguments along with a
“usage” string to print if the command line is given incorrectly,
and the “offset” which tells it which command-line argument to
start with (so you can have non-flag arguments at the beginning of the command
line). Finally, 
print( )
displays the values. Here is the implementation:
 
//: C20:ProgVals.cpp {O}
#include "ProgVals.h"
using namespace std;
ProgVals::ProgVals(
  std::string defaults[][2], int sz) {
  for(int i = 0; i < sz; i++)
    insert(make_pair(
      defaults[i][0], defaults[i][1]));
}
void ProgVals::parse(int argc, char* argv[],
  string usage, int offset) {
  // Parse and apply additional
  // command-line arguments:
  for(int i = offset; i < argc; i++) {
    string flag(argv[i]);
    int equal = flag.find('=');
    if(equal == string::npos) {
      cerr << "Command line error: " <<
        argv[i] << endl << usage << endl;
      continue; // Next argument
    }
    string name = flag.substr(0, equal);
    string value = flag.substr(equal + 1);
    if(find(name) == end()) {
      cerr << name << endl << usage << endl;
      continue; // Next argument
    }
    operator[](name) = value;
  }
}
void ProgVals::print(ostream& out) {
  out << "Program values:" << endl;
  for(iterator it = begin(); it != end(); it++)
    out << (*it).first << " = "
        << (*it).second << endl;The
constructor uses the STL 
make_pair( )
helper function to convert each pair of 
char*
into a 
pair
object that can be inserted into the 
map.
In 
parse( ),
each command-line argument is checked for the existence of the telltale ‘
=‘
sign (reporting an error if it isn’t there), and then is broken into two
strings, the 
name
which appears before the ‘
=‘,
and the 
value
which appears after. The 
operator[ ]
is then used to change the existing value to the new one.
 Here’s
an example to test the tool:
 
//: C20:ProgValTest.cpp
//{L} ProgVals
#include "ProgVals.h"
using namespace std;
string defaults[][2] = {
  { "color", "red" },
  { "size", "medium" },
  { "shape", "rectangular" },
  { "action", "hopping"},
};
const char* usage = "usage:\n"
"ProgValTest [flag1=val1 flag2=val2 ...]\n"
"(Note no space around '=')\n"
"Where the flags can be any of: \n"
"color, size, shape, action \n";
// So it can be used globally:
ProgVals pvals(defaults, 
  sizeof defaults / sizeof *defaults);
class Animal {
  string color, size, shape, action;
public:
  Animal(string col, string sz, 
    string shp, string act) 
    :color(col),size(sz),shape(shp),action(act){}
  // Default constructor uses program default
  // values, possibly change on command line:
  Animal() : color(pvals["color"]), 
    size(pvals["size"]), shape(pvals["shape"]),
    action(pvals["action"]) {}
  void print() {
    cout << "color = " << color << endl
      << "size = " << size << endl
      << "shape = " << shape << endl
      << "action = " << action << endl;
  }
  // And of course pvals can be used anywhere
  // else you'd like.
};
int main(int argc, char* argv[]) {
  // Initialize and parse command line values
  // before any code that uses pvals is called:
  pvals.parse(argc, argv, usage);
  pvals.print();
  Animal a;
  cout << "Animal a values:" << endl;
  a.print();This
program can create 
Animal
objects with different characteristics, and those characteristics can be
established with the command line. The default characteristics are given in the
two-dimensional array of 
char*
called 
defaults
and, after the 
usage
string you can see a global instance of 
ProgVals
called 
pvals
is
created; this is important because it allows the rest of the code in the
program to access the values.
 Note
that 
Animal’s
default constructor uses the values in 
pvals
inside its constructor initializer list. When you run the program you can try
creating different animal characteristics.
 Many
command-line programs also use a style of beginning a flag with a hyphen, and
sometimes they use single-character flags.
 The
STL 
map
is used in numerous places throughout the rest of this book.
 
Multimaps
and duplicate keys
A
multimap
is a 
map
that can contain duplicate keys. At first this may seem like a strange idea,
but it can occur surprisingly often. A phone book, for example, can have many
entries with the same name. 
 Suppose
you are monitoring wildlife, and you want to keep track of where and when each
type of animal is spotted. Thus, you may see many animals of the same kind, all
in different locations and at different times. So if the type of animal is the
key, you’ll need a 
multimap.
Here’s what it looks like:
 
//: C20:WildLifeMonitor.cpp
#include <vector>
#include <map>
#include <string>
#include <algorithm>
#include <iostream>
#include <sstream>
#include <ctime>
using namespace std;
class DataPoint {
  int x, y; // Location coordinates
  time_t time; // Time of Sighting
public:
  DataPoint() : x(0), y(0), time(0) {}
  DataPoint(int xx, int yy, time_t tm) :
    x(xx), y(yy), time(tm) {}
  // Synthesized operator=, copy-constructor OK
  int getX() { return x; }
  int getY() { return y; }
  time_t* getTime() { return &time; }
};
string animal[] = { 
  "chipmunk", "beaver", "marmot", "weasel",
  "squirrel", "ptarmigan", "bear", "eagle",
  "hawk", "vole", "deer", "otter", "hummingbird",
};
const int asz = sizeof animal/sizeof *animal;
vector<string> animals(animal, animal + asz);
// All the information is contained in a 
// "Sighting," which can be sent to an ostream:
typedef pair<string, DataPoint> Sighting;
ostream& 
operator<<(ostream& os, const Sighting& s) {
  return os << s.first << " sighted at x= " << 
    s.second.getX() << ", y= " << s.second.getY()
    << ", time = " << ctime(s.second.getTime());
}
// A generator for Sightings:
class SightingGen {
  vector<string>& animals;
  static const int d = 100;
public:
  SightingGen(vector<string>& an) :
    animals(an) { srand(time(0)); }
  Sighting operator()() {
    Sighting result;
    int select = rand() % animals.size();
    result.first = animals[select];
    result.second = DataPoint(
      rand() % d, rand() % d, time(0));
    return result;
  }
};
typedef multimap<string, DataPoint> DataMap;
typedef DataMap::iterator DMIter;
int main() {
  DataMap sightings;
  generate_n(
    inserter(sightings, sightings.begin()),
    50, SightingGen(animals));
  // Print everything:
  copy(sightings.begin(), sightings.end(),
    ostream_iterator<Sighting>(cout, ""));
  // Print sightings for selected animal:
  while(true) {
    cout << "select an animal or 'q' to quit: ";
    for(int i = 0; i < animals.size(); i++)
      cout <<'['<< i <<']'<< animals[i] << ' ';
    cout << endl;
    string reply;
    cin >> reply;
    if(reply.at(0) == 'q') return 0;
    istringstream r(reply);
    int i;
    r >> i; // Converts to int
    i %= animals.size();
    // Iterators in "range" denote begin, one 
    // past end of matching range:
    pair<DMIter, DMIter> range = 
      sightings.equal_range(animals[i]);
    copy(range.first, range.second,
      ostream_iterator<Sighting>(cout, ""));
  }All
the data about a sighting is encapsulated into the class 
DataPoint,
which is simple enough that it can rely on the synthesized assignment and
copy-constructor. It uses the Standard C library time functions to record the
time of the sighting.
 In
the array of 
string
animal,
notice that the 
char*
constructor is automatically used during initialization, which makes
initializing an array of 
string
quite convenient. Since it’s easier to use the animal names in a 
vector,
the length of the array is calculated and a 
vector<string>
is initialized using the 
vector(iterator,
iterator)
constructor.
 The
key-value pairs that make up a 
Sighting
are the 
string
which names the type of animal, and the 
DataPoint
that says where and when it was sighted. The standard 
pair
template combines these two types and is typedefed to produce the 
Sighting
type. Then an 
ostream
operator<<
is created for 
Sighting;
this will allow you to iterate through a map or multimap of 
Sightings
and
print it out.
 SightingGen
generates random sightings at random data points to use for testing. It has the
usual 
operator( )
necessary for a function object, but it also has a constructor to capture and
store a reference to a 
vector<string>,
which is where the aforementioned animal names are stored.
 A
DataMap
is a 
multimap
of 
string-DataPoint
pairs, which means it stores 
Sightings.
It is filled with 50 
Sightings
using 
generate_n( ),
and printed out (notice that because there is an 
operator<<
that takes a 
Sighting,
an 
ostream_iterator
can be created). At this point the user is asked to select the animal that they
want to see all the sightings for. If you press 
‘q’
the program will quit, but if you select an animal number, then the 
equal_range( )
member function is invoked. This returns an iterator (
DMIter)
to the beginning of the set of matching pairs, and one indicating past-the-end
of the set. Since only one object can be returned from a function, 
equal_range( )
makes use of 
pair.
Since the 
range
pair has the beginning and ending iterators of the matching set, those
iterators can be used in 
copy( )
to print out all the sightings for a particular type of animal.
 
Multisets
You’ve
seen the 
set,
which only allows one object of each value to be inserted. The 
multiset
is odd by comparison since it allows more than one object of each value to be
inserted. This seems to go against the whole idea of “setness,”
where you can ask “is ‘it’ in this set?” If there can
be more than one of ‘it’, then what does that question mean?
 With
some thought, you can see that it makes no sense to have more than one object
of the same value in a set if those duplicate objects are 
exactly
the same (with the possible exception of counting occurrences of objects, but
as seen earlier in this chapter that can be handled in an alternative, more
elegant fashion). Thus each duplicate object will have something that makes it
unique from the other duplicates – most likely different state
information that is not used in the calculation of the value during the
comparison. That is, to the comparison operation, the objects look the same but
they actually contain some differing internal state.
 Like
any STL container that must order its elements, the 
multiset
template uses the 
less
template by default to determine element ordering. This uses the contained
classes’ 
operator<,
but you may of course substitute your own comparison function.
 Consider
a simple class that contains one element that is used in the comparison, and
another that is not:
 
//: C20:MultiSet1.cpp
// Demonstration of multiset behavior
#include <iostream>
#include <set>
#include <algorithm>
#include <ctime>
using namespace std;
class X {
  char c; // Used in comparison
  int i; // Not used in comparison
  // Don't need default constructor and operator=
  X();
  X& operator=(const X&);
  // Usually need a copy-constructor (but the
  // synthesized version works here)
public:
  X(char cc, int ii) : c(cc), i(ii) {}
  // Notice no operator== is required
  friend bool operator<(const X& x, const X& y) {
    return x.c < y.c;
  }
  friend ostream& operator<<(ostream& os, X x) {
    return os << x.c << ":" << x.i;
  }
};
class Xgen {
  static int i;
  // Number of characters to select from:
  static const int span = 6;
public:
  Xgen() { srand(time(0)); }
  X operator()() {
    char c = 'A' + rand() % span;    
    return X(c, i++);
  }
};
int Xgen::i = 0;
typedef multiset<X> Xmset;
typedef Xmset::const_iterator Xmit;
int main() {
  Xmset mset;
  // Fill it with X's:
  generate_n(inserter(mset, mset.begin()), 
    25, Xgen());
  // Initialize a regular set from mset:
  set<X> unique(mset.begin(), mset.end());
  copy(unique.begin(), unique.end(), 
    ostream_iterator<X>(cout, " "));
  cout << "\n----\n";
  // Iterate over the unique values:
  for(set<X>::iterator i = unique.begin();
      i != unique.end(); i++) {
    pair<Xmit, Xmit> p = mset.equal_range(*i);
    copy(p.first, p.second, 
      ostream_iterator<X>(cout, " "));
    cout << endl;
  }In
X,
all the comparisons are made with the 
char
c
.
The comparison is performed with 
operator<,
which is all that is necessary for the 
multiset,
since in this example the default 
less
comparison object is used. The class 
Xgen
is used to randomly generate 
X
objects, but the comparison value is restricted to the span from 
‘A’
to ‘
E’.
In 
main( ),
a 
multiset<X>
is created and filled with 25 
X
objects using 
Xgen,
guaranteeing that there will be duplicate keys. So that we know what the unique
values are, a regular 
set<X>
is created from the 
multiset
(using the 
iterator,
iterator
constructor). These values are displayed, then each one is used to produce the 
equal_range( )
in the 
multiset
(
equal_range( )
has the same meaning here as it does with 
multimap:
all the elements with matching keys). Each set of matching keys is then printed.
 As
a second example, a (possibly) more elegant version of 
WordCount.cpp
can be created using 
multiset: 
//: C20:MultiSetWordCount.cpp
//{L} StreamTokenizer
// Count occurrences of words using a multiset
#include "StreamTokenizer.h"
#include "../require.h"
#include <string>
#include <set>
#include <fstream>
#include <iterator>
using namespace std;
int main(int argc, char* argv[]) {
  requireArgs(argc, 1);
  ifstream in(argv[1]);
  assure(in, argv[1]);
  StreamTokenizer words(in);
  multiset<string> wordmset;
  string word;
  while((word = words.next()).size() != 0)
    wordmset.insert(word);
  typedef multiset<string>::iterator MSit;
  MSit it = wordmset.begin();
  while(it != wordmset.end()) {
    pair<MSit, MSit> p=wordmset.equal_range(*it);
    int count = distance(p.first, p.second);
    cout << *it << ": " << count << endl;
    it = p.second; // Move to the next word
  }The
setup in 
main( )
is identical to 
WordCount.cpp,
but then each word is simply inserted into the 
multiset<string>.
An iterator is created and initialized to the beginning of the 
multiset;
dereferencing this iterator produces the current word. 
equal_range( )
produces the starting and ending iterators of the word that’s currently
selected, and the STL algorithm 
distance( )
(which is in 
<iterator>)
is used to count the number of elements in that range. Then the iterator 
it
is
moved forward to the end of the range, which puts it at the next word. Although
if you’re unfamiliar with the 
multiset
this code can seem more complex, the density of it and the lack of need for
supporting classes like 
Count
has a lot of appeal.
 In
the end, is this really a “set,” or should it be called something
else? An alternative is the generic “bag” that has been defined in
some container libraries, since a bag holds anything at all without
discrimination – including duplicate objects. This is close, but it
doesn’t quite fit since a bag has no specification about how elements
should be ordered, while a 
multiset
(which requires that all duplicate elements be adjacent to each other) is even
more restrictive than the concept of a set, which could use a hashing function
to order its elements, in which case they would not be in sorted order.
Besides, if you wanted to store a bunch of objects without any special
criterions, you’d probably just use a 
vector,
deque
or 
list.
 
  
	
		
		  Contact: webmaster@codeguru.com
		  
		  CodeGuru - the website for developers.