The
basic object
Step
one is exactly that. C++ functions can be placed inside
structs
as “member functions.” Here’s what it looks like after
converting the C version of
CStash
to the C++
Stash:
//: C04:CppLib.h
// C-like library converted to C++
struct Stash {
int size; // Size of each space
int quantity; // Number of storage spaces
int next; // Next empty space
// Dynamically allocated array of bytes:
unsigned char* storage;
// Functions!
void initialize(int size);
void cleanup();
int add(const void* element);
void* fetch(int index);
int count();
void inflate(int increase);
First,
notice there is no
typedef.
Instead of requiring you to create a
typedef,
the C++ compiler turns the name of the structure into a new type name for the
program (just as
int,
char,
float
and
double
are type names).
All
the data members are exactly the same as before, but now the functions are
inside the body of the
struct.
In addition, notice that the first argument from the C version of the library
has been removed. In C++, instead of forcing you to pass the address of the
structure as the first argument to all the functions that operate on that
structure, the compiler secretly does this for you. Now the only arguments for
the functions are concerned with what the function
does,
not the mechanism of the function’s operation.
It’s
important to realize that the function code is effectively the same as it was
with the C version of the library. The number of arguments are the same (even
though you don’t see the structure address being passed in, it’s
still there), and there’s only one function body for each function. That
is, just because you say
doesn’t
mean you get a different
add( )
function for each variable.
So
the code that’s generated is almost identical to what you would have
written for the C version of the library. Interestingly enough, this includes
the “name decoration” you probably would have done to produce
Stash_initialize( ),
Stash_cleanup( ),
and so on. When the function name is inside the
struct,
the compiler effectively does the same thing. Therefore,
initialize( )
inside the structure
Stash
will not collide with a function named
initialize( )
inside any other structure, or even a global function named
initialize( ).
Most of the time you don’t have to worry about the function name
decoration – you use the undecorated name. But sometimes you do need to
be able to specify that this
initialize( )
belongs to the
struct
Stash,
and not to any other
struct.
In particular, when you’re defining the function you need to fully
specify which one it is. To accomplish this full specification, C++ has an
operator (
::)
called the
scope
resolution operator
(named so because names can now be in different scopes: at global scope, or
within the scope of a
struct).
For example, if you want to specify
initialize( )
which
belongs to
Stash,
you say
Stash::initialize(int
size)
.
You can see how the scope resolution operator is used in the function
definitions:
//: C04:CppLib.cpp {O}
// C library converted to C++
// Declare structure and functions:
#include "CppLib.h"
#include <iostream>
#include <cassert>
using namespace std;
// Quantity of elements to add
// when increasing storage:
const int increment = 100;
void Stash::initialize(int sz) {
size = sz;
quantity = 0;
storage = 0;
next = 0;
}
int Stash::add(const void* element) {
if(next >= quantity) // Enough space left?
inflate(increment);
// Copy element into storage,
// starting at next empty space:
int startBytes = next * size;
unsigned char* e = (unsigned char*)element;
for(int i = 0; i < size; i++)
storage[startBytes + i] = e[i];
next++;
return(next - 1); // Index number
}
void* Stash::fetch(int index) {
// Check index boundaries:
assert(0 <= index && index < next);
// Produce pointer to desired element:
return &(storage[index * size]);
}
int Stash::count() {
return next; // Number of elements in CStash
}
void Stash::inflate(int increase) {
assert(increase > 0);
int newQuantity = quantity + increase;
int newBytes = newQuantity * size;
int oldBytes = quantity * size;
unsigned char* b = new unsigned char[newBytes];
for(int i = 0; i < oldBytes; i++)
b[i] = storage[i]; // Copy old to new
delete []storage; // Old storage
storage = b; // Point to new memory
quantity = newQuantity;
}
void Stash::cleanup() {
if(storage != 0) {
cout << "freeing storage" << endl;
delete []storage;
}
There
are several other things that are different between C and C++. First, the
declarations in the header files
are
required
by the compiler. In C++ you cannot call a function without declaring it first.
The compiler will issue an error message otherwise. This is an important way to
ensure that function calls are consistent between the point where they are
called and the point where they are defined. By forcing you to declare
the function before you call it, the C++ compiler virtually ensures you will
perform this declaration by including the header file. If you also include the
same header file in the place where the functions are defined, then the
compiler checks to make sure that the declaration in the header and the
function definition match up. This means that the header file becomes a
validated repository for function declarations and ensures that functions are
used consistently throughout all translation units in the project.
Of
course, global functions can
still be declared by hand every place where they are defined and used. (This is
so tedious that it becomes very unlikely.) However, structures must always be
declared before they are defined or used, and the most convenient place to put
a structure definition is in a header file, except for those you intentionally
hide in a file.
You
can see that all the member functions look almost the same as when they were C
functions, except for the scope resolution and the fact that the first argument
from the C version of the library is no longer explicit. It’s still
there, of course, because the function has to be able to work on a particular
struct
variable. But notice, inside the member function, that the member selection is
also gone! Thus, instead of saying
s–>size
= sz;
you say
size
= sz;
and eliminate the tedious
s–>,
which didn’t really add anything to the meaning of what you were doing
anyway. The C++ compiler is apparently doing this for you. Indeed, it is taking
the “secret” first argument (the address of the structure that we
were previously passing in by hand) and applying the member selector whenever
you refer to one of the data members of a
struct.
This
means that whenever you are inside the member function of another
struct,
you can refer to any member (including another member function) by simply
giving its name. The compiler will search through the local structure’s
names before looking for a global version of that name. You’ll find that
this feature means that not only is your code easier to write, it’s a lot
easier to read.
But
what if, for some reason, you
want
to be able to get your hands on the address of the structure? In the C version
of the library it was easy because each function’s first argument was a
CStash*
called
s.
In C++, things are even more consistent. There’s a special keyword, called
this,
which produces the address of the
struct.
It’s the equivalent of the ‘
s’
in the C version of the library. So we can revert to the C style of things by
saying
The
code generated by the compiler is exactly the same, so you don’t need to
use
this
in such a fashion – occasionally, you’ll see code where people
explicitly use
this->
everywhere but it doesn’t add anything to the meaning of the code and
often indicates an inexperienced programmer. Usually, you don’t use
this
very often, but when you need it, it’s there (some of the examples later
in the book will use
this). There’s
one last item to mention. In C, you could assign a
void*
to any other pointer like this:
int i = 10;
void* vp = &i; // OK in both C and C++
int*
ip = vp; // Only acceptable in C
and
there was no complaint from the compiler. But in C++, this statement is not
allowed. Why? Because C is not so particular about type information, so it
allows you to assign a pointer with an unspecified type to a pointer with a
specified type. Not so with C++. Type is critical in C++, and the compiler
stamps its foot when there are any violations of type information. This has
always been important, but it is especially important in C++ because you have
member functions in
structs.
If you could pass pointers to
structs
around with impunity in C++, then you could end up calling a member function
for a
struct
that doesn’t even logically exist for that
struct!
A real recipe for disaster. Therefore, while C++ allows the assignment of any
type of pointer to a
void*
(this was the original intent of
void*,
which is required to be large enough to hold a pointer to any type), it will
not
allow you to assign a
void
pointer to any other type of pointer. A cast is always required, to tell the
reader and the compiler that you really do want to treat it as the destination
type.
This
brings up an interesting issue. One of the important goals for C++ is to
compile as much existing C code as possible to allow for an easy transition to
the new language. However, this doesn’t mean any code that C allows will
automatically be allowed in C++. There
are a number of things the C compiler lets you get away with that are dangerous
and error-prone. (We’ll look at them as the book progresses.) The C++
compiler generates warnings and errors for these situations. This is often much
more of an advantage than a hindrance. In fact, there are many situations where
you are trying to run down an error in C and just can’t find it, but as
soon as you recompile the program in C++, the compiler points out the problem!
In C, you’ll often find that you can get the program to compile, but then
you have to get it to work. In C++, when the program compiles correctly, it
often works, too! This is because the language is a lot stricter about type. You
can see a number of new things in the way the C++ version of
Stash
is used, in the following test program:
//: C04:CppLibTest.cpp
//{L} CppLib
// Test of C++ library
#include "CppLib.h"
#include "../require.h"
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main() {
Stash intStash;
intStash.initialize(sizeof(int));
for(int i = 0; i < 100; i++)
intStash.add(&i);
for(int j = 0; j < intStash.count(); j++)
cout << "intStash.fetch(" << j << ") = "
<< *(int*)intStash.fetch(j)
<< endl;
// Holds 80-character strings:
Stash stringStash;
const int bufsize = 80;
stringStash.initialize(sizeof(char) * bufsize);
ifstream in("CppLibTest.cpp");
assure(in, "CppLibTest.cpp");
string line;
while(getline(in, line))
stringStash.add(line.c_str());
int k = 0;
char* cp;
while((cp =(char*)stringStash.fetch(k++)) != 0)
cout << "stringStash.fetch(" << k << ") = "
<< cp << endl;
intStash.cleanup();
stringStash.cleanup();
One
thing you’ll notice is that the variables are all defined “on the
fly” (as introduced in the previous chapter). That is, they are defined
at any point in the scope, rather than being restricted – as in C –
to the beginning of the scope.
The
code is quite similar to
CLibTest.cpp,
but when a member function is called, the call occurs using the member
selection operator
‘
.’
preceded by the name of the variable. This is a convenient syntax because it
mimics the selection of a data member of the structure. The difference is that
this is a function member, so it has an argument list.
Of
course, the call that the compiler
actually
generates looks much more like the original C library function. Thus,
considering name decoration
and the passing of
this,
the C++ function call
intStash.initialize(sizeof(int),
100)
becomes something like
Stash_initialize(&intStash,
sizeof(int), 100)
.
If you ever wonder what’s going on underneath the covers, remember that
the original C++ compiler
cfront
from AT&T produced C code as its output, which was then compiled by the
underlying C compiler. This approach meant that
cfront
could be quickly ported to any machine that had a C compiler, and it helped to
rapidly disseminate C++ compiler technology. But because the C++ compiler had
to generate C, you know that there must be some way to represent C++ syntax in C.
You’ll
also notice an additional cast in
while(cp
= (char*)stringStash.fetch(k++))
This
is due again to the stricter type checking in C++.
There’s
one other change from
ClibTest.cpp,
which is the introduction of the
require.h
header file. This is a header file which I created for this book to perform
more sophisticated error checking than that provided by
assert( ).
It contains several functions, including the one used here called
assure( )
which is used for files. This function checks to see if the file has
successfully been opened, and if not it reports to standard error that the file
could not be opened (thus it needs the name of the file as the second argument)
and exits the program. The
require.h
functions will be used throughout the book, in particular to ensure that there
are the right number of command-line arguments and that files are opened
properly. The
require.h
functions replace repetetive and distracting error-checking code, and yet they
provide essentially useful error messages. These functions will be fully
explained later in the book.
Contact: webmaster@codeguru.com
CodeGuru - the website for developers.