Elimination
of the definition block
In
C,
you must always define all the variables at the beginning of a block, after the
opening brace. This is not an uncommon requirement in programming languages,
and the reason given has often been that it’s “good programming
style.” On this point, I have my suspicions. It has always seemed
inconvenient to me, as a programmer, to pop back to the beginning of a block
every time I need a new variable. I also find code more readable when the
variable definition is close to its point of use. Perhaps
these arguments are stylistic. In C++, however, there’s a significant
problem in being forced to define all objects at the beginning of a scope. If a
constructor exists, it must be called when the object is created. However, if
the constructor takes one or more initialization arguments, how do you know you
will have that initialization information at the beginning of a scope? In the
general programming situation, you won’t. Because C has no concept of
private,
this separation of definition and initialization is no problem. However, C++
guarantees that when an object is created, it is simultaneously initialized.
This ensures you will have no uninitialized objects running around in your
system. C doesn’t care; in fact, C
encourages
this practice by requiring you to define variables at the beginning of a block
before you necessarily have the initialization information.
Generally
C++ will not allow you to create an object before you have the initialization
information for the constructor. As a result, you can’t be forced to
define variables at the beginning of a scope. In fact, the style of the
language would seem to encourage the definition of an object as close to its
point of use as possible. In C++, any rule that applies to an
“object” automatically refers to an object of a built-in type, as
well. This means that any class object or variable of a built-in type can also
be defined at any point in a scope. It also means that you can wait until you
have the information for a variable before defining it, so you can always
define and initialize at the same time:
//: C06:DefineInitialize.cpp
// Defining variables anywhere
#include "../require.h"
#include <iostream>
#include <string>
using namespace std;
class G {
int i;
public:
G(int ii);
};
G::G(int ii) { i = ii; }
int main() {
cout << "initialization value? ";
int retval = 0;
cin >> retval;
require(retval != 0);
int y = retval + 3;
G g(y);
You
can see that some code is executed, then
retval
is defined, initialized and used to capture user input, then
y
and
g
are defined. C, on the other hand, would never allow a variable to be defined
anywhere except at the beginning of the scope.
Generally,
you should define variables as close to their point of use as possible, and
always initialize them when they are defined. (This is a stylistic suggestion
for built-in types, where initialization is optional.) This is a safety issue.
By reducing the duration of the variable’s availability within the scope,
you are reducing the chance it will be misused in some other part of the scope.
In addition, readability is improved because the reader doesn’t have to
jump back and forth to the beginning of the scope to know the type of a variable.
for
loops
In
C++, you will often see a
for
loop counter
defined right inside the
for
expression:
for(int j = 0; j < 100; j++) {
cout << "j = " << j << endl;
}
for(int i = 0; i < 100; i++)
cout << "i = " << i << endl;
The
above statements are important special cases, which cause confusion to new C++
programmers.
The
variables
i
and
j
are defined directly inside the
for
expression (which you cannot do in C). They are then available for use in the
for
loop. It’s a very convenient syntax because the context removes all
question about the purpose of
i
and
j,
so you don’t need to use such ungainly names as
i_loop_counter
for clarity.
However,
some confusion may result if you expect lifetime of the variables
i
and
j
to
extend beyond the scope of the for loop – they do not
[27]. Chapter
3 points out that
while
and
switch
statements also allow the definition of objects in their control expressions,
although this usage seems far less important than with the
for
loop.
Watch
out for local variables that hide variables in the enclosing scope. In general,
using the same name for a nested variable as a varable global to that scope is
confusing and error prone
[28]. I
find small scopes an indicator of good design. If you have several pages for a
single function, perhaps you’re trying to do too much with that function.
More granular functions are not only more useful, but it’s also easier to
find bugs.
Storage
allocation
A
variable can now be defined at any point in a scope, so it might seem that the
storage for a variable may not be defined until its point of definition.
It’s actually more likely that the compiler will follow the practice in C
of allocating all the storage for a scope at the opening brace of that scope.
It doesn’t matter because, as a programmer, you can’t access the
storage (a.k.a. the object) until it has been defined
[29].
Although the storage is allocated
at the beginning of the block, the constructor call doesn’t happen until
the sequence point where the object is defined because the identifier
isn’t available until then. The compiler even checks to make sure you
don’t put the object definition (and thus the constructor call) where the
sequence point only
conditionally passes through it, such as in a
switch
statement
or somewhere a
goto
can
jump past it. Uncommenting the statements in the following code will generate a
warning or an error:
//: C06:Nojump.cpp
// Can't jump past constructors
class X {
public:
X();
};
X::X() {}
void f(int i) {
if(i < 10) {
//! goto jump1; // Error: goto bypasses init
}
X x1; // Constructor called here
jump1:
switch(i) {
case 1 :
X x2; // Constructor called here
break;
//! case 2 : // Error: case bypasses init
X x3; // Constructor called here
break;
}
}
int main() {
f(9);
f(11);
In
the above code, both the
goto
and the
switch
can potentially jump past the sequence point where a constructor is called.
That object will then be in scope even if the constructor hasn’t been
called, so the compiler gives an error message. This once again guarantees that
an object cannot be created unless it is also initialized.
All
the storage allocation discussed here happens, of course, on the stack.
The storage is allocated by the compiler by moving the stack pointer
“down” (a relative term, which may indicate an increase or decrease
of the actual stack pointer value,
depending on your machine). Objects can also be allocated on the heap using
new,
which is something we’ll explore further in Chapter XX.
[27]
An earlier iteration of the C++ draft standard said the variable lifetime
extended to the end of the scope that enclosed the
for
loop. Some compilers still implement that, but it is not correct so your code
will only be portable if you limit the scope to the
for
loop.
[28]
The Java language considered this such a bad idea that it flags such code as an
error.
[29]
OK, you probably could by fooling around with pointers, but you’d be
very, very bad.
Contact: webmaster@codeguru.com
CodeGuru - the website for developers.