Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Elimination of the definition block

In C, you must always define all the variables at the beginning of a block, after the opening brace. This is not an uncommon requirement in programming languages, and the reason given has often been that it’s “good programming style.” On this point, I have my suspicions. It has always seemed inconvenient to me, as a programmer, to pop back to the beginning of a block every time I need a new variable. I also find code more readable when the variable definition is close to its point of use.

Perhaps these arguments are stylistic. In C++, however, there’s a significant problem in being forced to define all objects at the beginning of a scope. If a constructor exists, it must be called when the object is created. However, if the constructor takes one or more initialization arguments, how do you know you will have that initialization information at the beginning of a scope? In the general programming situation, you won’t. Because C has no concept of private, this separation of definition and initialization is no problem. However, C++ guarantees that when an object is created, it is simultaneously initialized. This ensures you will have no uninitialized objects running around in your system. C doesn’t care; in fact, C encourages this practice by requiring you to define variables at the beginning of a block before you necessarily have the initialization information.

Generally C++ will not allow you to create an object before you have the initialization information for the constructor. As a result, you can’t be forced to define variables at the beginning of a scope. In fact, the style of the language would seem to encourage the definition of an object as close to its point of use as possible. In C++, any rule that applies to an “object” automatically refers to an object of a built-in type, as well. This means that any class object or variable of a built-in type can also be defined at any point in a scope. It also means that you can wait until you have the information for a variable before defining it, so you can always define and initialize at the same time:

//: C06:DefineInitialize.cpp
// Defining variables anywhere
#include "../require.h"
#include <iostream>
#include <string>
using namespace std;

class G {
  int i;
public:
  G(int ii);
};

G::G(int ii) { i = ii; }

int main() {
  cout << "initialization value? ";
  int retval = 0;
  cin >> retval;
  require(retval != 0);
  int y = retval + 3;
  G g(y);
} ///:~

You can see that some code is executed, then retval is defined, initialized and used to capture user input, then y and g are defined. C, on the other hand, would never allow a variable to be defined anywhere except at the beginning of the scope.

Generally, you should define variables as close to their point of use as possible, and always initialize them when they are defined. (This is a stylistic suggestion for built-in types, where initialization is optional.) This is a safety issue. By reducing the duration of the variable’s availability within the scope, you are reducing the chance it will be misused in some other part of the scope. In addition, readability is improved because the reader doesn’t have to jump back and forth to the beginning of the scope to know the type of a variable.

for loops

In C++, you will often see a for loop counter defined right inside the for expression:

for(int j = 0; j < 100; j++) {
    cout << "j = " << j << endl;
}
for(int i = 0; i < 100; i++)
cout << "i = " << i << endl;

The above statements are important special cases, which cause confusion to new C++ programmers.

The variables i and j are defined directly inside the for expression (which you cannot do in C). They are then available for use in the for loop. It’s a very convenient syntax because the context removes all question about the purpose of i and j, so you don’t need to use such ungainly names as i_loop_counter for clarity.

However, some confusion may result if you expect lifetime of the variables i and j to extend beyond the scope of the for loop – they do not [27].

Chapter 3 points out that while and switch statements also allow the definition of objects in their control expressions, although this usage seems far less important than with the for loop.

Watch out for local variables that hide variables in the enclosing scope. In general, using the same name for a nested variable as a varable global to that scope is confusing and error prone [28].

I find small scopes an indicator of good design. If you have several pages for a single function, perhaps you’re trying to do too much with that function. More granular functions are not only more useful, but it’s also easier to find bugs.

Storage allocation

A variable can now be defined at any point in a scope, so it might seem that the storage for a variable may not be defined until its point of definition. It’s actually more likely that the compiler will follow the practice in C of allocating all the storage for a scope at the opening brace of that scope. It doesn’t matter because, as a programmer, you can’t access the storage (a.k.a. the object) until it has been defined [29]. Although the storage is allocated at the beginning of the block, the constructor call doesn’t happen until the sequence point where the object is defined because the identifier isn’t available until then. The compiler even checks to make sure you don’t put the object definition (and thus the constructor call) where the sequence point only conditionally passes through it, such as in a switch statement or somewhere a goto can jump past it. Uncommenting the statements in the following code will generate a warning or an error:

//: C06:Nojump.cpp
// Can't jump past constructors

class X {
public:
  X();
};

X::X() {}

void f(int i) {
  if(i < 10) {
   //! goto jump1; // Error: goto bypasses init
  }
  X x1;  // Constructor called here
 jump1:
  switch(i) {
    case 1 :
      X x2;  // Constructor called here
      break;
  //! case 2 : // Error: case bypasses init
      X x3;  // Constructor called here
      break;
  }
} 

int main() {
  f(9);
  f(11);
}///:~

In the above code, both the goto and the switch can potentially jump past the sequence point where a constructor is called. That object will then be in scope even if the constructor hasn’t been called, so the compiler gives an error message. This once again guarantees that an object cannot be created unless it is also initialized.

All the storage allocation discussed here happens, of course, on the stack. The storage is allocated by the compiler by moving the stack pointer “down” (a relative term, which may indicate an increase or decrease of the actual stack pointer value, depending on your machine). Objects can also be allocated on the heap using new, which is something we’ll explore further in Chapter XX.


[27] An earlier iteration of the C++ draft standard said the variable lifetime extended to the end of the scope that enclosed the for loop. Some compilers still implement that, but it is not correct so your code will only be portable if you limit the scope to the for loop.

[28] The Java language considered this such a bad idea that it flags such code as an error.

[29] OK, you probably could by fooling around with pointers, but you’d be very, very bad.

Contents | Prev | Next


Contact: webmaster@codeguru.com
CodeGuru - the website for developers.
[an error occurred while processing this directive]