CodeGuru : Thinking in C++

Overloading new & delete

When you create a new-expression, two things occur: First, storage is allocated using the operator new , then the constructor is called. In a delete-expression, the destructor is called, then storage is deallocated using the operator delete . The constructor and destructor calls are never under your control (otherwise you might accidentally subvert them), but you can change the storage allocation functions operator new and operator delete .

The memory allocation system used by new and delete is designed for general-purpose use. In special situations, however, it doesn’t serve your needs. The most common reason to change the allocator is efficiency: You might be creating and destroying so many objects of a particular class that it has become a speed bottleneck. C++ allows you to overload new and delete to implement your own storage allocation scheme, so you can handle problems like this.

Another issue is heap fragmentation: By allocating objects of different sizes it’s possible to break up the heap so that you effectively run out of storage. That is, the storage might be available, but because of fragmentation no piece is big enough to satisfy your needs. By creating your own allocator for a particular class, you can ensure this never happens.

In embedded and real-time systems, a program may have to run for a very long time with restricted resources. Such a system may also require that memory allocation always take the same amount of time, and there’s no allowance for heap exhaustion or fragmentation. A custom memory allocator is the solution; otherwise programmers will avoid using new and delete altogether in such cases and miss out on a valuable C++ asset.

When you overload operator new and operator delete , it’s important to remember that you’re changing only the way raw storage is allocated . The compiler will simply call your new instead of the default version to allocate storage, then call the constructor for that storage. So, although the compiler allocates storage and calls the constructor when it sees new, all you can change when you overload new is the storage allocation portion. ( delete has a similar limitation.)

When you overload operator new, you also replace the behavior when it runs out of memory, so you must decide what to do in your operator new : return zero, write a loop to call the new-handler and retry allocation, or (typically) throw a bad_alloc exception (discussed in Chapter XX).

Overloading new and delete is like overloading any other operator. However, you have a choice of overloading the global allocator or using a different allocator for a particular class.

Overloading global new & delete

This is the drastic approach, when the global versions of new and delete are unsatisfactory for the whole system. If you overload the global versions, you make the defaults completely inaccessible – you can’t even call them from inside your redefinitions.

The overloaded new must take an argument of size_t (the Standard C standard type for sizes). This argument is generated and passed to you by the compiler and is the size of the object you’re responsible for allocating. You must return a pointer either to an object of that size (or bigger, if you have some reason to do so), or to zero if you can’t find the memory (in which case the constructor is not called!). However, if you can’t find the memory, you should probably do something more drastic than just returning zero, like calling the new-handler or throwing an exception, to signal that there’s a problem.

The return value of operator new is a void*, not a pointer to any particular type. All you’ve done is produce memory, not a finished object – that doesn’t happen until the constructor is called, an act the compiler guarantees and which is out of your control.

The operator delete takes a void* to memory that was allocated by operator new . It’s a void* because you get that pointer after the destructor is called, which removes the object-ness from the piece of storage. The return type is void.

Here’s a very simple example showing how to overload the global new and delete:

//: C13:GlobalNew.cpp
// Overload global new/delete
#include <cstdio>
#include <cstdlib>
using namespace std;

void* operator new(size_t sz) {
  printf("operator new: %d Bytes\n", sz);
  void* m = malloc(sz);
  if(!m) puts("out of memory");
  return m;
}

void operator delete(void* m) {
  puts("operator delete");
  free(m);
}

class S {
  int i[100];
public:
  S() { puts("S::S()"); }
  ~S() { puts("S::~S()"); }
};

int main() {
  puts("creating & destroying an int");
  int* p = new int(47);
  delete p;
  puts("creating & destroying an s");
  S* s = new S;
  delete s;
  puts("creating & destroying S[3]");
  S* sa = new S[3];
  delete []sa;

} ///:~

Here you can see the general form for overloading new and delete. These use the Standard C library functions malloc( ) and free( ) for the allocators (which is probably what the default new and delete use, as well!). However, they also print out messages about what they are doing. Notice that printf( ) and puts( ) are used rather than iostreams. Thus, when an iostream object is created (like the global cin, cout, and cerr), they call new to allocate memory. With printf( ), you don’t get into a deadlock because it doesn’t call new to initialize itself.

In main( ), objects of built-in types are created to prove that the overloaded new and delete are also called in that case. Then a single object of type s is created, followed by an array. For the array, you’ll see that extra memory is requested to put information about the number of objects in the array. In all cases, the global overloaded versions of new and delete are used.

Overloading new & delete for a class

Although you don’t have to explicitly say static, when you overload new and delete for a class, you’re creating static member functions. Again, the syntax is the same as overloading any other operator. When the compiler sees you use new to create an object of your class, it chooses the member operator new over the global version. However, the global versions of new and delete are used for all other types of objects (unless they have their own new and delete).

In the following example, a very primitive storage allocation system is created for the class Framis. A chunk of memory is set aside in the static data area at program start-up, and that memory is used to allocate space for objects of type Framis. To determine which blocks have been allocated, a simple array of bytes is used, one byte for each block:

//: C13:Framis.cpp
// Local overloaded new & delete
#include <cstddef> // Size_t
#include <fstream>
#include <new>
using namespace std;
ofstream out("Framis.out");

class Framis {
  static const int sz = 10;
  char c[sz]; // To take up space, not used
  static unsigned char pool[];
  static unsigned char alloc_map[];
public:
  static const int psize = 100;  // frami allowed
  Framis() { out << "Framis()\n"; }
  ~Framis() { out << "~Framis() ... "; }
  void* operator new(size_t) throw(bad_alloc);
  void operator delete(void*);
};
unsigned char Framis::pool[psize * sizeof(Framis)];
unsigned char Framis::alloc_map[psize] = {0};

// Size is ignored -- assume a Framis object
void* Framis::operator new(size_t) 
  throw(bad_alloc) {
  for(int i = 0; i < psize; i++)
    if(!alloc_map[i]) {
      out << "using block " << i << " ... ";
      alloc_map[i] = 1; // Mark it used
      return pool + (i * sizeof(Framis));
    }
  out << "out of memory" << endl;
  throw bad_alloc();
}

void Framis::operator delete(void* m) {
  if(!m) return; // Check for null pointer
  // Assume it was created in the pool
  // Calculate which block number it is:
  unsigned long block = (unsigned long)m
    - (unsigned long)pool;
  block /= sizeof(Framis);
  out << "freeing block " << block << endl;
  // Mark it free:
  alloc_map[block] = 0;
}

int main() {
  Framis* f[Framis::psize];
  for(int i = 0; i < Framis::psize; i++)
    f[i] = new Framis;
  new Framis; // Out of memory
  delete f[10];
  f[10] = 0;
  // Use released memory:
  Framis* x = new Framis;
  delete x;
  for(int j = 0; j < Framis::psize; j++)
    delete f[j]; // Delete f[10] OK

} ///:~

The pool of memory for the Framis heap is created by allocating an array of bytes large enough to hold psize Framis objects. The allocation map is psize bytes long, so there’s one byte for every block. All the bytes in the allocation map are initialized to zero using the aggregate initialization trick of setting the first element to zero so the compiler automatically initializes all the rest.

The local operator new has the same form as the global one. All it does is search through the allocation map looking for a zero byte, then sets that byte to one to indicate it’s been allocated and returns the address of that particular block. If it can’t find any memory, it issues a message and returns zero (Notice that the new-handler is not called and no exceptions are thrown because the behavior when you run out of memory is now under your control.) In this example, it’s OK to use iostreams because the global operator new and delete are untouched.

The operator delete assumes the Framis address was created in the pool. This is a fair assumption, because the local operator new will be called whenever you create a single Framis object on the heap – but not an array. Global new is used in that case. So the user might accidentally have called operator delete without using the empty bracket syntax to indicate array destruction. This would cause a problem. Also, the user might be deleting a pointer to an object created on the stack. If you think these things could occur, you might want to add a line to make sure the address is within the pool and on a correct boundary.

operator delete calculates which block in the pool this pointer represents, and then sets the allocation map’s flag for that block to zero to indicate the block has been released.

In main( ), enough Framis objects are dynamically allocated to run out of memory; this checks the out-of-memory behavior. Then one of the objects is freed, and another one is created to show that the released memory is reused.

Because this allocation scheme is specific to Framis objects, it’s probably much faster than the general-purpose memory allocation scheme used for the default new and delete.

Overloading new & delete for arrays

If you overload operator new and delete for a class, those operators are called whenever you create an object of that class. However, if you create an array of those class objects, the global operator new is called to allocate enough storage for the array all at once, and the global operator delete is called to release that storage. You can control the allocation of arrays of objects by overloading the special array versions of operator new[ ] and operator delete[ ] for the class. Here’s an example that shows when the two different versions are called:

//: C13:ArrayNew.cpp
// Operator new for arrays
#include <new> // Size_t definition
#include <fstream>
using namespace std;
ofstream trace("ArrayNew.out");

class Widget {
  static const int sz = 10;
  int i[sz];
public:
  Widget() { trace << "*"; }
  ~Widget() { trace << "~"; }
  void* operator new(size_t sz) {
    trace << "Widget::new: "
         << sz << " bytes" << endl;
    return ::new char[sz];
  }
  void operator delete(void* p) {
    trace << "Widget::delete" << endl;
    ::delete []p;
  }
  void* operator new[](size_t sz) {
    trace << "Widget::new[]: "
         << sz << " bytes" << endl;
    return ::new char[sz];
  }
  void operator delete[](void* p) {
    trace << "Widget::delete[]" << endl;
    ::delete []p;
  }
};

int main() {
  trace << "new Widget" << endl;
  Widget* w = new Widget;
  trace << "\ndelete Widget" << endl;
  delete w;
  trace << "\nnew Widget[25]" << endl;
  Widget* wa = new Widget[25];
  trace << "\ndelete []Widget" << endl;
  delete []wa;

} ///:~

Here, the global versions of new and delete are called so the effect is the same as having no overloaded versions of new and delete except that trace information is added. Of course, you can use any memory allocation scheme you want in the overloaded new and delete.

You can see that the array versions of new and delete are the same as the individual-object versions with the addition of the brackets. In both cases you’re handed the size of the memory you must allocate. The size handed to the array version will be the size of the entire array. It’s worth keeping in mind that the only thing the overloaded operator new is required to do is hand back a pointer to a large enough memory block. Although you may perform initialization on that memory, normally that’s the job of the constructor that will automatically be called for your memory by the compiler.

The constructor and destructor simply print out characters so you can see when they’ve been called. Here’s what the trace file looks like for one compiler:

new Widget
Widget::new: 20 bytes
*
delete Widget
~Widget::delete

new Widget[25]
Widget::new[]: 504 bytes
*************************
delete []Widget
~~~~~~~~~~~~~~~~~~~~~~~~~Widget::delete[]

Creating an individual object requires 20 bytes, as you might expect. (This machine uses two bytes for an int). The operator new is called, then the constructor (indicated by the *). In a complementary fashion, calling delete causes the destructor to be called, then the operator delete .

When an array of Widget objects is created, the array version of operator new is used, as promised. But notice that the size requested is four more bytes than expected. This extra four bytes is where the system keeps information about the array, in particular, the number of objects in the array. That way, when you say

delete []Widget;

the brackets tell the compiler it’s an array of objects, so the compiler generates code to look for the number of objects in the array and to call the destructor that many times.

You can see that, even though the array operator new and operator delete are only called once for the entire array chunk, the default constructor and destructor are called for each object in the array.

Constructor calls

Considering that

MyType* f = new MyType;

calls new to allocate a MyType-sized piece of storage, then invokes the MyType constructor on that storage, what happens if all the safeguards fail and the value returned by operator new is zero? The constructor is not called in that case, so although you still have an unsuccessfully created object, at least you haven’t invoked the constructor and handed it a zero pointer. Here’s an example to prove it:

//: C13:NoMemory.cpp
// Constructor isn't called
// If new returns 0
#include <iostream>
#include <new> // size_t definition
using namespace std;

void my_new_handler() {
  cout << "new handler called" << endl;
}

class NoMemory {
public:
  NoMemory() {
    cout << "NoMemory::NoMemory()" << endl;
  }
  void* operator new(size_t sz) throw(bad_alloc){
    cout << "NoMemory::operator new" << endl;
    throw bad_alloc(); // "Out of memory"
  }
};

int main() {
  set_new_handler(my_new_handler);
  NoMemory* nm = new NoMemory;
  cout << "nm = " << nm << endl;

} ///:~

When the program runs, it prints only the message from operator new . Because new returns zero, the constructor is never called so its message is not printed.

Object placement

There are two other, less common, uses for overloading operator new .

You may want to place an object in a specific location in memory. This is especially important with hardware-oriented embedded systems

where an object may be synonymous with a particular piece of hardware.

You may want to be able to choose from different allocators when calling new.

Both of these situations are solved with the same mechanism: The overloaded operator new can take more than one argument. As you’ve seen before, the first argument is always the size of the object, which is secretly calculated and passed by the compiler. But the other arguments can be anything you want: the address you want the object placed at, a reference to a memory allocation function or object, or anything else that is convenient for you.

The way you pass the extra arguments to operator new during a call may seem slightly curious at first: You put the argument list ( without the size_t argument, which is handled by the compiler) after the keyword new and before the class name of the object you’re creating. For example,

X* xp = new(a) X;

will pass a as the second argument to operator new . Of course, this can work only if such an operator new has been declared.

Here’s an example showing how you can place an object at a particular location:

//: C13:PlacementNew.cpp
// Placement with operator new
#include <cstddef> // Size_t
#include <iostream>
using namespace std;

class X {
  int i;
public:
  X(int ii = 0) : i(ii) {}
  ~X() {
    cout << "X::~X()" << endl;
  }
  void* operator new(size_t, void* loc) {
    return loc;
  }
};

int main() {
  int l[10];
  X* xp = new(l) X(47); // X at location l
  xp->X::~X(); // Explicit destructor call
  // ONLY use with placement!

} ///:~

Notice that operator new only returns the pointer that’s passed to it. Thus, the caller decides where the object is going to sit, and the constructor is called for that memory as part of the new-expression.

A dilemma occurs when you want to destroy the object. There’s only one version of operator delete , so there’s no way to say, “Use my special deallocator for this object.” You want to call the destructor, but you don’t want the memory to be released by the dynamic memory mechanism because it wasn’t allocated on the heap.

The answer is a very special syntax: You can explicitly call the destructor, as in

xp->X::~X(); // Explicit destructor call

A stern warning is in order here. Some people see this as a way to destroy objects at some time before the end of the scope, rather than either adjusting the scope or (more correctly) using dynamic object creation if they want the object’s lifetime to be determined at runtime. You will have serious problems if you call the destructor this way for an object created on the stack because the destructor will be called again at the end of the scope. If you call the destructor this way for an object that was created on the heap, the destructor will execute, but the memory won’t be released, which probably isn’t what you want. The only reason that the destructor can be called explicitly this way is to support the placement syntax for operator new .

Although this example shows only one additional argument, there’s nothing to prevent you from adding more if you need them for other purposes.

Contents | Prev | Next

Contact: webmaster@codeguru.com
CodeGuru - the website for developers. [an error occurred while processing this directive]