CodeGuru : Thinking in C++

To understand when inlining is effective, it’s helpful to understand what the compiler does when it encounters an inline. As with any function, the compiler holds the function type (that is, the function prototype including the name and argument types, in combination with the function return value) in its symbol table. In addition, when the compiler sees the inline function body and the function body parses without error, the code for the function body is also brought into the symbol table. Whether the code is stored in source form or as compiled assembly instructions is up to the compiler.

When you make a call to an inline function, the compiler first ensures that the call can be correctly made; that is, all the argument types must be the proper types, or the compiler must be able to make a type conversion to the proper types, and the return value must be the correct type (or convertible to the correct type) in the destination expression. This, of course, is exactly what the compiler does for any function and is markedly different from what the preprocessor does because the preprocessor cannot check types or make conversions.

If all the function type information fits the context of the call, then the inline code is substituted directly for the function call, eliminating the call overhead. Also, if the inline is a member function, the address of the object ( this) is put in the appropriate place(s), which of course is another thing the preprocessor is unable to perform.

Limitations

There are two situations when the compiler cannot perform inlining. In these cases, it simply reverts to the ordinary form of a function by taking the inline definition and creating storage for the function just as it does for a non-inline. If it must do this in multiple translation units (which would normally cause a multiple definition error), the linker is told to ignore the multiple definitions.

The compiler cannot perform inlining if the function is too complicated. This depends upon the particular compiler, but at the point most compilers give up, the inline probably wouldn’t gain you any efficiency. Generally, any sort of looping is considered too complicated to expand as an inline, and if you think about it, looping probably entails much more time inside the function than embodied in the calling overhead. If the function is just a collection of simple statements, the compiler probably won’t have any trouble inlining it, but if there are a lot of statements, the overhead of the function call will be much less than the cost of executing the body. And remember, every time you call a big inline function, the entire function body is inserted in place of each call, so you can easily get code bloat without any noticeable performance improvement. Some of the examples in this book may exceed reasonable inline sizes in favor of conserving screen real estate.

The compiler also cannot perform inlining if the address of the function is taken, implicitly or explicitly. If the compiler must produce an address, then it will allocate storage for the function code and use the resulting address. However, where an address is not required, the compiler will probably still inline the code.

It is important to understand that an inline is just a suggestion to the compiler; the compiler is not forced to inline anything at all. A good compiler will inline small, simple functions while intelligently ignoring inlines that are too complicated. This will give you the results you want – the true semantics of a function call with the efficiency of a macro.

Order of evaluation

If you’re imagining what the compiler is doing to implement inlines, you can confuse yourself into thinking there are more limitations than actually exist. In particular, if an inline makes a forward reference to a function that hasn’t yet been declared in the class, it can seem like the compiler won’t be able to handle it:

In f( ), a call is made to g( ), although g( ) has not yet been declared. This works because the language definition states that no inline functions in a class shall be evaluated until the closing brace of the class declaration.

Of course, if g( ) in turn called f( ), you’d end up with a set of recursive calls, which are too complicated for the compiler to inline. (Also, you’d have to perform some test in f( ) or g( ) to force one of them to “bottom out,” or the recursion would be infinite.)

Hidden activities in constructors & destructors

Constructors and destructors are two places where you can be fooled into thinking that an inline is more efficient than it actually is. Both constructors and destructors may have hidden activities, because the class can contain subobjects whose constructors and destructors must be called. These sub-objects may be member objects, or they may exist because of inheritance (which hasn’t been introduced yet). As an example of a class with member objects

In class WithMembers , the inline constructor and destructor look straightforward and simple enough, but there’s more going on than meets the eye. The constructors and destructors for the member objects q, r, and s are being called automatically, and those constructors and destructors are also inline, so the difference is significant from normal member functions. This doesn’t necessarily mean that you should always make constructor and destructor definitions out-of-line. When you’re making an initial “sketch” of a program by quickly writing code, it’s often more convenient to use inlines. However, if you’re concerned about efficiency, it’s a place to look.

Inlines & the compiler

Limitations

Order of evaluation

Hidden activities in constructors & destructors