CodeGuru : Thinking in C++

All these issues make it clear that one of the first standard class libraries for C++ should handle I/O. Because “hello, world” is the first program just about everyone writes in a new language, and because I/O is part of virtually every program, the I/O library in C++ must be particularly easy to use. It also has the much greater challenge that it can never know all the classes it must accommodate, but it must nevertheless be adaptable to use any new class. Thus its constraints required that this first class be a truly inspired design.

This chapter won’t look at the details of the design and how to add iostream functionality to your own classes (you’ll learn that in a later chapter). First, you need to learn to use iostreams. In addition to gaining a great deal of leverage and clarity in your dealings with I/O and formatting, you’ll also see how a really powerful C++ library can work.

Sneak preview of operator overloading

Before you can use the iostreams library, you must understand one new feature of the language that won’t be covered in detail until a later chapter. To use iostreams, you need to know that in C++ all the operators can take on different meanings. In this chapter, we’re particularly interested in << and >>. The statement “operators can take on different meanings” deserves some extra insight.

In Chapter XX, you learned how function overloading allows you to use the same function name with different argument lists. Now imagine that when the compiler sees an expression consisting of an argument followed by an operator followed by an argument, it simply calls a function. That is, an operator is simply a function call with a different syntax.

Of course, this is C++, which is very particular about data types. So there must be a previously declared function to match that operator and those particular argument types, or the compiler will not accept the expression.

What most people find immediately disturbing about operator overloading is the thought that maybe everything they know about operators in C is suddenly wrong. This is absolutely false. Here are two of the sacred design goals of C++:

Keeping these goals in mind will help answer a lot of questions; knowing there are no capricious changes to C when moving to C++ helps make the transition easy. In particular, operators for built-in types won’t suddenly start working differently – you cannot change their meaning. Overloaded operators can be created only where new data types are involved. So you can create a new overloaded operator for a new class, but the expression

Inserters and extractors

In the iostreams library, two operators have been overloaded to make the use of iostreams easy. The operator << is often referred to as an inserter for iostreams, and the operator >> is often referred to as an extractor.

A stream is an object that formats and holds bytes. You can have an input stream ( istream) or an output stream ( ostream). There are different types of istreams and ostreams: ifstreams and ofstreams for files, istrstreams , and ostrstreams for char* memory (in-core formatting), and istringstreams & ostringstreams for interfacing with the Standard C++ string class . All these stream objects have the same interface, regardless of whether you’re working with a file, standard I/O, a piece of memory or a string object. The single interface you learn also works for extensions added to support new classes.

If a stream is capable of producing bytes (an istream), you can get information from the stream using an extractor. The extractor produces and formats the type of information that’s expected by the destination object. To see an example of this, you can use the cin object, which is the iostream equivalent of stdin in C, that is, redirectable standard input. This object is pre-defined whenever you include the iostream.h header file. (Thus, the iostream library is automatically linked with most compilers.)

There’s an overloaded operator >> for every data type you can use as the right-hand argument of >> in an iostream statement. (You can also overload your own, which you’ll see in a later chapter.)

To find out what you have in the various variables, you can use the cout object (corresponding to standard output; there’s also a cerr object corresponding to standard error) with the inserter <<:

This is notably tedious, and doesn’t seem like much of an improvement over printf( ), type checking or no. Fortunately, the overloaded inserters and extractors in iostreams are designed to be chained together into a complex expression that is much easier to write:

Manipulators

One new element has been added here: a manipulator called endl. A manipulator acts on the stream itself; in this case it inserts a newline and flushes the stream (puts out all pending characters that have been stored in the internal stream buffer but not yet output). You can also just flush the stream:

and a manipulator called ends, which is like endl, only for strstreams (covered in a while). These are all the manipulators in <iostream>, but there are more in <iomanip> you’ll see later in the chapter.

Common usage

Although cin and the extractor >> provide a nice balance to cout and the inserter <<, in practice using formatted input routines, especially with standard input, has the same problems you run into with scanf( ). If the input produces an unexpected value, the process is skewed, and it’s very difficult to recover. In addition, formatted input defaults to whitespace delimiters. So if you collect the above code fragments into a program

Notice that buf got only the first word because the input routine looked for a space to delimit the input, which it saw after “this.” In addition, if the continuous input string is longer than the storage allocated for buf, you’ll overrun the buffer.

It seems cin and the extractor are provided only for completeness, and this is probably a good way to look at it. In practice, you’ll usually want to get your input a line at a time as a sequence of characters and then scan them and perform conversions once they’re safely in a buffer. This way you don’t have to worry about the input routine choking on unexpected data.

Another thing to consider is the whole concept of a command-line interface . This has made sense in the past when the console was little more than a glass typewriter, but the world is rapidly changing to one where the graphical user interface (GUI) dominates. What is the meaning of console I/O in such a world? It makes much more sense to ignore cin altogether other than for very simple examples or tests, and take the following approaches:

Line-oriented input

To grab input a line at a time, you have two choices: the member functions get( ) and getline( ). Both functions take three arguments: a pointer to a character buffer in which to store the result, the size of that buffer (so they don’t overrun it), and the terminating character, to know when to stop reading input. The terminating character has a default value of ‘\n’, which is what you’ll usually use. Both functions store a zero in the result buffer when they encounter the terminating character in the input.

So what’s the difference? Subtle, but important: get( ) stops when it sees the delimiter in the input stream, but it doesn’t extract it from the input stream. Thus, if you did another get( ) using the same delimiter it would immediately return with no fetched input. (Presumably, you either use a different delimiter in the next get( ) statement or a different input function.) getline( ), on the other hand, extracts the delimiter from the input stream, but still doesn’t store it in the result buffer.

Overloaded versions of get( )

get( ) also comes in three other overloaded versions: one with no arguments that returns the next character, using an int return value; one that stuffs a character into its char argument, using a reference (You’ll have to jump forward to Chapter XX if you want to understand it right this minute . . . .); and one that stores directly into the underlying buffer structure of another iostream object. That is explored later in the chapter.

Reading raw bytes

If you know exactly what you’re dealing with and want to move the bytes directly into a variable, array, or structure in memory, you can use read( ). The first argument is a pointer to the destination memory, and the second is the number of bytes to read. This is especially useful if you’ve previously stored the information to a file, for example, in binary form using the complementary write( ) member function for an output stream. You’ll see examples of all these functions later.

Error handling

All the versions of get( ) and getline( ) return the input stream from which the characters came except for get( ) with no arguments, which returns the next character or EOF. If you get the input stream object back, you can ask it if it’s still OK. In fact, you can ask any iostream object if it’s OK using the member functions good( ), eof( ), fail( ), and bad( ). These return state information based on the eofbit (indicates the buffer is at the end of sequence), the failbit (indicates some operation has failed because of formatting issues or some other problem that does not affect the buffer) and the badbit (indicates something has gone wrong with the buffer).

However, as mentioned earlier, the state of an input stream generally gets corrupted in weird ways only when you’re trying to do input to specific types and the type read from the input is inconsistent with what is expected. Then of course you have the problem of what to do with the input stream to correct the problem. If you follow my advice and read input a line at a time or as a big glob of characters (with read( )) and don’t attempt to use the input formatting functions except in simple cases, then all you’re concerned with is whether you’re at the end of the input (EOF). Fortunately, testing for this turns out to be simple and can be done inside of conditionals, such as while(cin) or if(cin). For now you’ll have to accept that when you use an input stream object in this context, the right value is safely, correctly and magically produced to indicate whether the object has reached the end of the input. You can also use the Boolean NOT operator !, as in if(!cin), to indicate the stream is not OK; that is, you’ve probably reached the end of input and should quit trying to read the stream.

There are times when the stream becomes not-OK, but you understand this condition and want to go on using it. For example, if you reach the end of an input file, the eofbit and failbit are set, so a conditional on that stream object will indicate the stream is no longer good. However, you may want to continue using the file, by seeking to an earlier position and reading more data. To correct the condition, simply call the clear( ) member function. [51]

Iostreams to the rescue