Saturday, March 14, 2009

type casting

Converting an expression of a given type into another type is known as type-casting. We have already seen some ways to type cast:

Implicit conversion
Implicit conversions do not require any operator. They are automatically performed when a value is copied to a compatible type. For example:
short a=2000;
int b;
b=a;


Here, the value of a has been promoted from short to int and we have not had to specify any type-casting operator. This is known as a standard conversion. Standard conversions affect fundamental data types, and allow conversions such as the conversions between numerical types (short to int, int to float, double to int...), to or from bool, and some pointer conversions. Some of these conversions may imply a loss of precision, which the compiler can signal with a warning. This can be avoided with an explicit conversion.

Implicit conversions also include constructor or operator conversions, which affect classes that include specific constructors or operator functions to perform conversions. For example:

class A {};
class B { public: B (A a) {} };

A a;
B b=a;


Here, a implicit conversion happened between objects of class A and class B, because B has a constructor that takes an object of class A as parameter. Therefore implicit conversions from A to B are allowed.


Explicit conversion
C++ is a strong-typed language. Many conversions, specially those that imply a different interpretation of the value, require an explicit conversion. We have already seen two notations for explicit type conversion: functional and c-like casting:
short a=2000;
int b;
b = (int) a; // c-like cast notation
b = int (a); // functional notation


The functionality of these explicit conversion operators is enough for most needs with fundamental data types. However, these operators can be applied indiscriminately on classes and pointers to classes, which can lead to code that while being syntactically correct can cause runtime errors. For example, the following code is syntactically correct:

// class type-casting
#include
using namespace std;

class CDummy {
float i,j;
};

class CAddition {
int x,y;
public:
CAddition (int a, int b) { x=a; y=b; }
int result() { return x+y;}
};

int main () {
CDummy d;
CAddition * padd;
padd = (CAddition*) &d;
cout << padd->result();
return 0;
}



The program declares a pointer to CAddition, but then it assigns to it a reference to an object of another incompatible type using explicit type-casting:

padd = (CAddition*) &d;


Traditional explicit type-casting allows to convert any pointer into any other pointer type, independently of the types they point to. The subsequent call to member result will produce either a run-time error or a unexpected result.

In order to control these types of conversions between classes, we have four specific casting operators: dynamic_cast, reinterpret_cast, static_cast and const_cast. Their format is to follow the new type enclosed between angle-brackets (<>) and immediately after, the expression to be converted between parentheses.

dynamic_cast (expression)
reinterpret_cast (expression)
static_cast (expression)
const_cast (expression)


The traditional type-casting equivalents to these expressions would be:

(new_type) expression
new_type (expression)


but each one with its own special characteristics:


dynamic_cast
dynamic_cast can be used only with pointers and references to objects. Its purpose is to ensure that the result of the type conversion is a valid complete object of the requested class.

Therefore, dynamic_cast is always successful when we cast a class to one of its base classes:

class CBase { };
class CDerived: public CBase { };

CBase b; CBase* pb;
CDerived d; CDerived* pd;

pb = dynamic_cast(&d); // ok: derived-to-base
pd = dynamic_cast(&b); // wrong: base-to-derived


The second conversion in this piece of code would produce a compilation error since base-to-derived conversions are not allowed with dynamic_cast unless the base class is polymorphic.

When a class is polymorphic, dynamic_cast performs a special checking during runtime to ensure that the expression yields a valid complete object of the requested class:

// dynamic_cast
#include
#include
using namespace std;

class CBase { virtual void dummy() {} };
class CDerived: public CBase { int a; };

int main () {
try {
CBase * pba = new CDerived;
CBase * pbb = new CBase;
CDerived * pd;

pd = dynamic_cast(pba);
if (pd==0) cout << "Null pointer on first type-cast" << endl;

pd = dynamic_cast(pbb);
if (pd==0) cout << "Null pointer on second type-cast" << endl;

} catch (exception& e) {cout << "Exception: " << e.what();}
return 0;
}
Null pointer on second type-cast


Compatibility note: dynamic_cast requires the Run-Time Type Information (RTTI) to keep track of dynamic types. Some compilers support this feature as an option which is disabled by default. This must be enabled for runtime type checking using dynamic_cast to work properly.



The code tries to perform two dynamic casts from pointer objects of type CBase* (pba and pbb) to a pointer object of type CDerived*, but only the first one is successful. Notice their respective initializations:

CBase * pba = new CDerived;
CBase * pbb = new CBase;


Even though both are pointers of type CBase*, pba points to an object of type CDerived, while pbb points to an object of type CBase. Thus, when their respective type-castings are performed using dynamic_cast, pba is pointing to a full object of class CDerived, whereas pbb is pointing to an object of class CBase, which is an incomplete object of class CDerived.

When dynamic_cast cannot cast a pointer because it is not a complete object of the required class -as in the second conversion in the previous example- it returns a null pointer to indicate the failure. If dynamic_cast is used to convert to a reference type and the conversion is not possible, an exception of type bad_cast is thrown instead.

dynamic_cast can also cast null pointers even between pointers to unrelated classes, and can also cast pointers of any type to void pointers (void*).


static_cast
static_cast can perform conversions between pointers to related classes, not only from the derived class to its base, but also from a base class to its derived. This ensures that at least the classes are compatible if the proper object is converted, but no safety check is performed during runtime to check if the object being converted is in fact a full object of the destination type. Therefore, it is up to the programmer to ensure that the conversion is safe. On the other side, the overhead of the type-safety checks of dynamic_cast is avoided.
class CBase {};
class CDerived: public CBase {};
CBase * a = new CBase;
CDerived * b = static_cast(a);


This would be valid, although b would point to an incomplete object of the class and could lead to runtime errors if dereferenced.

static_cast can also be used to perform any other non-pointer conversion that could also be performed implicitly, like for example standard conversion between fundamental types:

double d=3.14159265;
int i = static_cast(d);


Or any conversion between classes with explicit constructors or operator functions as described in "implicit conversions" above.


reinterpret_cast
reinterpret_cast converts any pointer type to any other pointer type, even of unrelated classes. The operation result is a simple binary copy of the value from one pointer to the other. All pointer conversions are allowed: neither the content pointed nor the pointer type itself is checked.
It can also cast pointers to or from integer types. The format in which this integer value represents a pointer is platform-specific. The only guarantee is that a pointer cast to an integer type large enough to fully contain it, is granted to be able to be cast back to a valid pointer.

The conversions that can be performed by reinterpret_cast but not by static_cast have no specific uses in C++ are low-level operations, whose interpretation results in code which is generally system-specific, and thus non-portable. For example:

class A {};
class B {};
A * a = new A;
B * b = reinterpret_cast(a);


This is valid C++ code, although it does not make much sense, since now we have a pointer that points to an object of an incompatible class, and thus dereferencing it is unsafe.


const_cast
This type of casting manipulates the constness of an object, either to be set or to be removed. For example, in order to pass a const argument to a function that expects a non-constant parameter:
// const_cast
#include
using namespace std;

void print (char * str)
{
cout << str << endl;
}

int main () {
const char * c = "sample text";
print ( const_cast (c) );
return 0;
}
sample text



typeid
typeid allows to check the type of an expression:
typeid (expression)


This operator returns a reference to a constant object of type type_info that is defined in the standard header file . This returned value can be compared with another one using operators == and != or can serve to obtain a null-terminated character sequence representing the data type or class name by using its name() member.

// typeid
#include
#include
using namespace std;

int main () {
int * a,b;
a=0; b=0;
if (typeid(a) != typeid(b))
{
cout << "a and b are of different types:\n";
cout << "a is: " << typeid(a).name() << '\n';
cout << "b is: " << typeid(b).name() << '\n';
}
return 0;
}
a and b are of different types:
a is: int *
b is: int


When typeid is applied to classes typeid uses the RTTI to keep track of the type of dynamic objects. When typeid is applied to an expression whose type is a polymorphic class, the result is the type of the most derived complete object:

// typeid, polymorphic class
#include
#include
#include
using namespace std;

class CBase { virtual void f(){} };
class CDerived : public CBase {};

int main () {
try {
CBase* a = new CBase;
CBase* b = new CDerived;
cout << "a is: " << typeid(a).name() << '\n';
cout << "b is: " << typeid(b).name() << '\n';
cout << "*a is: " << typeid(*a).name() << '\n';
cout << "*b is: " << typeid(*b).name() << '\n';
} catch (exception& e) { cout << "Exception: " << e.what() << endl; }
return 0;
}
a is: class CBase *
b is: class CBase *
*a is: class CBase
*b is: class CDerived


Notice how the type that typeid considers for pointers is the pointer type itself (both a and b are of type class CBase *). However, when typeid is applied to objects (like *a and *b) typeid yields their dynamic type (i.e. the type of their most derived complete object).

If the type typeid evaluates is a pointer preceded by the dereference operator (*), and this pointer has a null value, typeid throws a bad_typeid exception.

structure of program

Probably the best way to start learning a programming language is by writing a program. Therefore, here is our first program:
// my first program in C++

#include
using namespace std;

int main ()
{
cout << "Hello World!";
return 0;
}
Hello World!


The first panel shows the source code for our first program. The second one shows the result of the program once compiled and executed. The way to edit and compile a program depends on the compiler you are using. Depending on whether it has a Development Interface or not and on its version. Consult the compilers section and the manual or help included with your compiler if you have doubts on how to compile a C++ console program.

The previous program is the typical program that programmer apprentices write for the first time, and its result is the printing on screen of the "Hello World!" sentence. It is one of the simplest programs that can be written in C++, but it already contains the fundamental components that every C++ program has. We are going to look line by line at the code we have just written:


// my first program in C++
This is a comment line. All lines beginning with two slash signs (//) are considered comments and do not have any effect on the behavior of the program. The programmer can use them to include short explanations or observations within the source code itself. In this case, the line is a brief description of what our program is.

#include
Lines beginning with a hash sign (#) are directives for the preprocessor. They are not regular code lines with expressions but indications for the compiler's preprocessor. In this case the directive #include tells the preprocessor to include the iostream standard file. This specific file (iostream) includes the declarations of the basic standard input-output library in C++, and it is included because its functionality is going to be used later in the program.

using namespace std;
All the elements of the standard C++ library are declared within what is called a namespace, the namespace with the name std. So in order to access its functionality we declare with this expression that we will be using these entities. This line is very frequent in C++ programs that use the standard library, and in fact it will be included in most of the source codes included in these tutorials.

int main ()
This line corresponds to the beginning of the definition of the main function. The main function is the point by where all C++ programs start their execution, independently of its location within the source code. It does not matter whether there are other functions with other names defined before or after it - the instructions contained within this function's definition will always be the first ones to be executed in any C++ program. For that same reason, it is essential that all C++ programs have a main function.
The word main is followed in the code by a pair of parentheses (()). That is because it is a function declaration: In C++, what differentiates a function declaration from other types of expressions are these parentheses that follow its name. Optionally, these parentheses may enclose a list of parameters within them.

Right after these parentheses we can find the body of the main function enclosed in braces ({}). What is contained within these braces is what the function does when it is executed.


cout << "Hello World!";
This line is a C++ statement. A statement is a simple or compound expression that can actually produce some effect. In fact, this statement performs the only action that generates a visible effect in our first program.
cout represents the standard output stream in C++, and the meaning of the entire statement is to insert a sequence of characters (in this case the Hello World sequence of characters) into the standard output stream (which usually is the screen).

cout is declared in the iostream standard file within the std namespace, so that's why we needed to include that specific file and to declare that we were going to use this specific namespace earlier in our code.

Notice that the statement ends with a semicolon character (;). This character is used to mark the end of the statement and in fact it must be included at the end of all expression statements in all C++ programs (one of the most common syntax errors is indeed to forget to include some semicolon after a statement).


return 0;
The return statement causes the main function to finish. return may be followed by a return code (in our example is followed by the return code 0). A return code of 0 for the main function is generally interpreted as the program worked as expected without any errors during its execution. This is the most usual way to end a C++ console program.

You may have noticed that not all the lines of this program perform actions when the code is executed. There were lines containing only comments (those beginning by //). There were lines with directives for the compiler's preprocessor (those beginning by #). Then there were lines that began the declaration of a function (in this case, the main function) and, finally lines with statements (like the insertion into cout), which were all included within the block delimited by the braces ({}) of the main function.

The program has been structured in different lines in order to be more readable, but in C++, we do not have strict rules on how to separate instructions in different lines. For example, instead of

int main ()
{
cout << " Hello World!";
return 0;
}


We could have written:

int main () { cout << "Hello World!"; return 0; }


All in just one line and this would have had exactly the same meaning as the previous code.

In C++, the separation between statements is specified with an ending semicolon (;) at the end of each one, so the separation in different code lines does not matter at all for this purpose. We can write many statements per line or write a single statement that takes many code lines. The division of code in different lines serves only to make it more legible and schematic for the humans that may read it.

Let us add an additional instruction to our first program:

// my second program in C++

#include

using namespace std;

int main ()
{
cout << "Hello World! ";
cout << "I'm a C++ program";
return 0;
}
Hello World! I'm a C++ program


In this case, we performed two insertions into cout in two different statements. Once again, the separation in different lines of code has been done just to give greater readability to the program, since main could have been perfectly valid defined this way:

int main () { cout << " Hello World! "; cout << " I'm a C++ program "; return 0; }


We were also free to divide the code into more lines if we considered it more convenient:

int main ()
{
cout <<
"Hello World!";
cout
<< "I'm a C++ program";
return 0;
}


And the result would again have been exactly the same as in the previous examples.

Preprocessor directives (those that begin by #) are out of this general rule since they are not statements. They are lines read and processed by the preprocessor and do not produce any code by themselves. Preprocessor directives must be specified in their own line and do not have to end with a semicolon (;).


Comments
Comments are parts of the source code disregarded by the compiler. They simply do nothing. Their purpose is only to allow the programmer to insert notes or descriptions embedded within the source code.

C++ supports two ways to insert comments:

// line comment
/* block comment */


The first of them, known as line comment, discards everything from where the pair of slash signs (//) is found up to the end of that same line. The second one, known as block comment, discards everything between the /* characters and the first appearance of the */ characters, with the possibility of including more than one line.
We are going to add comments to our second program:

/* my second program in C++
with more comments */

#include
using namespace std;

int main ()
{
cout << "Hello World! "; // prints Hello World!
cout << "I'm a C++ program"; // prints I'm a C++ program
return 0;
}
Hello World! I'm a C++ program


If you include comments within the source code of your programs without using the comment characters combinations //, /* or */, the compiler will take them as if they were C++ expressions, most likely causing one or several error messages when you compile it.

Variable & data types

The usefulness of the "Hello World" programs shown in the previous section is quite questionable. We had to write several lines of code, compile them, and then execute the resulting program just to obtain a simple sentence written on the screen as result. It certainly would have been much faster to type the output sentence by ourselves. However, programming is not limited only to printing simple texts on the screen. In order to go a little further on and to become able to write programs that perform useful tasks that really save us work we need to introduce the concept of variable.
Let us think that I ask you to retain the number 5 in your mental memory, and then I ask you to memorize also the number 2 at the same time. You have just stored two different values in your memory. Now, if I ask you to add 1 to the first number I said, you should be retaining the numbers 6 (that is 5+1) and 2 in your memory. Values that we could now for example subtract and obtain 4 as result.

The whole process that you have just done with your mental memory is a simile of what a computer can do with two variables. The same process can be expressed in C++ with the following instruction set:

a = 5;
b = 2;
a = a + 1;
result = a - b;


Obviously, this is a very simple example since we have only used two small integer values, but consider that your computer can store millions of numbers like these at the same time and conduct sophisticated mathematical operations with them.

Therefore, we can define a variable as a portion of memory to store a determined value.

Each variable needs an identifier that distinguishes it from the others, for example, in the previous code the variable identifiers were a, b and result, but we could have called the variables any names we wanted to invent, as long as they were valid identifiers.


Identifiers
A valid identifier is a sequence of one or more letters, digits or underscore characters (_). Neither spaces nor punctuation marks or symbols can be part of an identifier. Only letters, digits and single underscore characters are valid. In addition, variable identifiers always have to begin with a letter. They can also begin with an underline character (_ ), but in some cases these may be reserved for compiler specific keywords or external identifiers, as well as identifiers containing two successive underscore characters anywhere. In no case they can begin with a digit.
Another rule that you have to consider when inventing your own identifiers is that they cannot match any keyword of the C++ language nor your compiler's specific ones, which are reserved keywords. The standard reserved keywords are:

asm, auto, bool, break, case, catch, char, class, const, const_cast, continue, default, delete, do, double, dynamic_cast, else, enum, explicit, export, extern, false, float, for, friend, goto, if, inline, int, long, mutable, namespace, new, operator, private, protected, public, register, reinterpret_cast, return, short, signed, sizeof, static, static_cast, struct, switch, template, this, throw, true, try, typedef, typeid, typename, union, unsigned, using, virtual, void, volatile, wchar_t, while


Additionally, alternative representations for some operators cannot be used as identifiers since they are reserved words under some circumstances:

and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq


Your compiler may also include some additional specific reserved keywords.

Very important: The C++ language is a "case sensitive" language. That means that an identifier written in capital letters is not equivalent to another one with the same name but written in small letters. Thus, for example, the RESULT variable is not the same as the result variable or the Result variable. These are three different variable identifiers.


Fundamental data types
When programming, we store the variables in our computer's memory, but the computer has to know what kind of data we want to store in them, since it is not going to occupy the same amount of memory to store a simple number than to store a single letter or a large number, and they are not going to be interpreted the same way.
The memory in our computers is organized in bytes. A byte is the minimum amount of memory that we can manage in C++. A byte can store a relatively small amount of data: one single character or a small integer (generally an integer between 0 and 255). In addition, the computer can manipulate more complex data types that come from grouping several bytes, such as long numbers or non-integer numbers.

Next you have a summary of the basic fundamental data types in C++, as well as the range of values that can be represented with each one:

Name Description Size* Range*
char Character or small integer. 1byte signed: -128 to 127
unsigned: 0 to 255
short int (short) Short Integer. 2bytes signed: -32768 to 32767
unsigned: 0 to 65535
int Integer. 4bytes signed: -2147483648 to 2147483647
unsigned: 0 to 4294967295
long int (long) Long integer. 4bytes signed: -2147483648 to 2147483647
unsigned: 0 to 4294967295
bool Boolean value. It can take one of two values: true or false. 1byte true or false
float Floating point number. 4bytes +/- 3.4e +/- 38 (~7 digits)
double Double precision floating point number. 8bytes +/- 1.7e +/- 308 (~15 digits)
long double Long double precision floating point number. 8bytes +/- 1.7e +/- 308 (~15 digits)
wchar_t Wide character. 2 or 4 bytes 1 wide character


* The values of the columns Size and Range depend on the system the program is compiled for. The values shown above are those found on most 32-bit systems. But for other systems, the general specification is that int has the natural size suggested by the system architecture (one "word") and the four integer types char, short, int and long must each one be at least as large as the one preceding it, with char being always 1 byte in size. The same applies to the floating point types float, double and long double, where each one must provide at least as much precision as the preceding one.


Declaration of variables
In order to use a variable in C++, we must first declare it specifying which data type we want it to be. The syntax to declare a new variable is to write the specifier of the desired data type (like int, bool, float...) followed by a valid variable identifier. For example:
int a;
float mynumber;


These are two valid declarations of variables. The first one declares a variable of type int with the identifier a. The second one declares a variable of type float with the identifier mynumber. Once declared, the variables a and mynumber can be used within the rest of their scope in the program.

If you are going to declare more than one variable of the same type, you can declare all of them in a single statement by separating their identifiers with commas. For example:

int a, b, c;


This declares three variables (a, b and c), all of them of type int, and has exactly the same meaning as:

int a;
int b;
int c;


The integer data types char, short, long and int can be either signed or unsigned depending on the range of numbers needed to be represented. Signed types can represent both positive and negative values, whereas unsigned types can only represent positive values (and zero). This can be specified by using either the specifier signed or the specifier unsigned before the type name. For example:

unsigned short int NumberOfSisters;
signed int MyAccountBalance;


By default, if we do not specify either signed or unsigned most compiler settings will assume the type to be signed, therefore instead of the second declaration above we could have written:

int MyAccountBalance;


with exactly the same meaning (with or without the keyword signed)

An exception to this general rule is the char type, which exists by itself and is considered a different fundamental data type from signed char and unsigned char, thought to store characters. You should use either signed or unsigned if you intend to store numerical values in a char-sized variable.

short and long can be used alone as type specifiers. In this case, they refer to their respective integer fundamental types: short is equivalent to short int and long is equivalent to long int. The following two variable declarations are equivalent:

short Year;
short int Year;


Finally, signed and unsigned may also be used as standalone type specifiers, meaning the same as signed int and unsigned int respectively. The following two declarations are equivalent:

unsigned NextYear;
unsigned int NextYear;


To see what variable declarations look like in action within a program, we are going to see the C++ code of the example about your mental memory proposed at the beginning of this section:

// operating with variables

#include
using namespace std;

int main ()
{
// declaring variables:
int a, b;
int result;

// process:
a = 5;
b = 2;
a = a + 1;
result = a - b;

// print out the result:
cout << result;

// terminate the program:
return 0;
}
4


Do not worry if something else than the variable declarations themselves looks a bit strange to you. You will see the rest in detail in coming sections.


Scope of variables
All the variables that we intend to use in a program must have been declared with its type specifier in an earlier point in the code, like we did in the previous code at the beginning of the body of the function main when we declared that a, b, and result were of type int.
A variable can be either of global or local scope. A global variable is a variable declared in the main body of the source code, outside all functions, while a local variable is one declared within the body of a function or a block.



Global variables can be referred from anywhere in the code, even inside functions, whenever it is after its declaration.

The scope of local variables is limited to the block enclosed in braces ({}) where they are declared. For example, if they are declared at the beginning of the body of a function (like in function main) their scope is between its declaration point and the end of that function. In the example above, this means that if another function existed in addition to main, the local variables declared in main could not be accessed from the other function and vice versa.



Initialization of variables
When declaring a regular local variable, its value is by default undetermined. But you may want a variable to store a concrete value at the same moment that it is declared. In order to do that, you can initialize the variable. There are two ways to do this in C++:
The first one, known as c-like, is done by appending an equal sign followed by the value to which the variable will be initialized:

type identifier = initial_value ;

For example, if we want to declare an int variable called a initialized with a value of 0 at the moment in which it is declared, we could write:

int a = 0;


The other way to initialize variables, known as constructor initialization, is done by enclosing the initial value between parentheses (()):

type identifier (initial_value) ;

For example:

int a (0);


Both ways of initializing variables are valid and equivalent in C++.

// initialization of variables

#include
using namespace std;

int main ()
{
int a=5; // initial value = 5
int b(2); // initial value = 2
int result; // initial value undetermined

a = a + 3;
result = a - b;
cout << result;

return 0;
}
6



Introduction to strings
Variables that can store non-numerical values that are longer than one single character are known as strings.
The C++ language library provides support for strings through the standard string class. This is not a fundamental type, but it behaves in a similar way as fundamental types do in its most basic usage.

A first difference with fundamental data types is that in order to declare and use objects (variables) of this type we need to include an additional header file in our source code: and have access to the std namespace (which we already had in all our previous programs thanks to the using namespace statement).

// my first string
#include
#include
using namespace std;

int main ()
{
string mystring = "This is a string";
cout << mystring;
return 0;
}
This is a string


As you may see in the previous example, strings can be initialized with any valid string literal just like numerical type variables can be initialized to any valid numerical literal. Both initialization formats are valid with strings:

string mystring = "This is a string";
string mystring ("This is a string");


Strings can also perform all the other basic operations that fundamental data types can, like being declared without an initial value and being assigned values during execution:

// my first string
#include
#include
using namespace std;

int main ()
{
string mystring;
mystring = "This is the initial string content";
cout << mystring << endl;
mystring = "This is a different string content";
cout << mystring << endl;
return 0;
}

This is the initial string content
This is a different string content


For more details on C++ strings, you can have a look at the string class reference.

Template concept

Function templates
Function templates are special functions that can operate with generic types. This allows us to create a function template whose functionality can be adapted to more than one type or class without repeating the entire code for each type.
In C++ this can be achieved using template parameters. A template parameter is a special kind of parameter that can be used to pass a type as argument: just like regular function parameters can be used to pass values to a function, template parameters allow to pass also types to a function. These function templates can use these parameters as if they were any other regular type.

The format for declaring function templates with type parameters is:

template function_declaration;
template function_declaration;


The only difference between both prototypes is the use of either the keyword class or the keyword typename. Its use is indistinct, since both expressions have exactly the same meaning and behave exactly the same way.

For example, to create a template function that returns the greater one of two objects we could use:

template
myType GetMax (myType a, myType b) {
return (a>b?a:b);
}


Here we have created a template function with myType as its template parameter. This template parameter represents a type that has not yet been specified, but that can be used in the template function as if it were a regular type. As you can see, the function template GetMax returns the greater of two parameters of this still-undefined type.

To use this function template we use the following format for the function call:

function_name (parameters);


For example, to call GetMax to compare two integer values of type int we can write:

int x,y;
GetMax (x,y);


When the compiler encounters this call to a template function, it uses the template to automatically generate a function replacing each appearance of myType by the type passed as the actual template parameter (int in this case) and then calls it. This process is automatically performed by the compiler and is invisible to the programmer.

Here is the entire example:

// function template
#include
using namespace std;

template
T GetMax (T a, T b) {
T result;
result = (a>b)? a : b;
return (result);
}

int main () {
int i=5, j=6, k;
long l=10, m=5, n;
k=GetMax(i,j);
n=GetMax(l,m);
cout << k << endl;
cout << n << endl;
return 0;
}
6
10


In this case, we have used T as the template parameter name instead of myType because it is shorter and in fact is a very common template parameter name. But you can use any identifier you like.

In the example above we used the function template GetMax() twice. The first time with arguments of type int and the second one with arguments of type long. The compiler has instantiated and then called each time the appropriate version of the function.

As you can see, the type T is used within the GetMax() template function even to declare new objects of that type:

T result;


Therefore, result will be an object of the same type as the parameters a and b when the function template is instantiated with a specific type.

In this specific case where the generic type T is used as a parameter for GetMax the compiler can find out automatically which data type has to instantiate without having to explicitly specify it within angle brackets (like we have done before specifying and ). So we could have written instead:

int i,j;
GetMax (i,j);


Since both i and j are of type int, and the compiler can automatically find out that the template parameter can only be int. This implicit method produces exactly the same result:

// function template II
#include
using namespace std;

template
T GetMax (T a, T b) {
return (a>b?a:b);
}

int main () {
int i=5, j=6, k;
long l=10, m=5, n;
k=GetMax(i,j);
n=GetMax(l,m);
cout << k << endl;
cout << n << endl;
return 0;
}
6
10


Notice how in this case, we called our function template GetMax() without explicitly specifying the type between angle-brackets <>. The compiler automatically determines what type is needed on each call.

Because our template function includes only one template parameter (class T) and the function template itself accepts two parameters, both of this T type, we cannot call our function template with two objects of different types as arguments:

int i;
long l;
k = GetMax (i,l);


This would not be correct, since our GetMax function template expects two arguments of the same type, and in this call to it we use objects of two different types.

We can also define function templates that accept more than one type parameter, simply by specifying more template parameters between the angle brackets. For example:

template
T GetMin (T a, U b) {
return (a}


In this case, our function template GetMin() accepts two parameters of different types and returns an object of the same type as the first parameter (T) that is passed. For example, after that declaration we could call GetMin() with:

int i,j;
long l;
i = GetMin (j,l);


or simply:

i = GetMin (j,l);


even though j and l have different types, since the compiler can determine the appropriate instantiation anyway.


Class templates
We also have the possibility to write class templates, so that a class can have members that use template parameters as types. For example:
template
class mypair {
T values [2];
public:
mypair (T first, T second)
{
values[0]=first; values[1]=second;
}
};


The class that we have just defined serves to store two elements of any valid type. For example, if we wanted to declare an object of this class to store two integer values of type int with the values 115 and 36 we would write:

mypair myobject (115, 36);


this same class would also be used to create an object to store any other type:

mypair myfloats (3.0, 2.18);


The only member function in the previous class template has been defined inline within the class declaration itself. In case that we define a function member outside the declaration of the class template, we must always precede that definition with the template <...> prefix:

// class templates
#include
using namespace std;

template
class mypair {
T a, b;
public:
mypair (T first, T second)
{a=first; b=second;}
T getmax ();
};

template
T mypair::getmax ()
{
T retval;
retval = a>b? a : b;
return retval;
}

int main () {
mypair myobject (100, 75);
cout << myobject.getmax();
return 0;
}
100


Notice the syntax of the definition of member function getmax:

template
T mypair::getmax ()


Confused by so many T's? There are three T's in this declaration: The first one is the template parameter. The second T refers to the type returned by the function. And the third T (the one between angle brackets) is also a requirement: It specifies that this function's template parameter is also the class template parameter.


Template specialization
If we want to define a different implementation for a template when a specific type is passed as template parameter, we can declare a specialization of that template.
For example, let's suppose that we have a very simple class called mycontainer that can store one element of any type and that it has just one member function called increase, which increases its value. But we find that when it stores an element of type char it would be more convenient to have a completely different implementation with a function member uppercase, so we decide to declare a class template specialization for that type:

// template specialization
#include
using namespace std;

// class template:
template
class mycontainer {
T element;
public:
mycontainer (T arg) {element=arg;}
T increase () {return ++element;}
};

// class template specialization:
template <>
class mycontainer {
char element;
public:
mycontainer (char arg) {element=arg;}
char uppercase ()
{
if ((element>='a')&&(element<='z'))
element+='A'-'a';
return element;
}
};

int main () {
mycontainer myint (7);
mycontainer mychar ('j');
cout << myint.increase() << endl;
cout << mychar.uppercase() << endl;
return 0;
}
8
J


This is the syntax used in the class template specialization:

template <> class mycontainer { ... };


First of all, notice that we precede the class template name with an emptytemplate<> parameter list. This is to explicitly declare it as a template specialization.

But more important than this prefix, is the specialization parameter after the class template name. This specialization parameter itself identifies the type for which we are going to declare a template class specialization (char). Notice the differences between the generic class template and the specialization:

template class mycontainer { ... };
template <> class mycontainer { ... };


The first line is the generic template, and the second one is the specialization.

When we declare specializations for a template class, we must also define all its members, even those exactly equal to the generic template class, because there is no "inheritance" of members from the generic template to the specialization.


Non-type parameters for templates
Besides the template arguments that are preceded by the class or typename keywords , which represent types, templates can also have regular typed parameters, similar to those found in functions. As an example, have a look at this class template that is used to contain sequences of elements:
// sequence template
#include
using namespace std;

template
class mysequence {
T memblock [N];
public:
void setmember (int x, T value);
T getmember (int x);
};

template
void mysequence::setmember (int x, T value) {
memblock[x]=value;
}

template
T mysequence::getmember (int x) {
return memblock[x];
}

int main () {
mysequence myints;
mysequence myfloats;
myints.setmember (0,100);
myfloats.setmember (3,3.1416);
cout << myints.getmember(0) << '\n';
cout << myfloats.getmember(3) << '\n';
return 0;
}
100
3.1416


It is also possible to set default values or types for class template parameters. For example, if the previous class template definition had been:

template class mysequence {..};


We could create objects using the default template parameters by declaring:

mysequence<> myseq;


Which would be equivalent to:

mysequence myseq;



Templates and multiple-file projects
From the point of view of the compiler, templates are not normal functions or classes. They are compiled on demand, meaning that the code of a template function is not compiled until an instantiation with specific template arguments is required. At that moment, when an instantiation is required, the compiler generates a function specifically for those arguments from the template.
When projects grow it is usual to split the code of a program in different source code files. In these cases, the interface and implementation are generally separated. Taking a library of functions as example, the interface generally consists of declarations of the prototypes of all the functions that can be called. These are generally declared in a "header file" with a .h extension, and the implementation (the definition of these functions) is in an independent file with c++ code.

Because templates are compiled when required, this forces a restriction for multi-file projects: the implementation (definition) of a template class or function must be in the same file as its declaration. That means that we cannot separate the interface in a separate header file, and that we must include both interface and implementation in any file that uses the templates.

Since no code is generated until a template is instantiated when required, compilers are prepared to allow the inclusion more than once of the same template file with both declarations and definitions in a project without generating linkage errors.


preprocessar directories

Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a hash sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.
These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. No semicolon (;) is expected at the end of a preprocessor directive. The only way a preprocessor directive can extend through more than one line is by preceding the newline character at the end of the line by a backslash (\).


macro definitions (#define, #undef)
To define preprocessor macros we can use #define. Its format is:
#define identifier replacement


When the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This replacement can be an expression, a statement, a block or simply anything. The preprocessor does not understand C++, it simply replaces any occurrence of identifier by replacement.

#define TABLE_SIZE 100
int table1[TABLE_SIZE];
int table2[TABLE_SIZE];


After the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to:

int table1[100];
int table2[100];


This use of #define as constant definer is already known by us from previous tutorials, but #define can work also with parameters to define function macros:

#define getmax(a,b) a>b?a:b


This would replace any occurrence of getmax followed by two arguments by the replacement expression, but also replacing each argument by its identifier, exactly as you would expect if it was a function:

// function macro
#include
using namespace std;

#define getmax(a,b) ((a)>(b)?(a):(b))

int main()
{
int x=5, y;
y= getmax(x,2);
cout << y << endl;
cout << getmax(7,x) << endl;
return 0;
}
5
7


Defined macros are not affected by block structure. A macro lasts until it is undefined with the #undef preprocessor directive:

#define TABLE_SIZE 100
int table1[TABLE_SIZE];
#undef TABLE_SIZE
#define TABLE_SIZE 200
int table2[TABLE_SIZE];


This would generate the same code as:

int table1[100];
int table2[200];


Function macro definitions accept two special operators (# and ##) in the replacement sequence:
If the operator # is used before a parameter is used in the replacement sequence, that parameter is replaced by a string literal (as if it were enclosed between double quotes)

#define str(x) #x
cout << str(test);


This would be translated into:

cout << "test";


The operator ## concatenates two arguments leaving no blank spaces between them:

#define glue(a,b) a ## b
glue(c,out) << "test";


This would also be translated into:

cout << "test";


Because preprocessor replacements happen before any C++ syntax check, macro definitions can be a tricky feature, but be careful: code that relies heavily on complicated macros may result obscure to other programmers, since the syntax they expect is on many occasions different from the regular expressions programmers expect in C++.


Conditional inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)
These directives allow to include or discard part of the code of a program if a certain condition is met.

#ifdef allows a section of a program to be compiled only if the macro that is specified as the parameter has been defined, no matter which its value is. For example:

#ifdef TABLE_SIZE
int table[TABLE_SIZE];
#endif


In this case, the line of code int table[TABLE_SIZE]; is only compiled if TABLE_SIZE was previously defined with #define, independently of its value. If it was not defined, that line will not be included in the program compilation.

#ifndef serves for the exact opposite: the code between #ifndef and #endif directives is only compiled if the specified identifier has not been previously defined. For example:

#ifndef TABLE_SIZE
#define TABLE_SIZE 100
#endif
int table[TABLE_SIZE];


In this case, if when arriving at this piece of code, the TABLE_SIZE macro has not been defined yet, it would be defined to a value of 100. If it already existed it would keep its previous value since the #define directive would not be executed.

The #if, #else and #elif (i.e., "else if") directives serve to specify some condition to be met in order for the portion of code they surround to be compiled. The condition that follows #if or #elif can only evaluate constant expressions, including macro expressions. For example:

#if TABLE_SIZE>200
#undef TABLE_SIZE
#define TABLE_SIZE 200

#elif TABLE_SIZE<50
#undef TABLE_SIZE
#define TABLE_SIZE 50

#else
#undef TABLE_SIZE
#define TABLE_SIZE 100
#endif

int table[TABLE_SIZE];


Notice how the whole structure of #if, #elif and #else chained directives ends with #endif.

The behavior of #ifdef and #ifndef can also be achieved by using the special operators defined and !defined respectively in any #if or #elif directive:

#if !defined TABLE_SIZE
#define TABLE_SIZE 100
#elif defined ARRAY_SIZE
#define TABLE_SIZE ARRAY_SIZE
int table[TABLE_SIZE];



Line control (#line)
When we compile a program and some error happen during the compiling process, the compiler shows an error message with references to the name of the file where the error happened and a line number, so it is easier to find the code generating the error.
The #line directive allows us to control both things, the line numbers within the code files as well as the file name that we want that appears when an error takes place. Its format is:

#line number "filename"


Where number is the new line number that will be assigned to the next code line. The line numbers of successive lines will be increased one by one from this point on.

"filename" is an optional parameter that allows to redefine the file name that will be shown. For example:

#line 20 "assigning variable"
int a?;


This code will generate an error that will be shown as error in file "assigning variable", line 20.


Error directive (#error)
This directive aborts the compilation process when it is found, generating a compilation the error that can be specified as its parameter:
#ifndef __cplusplus
#error A C++ compiler is required!
#endif


This example aborts the compilation process if the macro name __cplusplus is not defined (this macro name is defined by default in all C++ compilers).


Source file inclusion (#include)
This directive has also been used assiduously in other sections of this tutorial. When the preprocessor finds an #include directive it replaces it by the entire content of the specified file. There are two ways to specify a file to be included:
#include "file"
#include


The only difference between both expressions is the places (directories) where the compiler is going to look for the file. In the first case where the file name is specified between double-quotes, the file is searched first in the same directory that includes the file containing the directive. In case that it is not there, the compiler searches the file in the default directories where it is configured to look for the standard header files.
If the file name is enclosed between angle-brackets <> the file is searched directly where the compiler is configured to look for the standard header files. Therefore, standard header files are usually included in angle-brackets, while other specific header files are included using quotes.


Pragma directive (#pragma)
This directive is used to specify diverse options to the compiler. These options are specific for the platform and the compiler you use. Consult the manual or the reference of your compiler for more information on the possible parameters that you can define with #pragma.
If the compiler does not support a specific argument for #pragma, it is ignored - no error is generated.


Predefined macro names
The following macro names are defined at any time:
macro value
__LINE__ Integer value representing the current line in the source code file being compiled.
__FILE__ A string literal containing the presumed name of the source file being compiled.
__DATE__ A string literal in the form "Mmm dd yyyy" containing the date in which the compilation process began.
__TIME__ A string literal in the form "hh:mm:ss" containing the time at which the compilation process began.
__cplusplus An integer value. All C++ compilers have this constant defined to some value. If the compiler is fully compliant with the C++ standard its value is equal or greater than 199711L depending on the version of the standard they comply.


For example:

// standard macro names
#include
using namespace std;

int main()
{
cout << "This is the line number " << __LINE__;
cout << " of file " << __FILE__ << ".\n";
cout << "Its compilation began " << __DATE__;
cout << " at " << __TIME__ << ".\n";
cout << "The compiler gives a __cplusplus value of " << __cplusplus;
return 0;
}
This is the line number 7 of file /home/jay/stdmacronames.cpp.
Its compilation began Nov 1 2005 at 10:12:29.
The compiler gives a __cplusplus value of 1

Additional data type

Defined data types (typedef)
C++ allows the definition of our own types based on other existing data types. We can do this using the keyword typedef, whose format is:
typedef existing_type new_type_name ;


where existing_type is a C++ fundamental or compound type and new_type_name is the name for the new type we are defining. For example:

typedef char C;
typedef unsigned int WORD;
typedef char * pChar;
typedef char field [50];


In this case we have defined four data types: C, WORD, pChar and field as char, unsigned int, char* and char[50] respectively, that we could perfectly use in declarations later as any other valid type:

C mychar, anotherchar, *ptc1;
WORD myword;
pChar ptc2;
field name;


typedef does not create different types. It only creates synonyms of existing types. That means that the type of myword can be considered to be either WORD or unsigned int, since both are in fact the same type.

typedef can be useful to define an alias for a type that is frequently used within a program. It is also useful to define types when it is possible that we will need to change the type in later versions of our program, or if a type you want to use has a name that is too long or confusing.


Unions
Unions allow one same portion of memory to be accessed as different data types, since all of them are in fact the same location in memory. Its declaration and use is similar to the one of structures but its functionality is totally different:

union union_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;
All the elements of the union declaration occupy the same physical space in memory. Its size is the one of the greatest element of the declaration. For example:

union mytypes_t {
char c;
int i;
float f;
} mytypes;


defines three elements:

mytypes.c
mytypes.i
mytypes.f


each one with a different data type. Since all of them are referring to the same location in memory, the modification of one of the elements will affect the value of all of them. We cannot store different values in them independent of each other.

One of the uses a union may have is to unite an elementary type with an array or structures of smaller elements. For example:

union mix_t {
long l;
struct {
short hi;
short lo;
} s;
char c[4];
} mix;


defines three names that allow us to access the same group of 4 bytes: mix.l, mix.s and mix.c and which we can use according to how we want to access these bytes, as if they were a single long-type data, as if they were two short elements or as an array of char elements, respectively. I have mixed types, arrays and structures in the union so that you can see the different ways that we can access the data. For a little-endian system (most PC platforms), this union could be represented as:



The exact alignment and order of the members of a union in memory is platform dependant. Therefore be aware of possible portability issues with this type of use.


Anonymous unions
In C++ we have the option to declare anonymous unions. If we declare a union without any name, the union will be anonymous and we will be able to access its members directly by their member names. For example, look at the difference between these two structure declarations:
structure with regular union structure with anonymous union
struct {
char title[50];
char author[50];
union {
float dollars;
int yens;
} price;
} book;
struct {
char title[50];
char author[50];
union {
float dollars;
int yens;
};
} book;



The only difference between the two pieces of code is that in the first one we have given a name to the union (price) and in the second one we have not. The difference is seen when we access the members dollars and yens of an object of this type. For an object of the first type, it would be:

book.price.dollars
book.price.yens


whereas for an object of the second type, it would be:

book.dollars
book.yens


Once again I remind you that because it is a union and not a struct, the members dollars and yens occupy the same physical space in the memory so they cannot be used to store two different values simultaneously. You can set a value for price in dollars or in yens, but not in both.


Enumerations (enum)
Enumerations create new data types to contain something different that is not limited to the values fundamental data types may take. Its form is the following:

enum enumeration_name {
value1,
value2,
value3,
.
.
} object_names;
For example, we could create a new type of variable called colors_t to store colors with the following declaration:

enum colors_t {black, blue, green, cyan, red, purple, yellow, white};


Notice that we do not include any fundamental data type in the declaration. To say it somehow, we have created a whole new data type from scratch without basing it on any other existing type. The possible values that variables of this new type color_t may take are the new constant values included within braces. For example, once the colors_t enumeration is declared the following expressions will be valid:

colors_t mycolor;

mycolor = blue;
if (mycolor == green) mycolor = red;


Enumerations are type compatible with numeric variables, so their constants are always assigned an integer numerical value internally. If it is not specified, the integer value equivalent to the first possible value is equivalent to 0 and the following ones follow a +1 progression. Thus, in our data type colors_t that we have defined above, black would be equivalent to 0, blue would be equivalent to 1, green to 2, and so on.

We can explicitly specify an integer value for any of the constant values that our enumerated type can take. If the constant value that follows it is not given an integer value, it is automatically assumed the same value as the previous one plus one. For example:

enum months_t { january=1, february, march, april,
may, june, july, august,
september, october, november, december} y2k;


In this case, variable y2k of enumerated type months_t can contain any of the 12 possible values that go from january to december and that are equivalent to values between 1 and 12 (not between 0 and 11, since we have made january equal to 1).

Polymorphism concept

Before getting into this section, it is recommended that you have a proper understanding of pointers and class inheritance. If any of the following statements seem strange to you, you should review the indicated sections:
Statement: Explained in:
int a::b(c) {}; Classes
a->b Data Structures
class a: public b; Friendship and inheritance



Pointers to base class
One of the key features of derived classes is that a pointer to a derived class is type-compatible with a pointer to its base class. Polymorphism is the art of taking advantage of this simple but powerful and versatile feature, that brings Object Oriented Methodologies to its full potential.
We are going to start by rewriting our program about the rectangle and the triangle of the previous section taking into consideration this pointer compatibility property:

// pointers to base class
#include
using namespace std;

class CPolygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
};

class CRectangle: public CPolygon {
public:
int area ()
{ return (width * height); }
};

class CTriangle: public CPolygon {
public:
int area ()
{ return (width * height / 2); }
};

int main () {
CRectangle rect;
CTriangle trgl;
CPolygon * ppoly1 = ▭
CPolygon * ppoly2 = &trgl;
ppoly1->set_values (4,5);
ppoly2->set_values (4,5);
cout << rect.area() << endl;
cout << trgl.area() << endl;
return 0;
}
20
10


In function main, we create two pointers that point to objects of class CPolygon (ppoly1 and ppoly2). Then we assign references to rect and trgl to these pointers, and because both are objects of classes derived from CPolygon, both are valid assignment operations.

The only limitation in using *ppoly1 and *ppoly2 instead of rect and trgl is that both *ppoly1 and *ppoly2 are of type CPolygon* and therefore we can only use these pointers to refer to the members that CRectangle and CTriangle inherit from CPolygon. For that reason when we call the area() members at the end of the program we have had to use directly the objects rect and trgl instead of the pointers *ppoly1 and *ppoly2.

In order to use area() with the pointers to class CPolygon, this member should also have been declared in the class CPolygon, and not only in its derived classes, but the problem is that CRectangle and CTriangle implement different versions of area, therefore we cannot implement it in the base class. This is when virtual members become handy:


Virtual members
A member of a class that can be redefined in its derived classes is known as a virtual member. In order to declare a member of a class as virtual, we must precede its declaration with the keyword virtual:
// virtual members
#include
using namespace std;

class CPolygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
virtual int area ()
{ return (0); }
};

class CRectangle: public CPolygon {
public:
int area ()
{ return (width * height); }
};

class CTriangle: public CPolygon {
public:
int area ()
{ return (width * height / 2); }
};

int main () {
CRectangle rect;
CTriangle trgl;
CPolygon poly;
CPolygon * ppoly1 = ▭
CPolygon * ppoly2 = &trgl;
CPolygon * ppoly3 = &poly;
ppoly1->set_values (4,5);
ppoly2->set_values (4,5);
ppoly3->set_values (4,5);
cout << ppoly1->area() << endl;
cout << ppoly2->area() << endl;
cout << ppoly3->area() << endl;
return 0;
}
20
10
0


Now the three classes (CPolygon, CRectangle and CTriangle) have all the same members: width, height, set_values() and area().

The member function area() has been declared as virtual in the base class because it is later redefined in each derived class. You can verify if you want that if you remove this virtual keyword from the declaration of area() within CPolygon, and then you run the program the result will be 0 for the three polygons instead of 20, 10 and 0. That is because instead of calling the corresponding area() function for each object (CRectangle::area(), CTriangle::area() and CPolygon::area(), respectively), CPolygon::area() will be called in all cases since the calls are via a pointer whose type is CPolygon*.

Therefore, what the virtual keyword does is to allow a member of a derived class with the same name as one in the base class to be appropriately called from a pointer, and more precisely when the type of the pointer is a pointer to the base class but is pointing to an object of the derived class, as in the above example.

A class that declares or inherits a virtual function is called a polymorphic class.

Note that despite of its virtuality, we have also been able to declare an object of type CPolygon and to call its own area() function, which always returns 0.


Abstract base classes
Abstract base classes are something very similar to our CPolygon class of our previous example. The only difference is that in our previous example we have defined a valid area() function with a minimal functionality for objects that were of class CPolygon (like the object poly), whereas in an abstract base classes we could leave that area() member function without implementation at all. This is done by appending =0 (equal to zero) to the function declaration.
An abstract base CPolygon class could look like this:

// abstract class CPolygon
class CPolygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
virtual int area () =0;
};


Notice how we appended =0 to virtual int area () instead of specifying an implementation for the function. This type of function is called a pure virtual function, and all classes that contain at least one pure virtual function are abstract base classes.

The main difference between an abstract base class and a regular polymorphic class is that because in abstract base classes at least one of its members lacks implementation we cannot create instances (objects) of it.

But a class that cannot instantiate objects is not totally useless. We can create pointers to it and take advantage of all its polymorphic abilities. Therefore a declaration like:

CPolygon poly;


would not be valid for the abstract base class we have just declared, because tries to instantiate an object. Nevertheless, the following pointers:

CPolygon * ppoly1;
CPolygon * ppoly2;


would be perfectly valid.

This is so for as long as CPolygon includes a pure virtual function and therefore it's an abstract base class. However, pointers to this abstract base class can be used to point to objects of derived classes.

Here you have the complete example:

// abstract base class
#include
using namespace std;

class CPolygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
virtual int area (void) =0;
};

class CRectangle: public CPolygon {
public:
int area (void)
{ return (width * height); }
};

class CTriangle: public CPolygon {
public:
int area (void)
{ return (width * height / 2); }
};

int main () {
CRectangle rect;
CTriangle trgl;
CPolygon * ppoly1 = ▭
CPolygon * ppoly2 = &trgl;
ppoly1->set_values (4,5);
ppoly2->set_values (4,5);
cout << ppoly1->area() << endl;
cout << ppoly2->area() << endl;
return 0;
}
20
10


If you review the program you will notice that we refer to objects of different but related classes using a unique type of pointer (CPolygon*). This can be tremendously useful. For example, now we can create a function member of the abstract base class CPolygon that is able to print on screen the result of the area() function even though CPolygon itself has no implementation for this function:

// pure virtual members can be called
// from the abstract base class
#include
using namespace std;

class CPolygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
virtual int area (void) =0;
void printarea (void)
{ cout << this->area() << endl; }
};

class CRectangle: public CPolygon {
public:
int area (void)
{ return (width * height); }
};

class CTriangle: public CPolygon {
public:
int area (void)
{ return (width * height / 2); }
};

int main () {
CRectangle rect;
CTriangle trgl;
CPolygon * ppoly1 = ▭
CPolygon * ppoly2 = &trgl;
ppoly1->set_values (4,5);
ppoly2->set_values (4,5);
ppoly1->printarea();
ppoly2->printarea();
return 0;
}
20
10


Virtual members and abstract classes grant C++ the polymorphic characteristics that make object-oriented programming such a useful instrument in big projects. Of course, we have seen very simple uses of these features, but these features can be applied to arrays of objects or dynamically allocated objects.

Let's end with the same example again, but this time with objects that are dynamically allocated:

// dynamic allocation and polymorphism
#include
using namespace std;

class CPolygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
virtual int area (void) =0;
void printarea (void)
{ cout << this->area() << endl; }
};

class CRectangle: public CPolygon {
public:
int area (void)
{ return (width * height); }
};

class CTriangle: public CPolygon {
public:
int area (void)
{ return (width * height / 2); }
};

int main () {
CPolygon * ppoly1 = new CRectangle;
CPolygon * ppoly2 = new CTriangle;
ppoly1->set_values (4,5);
ppoly2->set_values (4,5);
ppoly1->printarea();
ppoly2->printarea();
delete ppoly1;
delete ppoly2;
return 0;
}
20
10


Notice that the ppoly pointers:

CPolygon * ppoly1 = new CRectangle;
CPolygon * ppoly2 = new CTriangle;


are declared being of type pointer to CPolygon but the objects dynamically allocated have been declared having the derived class type directly.