Templates vs. Virtual functions

You have probably heard about polymorphism before. Well, there are at least two different kinds of polymorphism in C++. Today, we’re going to talk about run time polymorphism via inheritance and compile time polymorphism via templates. Most people are familiar with polymorphism via inheritance. Typically we encounter it early in our programming career, often via some strange example about animals making various noises. Template polymorphism is not so well loved, but in this blog, we are going to give it a chance!

What is polymorphism anyway?

Polymorphism is type abstraction. We are doing polymorphism, when, we ignore some of the specifics of a given type, and treat it as different type, for example through an interface. The best way to understand it is via examples.

Polymorphism via inheritance

Let’s try our hand at a little subclass polymorphism. We are going to create a very simple base class with a virtual method. We will also create a subclass that inherits form this base class, and overrides our virtual base class method.

class SimpleBaseClass
{
public:
    virtual void DoSomethingFun() 
    {
        // put some fun stuff in here
    }
}

class SimpleSubClass : public SimpleBaseClass
{
public:
    void DoSomethingFun()
    {
        // put some different fun stuff here
    }
}

Now we can provide many different implementation of the DoSomethingFun method in different subclasses, and we can use them like so:

int main()
{
    SimpleBaseClass* polyObject = new SimpleSubClass();
    polyObject->DoSomethingFun();
    return 0;
}

When this all compiles, the compiler will create a special lookup table called a vtable. Then, because this is a virtual method, at run time, when the DoSomethingFun method is called on the polyObject pointer, the vtable will be used to find the implementation in the SimpleSubClass. That was polymorphism!

Polymorphism via templates

We have covered the bread and butter of polymorphism. However, we can achieve the same effect with templates! To do this, we are going to define a templated class, and two different implementation classes:

class FunImplementationClass
{
public:
    FunImplementation()
    {
        // put some fun stuff here
    }
}

class AnotherFunImplementationClass
{
public:
    FunImplementation()
    {
        // put some different fun stuff here
    }
}

template <typename T>
public TemplateBaseClass<T>
{
private:
    T implementation;
public:
    TemplateBaseClass(T implementation) : implementation(implementation)
    {
    }

    void DoSomethingFun()
    {
        implementation.FunImplementation();
    }
}

Now we have created a sort of base class like before. But, instead of defining different implementations in subclasses that inherit from this base class, we define them in classes that we pass in via template type parameters.

And we can use this in a very similar (and slightly more verbose) way to the traditional object oriented approach:

int main()
{
    TemplateBaseClass<FunImplementationClass> polyObject = TemplateBaseClass<FunImplementationClass>(FunImplementation());
    polyObject.DoSomethingFun()
    return 0;
}

This time, the compiler will use our templating to resolve which implementation to use, and it will do this at compile time, not run time.

Why would we even want to use templates like this?

There are some disadvantages to the template approach. The most obvious of which is that template programming is not as well known. If you use it, the other people who read, maintain and extend your code, are less likely to be familiar with this technique. So it will be a lot harder for them!

The other, is that templated code is a lot harder to debug. In particular, if you have ever had a compiler error from templated code, you will recognise the error message gore it produces.

I have also heard that templated code is liable to create very large binaries, as so much extra code is generated by the compiler. However I have never seen this being a problem in real life.

With all that said, why might you sill choose to use the templated version? The main reason is performance. There is a performance penalty associated with run time polymorphism. When the correct method is resolved at run time, it requires a look up in the vtable which adds a small but non-zero performance penalty.

In addition to this, when we use compile time polymorphism, the compiler knows exactly what code path we are going to follow. When we use run time polymorphism, that will only be resolved at run time. This means that the compiler has much greater flexibility to optimise our templated code. This can give us a significant performance boost!

C++ Template Meta Programming – Part 1

You may have heard of template meta programming before. It is a technique that uses the features of C++ templates to shift program execution from runtime to compile time. Basically, we are going to trick the compiler into running our code at compile time and then putting the output into our executable.

Let’s look at an example:

#include <iostream>

template<int n>
struct Factorial
{
    enum { value = n * Factorial<n - 1>::value };
};

template<>
struct Factorial<0>
{
    enum {value = 1};
};

int main() {
    std::cout << Factorial<5>::value << std::endl;
    return 0;
}

What does this code do? Well, first we define a templated struct that takes an integer as a parameter.

template<int n>
struct Factorial
{
    enum { value = n * Factorial<n - 1>::value };
};

Notice, that inside this struct, we define a field named value. This field is defined to be n times the value field of Factorial<n-1>. We are also using total template specialisation:

template<>
struct Factorial<0>
{
    enum {value = 1};
};

This means that if we ever specialise the Factorial template with the parameter 0 it will use this version of Factorial that defines the value field to be 1.

So, when we specialise the Factorial template with the parameter 5, the compiler creates the classes Factorial<5>, Factorial<4>, Factorial<3>, Factorial<2>, Factorial<1> and Factorial<0>. Then it computes the value field for each of these classes. The value field for Factorial<0> is 1. The compiler calculates the value field for each successive Factorial<n> by multiply n times the value field of the previous Factorial specialisation. So we have defined a recursion that calculates the factorial of an integer. What is weird is that this calculation happens entirely at compile time. When we run this code, the first six values of the factorial series have already been calculated, and we will just be reading the last value from the code the compiler has created.

We can do exactly the same thing with the Fibonacci numbers:

#include <iostream>

template<int n>
struct Fibonnaci
{
    enum { value =  Fibonnaci<n - 1>::value + Fibonnaci<n - 2>::value };
};

template<>
struct Fibonnaci<0>
{
    enum {value = 1};
};

template<>
struct Fibonnaci<1>
{
    enum {value = 1};
};

int main() {
    std::cout << Fibonnaci<5>::value << std::endl;
    return 0;
}

So we have seen how to do recursions via template meta programming. Next we will see how to write conditional statements that are evaluated at compile time.

Generic Programming, C++ vs C#

At first glance, C++ and C# are two very similar languages. However the more you look, the more you realise that they are two very different beasts. Today we are going to talk about how these two languages handle generic programming.

The Implementation

First lets look at how the two languages implement generic programming. In C++ we can define a template for a class like this:

template<typename T>
class Holder
{
public:
   T value;
};

We can then use that template in our code like so:

Holder<int> intHolder;

When the compiler sees this usage, it goes back to the definition of the Holder class and generates a new class. In this new class it replaces the type variable T with the type int. This new class generated by the compiler is what the name Holder<int> refers to. This class gets compiled as normal. It would be completely equivalent if we had instead defined a Holder class specialised to ints ourselves:

class Holder_int
{
public:
   int value;
};

If we also specialise our Holder class with the type double, the compiler will generate another version of Holder, this time with type variable T replaced by the type double.

Now, in C# the equivalent to the above example is:

class Holder<T>
{
    public T value;
}

This looks very similar, however the underlying implementation is very different. In C# generics are resolved at runtime not at compile time. The Holder class we defined will remain generic when it is complied to intermediate language. When we run code that uses the Holder class, the just in time compiler generates the different machine code implementations for whatever specialisation of Holder we have used. The JIT compiler will generate separate versions of the Holder class for each value type used. However, it generates a single implementation that is shared for all reference types. This works because reference types are implemented as pointers, which are always the same size, regardless of the type they are pointing to.

Constraints

Usually we need more than just a typename in our generic code. We want to do things with the values of that type. So, suppose we wanted to add a method to our generic Holder class that calls a length method on our inner object of type T. In C# we do it like this:

public interface IHasLength
{
   int length();
};

public class Holder<T> where T : IHasLength
{
   public T value;
   int GetLength()
   {
      return T.length();
   }
}

The syntax, class Holder where T : IHasLength specifies that whatever type this generic is specialised with must implement the interface IHasLength. With that constraint, the compiler knows that whatever type we specialise this class with will have a Length method, so we can call it freely in our generic code. There are five different kinds of constraint we can impose. We have just seen the interface constraint. We can also constrain a type parameter to be a subclass of a given class, to have a default constructor, to be a value type or to be a reference type.

In C++ we don’t need explicit type constraints. Instead there are implicit type constraints. For example, suppose we have the following code:

template<typename T>
class Holder
{
public:
   T value;
   int GetLength()
   {
      return value.length();
   }

}

Here, we have just written our code assuming that an object of type T has a Length method that returns a value that can be cast to an integer. We do not need to specify this constraint ourselves. We can instantiate this class and specify a specific type, for example via:

Holder<string> stringHolder();

The compiler then generates the version of Holder specialised to the type string. This new class is then compiled as normal. As the string class has a length method, this will compile successfully. However, suppose we instantiated our class like so:

Holder<int> intHolder();

The int type does not have a length method, so the class generated by this code will not compile.

Whereas the constraints on C# generic types are quite limited, in C++ we can use any constraint at all. For example, it is impossible to impose a constraint on a generic type parameter in C# that it must have an addition operator. This is because there is no interface that all types that have an addition operator implement. However, in C++, we just use the addition operator in our template code and the compiler will make sure that the specific types we use have an addition operator. This all works in C++ because the types are resolved at compile time, so the compiler can check that types satisfy the implicit interface then. In C# the type substitution does not actually happen until runtime, so it would be impossible for the C# compiler to do a check like this.

Beyond Type Parameters

Another cool feature of C++ templates is that we can pass values to our templates as well as type parameters. For example, in C++ the following code is valid:

tempate<int n>
bool LessThan(int x)
{
   return x < n;
}

This defines a template function with a parameter of type int. We can use this template like so:

bool result = LessThan<10>(5);

The code LessThan<10> causes the compiler to generate a version of the LessThan method with 10 substituted for the parameter n. You can’t do anything like this in C#.

Overall, C++ templates are a lot more powerful and flexible than C# generics. Of course this means that it is a lot easier to trip yourself up with C++ templates. And when things do go wrong the compiler output is probably going to be completely incomprehensible. As C++ generates separate code for every different version of a template that gets used this has the potential to create very large binaries. However, on a modern system this probably won’t really be a problem.