RAII in C#

We’ve already discussed RAII in C++, now it’s time to see how we implement the same pattern in C#. We don’t manage our memory in C#, however we still need to manage other resources, like network and database conections, files and mutexes. In C++, whenever control leaves a block, the local objects defined in that block are destroyed. In C# this is not the case because the runtime uses garbage collection to manage memory. This means that the runtime periodically looks for objects that are no long referenced and deletes those that it finds. So, unfortunately, we cannot use the same trick manage resources.

In order to guarantee that a resource is released in C# we can use a finally block like so:

// acquire a resource
try
{
   // use resource and do other stuff as well
}
finally
{
   // release resource
}

A finally block will be executed whenever control leaves the try block, either because it reaches the end of the block, it reaches a return statement of an exception is thrown.

We can also encapsulate resource management with a class like in C++. In C# we normally do this by implementing the IDisposable interface. The IDisposable interface looks like this:

interface IDisposable
{
    void Dispose();
}

We acquire the resource in the constructor and we release the resource in the Dispose method.

class ResourceHolder : IDisposable
{
   public ResourceHolder()
   {
       // acquire resource here
   }

   public void Dispose()
   {
       // release resource
   }
}

The Dispose method of our ResourceHolder must be called. It does not happen automatically like a destructor in C++. We can combine a class implementing IDisposable with the finally block like this:

var holder = ResourceHolder()
try
{
   // use resource and do other stuff as well
}
finally
{
   holder.Dispose();
}

In fact, this pattern is so useful that some syntactic sugar exists for it, the using statement.

using(var holder = ResourceHolder())
{
   // use resource and do other stuff as well
}

The above code is exactly equivalent to our previous code example with an explicit finally block. Whenever control exits the using block, the Dispose method of the object declared inside the using statement is called.

RAII is a great example of something that ends up being more complicated in C# than it is in C++. Usually, things are easier in C#!

Don’t Get Caught out by Covariance!

Downcasting is bad, you shouldn’t do it, it’s code smell and it’s an anti-pattern. Unfortunately, in real world code you will see plenty of it. So you need to be aware of the pottential pitfalls of using downcasting. One that caught me out recently is covariance.

In C++, you can quite freely cast between types. You can take a pointer to some memory and cast it to any type you like, and read out what you get. Generally speaking you won’t get anything useful. If you’re not careful you’ll get “undefined behaviour”. If you have data stored in memory, the assumption is that you know that type that data is, and you are responsible for using it appropriately.

In languages like Java and C#, things are different. Here, the runtime checks our casts and will throw an exception if it thinks you have gone wrong. The consequences of this difference can sometimes be surprising.

Let’s look at an example. Suppose we have a base class Security defined like so:

public class Security
{
    int Id;
    public Security(int id)
    {
        Id = id;
    }
}

and two subclasses, Stock:

public class Stock : Security
{
    string Name;
    public Stock(int id, string name) : base(id)
    {
        Name = name;
    }
}

and Bond:

public class Bond : Security
{
    double Rate;
    public Bond(string name, double rate) : base(id)
    {
        Rate = rate;
    }
}

Now if we have a reference of type Security, that actually points to a Stock, we can happily downcast it to a reference of type Stock, like this:

Security s1 = new Stock(1, "GOOG");
Stock AsStock = (Stock)s1;
Console.WriteLine(s1.Name);

Because, Stock is a subtype of Security we can cast a reference to a Stock to a reference to a Security. This works because every Stock will have all the fields of a Security, in the same relative locations in memory.

However, if you have a reference of type Security that points to a Security or a Bond, and try and cast it to a Stock, you’ll have trouble. If we run the following code

Security s1 = new Bond(1, 0.02);
Stock AsStock = (Stock)s1;
Console.WriteLine(s1.Name);

we will see a run time exception of the form:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Casting.Bond' to type 'Casting.Stock'.

This makes sense, the runtime knows that the object we have a reference to is not a Stock. It knows that we can’t cast this object to a stock in a sensible way. So the runtime stops us by throwing an exception.

Let’s look at a slightly more complicated example. Suppose, instead of a single object, we had a whole array of them. The following code:

Security[] securities = new Stock[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
Stock[] stocks = (Stock[]) securities; 
Console.WriteLine(stocks[0].Name);

will run happily. We can cast a reference of type Security[] that points to a Stock[], to a reference of type Stock[]. However if we try

Security[] securities = new Security[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
Stock[] stocks = (Stock[]) securities; 
Console.WriteLine(stocks[0].Name);

we will get an InvalidCastException:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Casting.Security[]' to type 'Casting.Stock[]'.

This might seem a little surprising. The objects in our array are still actually Stocks. We know that we can cast a reference of type Security that points to a Stock to a reference of type Security. Why can’t we cast the Security[] reference to a Stock[] reference?

It’s a subtle one. When we cast an array reference, we are not casting the objects in the array, we are casting the array itself. So in the first array example, we are casting a reference of type Security[] to a reference of type Stock[]. The runtime knows that the reference actually points to an object of type Stock[], so this is fine. There will only ever be Stock objects in this array. Even though we have a reference of type Security[] pointing to this array, we can’t do something like:

Security[] securities = new Stock[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
securities[0] = new Bond(2, 0.02);

the runtime knows that our array is of type Stock[], to it throws an exception:

Unhandled exception. System.ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array.

However, in the second example, we have a reference of type Security[] that points to an array of type Security[]. Although this array only contains stocks, the runtime cannot in general say whether that is true or not. Suppose we had done something like this instead:

Security[] securities = new Security[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
securities[0] = new Bond(2, 0.02);
Stock[] stocks = (Stock[]) securities; 

We can of course add a Bond to an array of type Security[], the runtime doesn’t keep track of all this, which is why it has no way of knowing if the securities array really does contain objects of type Stock or something else.

The name of this type of casting is Covariance. Eric Lippert, one of the original designers of C#, has a pretty good blog about it. The really important point is that when we have an array of some base type, even if we are able to downcast the individual members of that Array, the runtime might stop us from downcasting the entire array.

This tripped me up in my day job last week. I was performing a data load that returned an array of type Security[], I knew this array contained only objects of type Stock and so I cast it to Stock[]. I merged this into master, but then our regression tests failed. Thankfully my mistake was caught before making it into prod.

What is Single Inheritance, and Why is it Bad?

In a previous post we talked about how there are two different notions of inheritance, Interface Inheritance and Implementation Inheritance. In this post we’ll be talking primarily about the latter. That means, we’ll be talking about using inheritance to share actual logic among classes.

Let’s say we defined two base classes. The first is DatabaseAccessor:

class DatabaseAccessor {
   protected:
   bool WriteToDatabase(string query) {
   // Implementation here
   }

   string ReadFromDatabase(string query) {
   // Implementation here
   }
}

which encapsulates our logic for reading from and writing to some database. The second is Logger:

class Logger {
   protected:
   void LogWarning(string message) {
   // Implementation here
   }

   void LogError(string message) {
   // Implementation here
   }
}

this encapsulates our logic for logging errors and warnings. Now, suppose also that we are using a language the supports multiple inheritance, such as C++. If we want to define a class that shares functionality with both of these classes, we can subclass both of them at once:

class SomeUsefulClass : private DatabaseAccessor, private Logger {
   // Some useful code goes here
}

So, we have a subclass named SomeUsefulClass that can use both the DatabaseAccessor and Logger functionality, great!

What if we wanted to do the same thing in a language like C# or Java? Well then we’re out of luck, in C# and Java you cannot inherit from more than one base class. This is called single inheritance. How do we achieve the same effect if we are using either of these languages? Well, one solution would be to chain our subclasses together. We would choose which of DatabaseAccessor and Logger is more fundamental and make it the subclass of the other. Suppose we decided that everything has to log, then the Logger class remains the same and DatabaseAccessor becomes:

class DatabaseAccessor : private Logger {
   protected:
   bool WriteToDatabase(string query) {
   // Implementation here
   }

   string ReadFromDatabase(string query) {
   // Implementation here
   }
}

So now we can subclass DatabaseAccessor and get the Logger functionality as well. Although, what if we really did just want the DatabaseAccessor logic, and we didn’t actually need the Logger implementation? Well tough luck. Now everything that inherits from DatabaseAccessor inherits from Logger as well.

This might not seem like much of a problem with something as trivial as Logger, but in big enterprise applications, it can spiral out of control. You end up in a situation where all the basic re-usable code you might need is locked inside a lengthy chain of subclasses. What if you only need something from the bottom ring of this chain? Unfortunately, you have to pick up all of it’s base classes as well. This makes our code unnecessarily complicated and harder to read and understand. If one of those unwanted base classes does all kinds of heavy weight initialisation on construction, then it will have performance implications as well.

One consolation that is often offered is that both languages allow multiple inheritance of interfaces. This doesn’t really help us though. Implementation inheritance and Interface Inheritance are two completely different things. We can’t convert one of DatabaseAccessor or Logger into an interface. The entire reason we want to inherit from them is to get their implementation!

We could also use composition instead of inheritance. In this case we would inherit from one of the classes, and keep a reference to the other. But, in that case, why even use single inheritance at all? Why not just let our classes keep references to a Logger and a DatabaseAccessor? The language designers have struck a bizarre compromise here, we can use inheritance, but only a little bit. If C# and Java are Object Oriented, then they should allow us to use the full features of object orientation, rather than just a flavour.

The good news is that the people behind C#, Microsoft, have realised the error in their ways. They have released two features that ameliorate the problem of single inheritance, Extension Methods and Default Implementations in Interfaces. More on these in future blog posts.

The Two Different Types of Inheritance

If, like me, you started life as a C++ developer, you will be surprised to learn that there are actually two completely different notions of Inheritance. In C++, these two different notions get squashed together, but they’re still there.

Imagine we are defining a base class to represent a financial security. In C++ it might look something like this:

class Security {
   public:
   virtual double GetPrice(double interestRate) = 0;
}

Any class that inherits from this base class has to implement the GetPrice method. Apart form that, they may have nothing else in common.

Now, suppose we are writing a base class that encapsulates our database access logic. Again, in C++, it might look a little like this:

class DatabaseAccessor {
   private:
   bool WriteToDatabase(string query) {
   // Implementation here
   }

   string ReadFromDatabase(string query) {
   // Implementation here
   }
}

These two examples illustrate the two different notions of inheritance, interface inheritance and implementation inheritance.

Interface inheritance, also known as type inheritance, is when we use inheritance to specify a common api across different classes. In our first example, every class that implements the Security class will have a public GetPrice method that takes a double argument and returns a double. There can be lots of different Security subclasses, that all calculate their price differently. But they all share the same interface for this functionality. This is interface inheritance.

Now, let’s look at our second example. Any class that inherits from DatabaseAccessor will have a shared implementation for reading from and writing to databases. We are using inheritance here to make our code easier to maintain and re-use. This is implementation inheritance.

When we write C++ we don’t normally distinguish between these two things. Indeed we can mix them freely. For example, suppose we had another class:

class RateCurve {
public:
   virtual double GetRateAt(time_t date);
   double Discount(double value, time_t date) {
      return value * GetRateAt(date);
   }
}

The method GetRateAt defines an interface as it’s a virtual function with no implementation. The method Discount comes with an implementation that will be shared between subclasses. So whenever we inherit from RateCure we will be using interface and implementation inheritance at the same time!

In a language like C# things are quite different. Interface and Implementation inheritance are handled explicitly and separately. If you would like to share implementation you inherit from a class. If you would like to use a common API you implement an interface. Interfaces are defined like so:

public interface ISecurity {
   public double GetPrice(double rate);
}

An interface cannot contain any fields, and (at least up to C# 8.0) interface methods cannot have implementations. This all means that it is a little bit easier to reason about our code. The disadvantage is that we cannot use both interface and implementation inheritance at the same time.

Book Review – Functional Programming in C#

I read Enrico Buonanno’s Functional Programming in C# directly after Jon Skeet’s C# in depth. Unlike Skeet’s book, this is not a book about C#, it is a book about functional programming which uses C# is the medium of instruction.

What makes this particularly interesting is that C# isn’t really a functional language. Usually functional programming is discussed in terms of niche explicitly functional languages like Haskell. By using C#, Buonanno makes the principles and practices of functional programming a lot more accessible to the average programmer. He does advocate mixing the more traditional object oriented style of C# with functional programming, but this is not something that really comes through in the code samples.

This book covers the basic concepts of functional programming really well. He explains how and why to avoid state mutation, the concept of functions as first class citizens and higher order functions, function purity and side effects, Partial application and currying and lazy computation. One thing that interested me particularly is the good case the author makes that pure functions should not throw exceptions.

He does not spend a long time justifying functional programming. The two main benefits he repeatedly highlights are easier testing and better support for concurrency. It is clear that a functional approach is more suited to concurrent programming. However the claim that functional code is easier to test seems somewhat dubious to me, and is not really backed up with credible examples. He also claims that functional programming leads to cleaner code. Again, I am somewhat skeptical.

I really enjoyed the discussion of user defined types. In particular the pattern of creating new types that wrap low level types and add some semantic meaning. A great example he uses is an age type. This has only one member, an integer. It’s constructor will only accept valid values for human ages. This means that we have added an extra layer of static type checking to this type: when we want an age, we use the age class, and not just an integer. It also means we don’t have to perform extra checks when using an age type, as we know it’s value was checked when it was initialised.

The material on LINQ was also quite good. LINQ is strongly functional in style, it emphasizes data flow over mutation, composability of functions and pureness. This is all well explained, the author even shows how to integrate user defined monads into LINQ. However LINQ is not dealt with in a single place in a systematic way, which is disappointing. Another gripe I have is that the author uses LINQ query syntax. I do not like query syntax. Not only is it ugly, it is usually incomprehensible.

Buonanno uses Map, Bind and Apply to introduce functors, monads and applicatives in a very practical way. He also goes into detail explaining how and why to use the classic monads Option and Either. We even see variations on the Either monad that can be used for error handling and validation. Both the applicative and monadic versions of Traverse appear near the end of the book as well. In my opinion, Monad stacking is one of the worst anti-patterns of functional programming. So it is disappointing that it only gets very limited coverage. I would also have liked if he had used his excellent examples as a jumping off point to dig deeper into category theory. Perhaps however, that would have been too abstract for what is quite a practical book.

Handling state in a functional way is covered well, but it feels a bit academic. It is hard to imagine applying the patterns he covers in a real world code base. Sometimes you just have to use state! The final few chapters cover IObservables, the agent model and the actor model. These sections were quite interesting but felt a little out of place. They really merit a much deeper dive.

When I finished this book I had a much greater appreciation for functional programming. In particular it gave me lots of ideas of how I could practically apply it in my real life work. The code samples were all very good, occasionally though they were a little convoluted. Indeed they sometimes seemed like evidence against a functional style. But, as someone relatively new to functional programming I appreciated how grounded it was in real world examples. Overall, this book is an excellent resource for C# programmers who want to add a little functional flourish to their code. I highly recommend it.

C# In Depth – Book Review

Jon Skeet is a bit of a legend. He has the highest reputation score on Stack overflow. He got there because of his consistently patient, helpful and correct answers. He is probably the most prominent C# developer there is. So, when I started a new job as a C# developer, I decided to read Skeet’s book, C# in Depth.

It’s a very good book. There is one big problem however, the structure. This book is divided into five parts, each dealing with a successive major numbered release of C#. This chronological structure is quite strange. The overriding assumption of the author is that the reader is familiar with C# 1. Given that C# 2 was released 13 years ago, this is a pretty strange angle. It’s hard to imagine there are many programmers today who are familiar with C# 1 but need a detailed walk through of the new additions to the language in C# version 2 to 5. The book ends up a sort of mix between a history of C# and an intermediate user’s guide.

One example of the problem with this structure is how delegates are covered. They are first introduced briefly in chapter 1. Improvements to the delegate syntax in C# 2 are then covered in detail in chapter 5. In neither of these chapters is there a clear explanation of what delegates actually are or why they are part of C#. Indeed when we reach chapter 10 Skeet covers lambda expressions, which, in reality, make delegates redundant for most use cases.

Another victim of the unorthodox structure is the coverage of class properties. Modern C# syntax allows us to define properties in a very quick intuitive manner. In this book, first we learn about properties as they originally appeared in C# 1. Then, in chapter 7, we see how C# 2 allowed a mix of public getters with private setters. Finally in chapter 8 we see how properties are actually implemented in modern C#.

There is of course a benefit to covering older versions of the language in detail. C# is a language designed for enterprise development. So, if you code in it, you are likely to be working with a large legacy code base. This means that understanding what the language looked like in it’s various iterations is useful. However, these topics would be a lot better served if they were covered all at once, rather than being split over multiple chapters.

Skeet spends a lot of time covering Linq, which is great. Linq is a really cool feature of C#, and he covers cool details, like how to use extension methods and iterators to integrate your own code into LINQ. He also covers the query expression Linq syntax. This is the syntax that lets your write a linq expression in the style of a SQL query. Frankly I think Linq expression syntax is a monstrosity and should never be used, but it is probably useful to cover it, and explain how it works (it’s really just syntactic sugar for the normal linq syntax). There is also a useful section on async code, that gets into a lot of really useful detail.

Overall, Skeet has an ability to make some quite obscure topics interesting and accessible. He always presents new ideas with realistic and useful code snippets. Most important of all, he writes in a fun conversational style, that makes reading his book a lot more fun than a typical intermediate language guide.