Wednesday, December 28, 2011

Concepts and Axioms in C++

C++11, the most recently published standard for the world's third most popular programming language, is admittedly a little bit of a disappointment to me. For years, one of the big features that were supposed to be added to the standard was concepts. Suppose I'm writing a function in C++ for a binary search tree.

class bst_node
   int value;

But what if I want to have a binary tree which can hold a bool instead? Need I re-code the whole class? Need I make some complicated hierarchy of inheritance? Need I use the dreaded C void* type? Templates to the rescue!

Just as the object-oriented programming paradigm allows one to instantiate an object from a class, the generic programming paradigm, as implemented by C++, abstracts one step further by allowing one to instantiate a class (or function) from a templateObserve:

template <typename T>
class bst_node
   T value;

If I want a binary search tree integer node, I simply create an object of type bst_node<int>, and if I want one which holds a bool, I use bst_node<bool>. For each different type parameter which I pass to the bst_node class template, the compiler will, at compile time, instantiate an instance of the bst_node class which replaces all instances of T with whatever type I pass in. Not bad, eh? Now, I should point out that templates, when abused, can cause massive code bloat, so it's always good to keep in mind what the compiler is doing behind the scenes...But that's a blog post for another day.

There is, however, one key problem with the template mechanism: There is a complete and total lack of type-safety enforced by it. Suppose I create a function template as follows:

template <typename T>
int Compare (const T& obj1, const T& obj2)
   if (obj1 < obj2) return -1;
   else return (obj1 > obj2);

Now, if the meaning of the operators '<' and '>' are defined for the type T, then all is well and good. But suppose we make a function call similar to what follows:

Compare<bst_node<int> >(node1, node2);

If we have not defined a function operator< for the template class bst_node (and in particular, the instantiated class bst_node<int>) - assume we have not - then we will be treated to a whole host of horrendous compiler errors. After all, the operator '<' has no meaning for the class; would you expect anything different?

The notion of concepts was supposed to solve this. Concepts were supposed to provide a lexical mechanism for specifying requirements which restrict what types can be passed into a type parameter when instantiating a template. On the most basic level, concepts would allow you to say "Whatever type is passed in here, it must implement the '<' operator!" And voila! C++ takes care of the rest. As Wikipedia astutely points out, the implications on formal software verification (and general type safety) are tremendous. Perhaps even more incredible is the notion of axioms. Were axioms implemented, concepts could be taken a step further to specify that a type must have operations which satisfy certain "axioms", for example associativity. Anybody who has studied any abstract mathematics will barely be able to contain his or her excitement at the prospect of having such power available at the lexical level. Algebra libraries would never be the same.

Sadly, neither concepts nor axioms made it into the C++11 standard (instead we got lambda expressions and a redefined auto keyword), and we are currently still stuck with templates such as they are...not that that's so bad. Templates are, of course, an extraordinarily powerful tool such as they are, and my trivial example here did not even scratch the surface of their usefulness. I have been playing around with template metaprogramming for the past week or so; perhaps sometime in the near future I will write a post on that.

Finally, for those interested in the "concept" idea, I understand that there are other languages out there that have implemented similar functionality. Although I have not gone deeply into it myself, the functional programming language Haskell boasts an extremely sophisticated type system which supports, among many other things, the notion of type classes: Haskell's answer to C++'s concepts (or lack thereof). Haskell is a language heavily grounded in mathematics (Category Theory in particular) and I'd definitely encourage anybody with some time to take a look at it. It's a very impressive language, and even if you don't understand all that's going on (I don't!), it's still good to explore different ways of programming, just to know what's out there.