Inferring too much

21 02 2011

Towards the end of last millennium I went to my first job interview and was asked the following question:

Given that a, b, c and d are of type string, what does the following line of code do?

string x = a + b + c + d;

I answered that (a + b) will return a temporary string object which will be concatenated to c creating a second temporary object which in turn will be concatenated to d. The final temporary object will then be used to initialize x. This means that there will be at least 3 memory allocations.
The follow-up question was how can I make this same line perform no more than one memory allocation[1].

That was my introduction to expression templates. If you are familiar with expression templates you can skip to the point (yes there actually is a point for a change).
The idea behind expression templates is that instead of performing an expensive operation eagerly you can return a proxy object which will be able to compute the required value in a cheaper way than naïvely performing the operation.

Examples for this can be:

  1. Matrix and vector arithmetics, e.g. given vectors A and B the expression (A+B)[2] can compute only one cell of the resulting vector by adding A[2] to B[2] rather than calculating the value of all the cells.
  2. In the string example above[2] do not create temporary string objects. Instead calculate the length of the resulting string and perform one allocation to create it when it is used.

Lets pretend that we’re writing a string class (the full code is available here). In order to facilitate the expression template I gave the string class a copy_to(buff) method which copies the string to a char* buffer and a way to construct a string by giving it a buffer so it doesn’t need to allocate anything.

Our helper class (concat) will be returned from operator+, it is temlated on its first parameter[3] in order to support arbitrary long concatenations. concat exposes a similar static interface to that of string (it has functions with the same names) so that when we say lhs.length() it will work whether lhs is a string, a concat<string> or a concat<concat<concat<string>>>.

    template <class T>
    class concat {
        const T& lhs;
        const string& rhs;
        concat(const T& left, const string& right)
            : lhs(left)
            , rhs(right)

        size_t length() const
            return lhs.length() + rhs.length();

        void copy_to(char* dest) const
            helper::copy(lhs, dest);
            helper::copy(rhs, dest + lhs.length());

        operator string() const
            char* buf = new char[length() + 1];
            return helper::create(buf, length());

        concat<concat<T> > operator+(const string& str) const
            return concat<concat<T> >(*this, str);
    concat<string> operator+(const string& str) const
        return concat<string>(*this, str);

Since concat has a cast operator to string it will not break existing code that uses string concatenation.

string a("hello"), b(""), c("world"), d("!");

string s = a + b + c + d; // OK
std::cout << a + b + c + d; // Also OK

If we place a breakpoint in concat::operator string() we will see the following picture:

Visualization of the type in the debug view

We can see that the type of this encodes the expression which created it (hence the name expression templates).

Since concat is a private class inside string we know that a variable of its type will never be created making it safe to hold on to references to temporary objects. We’re guaranteed that an expression template is only used within an expression (typically up to the next semicolon) and any temporaries it references will be valid as long as it exists.

string::concat x = a + b + c; // compilation error

Enter Type Inference

Wait, what was that last part again? Hasn’t C++0x added type inference? What happens if I write auto x = a + b + c;?

Now I’m screwed, not only isn’t this a compilation error, when building with no optimizations (in both VC10 and g++4.5) it seems to behave correctly even thought this is the mother of undefined behaviour, calling an object after it has been destructed is rarely never a Good Thing™.

When compiling it with optimizations in VS10 the output was rather ironic (who knew the optimizer had a sense of humour)


☹ would have been more appropriate

In order to prove to myself that this is really what’s happening (also in debug mode) I added a valid boolean member to concat which is set to true in the constructor and to false in the destructor, I then added the following lines to the beginning of copy_to

if (!valid)
    throw "Here there be dragons^H^H^H undefined behaviour";

Sure enough now when I use auto instead of string (and only then) the program terminates with an unhanded exception.

What does this mean?

This means that library writers can no longer depend on the fact that their private classes are never stored for longer than the expression in which they were created. This problem isn’t confined to outlandish optimizations. The standard defines that std::vector<bool> is a special case which packs the elements into bits and not real bools, this means that operator[] of vector<bool> must return a proxy type.

The following program:

#include <vector>
#include <iostream>
#include <limits>
std::vector<bool> to_bits(unsigned int n) {
    const int bits = std::numeric_limits<unsigned int>::digits;
    std::vector<bool> ret(bits);
    for(int i = 0, mask = 1; i < bits; ++i, mask *= 2)
        ret[i] = (n &  mask) != 0;
    return ret;

int main()
    bool b = to_bits(42)[3];
    auto a = to_bits(42)[3];
    std::cout << std::boolalpha << b << std::endl;
    std::cout << std::boolalpha << a << std::endl;

Produces the same undefined behaviour, when tested in g++ and VC10 it outputs:


And when it’s compiled in debug mode in VC10 an assert is fired from debug iterators

Unfortunately (in this context) if C#’s var keyword is any indicator people will start using auto everywhere, I know people who use var for loop variables rather than int. The fact that tools like ReSharper recommend this line of action will add to the trend.

What can we do?

I don’t think there’s much we can do, it seems too late in the standardization process to actually change anything. Perhaps future versions of C++ will allow specifying how a type behaves in a type inference (auto/decltype) situation.

string operator auto() const { return *this; }


After posting a link to this on reddit I got a lot of comments and I would like to address some of them.

  1. Had I made concat non-copy constructable the auto line would fail to compile (just like the C++03 code) and the problem wouldn’t appear. I hadn’t thought of this and it certainly alleviates  the severity of the “problem” I describe.
    1. Actually this is not true, as Geoff points out if concat has a deleted copy constructor it can’t be returned from operator+.
  2. Even if though the problem was is as bad as I originally thought it was it’s still just one more gotcha in the language.  All languages have these pitfalls and I don’t think that one more is a reason to invalidate the entire language (as some people seem to think).


  1. Assuming RVO which is out of scope for this post.
  2. This optimization can’t be used on std::string since the C++ standard mandates that the return value of operator+ will be an std::string and not some other type.
  3. Ignoring the fact that a + (b+c) will have a concat as a second parameter allows me to significantly simplify the code in this post.
Geoff RomerGeoffGeoff