Inferring too much

21 02 2011

Towards the end of last millennium I went to my first job interview and was asked the following question:

Given that a, b, c and d are of type string, what does the following line of code do?

string x = a + b + c + d;

I answered that (a + b) will return a temporary string object which will be concatenated to c creating a second temporary object which in turn will be concatenated to d. The final temporary object will then be used to initialize x. This means that there will be at least 3 memory allocations.
The follow-up question was how can I make this same line perform no more than one memory allocation[1].

That was my introduction to expression templates. If you are familiar with expression templates you can skip to the point (yes there actually is a point for a change).
The idea behind expression templates is that instead of performing an expensive operation eagerly you can return a proxy object which will be able to compute the required value in a cheaper way than naïvely performing the operation.

Examples for this can be:

  1. Matrix and vector arithmetics, e.g. given vectors A and B the expression (A+B)[2] can compute only one cell of the resulting vector by adding A[2] to B[2] rather than calculating the value of all the cells.
  2. In the string example above[2] do not create temporary string objects. Instead calculate the length of the resulting string and perform one allocation to create it when it is used.

Lets pretend that we’re writing a string class (the full code is available here). In order to facilitate the expression template I gave the string class a copy_to(buff) method which copies the string to a char* buffer and a way to construct a string by giving it a buffer so it doesn’t need to allocate anything.

Our helper class (concat) will be returned from operator+, it is temlated on its first parameter[3] in order to support arbitrary long concatenations. concat exposes a similar static interface to that of string (it has functions with the same names) so that when we say lhs.length() it will work whether lhs is a string, a concat<string> or a concat<concat<concat<string>>>.

private:
    template <class T>
    class concat {
        const T& lhs;
        const string& rhs;
public:
        concat(const T& left, const string& right)
            : lhs(left)
            , rhs(right)
        {}

        size_t length() const
        {
            return lhs.length() + rhs.length();
        }

        void copy_to(char* dest) const
        {
            helper::copy(lhs, dest);
            helper::copy(rhs, dest + lhs.length());
        }

        operator string() const
        {
            char* buf = new char[length() + 1];
            copy_to(buf);
            return helper::create(buf, length());
        }

        concat<concat<T> > operator+(const string& str) const
        {
            return concat<concat<T> >(*this, str);
        }
    };
public:
    concat<string> operator+(const string& str) const
    {
        return concat<string>(*this, str);
    }

Since concat has a cast operator to string it will not break existing code that uses string concatenation.

string a("hello"), b(""), c("world"), d("!");

string s = a + b + c + d; // OK
std::cout << a + b + c + d; // Also OK

If we place a breakpoint in concat::operator string() we will see the following picture:

Visualization of the type in the debug view

We can see that the type of this encodes the expression which created it (hence the name expression templates).

Since concat is a private class inside string we know that a variable of its type will never be created making it safe to hold on to references to temporary objects. We’re guaranteed that an expression template is only used within an expression (typically up to the next semicolon) and any temporaries it references will be valid as long as it exists.

string::concat x = a + b + c; // compilation error

Enter Type Inference

Wait, what was that last part again? Hasn’t C++0x added type inference? What happens if I write auto x = a + b + c;?

Now I’m screwed, not only isn’t this a compilation error, when building with no optimizations (in both VC10 and g++4.5) it seems to behave correctly even thought this is the mother of undefined behaviour, calling an object after it has been destructed is rarely never a Good Thing™.

When compiling it with optimizations in VS10 the output was rather ironic (who knew the optimizer had a sense of humour)

☹

☹ would have been more appropriate

In order to prove to myself that this is really what’s happening (also in debug mode) I added a valid boolean member to concat which is set to true in the constructor and to false in the destructor, I then added the following lines to the beginning of copy_to

if (!valid)
    throw "Here there be dragons^H^H^H undefined behaviour";

Sure enough now when I use auto instead of string (and only then) the program terminates with an unhanded exception.

What does this mean?

This means that library writers can no longer depend on the fact that their private classes are never stored for longer than the expression in which they were created. This problem isn’t confined to outlandish optimizations. The standard defines that std::vector<bool> is a special case which packs the elements into bits and not real bools, this means that operator[] of vector<bool> must return a proxy type.

The following program:

#include <vector>
#include <iostream>
#include <limits>
std::vector<bool> to_bits(unsigned int n) {
    const int bits = std::numeric_limits<unsigned int>::digits;
    std::vector<bool> ret(bits);
    for(int i = 0, mask = 1; i < bits; ++i, mask *= 2)
        ret[i] = (n &  mask) != 0;
    return ret;
}

int main()
{
    bool b = to_bits(42)[3];
    auto a = to_bits(42)[3];
    std::cout << std::boolalpha << b << std::endl;
    std::cout << std::boolalpha << a << std::endl;
}

Produces the same undefined behaviour, when tested in g++ and VC10 it outputs:

true
false

And when it’s compiled in debug mode in VC10 an assert is fired from debug iterators

Unfortunately (in this context) if C#’s var keyword is any indicator people will start using auto everywhere, I know people who use var for loop variables rather than int. The fact that tools like ReSharper recommend this line of action will add to the trend.

What can we do?

I don’t think there’s much we can do, it seems too late in the standardization process to actually change anything. Perhaps future versions of C++ will allow specifying how a type behaves in a type inference (auto/decltype) situation.

string operator auto() const { return *this; }

Edit

After posting a link to this on reddit I got a lot of comments and I would like to address some of them.

  1. Had I made concat non-copy constructable the auto line would fail to compile (just like the C++03 code) and the problem wouldn’t appear. I hadn’t thought of this and it certainly alleviates  the severity of the “problem” I describe.
    1. Actually this is not true, as Geoff points out if concat has a deleted copy constructor it can’t be returned from operator+.
  2. Even if though the problem was is as bad as I originally thought it was it’s still just one more gotcha in the language.  All languages have these pitfalls and I don’t think that one more is a reason to invalidate the entire language (as some people seem to think).

Footnotes

  1. Assuming RVO which is out of scope for this post.
  2. This optimization can’t be used on std::string since the C++ standard mandates that the return value of operator+ will be an std::string and not some other type.
  3. Ignoring the fact that a + (b+c) will have a concat as a second parameter allows me to significantly simplify the code in this post.
Geoff RomerGeoffGeoff
About these ads

Actions

Information

38 responses

23 02 2011
Achilleas Margaritis

I think that the compilers are wrong if they accept the instantiation of a private class through type inference. Auto or not, if a class is private, it should not be available outside of its scope.

24 02 2011
Motti Lanzkron

As vtj explained on reddit :

Because what is being inferred is a type, rather than a particular name of a type. Access restrictions apply to names, not types or values. A single type may have different names with different access restrictions. Some expressions (like lambdas) have types that have no name at all, yet it must be possible to infer their type.

24 02 2011
Achilleas Margaritis

But the inferred type has a name; and a name has a scope and an access.

The inferred type’s name is already used in other cases, where the auto variable is used.

For example, if I do the following:

auto x = a + b + c;
x.bar();

Then the compiler will say to me that type Foo (assuming the type inference results to Foo type) does not have member bar().

23 02 2011
Mike

Have you brought up this problem with anyone on the standards committee?

23 02 2011
Motti Lanzkron

Nope, I saw that someone raised this in 2008 and nobody paid much attention and was too meek to go to bother the committee.

23 02 2011
Charles

Can’t you patch it up by making concat non-copyable and non-assignable? (Can’t fill in the missing details so it actually works.) One thing is the eagerness of auto, another thing is the responsibility of the author of brittle code. (You don’t need auto to store instances of concat, because template deduction works the same way.)

23 02 2011
Motti Lanzkron

Good point, library writers should be more careful.
Actually it’s not such a good point as pointed out by Geoff.

As for the point about template deduction working the same way, this is true but it’s much harder to get yourself into trouble using templates to deduce returned types than with auto. If you had a function accepting a template type and passed `a+b+c` all the temporary objects would still be in scope for the duration of the function call.

24 02 2011
Geoff Romer

I don’t think this will work. First of all, the type needs to be copyable (or at least movable), or you can’t return it from operator+. You might be able to fix this by making the copy/move constructor private, but that’s getting pretty subtle.

The bigger problem is that even if you could prevent copying, assignment, and moving, you still have the following problem:

const auto& x = a + b + c;

Since x is a reference, there’s no copying, assigning, or moving going on. Since it’s a const reference, the lifetime of the temporary returned by the expression is extended to be the lifetime of the reference, but I don’t think the lifetime of the temporary returned by (a + b) is similarly extended, so you get the same dangling-reference issue.

24 02 2011
Motti Lanzkron

Excellent point! I’ve edited my post accordingly.

23 02 2011
thomas

After reading this post I am motivated to go back and redouble my efforts at learning OCaml – a language which had type-inferrence from the beginning. It also seems like a good candidate for use in place of C++ for a lot of the projects I’m going to be working on (has the required speed, etc.)

23 02 2011
Rob

While §7.1.6.4 of the FCD does not constrain the deduced type as if it had appeared in the original source, §11/1 still applies so that a private nested type’s “name can be used only by members and friends.” IOW, I think you found a compiler bug rather than a defect in the standard.

23 02 2011
pizer

Funny. I had the exact same train of thought one or two years ago and I came even up with “operator auto” in the search for a fix. I started a thread to that effect in one of the c++-related usenet groups if I remember correctly. But nobody seemed to care/understand. Just recently I wrote a little library that employs expression templates. But this time I used reference-counted copy-on-write to avoid unnecessary copying and to guarantee that data was kept alive long enough. So, writing “auto expr = …;” would be at least save.

23 02 2011
Motti Lanzkron

Are you slavkop? In (or around) July 2009 was the first I thought of this and was surprised to find a posting in comp.lang.c++.moderated with my thoughts stolen and taken back to December 2008…

23 02 2011
pizer

No, but I was the first one to answer slavkop in that thread :-) Seems like there is more awareness of this issue than I expected/remembered. Cheers! – p

23 02 2011
Amelek

if you have function returning private object, in c++0x you can instance that object on your own:

http://codepad.org/KjpFj5rA

As for your string code, it could be fixed using simple private copy constructor and befriending string class from concat. Then auto s = a + b; will no longer be valid.

23 02 2011
pizer

In addition to techniques like reference-counted copy-on-write one might want to try to make use of move semantics where it makes sense. std::string is a good example for that. Alternativly, one could make expression templates explicit. By that I mean that the user would have to explicitly enable them using function templates named “expr” and “eval” (or something like that). Example: auto x = eval(expr(a)+b+c+d);

23 02 2011
Paul

This problem actually exist before the auto keyword. When you pass the expression template to a template function it deduces the type in the same way, unfortunately. I actually had the same idea of having an operator auto() for developing proxy types. This will make it capable of doing a psuedo-dot overloading also.
I use proxy types to separate getting and setting in maps(because they could be referencing memory from disk or locking memory for concurrent access). In this situation overloading the dot operator helps when you want to chain methods. Now since i cant do that i have a get and set property on each element. So I access an element like this:
auto x = map[i].get;
map[i+1].set = x;

This still lets you set values without having to use a storm of parenthesis. The same can be done for expression templates using an eval function like what pizer said.

The proxy object has a private constructor so it cant be pass through auto type deduction, but it still can be referenced in a template function. And this can still be a potential problem. One idea i’m working on around this is kinda of an operator auto() hack. A template function goes through two calls and uses a metafuntion to adjust the deduction types:

template
void foo(T& x)
{
foo_deduced<typename adjust_deduction::type>(x);
}

template
void foo_deduced(T& x)
{
//Put my logic here
}

So then you would just override the adjust_deduction metafunction, or maybe provide some kind of tagging mechanism for it. I’m trying to think of some kind of macro that would help in creating these functions.
Anyways it seems that if several people are coming up with this idea independently for an operator auto() overload, then it would make sense for the c++ committee to have a serious look at it.

23 02 2011
Paul

Actually the line should be:
foo_deduced<typename adjust_deduction::type>(x);

24 02 2011
Witek

cords to the rescue: https://github.com/baryluk/cords

Unfortunately I do not updated them for some time, so probably not compile on current 2,x compiler.

OH. Sorry, I somehow missed it is about C++, not D. Any way there is a similar library for C++, even in STL.

24 02 2011
petke

I admit not taking the time to read the implementation of your example in detail. auto though, is just supposed to be syntactic sugar. Its not supposed to change the semantics. Are you sure you could not write out the same type (that auto evaluates to) in the C++03 example? If that is possible (even if it is cumbersome) then one could argue that the example has always had this latent bug in it.

24 05 2011
Alfonse

It seems to me that move semantics would probably allow you to make this work without the use of a special type. If the string’s operator+ had a version that took a std::string&& on the left-hand, then it could simply build the string in the std::string&& and therefore avoid creating extra temporaries.

I haven’t studied the deep magic around move semantics, but I think that this should work.

24 05 2011
pizer

@Alfonse: The use of rvalue references as a function’s return type to save some temporary object creations is generally frowned upon. See http://stackoverflow.com/questions/6006527/

25 05 2011
Alfonse

I wasn’t thinking about returning an r-value reference. The regular operator+ for std::string looks something like this:

std::string operator+(const std::string &lhs, const std::string &rhs)
{
std::string result(lhs);
result += rhs;
return result;
}

That first command is a copy operation. It allocates heap memory will it will then potentially reallocate thanks to rhs’s data. Instead, one could have:

std::string operator+(std::string &&lhs, const std::string &rhs)
{
std::string result(std::move(lhs));
result += rhs;
return result;
}

Yes, you have created a temporary by doing this. But thanks to decent return value optimizations in compilers, returning a std::string won’t create a new copy. You will have saved one memory allocation and a copy operation.

It ultimately won’t be as efficient as your code, as it cannot calculate the final string length. But it is still an improvement.

25 05 2011
Rob

@Alfonse (since I can’t reply directly to his reply to pizer’s post)

The proper C++03 implementation of the operator is:


std::string
operator +(std::string _lhs, std::string const & _rhs)
{
_lhs += _rhs;
return _lhs;
}

See http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/.

27 05 2011
Motti Lanzkron

While it’s true that using an rvalue for the left hand side would give you better performance than the naive implementation it will still be less optimal than the expression templates implementation. The implementation I show computes the size in advance and allocates memory only once, if we do the lhs rvalue then we reallocate the buffer for lhs every time the capacity is used up.

In any case the point I tried to make is general and the string example is just that, an example. How would you solve matrix multiplication?

25 05 2011
pizer

I thought by “avoiding temporaries” you guys meant returning an rvalue reference to an older temporary instead of creating a new one. For a decent compiler with NRVO and other copy elisions one could apply “swaptimization”:

~~~~~~~~~~
string operator+(string x, string const& y)
{
string result; // default ctor is expected to be fast
result.swap(x); // swapping is expected to be fast
result += y;
return result; // NRVO aplicable
}
~~~~~~~~~~

Of course, I’m aware of Dave’s “want speed? pass by value” article.

13 03 2012
Clinton

I think this is a bit of a flaw in the standard, which unfortunately now will be hard to fix.

I think the best way would have been is to disallow:

const auto x& = f();

AND

auto x&& = f();

Unless f() had a valid move or copy constructor.

That would mean that they would only be valid when

auto x = f() was valid.

Expression template classes could have then had private copy/move constructors.

Perhaps now, the best way forward to prevent issues with future code is the operator auto() like you suggested, or simply allowing classes to be tagged as “temporary”, which means they can not be copied or moved to anything which outlives their own lifetime.

13 03 2012
Clinton

Keep in mind, with move constructors now, I don’t really see the need to extend the lifetime of temporaries bound to rvalue references.

Why write:

auto x&& = f();

When you can write

auto x = f();

Which is probably clearer, and doesn’t need a move or a copy due to RVO.

I understand how const blah& = f() was useful for non-copyable objects, but I can’t really think of an object which is not copyable or movable, and even if it was, why would you want to assign it to an rvalue reference (which strongly suggests it’s moveable)?!

At the very least, blah&& x = f() should require a move or copy constructor (even if it’s not used in the end). Implementing rvalue references like that in the first place wouldn’t have broke backward compatibility, and would have allowed expression templates to be type safe in the following manner:

(1) Private move/copy constructors.
(2) Conversion operator that only takes an rvalue reference.

Then

auto x = f() will fail (no copy constructor).
auto x = eval(f()) will succeed.

auto x&& = f() would fail.

const auto& x = f() would succeed but
auto x2 = eval(f()) would fail (eval only takes an rvalue reference).

And

auto x2 = eval(std::move(f()) would fail (std::move on a const reference is still const).

However, sadly with the current standard, you can subvert the type system in the following manner.

auto x&& = f()
auto x2 = eval(std::move(f()));

And then you’ve got references to expired temporary issues. I can’t work out a way to avoid this problem.

13 03 2012
Motti Lanzkron

Thanks for your detailed comments.
First thing to keep in mind is that move constructors aren’t a magic bullet, in some cases a move constructor is just as expensive as a copy constructor (e.g. std::array).
In any case I’m not sure that your examples capturing rvalue references are correct, I think that using rvalue references in places other than function parameters is usually not what one would want.

14 03 2012
Clinton

The standard actually dictates assigning to rvalue references in some places, see here:

http://stackoverflow.com/questions/9657708/c11-the-range-based-for-statement-range-init-lifetime

Personally, I don’t think assigning to rvalue references should be allowed, as assigning to a value is just as efficient due to RVO (no move or copy required).

blah x = f(), while not requiring a move or copy due to RVO, at least requires a move or copy constructor to exist (even though it will not be used).

However blah&& x = f() can be applied if f() returns a non-movable or non-copyable type, so the trick of making expression templates non-movable and non-copyable doesn’t work here.

14 03 2012
Motti Lanzkron

I didn’t know you could bind a temporary to any kind of reference, I thought it was only to a const reference. You learn something new every day!

4 04 2012
Reader Q&A: « Sutter’s Mill

[…] Yes, and even exactly that spelling has been suggested. I’ll take that as a +1 for discoverability if we name it that! (For people reading this comment, if it doesn’t make any sense I wrote about it last year in my blog http://lanzkron.wordpress.com/2011/02/21/inferring-too-much/) […]

4 04 2012
Gotchas of type inference | Andrzej's C++ blog

[…] Lanzkron, “Inferring too much.” Rate this: Like this:LikeBe the first to like this post. This entry was posted in […]

21 04 2012
DK

I think there’s an additional issue besides the references to objects that have expired. Consider

string prefix("Mr.");
string first("Michael");
string last("Stipe");
string separator(" ");
auto fullName = prefix + separator + first + separator + last;
string n1 = fullName;
last = string("Mills");
string n2 = fullName;

Lets assume that string::concat’s references lifespan’s are extended for the duration of fullName’s. Since fullName has references to the strings, modifying ‘last’ would lead to n1 and n2 being different, not what we’d expect. The change to last would have the surprising effect of changing the value of n2 in the following line.

So this leads me to a question. Which constructor is used for fullName? is it concat<concat >(concat<concat > oldVal) (i.e. copy constructor)? or is it just assigned the last concat<concat > returned from the last operator+(conact, string) in the expression? If it’s a constructor with a single argument, the string concatenation could take place in that constructor and store that in concat instead of setting up the references to the building pieces. (same for operator=() ) But I might be missing something.

22 04 2012
Motti Lanzkron

Good point DK, this will also generate different results that writing string fullName =....

As for which constructor is used of fullName, I don’t understand the question, fullName will be initialized according to the regular rules of C++, i.e. it will be either copy constructed from the last concat object returned by the + expression or the copy may be elided. In any case since we didn’t specify a copy constructor the default implementation would just copy references and not perform a deep copy.

Even if we did define a copy constructor, to perform deep copy, I don’t think we could depend on it since the compiler is allowed to elide copy constructor/destructor pairs.

24 04 2012
DK

Sorry for being less than clear with my question. (Didn’t help that the ‘<T>’s got treated as html). Yes, I was wondering if copy constructors could be used to enforce the string concatenation as fullName was being created, or copy elision could still happen.

24 04 2012
Motti Lanzkron

To my understanding the compiler can assume that a copy constructor/destructor pair do nothing and can therefore elide both (RVO). I don’t think that C++11 has made any changes in this aspect. In fact I think I saw people say that a compiler can elide a move-constructor for RVO since such elision is more performant even than move construction.

23 07 2013

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: