01 Mar 2022
In this post we’ll explore how to use modern C++ (C++11 onwards) to implement moving semantics, that is, how to move data between variables so as to avoid unnecessary copying.
We’ll first understand rvalues references and see how it can be leveraged for implementing moving semantics.
In a simplistic way, lvalues are expressions that can be assigned to, i.e. they can appear on the left side of an assignment, while rvalues are the other type of expressions. Examples:
c are lvalues, while
f() are rvalues, because they cannot be assigned to.
Another key property of rvalues is that they cannot be reused. In the code below, we seem to be apparently be reusing
f() but the second call of
f() is not the same as the first.
In the next example,
x is a lvalue that is reused.
One way to put this is that rvalues cannot be read from again, analogous to how
const values cannot be written to again.
A reference type is denoted by adding
& to the type. In the example below
y is a reference to
Reference types can only refer to lvalues (like
x above), not rvalues. In C++11, the rvalue reference was introduced, which is indicated by
To disambiguate, the regular reference syntax can also be qualified as lvalue reference.
So far we only discussed the syntax. Let’s now discuss the semantics of lvalues and rvalues.
References are essentially aliases, so in the example below when we mutate
x after initializing
y, the latter will also be changed and vice-versa:
We can use references to avoid copies, for example:
When we do
A y = x the copy constructor is invoked, while
A &z = x doesn’t invoke any constructors. We can think of
z as an alias to
x. The same thing applies to function calls:
The same behavior applies to rvalue references, except that now we can pass a rvalue
Note that it prints
"new" from the
A() call, not from the
As we know, C++ allows overloading, which means there can be multiple functions with the same name but different type signature. The function that ends up being called depends on the types of arguments.
& can only receive lvalues and
&& only rvalues, having overloads for both
&& is not ambigous:
A copy constructor of a class
A is a constructor that takes a lvalue reference to
A. For example:
A move constructor of a class
A is a constructor that takes a rvalue reference to
A. For example:
The idea behind the move terminology is that rvalues have the guarantee they won’t be read again, so we can assume we can mess with the input as we see fit, including destroying/emptying it. One more realistic example is a class that holds onto some memory:
In a copy constructor we need to clone the input’s memory because whatever
&x is referring to might be used again, but in a move constructor we can simply move that memory since we know the reference it to a rvalue which won’t be read from again.
We can cast a lvalue reference (
&T) to a rvalue one (
static_cast<T&&>() or via
This is essentially telling the compiler: “I know this is a lvalue but trust me I won’t try to read from it later, so treat it as a rvalue”.
One example where it’s useful is in defining
We can see the copy constructor is called 3 times. We can use
std::move() to force the move constructor to be called:
So here we’re saying it’s fine to treat
y as rvalues since we’ll overwrite them before we read from them again. For
t we know it won’t be read from since it’s local.
std::move() doesn’t do anything special in regards to moving. It simply adds a semantic layer on top of this casting. Similarly, nothing in the syntax of rvalue references is specific to moving data, it’s all about the meaning we add on top of it (a design pattern of sorts) hence the semantic bit in move semantics.
Since move semantics is not something the compiler understands, this “assume lvalue is rvalue” is a contract that must be honored by the code. The compiler won’t prevent us from doing:
Constructor elision is an optimization the compiler performs to avoid a copy when it knows lvalue won’t be reused. For example:
In theory the variable
f() would be copied to a temporary place before being assigned to
y, but most compilers will special case this and skip the copy constructor, so we don’t have to worry about moving here.
It’s possible to turn off this behavior by compiling with the
-fno-elide-constructors flag. Re-running the same code should now print
In this post we learned about rvalue references and move semantics and how they’re connected. Thomas Becker’s C++ Rvalue References Explained  is an excellent resource that goes into details while being very accessible through step-by-step progression.
One key observation that made me internalize why rvalue references are useful for move semantics is that rvalues cannot be read from. This constraint enables more efficient operations such as moving data instead of copying. I haven’t seen it explicitly called out in the articles about move semantics I read.
Rust Memory Management. The move semantics is a first class citizen in Rust via the ownership model and can be enforced by the compiler. Here’s an example from that post where the move is enforced by the compiler: