07 Apr 2022
In this post we’ll explore the different mechanisms for passing data around in C++ and be build a set of heuristics to help us choosing between them for different scenarios.
A lot of these will boil down to personal taste, so keep the salt shaker handy.
Implicit in the pass-by-value is a call to the copy constructor. We can see it being called by printing a message the copy constructor
If we don’t provide one, there’s usually a default copy constructor that is used, which recursively calls the copy constructor of its internal variables. If
C is a complex class it can be a very expensive operation.
Do we ever want to pass data by value? Meyers  (Item 20) suggests that we should always pass by reference, except for built-in types (note this does not include
std::string) and STL iterators (e.g. like
Heuristic 1. Always pass by reference, except for built-in types and STL iterators.
When should we use pointers vs references? There’s no general consensus  but I personally am on the camp of avoiding pointers as much as possible. Let’s consider a few scenarios.
Pointers can be null, so it might be desirable to use them to encode optional values:
The alternative is to use
std::optional. The major problem for this contrived example is that
std::optional cannot store a reference , so to avoid copies when wrapping it in an
std::optional we need to use
It unfortunately looks quite verbose. Interestingly
boost::optional does allow optional references.  provides some insight on why optional references did not make into the standard.
Another scenario is when we receive a pointer from a function and might want to pass it to an internal function. In this case de-referencing the pointer and passing as reference does not involve copies, so it’s fine.
How about a setter method? We likely will assign the argument to an member variable which possibly outlive the object. Example:
In this case a copy will be triggered on
b_ = b so it’s ok. We cannot store a reference since its referred object could fall out of scope, so if we do want to avoid a copy we then need to use a pointer:
Alternatively we could transfer the ownership to the class by expecting a rvalue reference so the caller is forced to use
The problem with this approach is that
get_b() needs to make a copy.
With the caveat of potential boilerplate and some care to avoid copying when modeling optionals, we can state the following:
Heuristic 2. Always prefer references over pointers for input arguments. Except if it will be stored internally and we want to avoid copies.
One other scenario where we can’t use references is when returning an object on the heap (created via
new), so we have to return a pointer (Item 21 in ). The problem with returning a raw pointer is that someone needs to
delete the object in the heap, otherwise we incur in memory leak.
Smart pointers aim to solve this sort of pitfalls. It knows when a object cannot be referenced anymore, and thus handles the
For this case it seems strictly better to return smart pointers. For function arguments,  argues we should use raw pointers if we don’t care about the ownership model of the pointer.
However, in light of Heuristic 2, we either will not use pointers or we should care about ownership (i.e. when storing it internally) and thus can define another heuristic:
Heuristic 3. Always prefer returning a smart pointer over a raw pointer.
Another thing to keep in mind is the constructor elision we mentioned in . The compiler usually optimizes cases where we return objects by value to avoid a copy constructor.
So we don’t need to use pointers in this case.
unique_ptr are smart pointers in the sense that they know when to delete the object when it knows such object cannot be referenced anymore.
The difference between them is that we can assign a
shared_ptr to multiple variables whereas a
unique_ptr can only belong to one, and re-assignments must “move” the data. Example for
One potential pitfall of
shared_ptr is that the former doesn’t keep information about the type used to create it. This means when the pointer gets out of scope, the current type’s destructor is used .
For example, suppose we have class
D deriving from
C, and that when we create a smart pointer we return it as a pointer to
We can see that when the
std::unique_ptr<C> falls out of scope, only
C’s destructor was called even though we constructed an instance of
D. The solution is to make
C’s destructor virtual:
In general I prefer to start with the most restrictive mode possible (e.g.
const modifier), since it simplifies reasoning about the code. When the need arises, it’s possible to relax the constraints.
Thus, unless we explicitly expect our pointer to be shared by multiple owners, I’d default to
Heuristic 4. Everything being equal, prefer unique_ptr over shared_ptr.
In this post we came up with 4 heuristics to help deciding between different memory syntax and semantics.
The idea is that these heuristics are very general with small exceptions, so they can be remembered more easily. The problem with them is that they leave out a lot of nuance and can be overly prescriptive.
As we gain more experience with C++ we get a better “feel” for when to use what. As with a lot of subjective recommendation I value consistency more than strong opinions so I’d rather stick to existing patterns in a codebase than push my own heuristics.