kuniga.me > NP-Incompleteness > Review: Effective Modern C++
25 Oct 2022
In this post I’ll share my notes on the book Effective Modern C++ by Scott Meyers.
As with Effective C++, Meyers’ book is organized around items. Each item title describes specific recommendations (e.g. “Prefer nullptr to 0 and NULL”) and then it delves into the rationale, while also explaning details about the C++ language.
The post lists each item with a summary and my thoughts when applicable. The goal is that it can be an index to find more details on the book itself.
The Modern in the title refers to C++11 and C++14 features. This book is a complement to Effective C++, not an updated edition.
The book is divided into 8 chapters and 42 items. Each chapter serves as a theme into which the items are organized.
To make look up easier, I’ve included a table of contents:
This item explains how a given template T
is resolved. Let’s analyze some cases. First assume the template is declared as T&
:
As const T&
:
As T&&
(universal reference, see Item 24):
There are corner cases for arrays or function pointers which we won’t cover.
The gist is that auto
is resolved the same way templates are for the most part. The only difference is when an auto
variable is initialized using curly braces. auto
resolves to std::initializer_list<>
, template doesn’t compile:
decltype
is used to get the type of a variable. It can be useful to bridge the gap between auto
and templates. auto
doesn’t have an explicit type and templates might require one. Item 18 provides one such example:
We won’t go over the details, but type-wise, this version of std::unique_ptr<T, D>
has two template parameters, the type of the underlying object T
and that of the deleter function D
. We don’t know the type of deleter
so we can use decltype(deleter)
. Item 33 has a simular use case.
Another use of decltype
is combining with auto
as decltype(auto)
. One problem with auto
is that it drops the reference modifier when resolving. For example:
Here auto
resolves to int
. If we wish to preserve the &
from getRef()
we need decltype
:
A third use of decltype
is a technique to display the type of a given variable at compile time, as shown in Item 4.
There are several ways to inspect the deduced type of variables declared with auto:
One interesting technique is to have a compilation error tell us that. We can use the following code:
This will fail to compile and display the type in the error message. In clang
(v14) I get:
error: implicit instantiation of undefined template ‘TD<int *>’
In runtime, we can instead use typeid().name()
:
It mangles the name but compilers have tools for prettifying it. For clang
, we can use the llvm-cxxfilt
CLI:
The reasons provided include: easier to type and refactor. It also avoids subtle type mismatches which are hard to catch because the compiler tries to convert/cast types when possible. Examples are provided in the book.
Item 6 discusses cases in which auto
doesn’t work well.
One example where auto
doesn’t infer the “expected” type is when accessing an element of a vector of booleans. This is because vector<bool>
is optimized to use bitpack so each element only uses 1 bit instead of a whole byte.
However, this means when acessing a specific element, it needs to return a special structure, std::__bit_reference<std::vector<bool>
, which can be implicitly converted to bool
:
However if we use auto
:
b
holds a reference to an object that doesn’t exist anymore (i.e. the temporary object created to hold f()
’s return value), so its value is undefined.
More generally, in any case we use proxy classes, i.e. types that are not actually the type one would expect but can be implicitly converted to it, we might have such risk. The author suggests using static_cast<T>
to solve this issue:
but then I’m not sure about the advantage of using auto
.
Variables can be initialized via assignment, parenthesis or curly braces:
The advantage of the curly braces is that it prevents narrowing conversion, which is when a broader type (e.g. double
) gets converted to a narrower one (e.g. int
), possibly causing information loss:
Curly braces won’t compile. Another issue curly braces avoid is the vexing parse in which the initialization syntax is the same as a function declaration. For example:
The last expression might seem like it’s creating an instance of C
by calling the default constructor but it’s actually declaring a function. Using curly braces does the intuitive thing:
Another scenario in which parenthesis and curly braces behave differently is passing two arguments to a int
vector:
The item suggests nullptr
is more readable when representing null pointers than either 0
and NULL
.
There are also some cases when using templates where passing either 0
or NULL
won’t compile as pointers. For example:
It boils down to templates. typedef
cannot be templatized. An example using alias declaration:
If we want to achieve the same using typedefs
we need to use a struct
:
Whenever we use this new type we need to do typename MyVec<T>::type
as opposed to MyVec<T>
for alias declaration.
The C++98 enums are known as unscoped enums:
The C++11 enums are called scoped enums (note the class
modifier):
To refer to a scoped enum value we do RGB::red
as opposed to red
previously. The need for qualifying the enum value is the origin of scoped. This prevents scope pollution (e.g. another enum including red
would fail to compile).
One case where unscoped enums work better is to implement named tuple access for readability:
The alternative using scoped enums would require explicit downcast to std::size_t
because its default type is int
.
Suppose we’re inheriting from a class and we want to “hide” some of the methods from the parent class to all callers. One way to achieve this is by making the methods private. However, member methods or friend classes would still be able to call them by accident, so we can also not define them.
The problem is that if being invoked by a different compilation unit, this would only fail at linking time which is harder to understand. We can instead delete the method:
To recap, suppose class Child
inherits from Parent
. Overriding functions allows us to call the method from the instance’s Child
type even when the type on the signature is of Parent
type. Example:
It’s easy to get this wrong. If we forget to add virtual
to the parent method or make a mistake when defining the signature of the child the override won’t take place. Example:
Here we forgot to add const
to the Child::f()
so it’s not overriding. The override
keyword will cause a compilation error if that happens.
clang
reports this error:
hidden overloaded virtual function ‘C::f’ declared here: different qualifiers (‘const’ vs unqualified)
Use cbegin()
and cend()
from stl
collections whenever possible as opposed to begin()
and end()
.
We can annotate a function with noexcept
to indicate it doesn’t throw exceptions:
noexcept
functions can be better optimized by the compilers. However, there’s no compile time constraint to enforce a noexcept
doesn’t really throw exceptions or call functions that do.
My take on this item is that it’s not very broadly applicable.
The constexpr
can be used when declaring variables or functions. For variables, its value is resolved at compile time:
constexpr
variables can be a function of other constexpr
variables:
For functions, its behavior depends on whether all the arguments are constexpr
. If yes, then its result is also a constexpr
, else it’s a regular function. The body of a constexpr
function can only depend on other constexpr
functions.
Downside: it’s hard to debug or profile constexpr
functions because printf()
is considered side-effect.
The gist of this item is that there’s a backdoor to mutate variables in const
methods: declaring them as mutable
, for example:
And hence not thread-safe.
This item discusses the conditions in which the default constructor, destructor and assignment operators are auto-generated.
Let’s abbreviate:
We can build a table to encode the rules for when a member function is auto-generated. To read the table: a member function corresponding to a row is only auto-generated if none of the columns in which an ✓ exists is user defined.
CC | CA | MC | MA | D | |
---|---|---|---|---|---|
CC | ✓ | ✓ | ✓ | ||
CA | ✓ | ✓ | ✓ | ||
MC | ✓ | ✓ | ✓ | ✓ | ✓ |
MA | ✓ | ✓ | ✓ | ✓ | ✓ |
It also mentions the Rule of three:
Rule of Three: if you declare any of copy constructor, copy assignment or destructor, you should declare all three.
Templated operations do not count towards special member functions. For example:
We’ve discussed unique pointers in Smart Pointers in C++. This section describes other things I’ve learned from the book.
std::unique_ptr
supports custom deleters which are made part of the type:
The size of std::unique_ptr
is the same as raw pointers unless custom deleters are used.
We’ve discussed shared pointers in Smart Pointers in C++. This section describes other things I’ve learned from the book.
Moving shared pointers (as opposed to copying) avoids reference count changes (which can be expensive since it’s atomic).
Shared pointers allocate memory for a control block which among other things stores the reference count, so the size of std::shared_ptr
is at least twice as big than std::unique_ptr
.
Differently from std::unique_ptr
, the custom deleter is not part of the type of a std::shared_ptr
because it can be stored in the control block.
A std::weak_ptr
can be obtained from std::shared_ptr
but does not increase reference count. This is useful to prevent cyclical dependencies in which case reference count doesn’t work but this is very uncommon.
std::weak_ptr
also have a control block like std::shared_ptr
but it has a different reference count.
We’ve discussed the merits of std::make_unique
and std::make_shared
in Smart Pointers in C++. This section describes other things I’ve learned from the book.
One interesting bit is that when using std::make_shared
, it allocates the object being created and the control block in the same chunk of memory.
The Pimpl idiom is a technique used to reduce build times. The idea is to move heavy dependencies from the .h
file to the .cpp
one. For example, suppose we have some dependency a.h
:
Our main class B
depends on A
, so its header includes it:
And here’s the implemention:
If we want to not depend on header a.h
in b.h
, a technique is to define a struct Impl
which depends on A
but we only forward declare in b.h
and create a single member variable as a unique pointer to it (hence the Pimpl name: pointer + implementation):
Then in the b.cpp
we actually define the struct Impl
and have the dependency on a.h
there:
Note that we don’t need the destructor because impl
calls delete
when it falls out of scope. There’s some issue with the auto-generated destructor that the book delves into but I didn’t get errors for that.
We’ve discussed std::move
in Move Semantics in C++. This section describes other things I’ve learned from the book.
One important observation is that function arguments are always lvalue even if their type is a rvalue reference. For example:
In [2], we’ve seen that std::move()
is a static cast that converts any reference to a rvalue reference. std::forward<T>()
converts to a rvalue reference conditionally, only if the type T
is itself a rvalue reference.
This is clearer from an example:
Since T&&
is a universal reference, the way it is resolved depends on whether the passed value is a rvalue or lvalue reference. In f(s)
, the type T&&
in f()
resolves to std::string &
. In f(std::move(s))
, T&&
resolves to std::string &&
.
At f()
, p
is a lvalue, so if passed to g()
as is, we’d always call g(std::string &s)
regardless of whether f()
was initially called with a rvalue. We’d like to preserve the rvalue information as if g()
was being called directly. This is what std::forward<T>
does.
Simplistically std::forward<T>
is basically a static_cast<T&&>
but there are nuances we won’t discuss here.
Worth noting that std::move
and std::forward
are both static casts.
Universeal references are rvalue references where type deduction happens, either via auto
or templates. Examples:
It’s called a universal reference because it properly handles both rvalue and lvalue references.
This item basically says that if we got p
as a universal reference we should pass it along using std::forward<T>
:
But if we got it as a rvalue reference we should use std::move
:
This item basically says that if we have a single parameter function:
Do not add an overload using universal references like:
This might make the overload resolution difficult to reason about. For example, if we pass short
to f()
it actually calls the universal reference overload.
This is even worse if we use universal references in the constructor, because it mixes up with copy and move constructors.
The book delves into the details on why the last line calls the universal reference constructor.
This item provides several ways to avoid the universal references overloading. One of the most interesting is tag dispatch. It basically leverages static checks to make sure the right overload is used. So if we have:
We can turn f
into a dispatcher function and have the int and non-int logic as fImpl
:
When we call f()
with an integer type, std::is_integral<std::remove_ref<T>>
will resolve to std::true_type
and call fImpl(int x, std::true_type)
. Otherwise it resolves to std::false_type
and calls fImpl(T&& x, std::false_type)
.
You can’t directly write a reference to a reference:
But compilers might as intermediate steps when deducing types, for example:
Since y
is int &
, auto&
resolves to int& &
, but gets collapsed to int&
in the end.
The rule for collpasing is simple: if both references are rvalue, the result is an rvalue reference, otherwise it’s a lvalue reference. This explains the behavior of universal references:
For a
, auto &&
resolves to int& &&
and thus int &
. For b
, auto &&
resolves to int&& &&
and thus int &&
.
One case where move is not cheaper than copying is for when small string optimization (SSO) is used. In this case the content is stored along side the std::string
and not dynamically allocated, so we can’t simply do a pointer swap.
For cases where it’s not used, some STL code only makes use of move operations if they’re are noexcept
, for some back-compatibility reasons.
This item discusses cases in which using:
doesn’t work. The failure scenarios described are due to universal references not to std::forward
in particular.
Casr 1: Braced initializers.
This is explained in Item 2, the reason being that T
cannot deduce the type of std::initializer_list
.
Case 2: 0 or NULL as null pointers
This is explained in Item 8 and is also related to template type deduction.
Case 3: Declaration-only integral static const and constexpr data members
This is a super specific scenario when we have:
The explanation is that C::k
doesn’t have an address in memory and since references are often implemented as pointers, this might fail in some compilers.
One natural question to ask is why rvalues are allowed to have references then? That’s because the compiler will create a temporary object for the rvalue which in turn has some address.
This temporary object creation doesn’t happen for static constexpr
, at least for some compilers. For clang
it works.
Case 4: Overloaded function names and template names.
This also happens if callback
is a template function.
Case 5: Bitfields.
Bitfields allow splitting a single type into multiple variables, for example:
Here field1
uses 10 bits and field2
uses 22 bits from the 32 bits of std::uint32_t
.
This also doesn’t work with universal references:
Things I learned from the chapter introduction:
The compiler creates classes for lambdas behind the scenes, called closure class. The lambda logic goes in the ()
operator, which is const
by default. The mutable
keyword in the lambda changes that.
Assigning a lambda to a variable incurs in the creation of an instance, called closure. Closures can be copied.
The item advises against using default capture by value ([=]
):
or default capture by reference:
Because they can cause dangling references.
For move-only objects like std::unique_ptr
we can use this syntax to move the object into the closure:
Generic lambdas are those having auto
in their argument list:
The underlying closure class is implemented using templates, possibly as:
If we want the closure to take a universal reference (auto&&
) and forward that argument, we don’t have the template T
available, so we can use decltype
:
According to this item, there’s never a reason to use std::bind
after C++14. It claims that lambdas are more readable, expressive and can be more efficient than std::bind
.
In other words, prefer std::async
to std::thread
. Example using threads:
And async:
The item suggests thread is a lower level abstraction than async, so async handles a lot of the details for you.
Another advantage of async is that you can get the result from the async function more easily than in a thread:
In line with Item 35’s claim that async handles a lot of the details for you, one thing you can’t assume is that it will always run the callback in a separate thread. It might actually wait and run the function in the current thread.
To force it to run as a separate thread we must use std::launch::async
:
A unjoinable thread is one in which the .join()
cannot be called on. One example is when a thread has already been joined:
Or when the thread has been moved:
Or detached:
If a thread is joinable by the time it’s destructed, the program crashes, for example:
The recommendation of this item is to make sure threads are made unjoinable before they get destroyed.
As one way to achieve this, the author proposes a RAII-wrapper around std::thread
called ThreadRAII
, that enables configuring whether to call .join()
or .detach()
on the underlying thread at the ThreadRAII
’s destructor, effectively guaranteeing a thread is never left joinable on destruction.
This item discusses the behavior of the destructor of std::future
(the type returned by std::async
). If it’s executed asynchronously either explicitly via the flag std::launch::async
or implicitly (Item 36), the destructor behavior changes, because it calls .join()
on the underlying thread.
Note this behavior is different from when a raw std::thread
is destroyed (Item 37).
This item discusses a scenario where we have 2 threads, t1
and t2
and we’d like t2
to wait for t1
until it signals it. One way to do this is using std::condition_variable
+ std::unique_lock
+ std::mutex
:
The item argues this is hacky and suffers from issues like cv.notify_one()
running before cv.wait(lk)
, which causes the latter to hang. It proposes an alternative using std::promise
+ std::future
:
The major downside of this approach is that it can only be used once.
Independent assignment like:
Can be re-ordered either by the compiler or by the underlying hardware to improve efficiency. This poses a problem for concurrent programming because we might use an independent variable to indicate some computation has taken place:
If another thread relies on isReady
to determine compute()
has been run, we can’t let the compiler re-order the last two statements.
std::atomic
prevents that by telling the compiler: if an expression appears before a write to an std::atomic
variable in the source code, then it has to be executed before such write in runtime.
There’s another optimization compilers can do, regarding redundant reads and writes. In the code below, the initial assignment of y
is never used and is later overwritten:
The compiler might want to re-write this as:
However, it’s possible that y
writes to a special memory (e.g. an external device) instead of RAM and some other system might depend on that side-effect. volatile
prevents this optimization from happening.
Note that this is still subject to re-ordering, so we could combine volatile
and std::atomic
:
Suppose we have a function that takes a reference and makes a copy of it internally:
We’d want to also support a rvalue reference overload to avoid making additional copies for rvalues:
If C
is cheap to move, we can simplify things and just take r
by value:
Let’s first compare this new form with the set(C& r)
case. When we call set(C r)
with an lvalue, we’ll copy-construct it when calling set()
, but avoid the copy when move-assigning to c_
. Whereas for set(C& r)
, we avoid a copy when calling set()
but make a copy when assigning to c_
. Assuming moving is cheap, they incur in roughly the same cost.
Now compare with the set(C&& r)
case. When we call set(C r)
with an rvalue we’ll move-construct it when calling set()
and we’ll move-assign to c_
. For set(C& r)
we’ll do two move-assigns. Again, assuming moving is cheap, they incur in roughly the same cost.
Many STL containers support emplacement instead of insertion. For example, std::vector
has emplace_back()
. Emplace methods take the constructor arguments instead of object, so it avoids temporary object creation if the argument has different type but has a contructor that accepts it.
For example, suppose we have a class C
that can be constructed from int
. If we call push_back(1)
, first we’ll create a temporary object via tmp = C(1)
then copy it when doing push_back(tmp)
:
If we use emplace_back(1)
, we’ll only call C(1)
done inside std::vector
:
I really liked reading this book cover-to-cover and learned a lot! The book is rather verbose but it has a very fluid narrative and thus is smooth to read.
Despite the verbosity, the book has a lot of content and I had trouble summarizing, even leaving out the rationale for the recommendation. The markdown text for the post has over 1k lines (usually it’s fewer than 200).
The book also manages to make the content accessible but also detailed and technically precise. One downside is that at times the author spends a lot of time discussing what it seems like an extreme corner case, for example, in Item 27 (on avoiding overloaded universal references), the section called “Constraining templates that take universal references”.