RAII

29 Apr 2023

group of miniature humans mopping the floor

One of my favorite features from C++ is the RAII idiom, which stands for Resource acquisition is initialization [1], a term coined by Bjarne Stroustrup.

The general idea is that resource acquisition (memory, locks, file descriptors) is done during the constructor and released during the destructor.

In this post we wish to discuss RAII in a broader context. We’ll cover applications of RAII in C++, in Python and as a UX paradigm.

Recap

Let’s revisit some concepts from C++ which will be needed to discuss RAII in more depth.

Constructors and Destructors

Many object oriented languages have the concept of a constructor, a function that is called to initialize the object upon creation. C++ classes also have a destructor, a function that is called before the object is destroyed.

The destructor of an object is called when it falls out of scope:

struct MyClass {
    MyClass() { /* constructor */ }
    ~MyClass() { /* destructor */ }
};
{
    MyClass x;
    x.f();
} // object is destroyed, ~MyClass is called

Let’s now try to understand how the destruction of objects happen in C++.

Lifetime and Destruction

The lifetime of an object starts when it’s created and ends when it’s not reacheable in any way. In C++, because of its “pass-by-value” behavior, the object’s lifetime starts when it’s assigned to a variable and ends when such variable falls out of scope. For example:

void g(std::string);

std::string y;

{
    std::string x = "hello";
    g(x);
    y = x;
} // x falls out of scope

In here the object “hello” is assigned to x. When x falls out of scope, the object’s lifetime ends. Note that when we “pass” x to g() or assign to y, we’re in reality making a copy (i.e. creating a new object), so it doesn’t affect the lifetime of the object “hello” first assigned to x. This makes it easy for the C++ runtime to know when to delete an object.

There are some exceptions though. The first is due to an optimization known as constructor elision (more in Moving Semantics in C++) in which it will avoid a copy when returning a value from a function.

std::string f() {
    std::string x = "hello";
    return x;
}

std::string y = f();

The object is not deleted when x falls out of scope, but rather “transfered” to y. This is an optimization which not all compilers might perform, though. In this case the lifetime of the object is prolonged to end when y falls out of scope, but it’s still feasible for the C++ runtime to track.

Another case is when the variable we assign x to is a reference:

void g(const std::string &);

void f() {
    std::string x = "hello";
    g(x);
}

Here x is still the “owner” of the object because when we pass x to g(x), the latter is simply holding a reference. If g() were to keep such object around and access it after x falls out of scope, we’d have undefined behavior, so again, the object’s lifetime ends when x falls out of scope.

In all the cases mentioned so far, the C++ runtime has visibility on the lifetime of the object and thus is able to delete them. Now let’s consider a third case, when we use raw pointers:

void g(const std::string *);

void f() {
    MyClass* x = new MyClass();
    g(x);
}

Now the object belongs to all variables in scope that have a pointer to it. Theoretically, the lifetime of this object only ends when all such variables fall out of scope. In this case, the C++ runtime doesn’t have global visibility on which variables have pointers to an object, so it will not try to delete the object. It’s up to the user to do so.

C++11 introduced smart pointers (std::unique_ptr and std::shared_ptr) which do have visibility on the lifetime of the object, at a cost, so they can delete it - note this is done at the STD library level, not by the C++ runtime. There are exceptions in the presence of cyclic references, which we discuss later in Finalizers.

Applications

Memory. Dynamic memory allocation is one of the main usages for RAII. One example is std::vector, which allocates a certain amount of memory upon construction, possibly some more during its lifetime via methods like .push_back() and when it falls out of scope, it releases all that memory. For example:

MyClass c;
{
    std::vector<int> v(10, c); // allocate memory
} // releases memory. Calls ~MyClass 10 times

Pointers. One problem with raw pointers is that we don’t know when they fall out of scope (see Lifetime and Destruction above), so the developer has to make sure the memory the pointer points to is deallocated appropriately. Smart pointers help with this problem using RAII that deallocates the resouces when it determines the object is not being used anymore.

{
    std::unique_ptr<MyClass> p;
    {
        auto p1 = std::make_shared<MyClass>();
        p1->doSomething();
        p = std::move(p1);
    } // p1 fell out of scope but p lives on
    p->doSomethingElse();
} // p fell out of scope and owns the object. Calls ~MyClass

Files. Another application of RAII is opening files or network sockets and making sure they will get closed. Example using std::ofstream:

{
    std::ofstream ofs ("test.txt"); // open file
    // Read file's content
} // closes the file upon destruction

Locks. A final example is making sure locks are released, avoiding potential deadlocks. Example using std::lock_guard:

std::mutex m;
{
    std::lock_guard<std::mutex> lock(m); // acquire lock
    // access exclusive resource
} // lock is released

Python

Python has a simplified version of RAII, known as context manager. One of the most common scenarios where I use context manager is for opening files:

with open(filename, "r") as file:
    content = file.read()

In this example, the file is closed as soon as the block with since we know file fell out of scope. Context managers have analogous to constructors and destructors, namely __enter__() and __exit__().

The limitation is that this only works within a single scope (likely within a function’s body), i.e. you can’t have one function open a file and have another close it.

Finalizers

Python classes do have destructors, the __del__() method, that gets called when the object is garbage collected:

class A():
    def __del__(self):
        print("destroying")


def f():
    a = A()

f()
gc.collect()
print("ending program")

The code above prints "destroying" since a falls out of scope in f() and we force garbage collection, and then "ending program". Python’s garbage collection is based on reference counting [2] so at first sight it looks like it behaves exactly like C++ with std::shared_ptr. There are two major differences, however.

1. Cyclic references. Reference counting algorithms have trouble with cycles. C++’s stance is that it’s up to the user to make sure cycles don’t happen or at least that they use weak pointers (std::weak_ptr). Python’s GC algorithm on the other hand is able to detect them.

To be able to handle cycles Python’s GC has to keep additional data structures in memory, mainly a doubly-linked list with every container object (i.e. objects that can contain a reference to one or more objects). Determining whether it’s safe to delete objects inside cyclic references requires doing a pass on all these objects, which would be prohibitive do to every time some reference count goes to 0, which takes us to our next point.

2. Determinism. With std::shared_ptr C++ will invoke the destructor as soon the object reference count goes to 0. Python’s GC doesn’t offer such guarantees. It might delete the object right away or it might decide to batch the deletion for efficiency and even do so in a separate thread, so __del()__ is called, but eventually.

So finalizers are not as widely useful as C++ destructors.

RAII as UX

Since RAII is associated with C++, a systems language, I never thought of it as a user experience (UX) paradigm, until I saw some news about credit card scams in San Fracisco [3].

ATMs

The scam works as follows: Criminals jam the physical card reader in the ATM so users are forced to use authentication by card tapping. The problem is that the ATM UI requires either explicit action from users to terminate the current session or a timeout period.

Most people forget to terminate the session, so the scammer exploits it by waiting behind the victim and once they’re done, they quickly go to the ATM and take over the previous user’ session.

This scam doesn’t work with physical cards because the moment the user removes their card, the session is over. Of course, there’s the risk of users forgetting the card inside the slot, but ATMs are designed to be very noisy to reduce this risk. There are other mechanisms to alleviate problem:

Flow 1. Users can only take the cash after they take out the card. This has the (albeit lower) risk of an absent-minded user forgetting to take the cash. Also the operation being performed might not involve withdrawing cash.

Flow 2. For every operation the user is required to insert and then remove the card immediately before the operation starts. After the operation ends, the session ends, preventing criminals from using the account. The downside is the inconvenience of having to repeat this authentication possibly multiple times.

Flow 2 is not exclusive to “card insertion” and it would work with tapping as well. The ATM’s tapping flow from the news is likely newer and less robust, so that’s probably why it’s being used by scammers.

I found this interesting because the card insertion is basically a “real world” RAII! I initially thought it to be a more robust mechanism than tapping because it adds a boundary for the end of operation, but I think that’s not correct if Flow 2 is implemented.

What are other RAII mechanism ATMs could use? One I can think of is a weight sensor (i.e. a scale): when the user leaves the platform, the weight change would cause the session to terminate.

Temporary use of resources

Another real world application of RAII is when you exchange some valuable while using a resource. For example, in the Israel Museum in Jerusalem, you can leave your ID in the reception to get an audio guide device.

This system was bad for me because I managed to return the device but forgot to get my ID back and only noticed it missing when I was back in Tel Aviv! Luckily I was on a business trip, so a kind co-worker who lived nearby rescued it and returned it to me on the day after.

See caption. — Figure 1: Figurines from the Yarmukian culture. The Ancient Levant was home to very old civilizations and the Israel Museum has a lot of their unique artifacts.

A more recent instance was when using the lockers of the National Palace Museum in Taiwan, we had to put in a coin to be able to lock it, getting the coin back upon unlocking it.

Conclusion

I enjoy writing about topics I think I already know about because it forces me to learn in more depth and I often end up learning new things, for example about Python finalizers [2].

In this post we also got more insights on why RAII works in C++ but not in Python. Mainly for two reasons:

C++ pass-by-value behavior makes lifetime tracking cheap.
When we use pass-by-reference (pointers), its version of GC (smart pointers) makes different trade-offs vs. Python’s, in particular letting go of detecting cycles but preserving determinism of when destructors are called.

[Book] Designing Data Intensive Applications - The book Designing Data Intensive Applications describes eventual consistency and strong consistency. This reminds me a lot the behavior difference between C++ destructors and Python finalizers.

If we imagine that “unreacheable objects get destroyed” is a consistent state, then C++ destructors provides strong consistency and Python finalizers eventual consistency.

References

[1] Wikipedia - RAII
[2] Garbage Collector Design, Pablo Galindo Salgado
[3] ABC News: Chase Bank ATM victim goes undercover to prove he was scammed by glue and ‘tap’ thief.

| Tags: c++, python, memoir, travel