Rust Memory Management

kuniga.me > NP-Incompleteness > Rust Memory Management

Rust Memory Management

13 Mar 2019

Graydon Hoare is a Software Developer who created the Rust programming language while working at Mozilla Research [1]. He has an interesting presence on the internet, for example responding to this question and on Twitter.

In this post we’ll talk about one of the key features of Rust, which I find the hardest to wrap my head around, which is its memory management.

Motivation

In most programming languages we either have to manage memory allocation ourselves or rely on a complex garbage collector to which we have limited control and can lead to unpredictable performance bottlenecks.

Rust has a set of constraints around memory allocation that results in a deterministic automatic memory management.

We’ll now delve into more details on what these constraints are, how they enforce the necessary guarantees to enable an efficient memory management by the runtime.

Ownership Rules

Rust has the following basic rules around ownership:

• Each value in Rust has a variable that’s called its owner
• There can only be one owner at a time
• When the owner goes out of scope, the value will be dropped They’re very basic but have deep implications in how we think about programs. Let’s analyze some of them.

Assignment transfers ownership

To conform the second rule, whenever we assign a variable to another, we are transferring the ownership from one variable to another. For example:

In the example above, vec1 transfer ownership of its data to vec2. This means that we can neither read nor write to vec1 anymore. It’s as if it was out of scope (unless it is assigned ownership to some other data later).

By having a single owner we don’t have to worry about keeping track of references to a given object as garbage collectors do to know when we are allowed to free the memory. For example, if we had:

Because vec2 owns the vector allocated initially and it goes out of scope after line 4, the runtime can free the memory safely, since we know vec1 cannot access that data after line 3.

Similarly, when we use a variable as argument to a function, its data is transferred to the parameter. In the example below, vec1’s data is transferred to vec in mutate_vec(), so we cannot access it in line 9.

One way to return the ownership back to vec1 is for the function to return the argument.

References & Borrowing

To avoid transferring data to a variable on assignment, we can use references. In the example below, vec2 “borrows” the data from vec1 but there’s no ownership transfer. We have read access to the data via vec2, which we can do by dereferencing vec2 with &vec2.

If we want write access, we need to make vec1 mutable and also obtain a mutable reference to vec1 via the &mut operator, like in the example below:

However, if we try to access the data via vec1 while vec2 has a mutable reference to it, we’ll get an error.

We can take as many (non-mutable) references as we want:

But once we try to obtain a mutable reference, we get an error:

Read and write locks. Rust enforces these constraints to prevent race condition bugs in multi-thread applications. In fact the borrowing mechanism via references is very similar to read locks and write locks.

To recall, if some data has a read lock, we can acquire as many other reads locks as we want but we cannot acquire a write lock. This way we prevent data inconsistency between multiple reads since we prevent data mutations by not allowing writes. Conversely, if some data has a write lock, we cannot acquire neither read or write locks.

We can see that the regular borrowing implements a read lock while a mutable borrowing implements a write lock.

Dangling references. One potential issue with references is that if we return a reference to some variable that went out of scope.

Luckily, the compiler will prevent this case by making a compile error. It’s now always the case that we cannot return a reference. If the reference we’re returning was sent as argument, it’s still a valid one, for example:

However, if we try to pass two parameters, we’ll run into a compile error:

To see why it’s not possible to guarantee this memory-safe, we can consider the following code calling get_largest:

Here we’re sending references for both vec1 and vec2, and returning back either of them. If vec2 happened to be larger, we’d return it to result and try to access the data after vec2 went out of scope.

However, if result is used while both vec1 and vec2 are in scope, it should be theoretically safe to allow calling get_largest:

In fact, it’s possible and we’ll see how next, but we’ll need to introduce a new terminology.

The lifetime of a variable represents the duration in which the variable is valid. In the example below, the lifetime of variable a is from line 2 to line 7. The lifetimes for b is from 4 to 5, for c is line 5 and for d is line 7. Note that a’s lifetime contains all other lifetimes and b’s lifetime contains c’s lifetime.

We can annotate a function using a syntax similar to generics to parametrize the function arguments by their lifetimes. For example, if we want to include the lifetimes of the arguments in get_largest(), we can do:

This is essentially binding the each variable to the lifetime specified by 'a. Note it’s forcing that both variables have the same lifetime and that the return type has the same lifetime as the input parameters.

Now, if we replace get_largest() with get_largest_with_lifetime(), we won’t get compiler errors. In the example below, result has the same lifetime was the common lifetimes of vec1 and vec2, which is vec2’s lifetime. This means we’re fine using result with the inner block.

Conclusion

Rust documentation is very detailed and well written. All the concepts presented in this post are explained in length there. In here I’m presenting them in my own words and different examples (vec instead of string).

I also tried to group topics around memory management, which is different from the documentation. In there “lifetimes” are not covered under ownership and I skipped the generics discussion.

References

• [1] Wikipedia - Rust (programming language)
• [2] Rust Documentation: What is Ownership?
• [3] Rust Documentation: References & Borrowing
• [4] Validating References with Lifetimes