18 Nov 2016
In this short post we’ll discuss lazy evaluation in OCaml and study a data structure called Stream. It’s based mainly on chapter 4 of Purely Functional Data Structures and it’s part of a series of study notes on that book.
Lazy evaluation is a property in which an expression is not evaluated immediately (suspended) and when it’s evaluated the first time, the subsequent calls are cached (memoized). Functional languages like Haskell are lazy evaluated, but not OCaml, which is eagerly evaluated. Because the results are memoized, expressions that are lazily evaluated must always return the same value given the same inputs. In Haskell it’s easy to enforce because functions are pure, that is, they do not rely on side effects.
In the book the author defines a notation for lazy evaluation:
datatype a susp = $ of a
In OCaml, we can work with lazily evaluated expressions through the Lazy module. The definition of a suspension is similar:
type 'a t = 'a lazy_t
and we can use the lazy construct. Let’s define a simple expensive function, a naive Fibonacci, which runs at
We can create a lazy evaluated version of it:
We can see that by assigning it to a variable, it doesn’t cause the function to be executed:
The author defines a matching operator (
$) that causes a lazy expression to be evaluated, but I couldn’t find a corresponding operator in OCaml. Nevertheless, the Lazy module has the
force() function, which does exactly that:
Note that if we execute the same expression again, the second time it returns much faster, because of the memoization.
We are now ready to introduce the stream data structure.
A stream is a lazy version of a linked list. Recall that a linked list is composed of nodes which point to the next node, or equivalently, to the remaining of the list. The usual definition of a linked list is:
If we want to be explicit that a node is actually pointing to a sublist, we could use an intermediate type, listType:
list are mutually recursive (they depend on each other), so we have to define them together by using the and construct.
In a stream the pointer to the remaining of the list is lazily evaluated, so the type is:
With this basic structure we can implement many of the list functions for streams.
Let’s start with the concat operator (++):
Note that it never evaluates
streamB and it only evaluates the first cell of
To help us testing, we can define function to convert from a list:
and a function that forces the evaluation of the entire stream, essentially converting it back to a list:
take(n) function returns the first n elements from a stream. Like the concat function, only the first node of the stream is evaluated. The recursive call is suspended.
drop(n) function removes the first n elements from a stream and returns the result. In this case, we need to evaluate all the n recursive calls:
drop look very similar but one is lazy while the other is not. That’s because the head of the stream is not suspended, but the tail is. In the drop case we need to find the (n+1)-th element that will be the new head of the stream. In the take case, we’re not changing the head, and since the tail is suspended, it can wait.
The reverse function reverts the order the elements in a stream. In this case it’s more obvious that since we’re changing the location of the head, it must be eagerly evaluated.
In this post we saw that OCaml is not lazy evaluated but we can rely on the Lazy module to accomplish that. We also learned a new data structure, stream, which is recursively lazily evaluated and operations like
take play well with laziness, while other like
reverse do not.
The full implementation with comments is available on github.