Functors

kuniga.me > NP-Incompleteness > Functors

Functors

09 Sep 2012

In Chapter 10 of Real World Haskell, the authors explain how to write a parser for a PGM file.

They start with a very direct and verbose solution. Along the chapter they perform abstractions that allows code reuse. Then, they end up with a much concise and legible solution.

Part of the abstraction makes use of functors, and it’s about them that we’ll talk briefly in this post. Since functors also exists in C++, we’ll talk about them too, even though they represent different concepts in each of these languages.

Functors in Haskell

You’re probably acquainted with maps. Nevertheless, let’s see an example in Haskell:

map (+1) [1,2,3]

A map receives a function that transform an element of type a in a element in a type b and a list of elements of type a. We can see this by checking the its type on ghci:

>:t map
map :: (a -> b) -> [a] -> [b]

What if we want to apply a function to the elements of other structures beside lists?

Let’s start with the Maybe datatype already defined on Prelude:

data Maybe a = Nothing | Just a

In this case, the example map (+1) (Just 1) will not work. So, let’s write a function to apply the function only to the “internal” type of a Maybe:

maybeMap:: (a -> b) -> Maybe a -> Maybe b
maybeMap f (Just x) = Just (f x)
maybeMap _ Nothing  = Nothing

A classical example is performing a map over the elements a tree. Let’s define a datatype Tree representing a binary tree:

data Tree a =   Node (Tree a) (Tree a)
              | Leaf a
                deriving (Show)

To apply a function to its internal elements we can do:

treeMap :: (a -> b) -> Tree a -> Tree b
treeMap f (Leaf a)   = Leaf (f a)
treeMap f (Node l r) = Node (treeMap f l) (treeMap f r)

We can test it by doing:

> let t = Node (Leaf 1) (Node (Leaf 2) (Leaf 3))
> treeMap (+1) t
Node (Leaf 2) (Node (Leaf 3) (Leaf 4))

The Prelude allow us to implement a common interface using the typeclass Functor:

class Functor f where
    fmap :: (a -> b) -> f a -> f b

Thus, we can specify an implementation for Maybe:

instance Functor Maybe where
    fmap = maybeMap

… for our Tree datatype:

instance Functor Maybe where
    fmap = treeMap

… and for lists:

instance Functor [] where
    fmap = map

We can then use now fmap to perform mapping on lists, Maybe or Tree:

fmap (+1) [1, 2, 3] // [2, 3, 4]
fmap (+1) (Just 10) // Just 11
fmap (+1) (Leaf 0) // (Leaf 1)

Note that we can also apply this function recursively to perform mappings into inner types, for example:

fmap (fmap (+1)) [[1,2], [3]] // [[2,3], [4]]
fmap (fmap (*2)) [Just 4] // [Just 8]

The idea of functors is to perform mappings over the internal elements of datatype, but without affecting its structure.

For instance, a function that receives a (Just 1) and returns Nothing, is not preserving the Maybe structure. The same can be said about any function that changes the topology of a Tree.

So, functors are expected to satisfy two properties: identity and composability.

The identity property says that (fmap id) is equivalent to id. Since we’re not changing the internal elements neither the structure of a datatype, we are not changing the datatype at all.

The composability means that (fmap f) . (fmap g) is equivalent to fmap (f . g)

For further details, this subject is pretty well explained in [2] and [3].

Functors in C++

In C++ we also have the concept of functors, but it has a different meaning. Here a functor is an object that acts as a function.

Basically, C++ functors are classes that define the operator(). Here’s an example:

struct Functor {
    void operator() (){
        cout << "Functor\n";
    }
};
...
Functor f;
f();

What’s the difference between overloading the operator() and using any other method instead? For instance, one could define a method apply() and use it like this:

struct Functor {
    void apply(){
        cout << "Functor\n";
    }
};
...
Functor f;
f.apply();

The answer to the question is that, besides the syntax sugar (i.e. calling f() instead of f.apply()), overloading the operator() allows functors and function pointers to be used alike with templates. Consider the following example,

struct Functor {
    void operator() (){
        cout << "Functor\n";
    }
};
void f(){
    cout << "Function\n";
}

template<typename F>
void callFunction(F func){
    func();
}

int main(){
    callFunction(f); // Function
    callFunction(Functor()); // Functor
    return 0;
}

Here, both f() and an instance of Functor can be passed as parameter to callFunction(). This is pretty much like the STL sort function.

The advantage of functors over regular functions is that they can have internal state, like the example below.

struct Functor {
    int x;

    Functor(int _x){
        x = _x;
    }
    int operator() (int y){
        return x + y;
    }
};

int main (){
    Functor addTwo(2);
    cout << addTwo(13) << endl;
    return 0;
}

Sure, one could achieve the same behavior with functions using global variables, but with functors we get more encapsulation.

References