24 Days of GHC Extensions: Deriving

Today I’m happy to announce another guest post for 24 Days of GHC Extensions. In this post, Andraz Bajt is going to show us how various extensions available to GHC allow us to write less code. Take it away, Andraz!

In Haskell we use type classes to overload functions for multiple types where there are several sensible implementations. However, many of these functions are almost mechanical to write, making them tedious to write out by hand. Today, we’ll focus on extensions to the deriving mechanism. These allows us to automatically derive some useful common instances for our own datatypes and thus leave us with less code to maintain.

Functor

Functors are a very useful beast. There is the usual notion of the container like List or Maybe that you fmap over element-wise but functors can also enable more powerful abstractions.

The first motivating example I would like to show are free monads. You can get a monad for your “language” for free, given that the parameter you give to Free is a functor in itself and herein lies the problem - as we have to write out an Functor instance, the approach is hardly free for the programmer.

data MiniIoF a = Terminate
               | PrintLine String a
               | ReadLine (String -> a)

type MiniIoM = Free MiniIoF

This describes a mini language for doing IO operations. The important part to notice here is the type parameter a describing the type of the “remaining” or continuation if you will. If we want to do something with this remaining, say append an action to the program, we need to be able to fmap over MiniIoF to access it. This is why we need the Functor instance.

instance Functor MiniIoF where
    _ `fmap` Terminate = Terminate
    f `fmap` PrintLine s a = PrintLine s (f a)
    f `fmap` ReadLine g = ReadLine (f . g)

We are just mechanically taking it apart, putting f at the right places, and putting it back together. Considering the functor laws this is the only valid instance (that is not throwing errors or doing something funny), so why can’t the compiler write it for us?

It turns out that GHC can write it for us. For this to work we just need to enable the DeriveFunctor extension.

{-# LANGUAGE DeriveFunctor #-}
data MiniIoF a = Terminate
              | PrintLine String a
              | ReadLine (String -> a)
              deriving Functor

As if by magic we have our functor. GHC quite literally writes our code, you can see the generated code with the -ddump-deriv flag.

Foldable, Traversable

I already mentioned containers when talking about functors. Apart from mapping, two other common general operations over containers are traversal and folding, so if we are writing custom containers (or some different data structure) it is nice to provide instances for Traversable and Foldable. Let’s take a look at a reimplementation of lists and their instances

data List a = Nil | Cons a (List a) deriving (Eq, Show, Functor)

instance Foldable List where
  foldr _ z Nil = z
  foldr f z (Cons x xs) = f x $ foldr f z xs

instance Traversable List where
  traverse _ Nil = pure Nil
  traverse f (Cons x xs) = Cons <$> f x <*> traverse f xs

Prelude> foldr (++) "" $ Cons "1" $ Cons "2" $ Cons "3" $ Nil
"123"

But looking at these instances you might notice they are pretty mechanical definitions. If you look at the types they pretty much write themselves and indeed they can write themselves using some GHC trickery. Namely extensions DeriveFoldable and DeriveTraversable respectively

{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}

data List a = Nil | Cons a (List a)
              deriving (Eq, Show, Functor, Foldable, Traversable)

DataTypeable

There are some type classes that aren’t just boring to implement but are even impossible to implement instances manually. The culprit here is Typeable and partner Data. Data is a type class for abstracting over “data shape” allowing you to write generic code for any user-defined ADT without template Haskell. This is something you really don’t want to write by hand since it’s very tedious. The Data type class is also a subclass of Typeable so the two are often defined in pairs. Typeable defines a way for generating a value representative of a type and relies on some data types that do not expose constructors so you cannot implement it by hand. This type class allows type-casting and other “evil tricks” at runtime but it is sometimes necessary for writing efficient or generic code. It generates 128 bit fingerprint for a given type and nests fingerprints for type parameters if there are any. Since these fingerprints are supposed to be unique you must be familiar with implementation details of other instances (e.g. those provided in base) in order not to clash. It might be a good idea to let generation of this code to GHC using the extension DeriveDataTypeable

{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE DeriveDataTypeable #-}

data List a = Nil | Cons a (List a)
              deriving ( Eq, Show
                       , Functor, Foldable, Traversable
                       , Typeable, Data)

Newtypes

Sometimes you want to treat some values of a given type differently and this is where newtypes come in. Just wrap it up. It doesn’t cost anything at runtime but it might cost you something at code-writing-time. Let’s say you have some monad transformer stack for your application and you wrap it up in a newtype.

newtype App a = App { unApp :: ReaderT Config (StateT AppState IO) a }

But App is not a monad anymore. Nor is it MonadReader, MonadState or MonadIO if you are using mtl. You have to code directly against the specific type. You lost the ability to run programs polymorphic in the monad type against your application monad stack. In order to get it back you have to write all the instances but they will be extremely boring since all you have to do is lift the operations. And it gets worse! Now you are creating overhead and hoping that compiler is smart enough to optimize it away (newtypes are guaranteed to disappear). The alternative is to just use a type alias but this can mean giving up some precision in our types.

Or you can just ask the compiler to generate all the instances for you. The GHC doesn’t even have to play by its own rules. It can just take the existing instances and insert some (free at runtime) coercions to make them work with newtypes. This is the GeneralizedNewtypeDeriving extension.

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
newtype App a = App { unApp :: ReaderT Config (StateT AppState IO) a }
                deriving (Monad, MonadReader Config,
                          MonadState AppState, MonadIO)

Now we can happily use out newtypes and still have all the polymorphism we want.

Conclusion

We seen quite some automatically generated code. I would argue that using these extensions to generate instances is a good idea and you should do it as writing out mechanical boilerplate violates the DRY principle in a way. You are repeating yourself, not in exact code but in patterns. On a more pragmatic side: leveraging modern GHC features will leave you with less code to maintain and thus more productive.

Some extensions are quite controversial but in my opinion the extensions featured above are quite innocent. It is even simple to get rid of them: just dump the generated code and paste it into the source. Now go write some interesting code instead of writing boring instances!