Today’s guest post comes from Tom Ellis. If you haven’t heard, Tom released a novel library for interacting with relational databases this year - Opaleye. While similar to HaskellDB in some respects, Opaleye is distinct in its extensive use of arrows in order to guarantee safety of queries. In this post, Tom’s going to guide us through GHC’s special syntax support for the `Arrow`

type class.

```
{-# LANGUAGE Arrows #-}
import Control.Monad (guard)
import Control.Monad.Identity (Identity, runIdentity)
import Control.Arrow (returnA, Kleisli(Kleisli), runKleisli)
```

In Haskell we use the `Monad`

typeclass to provide an interface for encoding certain forms of computation. The `Arrow`

typeclass provides an interface that is in many ways similar. Everything that can be made an instance of `Monad`

(in a law-abiding way) can also be adapted to be an `Arrow`

(in a law-abiding way). This means that the `Arrow`

interface is less powerful than `Monad`

, but that more things can be made instances of `Arrow`

.

Working directly with `Monad`

combinators like `>>=`

and `>>`

can sometimes be awkward so Haskell provides “`do`

notation” as an alternative way of writing monad expressions in a way that can often be clearer. Likewise there is an “arrow notation” for writing arrow expressions. It is not enabled in GHC by default but can be turned on with the `Arrows`

language extension.

The subject of this blog post is how to understand and use arrow notation. It will also serve as an introduction to the `Arrow`

typeclass whilst keeping mention of the arrow combinators to a minimum. We will use develop our intuition for arrow computation rather than learn any technical and formal definitions!

Let’s kick off by refreshing our memory of `do`

notation. A very basic way to think of `do`

notation is that it is similar to a sequence of let bindings. For example, a basic Haskell expression to perform a numerical calculation might be

```
f :: Int -> (Int, Int)
= \x ->
f let y = 2 * x
= y + 3
z1 = y - 5
z2 in (z1, z2)
-- ghci> f 10
-- (23, 15)
```

`do`

notation supports expressing the exact same computation inside the `Identity`

monad, that is, a monad that has no “side effects”.

```
fM :: Int -> Identity (Int, Int)
= \x -> do
fM <- return (2 * x)
y <- return (y + 3)
z1 <- return (y - 5)
z2 return (z1, z2)
-- ghci> runIdentity (fM 10)
-- (23,15)
```

The `let`

bindings in `f`

become `<-`

under `do`

notation. (For technical reasons we have to wrap every intermediate expression in `return`

to lift them into the `Identity`

monad.) Arrow notation supports a similar translation:

```
fA :: Int -> (Int, Int)
= proc x -> do
fA <- (2 *) -< x
y <- (+ 3) -< y
z1 <- (subtract 5) -< y
z2 -< (z1, z2)
returnA
-- ghci> fA 10
-- (23,15)
```

In arrow notation `proc`

plays the part of “lambda”, i.e. backslash, `\`

, `<-`

plays the part of `=`

and `-<`

feeds an argument into a “function”. We use `returnA`

instead of `return`

.

The benefit of `do`

notation comes when we want to encode a computation that can’t be written using pure `let`

bindings alone. Here’s an example that uses the list monad to generate all coordinates within a given radius of the origin:

```
range :: Int -> [Int]
range r = [-r..r]
cM :: Int -> [(Int, Int)]
= \r -> do
cM <- range 5
x <- range 5
y *x + y*y <= r*r)
guard (xreturn (x, y)
-- ghci> take 10 (cM 5)
-- [(-5,0),(-4,-3),(-4,-2),(-4,-1),(-4,0),(-4,1),(-4,2),(-4,3),(-3,-4),(-3,-3)]
```

We read this as

- for each
`x`

in -5 to 5 - for each
`y`

in -5 to 5 - where
`x*x + y*y <= r*r`

- return the pair of
`x`

and`y`

Now let’s see how to use arrow notation to express the same computation. For trivial technical reasons we need to wrap the list monad to make it suitable for use with arrow notation. The wrapping and unwrapping don’t actually do anything except shuffle some type parameters around. In arrow computations we will use `K [] a b`

where instead of `a -> [b]`

. We’ll use abbreviated versions of the relevant wrapping and unwrapping functions:

```
type K = Kleisli
k :: (a -> m b) -> Kleisli m a b
= Kleisli
k
runK :: Kleisli m a b -> (a -> m b)
= runKleisli runK
```

Then we can use arrow notation to implement the radius list computation as follows:

```
cA :: Kleisli [] Int (Int, Int)
= proc r -> do
cA <- k range -< 5
x <- k range -< 5
y -< (x*x + y*y <= r*r)
k guard -< (x, y)
returnA
-- ghci> take 10 (runK cA 5)
-- [(-5,0),(-4,-3),(-4,-2),(-4,-1),(-4,0),(-4,1),(-4,2),(-4,3),(-3,-4),(-3,-3)]
```

What’s the point of arrow notation? So far we have only seen that it is able to replicate some examples in `do`

notation. Well, the point is that arrow notation forbids some computations that `do`

notation allows. In particular all “arrow actions” must be “statically” known”. That sentence was a mouthful! What does it mean? I am calling the expression that comes between `<-`

and `-<`

in a row of arrow notation the “arrow action”. “Statically known” means that if we have a couple of rows of arrow notation

```
-- y <- action1 -< x
-- z <- action2 -< y
```

then the expression `action2`

cannot depend on `x`

or indeed anything bound on the left hand side of an arrow notation row.

This restriction has important practical consequences. For example, our Haskell IO system might be based on the following primitives

```
getLineM :: String -> IO String
= do
getLineM prompt print prompt
getLine
printM :: String -> IO ()
= print
printM
writeFileM :: FilePath -> String -> IO ()
= writeFile filePath string writeFileM (filePath, string)
```

then we could use `do`

notation to write a procedure which reads a line of user input and either prints something out or writes to a file based on that input.

```
procedureM :: String -> IO ()
= \prompt -> do
procedureM <- getLineM prompt
input if input == "Hello"
then printM "You said 'Hello'"
else writeFileM ("/tmp/output", "The user said '" ++ input ++ "'")
-- ghci> procedureM "Say something"
-- "Say something"
-- Hello
-- "You said 'Hello'"
-- ghci> procedureM "Say something"
-- "Say something"
-- Bye bye
-- (Writes to /tmp/output)
```

However, there is no way to express this in arrow notation using only the same primitives.

```
-- procedureA :: K IO String ()
-- procedureA = proc prompt -> do
-- input <- k getLineM -< prompt
-- if input == "Hello"
-- then printM
-- else writeFileM "/tmp/output" ...
-- Oh no! This won't work because we were trying to refer to a <-
-- bound variable on the left hand side of a -<
```

Why do we want to use arrows when they have these restrictions? Going into details would take a whole other blog post of its own, but I will mention briefly a few places where the full generality of monads is too much.

Firstly, an arrow-only interface can often allow you to take advantage of opmizations that a monadic interface could not. For example parsers written using parser combinators can be made more memory efficient if we know statically the parsing action they are going to perform. Similarly it can help to reduce the chance of memory leaks in functional reactive programming (e.g. with `netwire`

) if actions cannot depend in an unrestrained way on the result of previous actions.

In embedded domain specific languages this forcing non-dependence can make code generation more easily match a target language. For example in the SQL-generating relational query language Opaleye queries are built up from arrows called `QueryArr`

. Using an arrow rather than a monad allows the semantics of the domain specific language to more closely match the semantics of the underlying SQL language.

So in summary, arrows are a Haskell abstraction, similar to monads, and arrow notation is a way of writing arrow expressions which is similar to to `do`

notation. However, there are some restrictions on what you can do with arrows that are not shared by monads. The benefit of the restriction is you can often receive a performance benefit or use your more specific knowledge about the structure of an arrow computation to your advantage.

*This post is part of 24 Days of GHC Extensions - for more posts like this, check out the calendar*.