I conjecture that the majority of bugs that programmers encounter occur at the boundaries between 2 different systems. In Haskell, we do everything we can to encode our assumptions of behaviour via the type system, and perform strict validation when bringing outside information into our cosy type-safe bubble. However, this doesn’t help if you’ve requested the wrong data in the first place - all you can do is throw a runtime exception.
For me, this is a frequent problem with my choice of database access -
postgresql-simple. Too often I have queried for
foo, bar in
some_table, but elsewhere assumed I was querying for
bar, foo - the order doesn’t agree. In today’s post, we will look at
haskelldb, which is a high level abstract interface to databases.
haskelldb takes an encoding of your database, and then uses this information to construct SQL for you, and deals with mapping data too and from the format the database. Kicking off today’s example, again using the Helping Santa project from earlier posts, we begin by defining our database interface. First of all, we need to define the columns:
data PresentName = PresentName instance FieldTag PresentName where = const "name" fieldName presentName :: Attr PresentName String = mkAttr PresentName presentName data ChildName = ChildName instance FieldTag ChildName where = const "name" fieldName childName :: Attr ChildName String = mkAttr ChildName childName data ChildLocationLat = ChildLocationLat instance FieldTag ChildLocationLat where = const "loc_lat" fieldName childLocLat :: Attr ChildLocationLat Double = mkAttr ChildLocationLat childLocLat data ChildLocationLong = ChildLocationLong instance FieldTag ChildLocationLong where = const "loc_long" fieldName childLocLong :: Attr ChildLocationLong Double = mkAttr ChildLocationLongchildLocLong
For each column we introduce a unique type, we introduce an
Attr for it (with its underlying
Expr type), and also add
FieldTag instances to map to and from the database column name. Now that we are armed with these column definitions, we can define our tables:
child :: Table (RecCons ChildName (Expr String) RecCons ChildLocationLat (Expr Double) (RecCons ChildLocationLong (Expr Double) (RecNil))) = baseTable "child" child $ hdbMakeEntry ChildName # hdbMakeEntry ChildLocationLat # hdbMakeEntry ChildLocationLong present :: Table (RecCons PresentName (Expr String) RecCons ChildName (Expr String) (RecNil)) = baseTable "present" $ hdbMakeEntry PresentName present # hdbMakeEntry ChildName
baseTable to define the “base” tables in the database - the tables that actually have data in, and use
hdbMakeEntry to add columns to each table.
It’s quite a bit of up front typing I’ll admit, but you could automate this typing with a custom step in your build process (easy with Shake!), or use Template Haskell. I’ve opted to write it all by hand as I find this helps further develop my understanding, as I only have to understand one thing at a time.
Now that we’ve represented the schema in code, let’s have a look at leveraging this to write some queries.
allPresents :: Query (Rel (RecCons PresentName (Expr String) RecNil)) = do allPresents <- table present allPresents $ presentName << allPresents ! presentName project presentsFor :: String -> Query (Rel (RecCons PresentName (Expr String) RecNil)) = do presentsFor name <- table child children <- table present presents $ children ! childName .==. presents ! childName restrict $ children ! childName .==. constant name restrict $ presentName << presents ! presentName project
If you’ve only written SQL before this might look a little alien -
haskelldb provides an API that is very close to relational algebra. In the first example we’ve taken the output of the entire
present table and returned all rows. In the second example we’ve selected from two tables - a Cartesian join - and filtered this join on a predicate to find just the rows about a specific child (the child’s name is given by input).
The beauty of this interface is that
haskelldb is free to rewrite your query in a more optimal form. You can have a look at the generated SQL by using
> ppSql (presentsFor "Little Bobby Tables") SELECT name2 as name FROM (SELECT name as name2, name as name2FROM present as T1) as T1, SELECT name as name1 (FROM child as T1) as T2 WHERE name1 = 'Little Bobby Tables' AND name1 = name2
Coupled with PostgreSQL’s stunning query planer, the end result is query with the same speed as one I’d write by hand!
haskelldb gives us a extremely expressive API for querying databases, and doesn’t sacrifice performance to do so. This allows you to consume and compose the API in a way that best suits the needs of your application. If you frequently select from the result of a join you can simply store part of this query separately, and compose it later.
One downside of
haskelldb is that documentation can be a little bit terse, and the API is quite large, so it can be hard to get going sometimes. I highly recommend Chris Done’s series of blog posts on haskelldb if you’re interested in learning more.
You can contact me via email at email@example.com or tweet to me @acid2. I share almost all of my work at GitHub. This post is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.