Alexis King’s Blog

An introduction to typeclass metaprogramming

2021-03-25T00:00:00Z

Typeclass metaprogramming is a powerful technique available to Haskell programmers to automatically generate term-level code from static type information. It has been used to great effect in several popular Haskell libraries (such as the servant ecosystem), and it is the core mechanism used to implement generic programming via GHC generics. Despite this, remarkably little material exists that explains the technique, relegating it to folk knowledge known only to advanced Haskell programmers.

This blog post attempts to remedy that by providing an overview of the foundational concepts behind typeclass metaprogramming. It does not attempt to be a complete guide to type-level programming in Haskell—such a task could easily fill a book—but it does provide explanations and illustrations of the most essential components. This is also not a blog post for Haskell beginners—familiarity with the essentials of the Haskell type system and several common GHC extensions is assumed—but it does not assume any prior knowledge of type-level programming.

Part 1: Basic building blocks

Typeclass metaprogramming is a big subject, which makes covering it in a blog post tricky. To break it into more manageable chunks, this post is divided into several parts, each of which introduces new type system features or type-level programming techniques, then presents an example of how they can be applied.

To start, we’ll cover the absolute foundations of typeclass metaprogramming.

Typeclasses as functions from types to terms

As its name implies, typeclass metaprogramming (henceforth TMP¹) centers around Haskell’s typeclass construct. Traditionally, typeclasses are viewed as a mechanism for principled operator overloading; for example, they underpin Haskell’s polymorphic == operator via the Eq class. Though that is often the most useful way to think about typeclasses, TMP encourages a different perspective: typeclasses are functions from types to (runtime) terms.

What does that mean? Let’s illustrate with an example. Suppose we define a typeclass called TypeOf:

class TypeOf a where
  typeOf :: a -> String

The idea is that this typeclass will accept some value and return the name of its type as a string. To illustrate, here are a couple potential instances:

instance TypeOf Bool where
  typeOf _ = "Bool"

instance TypeOf Char where
  typeOf _ = "Char"

instance (TypeOf a, TypeOf b) => TypeOf (a, b) where
  typeOf (a, b) = "(" ++ typeOf a ++ ", " ++ typeOf b ++ ")"

Given these instances, we can observe that they do what we expect in GHCi:

ghci> typeOf (True, 'a')
"(Bool, Char)"

Note that both the TypeOf Bool and TypeOf Char instances ignore the argument to typeOf altogether. This makes sense, as the whole point of the TypeOf class is to get access to type information, which is the same regardless of which value is provided. To make this more explicit, we can take advantage of some GHC extensions to eliminate the value-level argument altogether:

{-# LANGUAGE AllowAmbiguousTypes, ScopedTypeVariables, TypeApplications #-}

class TypeOf a where
  typeOf :: String

This typeclass definition is a little unusual, as the type parameter a doesn’t appear anywhere in the body. To understand what it means, recall that the type of each method of a typeclass is implicitly extended with the typeclass’s constraint. For example, in the definition

class Show a where
  show :: a -> String

the full type of the show method is implicitly extended with a Show a constraint to yield:

show :: Show a => a -> String

Furthermore, if we write foralls explicitly, each typeclass method is also implicitly quantified over the class’s type parameters, which makes the following the full type of show:

show :: forall a. Show a => a -> String

In the same vein, we can write out the full type of typeOf, as given by our new definition of TypeOf:

typeOf :: forall a. TypeOf a => String

This type is still unusual, as the a type parameter doesn’t appear anywhere to the right of the => arrow. This makes the type parameter trivially ambiguous, which is to say it’s impossible for GHC to infer what a should be at any call site. Fortunately, we can use TypeApplications to pass a type for a directly, as we can see in the updated definition of TypeOf (a, b):

instance TypeOf Bool where
  typeOf = "Bool"

instance TypeOf Char where
  typeOf = "Char"

instance (TypeOf a, TypeOf b) => TypeOf (a, b) where
  typeOf = "(" ++ typeOf @a ++ ", " ++ typeOf @b ++ ")"

Once again, we can test out our new definitions in GHCi:

ghci> typeOf @Bool
"Bool"
ghci> typeOf @(Bool, Char)
"(Bool, Char)"

This illustrates very succinctly how typeclasses can be seen as functions from types to terms. Our typeOf function is, quite literally, a function that accepts a single type as an argument and returns a term-level String. Of course, the TypeOf typeclass is not a particularly useful example of such a function, but it demonstrates how easy it is to construct.

Type-level interpreters

One important consequence of eliminating the value-level argument of typeOf is that there is no need for its argument type to actually be inhabited. For example, consider the TypeOf instance on Void from Data.Void:

instance TypeOf Void where
  typeOf = "Void"

This above instance is no different from the ones on Bool and Char even though Void is a completely uninhabited type. This is an important point: as we delve into type-level programming, it’s important to keep in mind that the language of types is mostly blind to the term-level meaning of those types. Although we usually write typeclasses that operate on values, this is not at all essential. This turns out to be quite important in practice, even in something as simple as the definition of TypeOf on lists:

instance TypeOf a => TypeOf [a] where
  typeOf = "[" ++ typeOf @a ++ "]"

If typeOf required a value-level argument, not just a type, our instance above would be in a pickle when given the empty list, since it would have no value of type a to recursively apply typeOf to. But since typeOf only accepts a type-level argument, the term-level meaning of the list type poses no obstacle.

A perhaps unintuitive consequence of this property is that we can use typeclasses to write interesting functions on types even if none of the types are inhabited at all. For example, consider the following pair of type definitions:

data Z
data S a

It is impossible to construct any values of these types, but we can nevertheless use them to construct natural numbers at the type level:

Z is a type that represents 0.
S Z is a type that represents 1.
S (S Z) is a type that represents 2.

And so on. These types might not seem very useful, since they aren’t inhabited by any values, but remarkably, we can still use a typeclass to distinguish them and convert them to term-level values:

import Numeric.Natural

class ReifyNat a where
  reifyNat :: Natural

instance ReifyNat Z where
  reifyNat = 0

instance ReifyNat a => ReifyNat (S a) where
  reifyNat = 1 + reifyNat @a

As its name implies, reifyNat reifies a type-level natural number encoded using our datatypes above into a term-level Natural value:

ghci> reifyNat @Z
0
ghci> reifyNat @(S Z)
1
ghci> reifyNat @(S (S Z))
2

One way to think about reifyNat is as an interpreter of a type-level language. In this case, the type-level language is very simple, only capturing natural numbers, but in general, it could be arbitrarily complex—and typeclasses can be used to give it a useful meaning, even if it has no term-level representation.

Overlapping instances

Generally, typeclass instances aren’t supposed to overlap. That is, if you write an instance for Show (Maybe a), you aren’t supposed to also write an instance for Show (Maybe Bool), since it isn’t clear whether show (Just True) should use the first instance or the second. For that reason, by default, GHC rejects any form of instance overlap as soon as it detects it.

Usually, this is the right behavior. Due to the way Haskell’s typeclass system is designed to preserve coherency—that is, the same combination of type arguments always selects the same instance—overlapping instances can be unintuitive or even cause nonsensical behavior if orphan instances are defined. However, when doing TMP, it’s useful to make exceptions to that rule of thumb, so GHC provides the option to explicitly opt-in to overlapping instances.

As a simple example, suppose we wanted to write a typeclass that checks whether a given type is () or not:

class IsUnit a where
  isUnit :: Bool

If we were to write an ordinary, value-level function, we could write something like this pseudo-Haskell:

-- not actually valid Haskell, just an example
isUnit :: * -> Bool
isUnit () = True
isUnit _  = False

But if we try to translate this to typeclass instances, we’ll get a problem:

instance IsUnit () where
  isUnit = True

instance IsUnit a where
  isUnit = False

The problem is that a function definition has a closed set of clauses matched from top to bottom, but typeclass instances are open and unordered.² This means GHC will complain about instance overlap if we try to evaluate isUnit @():

ghci> isUnit @()

error:
    • Overlapping instances for IsUnit ()
        arising from a use of ‘isUnit’
      Matching instances:
        instance IsUnit a
        instance IsUnit ()

To fix this, we have to explicitly mark IsUnit () as overlapping:

instance {-# OVERLAPPING #-} IsUnit () where
  isUnit = True

Now GHC accepts the expression without complaint:

ghci> isUnit @()
True

What does the {-# OVERLAPPING #-} pragma do, exactly? The gory details are spelled out in the GHC User’s Guide, but the simple explanation is that {-# OVERLAPPING #-} relaxes the overlap checker as long as the instance is strictly more specific than the instance(s) it overlaps with. In this case, that is true: IsUnit () is trivially more specific than IsUnit a, since the former only matches () while the latter matches anything at all. That means our overlap is well-formed, and instance resolution should behave the way we’d like.

Overlapping instances are a useful tool when performing TMP, as they make it possible to write piecewise functions on types in the same way it’s possible to write piecewise functions on terms. However, they must still be used with care, as without understanding how they work, they can produce unintuitive results. For an example of how things can go wrong, consider the following definition:

guardUnit :: forall a. a -> Either String a
guardUnit x = case isUnit @a of
  True  -> Left "unit is not allowed"
  False -> Right x

The intent of guardUnit is to use isUnit to detect if its argument is of type (), and if it is, to return an error. However, even though we marked IsUnit () overlapping, we still get an overlapping instance error:

error:
    • Overlapping instances for IsUnit a arising from a use of ‘isUnit’
      Matching instances:
        instance IsUnit a
        instance [overlapping] IsUnit ()
    • In the expression: isUnit @a

What gives? The problem is that GHC simply doesn’t know what type a is when compiling guardUnit. It could be instantiated to () where it’s called, but it might not be. Therefore, GHC doesn’t know which instance to pick, and an overlapping instance error is still reported.

This behavior is actually a very, very good thing. If GHC were to blindly pick the IsUnit a instance in this case, then guardUnit would always take the False branch, even when passed a value of type ()! That would certainly not be what was intended, so it’s better to reject this program than to silently do the wrong thing. However, in more complicated situations, it can be quite surprising that GHC is complaining about instance overlap even when {-# OVERLAPPING #-} annotations are used, so it’s important to keep their limitations in mind.

As it happens, in this particular case, the error is easily remedied. We simply have to add an IsUnit constraint to the type signature of guardUnit:

guardUnit :: forall a. IsUnit a => a -> Either String a
guardUnit x = case isUnit @a of
  True  -> Left "unit is not allowed"
  False -> Right x

Now picking the right IsUnit instance is deferred to the place where guardUnit is used, and the definition is accepted.³

Type families are functions from types to types

In the previous section, we discussed how typeclasses are functions from types to terms, but what about functions from types to types? For example, suppose we wanted to sum two type-level natural numbers and get a new type-level natural number as a result? For that, we can use a type family:

{-# LANGUAGE TypeFamilies #-}

type family Sum a b where
  Sum Z     b = b
  Sum (S a) b = S (Sum a b)

The above is a closed type family, which works quite a lot like an ordinary Haskell function definition, just at the type level instead of at the value level. For comparison, the equivalent value-level definition of Sum would look like this:

data Nat = Z | S Nat

sum :: Nat -> Nat -> Nat
sum Z     b = b
sum (S a) b = S (sum a b)

As you can see, the two are quite similar. Both are defined via a pair of pattern-matching clauses, and though it doesn’t matter here, both closed type families and ordinary functions evaluate their clauses top to bottom.

To test our definition of Sum in GHCi, we can use the :kind! command, which prints out a type and its kind after reducing it as much as possible:

ghci> :kind! Sum (S Z) (S (S Z))
Sum (S Z) (S (S Z)) :: *
= S (S (S Z))

We can also combine Sum with our ReifyNat class from earlier:

ghci> reifyNat @(Sum (S Z) (S (S Z)))
3

Type families are a useful complement to typeclasses when performing type-level programming. They allow computation to occur entirely at the type-level, which is necessarily computation that occurs entirely at compile-time, and the result can then be passed to a typeclass method to produce a term-level value from the result.

Example 1: Generalized `concat`

Finally, using what we’ve discussed so far, we can do our first bit of practical TMP. Specifically, we’re going to define a flatten function similar to like-named functions provided by many dynamically-typed languages. In those languages, flatten is like concat, but it works on a list of arbitrary depth. For example, we might use it like this:

> flatten [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
[1, 2, 3, 4, 5, 6, 7, 8]

In Haskell, lists of different depths have different types, so multiple levels of concat have to be applied explicitly. But using TMP, we can write a generic flatten function that operates on lists of any depth!

Since this is typeclass metaprogramming, we’ll unsurprisingly begin with a typeclass:

class Flatten a where
  flatten :: a -> [???]

Our first challenge is writing the return type of flatten. Since the argument could be a list of any depth, there’s no direct way to obtain its element type. Fortunately, we can define a type family that does precisely that:

type family ElementOf a where
  ElementOf [[a]] = ElementOf [a]
  ElementOf [a]   = a

class Flatten a where
  flatten :: a -> [ElementOf a]

Now we can write our Flatten instances. The base case is when the type is a list of depth 1, in which case we don’t have any flattening to do:

instance Flatten [a] where
  flatten x = x

The inductive case is when the type is a nested list, in which case we want to apply concat and recur:

instance {-# OVERLAPPING #-} Flatten [a] => Flatten [[a]] where
  flatten x = flatten (concat x)

Sadly, if we try to compile these definitions, GHC will reject our Flatten [a] instance:

error:
    • Couldn't match type ‘a’ with ‘ElementOf [a]’
      ‘a’ is a rigid type variable bound by
        the instance declaration
      Expected type: [ElementOf [a]]
        Actual type: [a]
    • In the expression: x
      In an equation for ‘flatten’: flatten x = x
      In the instance declaration for ‘Flatten [a]’
   |
   |   flatten x = x
   |               ^

At first blush, this error looks very confusing. Why doesn’t GHC think a and ElementOf [a] are the same type? Well, consider what would happen if we picked a type like [Int] for a. Then [a] would be [[Int]], a nested list, so the first case of ElementOf would apply. Therefore, GHC refuses to pick the second equation of ElementOf so hastily.

In this particular case, we might think that’s rather silly. After all, if a were [Int], then GHC wouldn’t have picked the Flatten [a] instance to begin with, it would pick the more specific Flatten [[a]] instance defined below. Therefore, the hypothetical situation above could never happen. Unfortunately, GHC does not realize this, so we find ourselves at an impasse.

Fortunately, we can soothe GHC’s anxiety by adding an extra constraint to our Flatten [a] instance:

instance (ElementOf [a] ~ a) => Flatten [a] where
  flatten x = x

This is a type equality constraint. Type equality constraints are written with the syntax a ~ b, and they state that a must be the same type as b. Type equality constraints are mostly useful when type families are involved, since they can be used (as in this case) to require a type family reduce to a certain type. In this case, we’re asserting that ElementOf [a] must always be a, which allows the instance to typecheck.

Note that this doesn’t let us completely wriggle out of our obligation, as the type equality constraint must eventually be checked when the instance is actually used, so initially this might seem like we’ve only deferred the problem to later. But in this case, that’s exactly what we need: by the time the Flatten [a] instance is selected, GHC will know that a is not a list type, and it will be able to reduce ElementOf [a] to a without difficulty. Indeed, we can see this for ourselves by using flatten in GHCi:

ghci> flatten [[[1 :: Integer, 2], [3, 4]], [[5, 6], [7, 8]]]
[1,2,3,4,5,6,7,8]

It works! But why do we need the type annotation on 1? If we leave it out, we get a rather hairy type error:

error:
    • Couldn't match type ‘ElementOf [a0]’ with ‘ElementOf [a]’
      Expected type: [ElementOf [a]]
        Actual type: [ElementOf [a0]]
      NB: ‘ElementOf’ is a non-injective type family
      The type variable ‘a0’ is ambiguous

The issue here stems from the polymorphic nature of Haskell number literals. Theoretically, someone could define a Num [a] instance, in which case 1 could actually have a list type, and either case of ElementOf could match depending on the choice of Num instance. Of course, no such Num instance exists, nor should it, but the possibility of it being defined means GHC can’t be certain of the depth of the argument list.

This issue happens to come up a lot in simple examples of TMP, since polymorphic number literals introduce a level of ambiguity. In real programs, this is much less of an issue, since there’s no reason to call flatten on a completely hardcoded list! However, it’s still important to understand what these type errors mean and why they occur.

That wrinkle aside, flatten is a functioning example of what useful TMP can look like. We’ve written a single, generic definition that flattens lists of any depth, taking advantage of static type information to choose what to do at runtime.

Typeclasses as compile-time code generation

Presented with the above definition of Flatten, it might not be immediately obvious how to think about Flatten as a function from types to terms. After all, it looks a lot more like an “ordinary” typeclass (like, say, Eq or Show) than the TypeOf and ReifyNat classes we defined above.

One useful way to shift our perspective is to consider equivalent Flatten instances written using point-free style:

instance (ElementOf [a] ~ a) => Flatten [a] where
  flatten = id

instance {-# OVERLAPPING #-} Flatten [a] => Flatten [[a]] where
  flatten = flatten . concat

These definitions of flatten no longer (syntactically) depend on term-level arguments, just like our definitions of typeOf and reifyNat didn’t accept any term-level arguments above. This allows us to consider what flatten might “expand to” given a type argument alone:

flatten @[Int] is just id, since the Flatten [a] instance is selected.
flatten @[[Int]] is flatten @[Int] . concat, since the Flatten [[a]] instance is selected. That then becomes id . concat, which can be further simplified to just concat.
flatten @[[[Int]]] is flatten @[[Int]] . concat, which simplifies to concat . concat by the same reasoning above.
flatten @[[[[Int]]]] is then concat . concat . concat, and so on.

This meshes quite naturally with our intuition of typeclasses as functions from types to terms. Each application of flatten takes a type as an argument and produces some number of composed concats as a result. From this perspective, Flatten is performing a kind of compile-time code generation, synthesizing an expression to do the concatenation on the fly by inspecting the type information.

This framing is one of the key ideas that makes TMP so powerful, and indeed, it explains how it’s worthy of the name metaprogramming. As we continue to more sophisticated examples of TMP, try to keep this perspective in mind.

Part 2: Generic programming

Part 1 of this blog post established the foundational techniques used in TMP, all of which are useful on their own. If you’ve read up to this point, you now know enough to start applying TMP yourself, and the remainder of this blog post will simply continue to build upon what you already know.

In the previous section, we discussed how to use TMP to write a generic flatten operation. In this section, we’ll aim a bit higher: totally generic functions that operate on arbitrary datatypes.

Open type families and associated types

Before we can dive into examples, we need to revisit type families. In the previous sections, we discussed closed type families, but we did not cover their counterpart, open type families. Like closed type families, open type families are effectively functions from types to types, but unlike closed type families, they are not defined with a predefined set of equations. Instead, new equations are added separately using type instance declarations. For example, we could define our Sum family from above like this:

type family Sum a b
type instance Sum Z b = b
type instance Sum (S a) b = S (Sum a b)

In the case of Sum, this would not be very useful, and indeed, Sum is much better expressed as a closed type family than an open one. But the advantage of open type families is similar to the advantage of typeclasses: new equations can be added at any time, even in modules other than the one that declares the open type family.

This extensibility means open type families are used less for type-level computation and more for type-level maps that associate types with other types. For example, one might define a Key open type family that relates types to the types used to index them:

type family Key a
type instance Key (Vector a) = Int
type instance Key (Map k v) = k
type instance Key (Trie a) = ByteString

This can be combined with a typeclass to provide a generic way to see if a data structure contains a given key:

class HasKey a where
  hasKey :: Key a -> a -> Bool

instance HasKey (Vector a) where
  hasKey i vec = i >= 0 && i < Data.Vector.length vec

instance HasKey (Map k v) where
  hasKey = Data.Map.member

instance HasKey (Trie a) where
  hasKey = Data.Trie.member

In this case, anyone could define their own data structure, define instances of Key and HasKey for their data structure, and use hasKey to see if it contains a given key, regardless of the structure of those keys. In fact, it’s so common for open type families and typeclasses to cooperate in this way that GHC provides the option to make the connection explicit by defining them together:

class HasKey a where
  type Key a
  hasKey :: Key a -> a -> Bool

instance HasKey (Vector a) where
  type Key (Vector a) = Int
  hasKey i vec = i >= 0 && i < Data.Vector.length vec

instance HasKey (Map k v) where
  type Key (Map k v) = k
  hasKey = Data.Map.member

instance HasKey (Trie a) where
  type Key (Trie a) = ByteString
  hasKey = Data.Trie.member

An open family declared inside a typeclass like this is called an associated type. It works exactly the same way as the separate definitions of Key and HasKey, it just uses a different syntax. Note that although the family and instance keywords have disappeared from the declarations, that is only an abbreviation; the keywords are simply implicitly added (and explicitly writing them is still allowed, though most people do not).

Open type families and associated types are extremely useful for abstracting over similar types with slightly different structure, and libraries like mono-traversable are examples of how they can be used to that end for their full effect. However, those use cases can’t really be classified as TMP, just using typeclasses for their traditional purpose of operation overloading.

However, that doesn’t mean open type families aren’t useful for TMP. In fact, one use case of TMP makes heavy use of open type families: datatype-generic programming.

Example 2: Datatype-generic programming

Datatype-generic programming refers to a class of techniques for writing generic functions that operate on arbitrary data structures. Some useful applications of datatype-generic programming include

equality, comparison, and hashing,
recursive traversal of self-similar data structures, and
serialization and deserialization,

among other things. The idea is that by exploiting the structure of datatype definitions themselves, it’s possible for a datatype-generic function to provide implementations of functionality like the above on any datatype.

In Haskell, the most popular approach to datatype-generic programming leverages GHC generics, which is quite sophisticated. The module documentation for GHC.Generics already includes a fairly lengthy explanation of how it works, so I will not regurgitate it here (that could fill a blog post of its own!), but I will show how to construct a simplified version of the system that highlights the key role of TMP.

Generic datatype representations

At the heart of the Generic class is a simple concept: all non-GADT Haskell datatypes can be represented as sums of products. For example, if we have

data Authentication
  = AuthBasic Username Password
  | AuthSSH PublicKey

then we have a type that is essentially equivalent to this one:

type Authentication = Either (Username, Password) PublicKey

If we know how to define a function on a nested tree built out of Eithers and pairs, then we know how to define it on any such datatype! This is where TMP comes in: recall the way we viewed Flatten as a mechanism for compile-time code generation based on type information. Could we use the same technique to generate implementations of equality, comparison, hashing, etc. from statically-known information about the structure of a datatype?

The answer to that question is yes. To start, let’s consider a particularly simple example: suppose we want to write a generic function that counts the number of fields stored in an arbitrary constructor. For example, numFields (AuthBasic "alyssa" "pass1234") would return 2, while numFields (AuthSSH "<key>") would return 1. Not a very useful function, admittedly, but it’s a simple example of what generic programming can do.

We’ll start by using TMP to implement a “generic” version of numFields that operates on trees of Eithers and pairs as described above:

class GNumFields a where
  gnumFields :: a -> Natural

-- base case: leaf value
instance GNumFields a where
  gnumFields _ = 1

instance {-# OVERLAPPING #-} (GNumFields a, GNumFields b) => GNumFields (a, b) where
  gnumFields (a, b) = gnumFields a + gnumFields b

instance {-# OVERLAPPING #-} (GNumFields a, GNumFields b) => GNumFields (Either a b) where
  gnumFields (Left a)  = gnumFields a
  gnumFields (Right b) = gnumFields b

Just like our Flatten class from earlier, GNumFields uses the type-level structure of its argument to choose what to do:

If we find a pair, that corresponds to a product, so we recur into both sides and sum the results.
If we find Left or Right, that corresponds to the “spine” differentiating different constructors, so we simply recur into the contained value.
In the case of any other value, we’re at a “leaf” in the tree of Eithers and pairs, which corresponds to a single field, so we just return 1.

Now if we call gnumFields (Left ("alyssa", "pass1234")), we’ll get 2, and if we call gnumFields (Right "<key>"), we’ll get 1. All that’s left to do is write a bit of code that converts our Authentication type to a tree of Eithers and pairs:

genericizeAuthentication :: Authentication -> Either (Username, Password) PublicKey
genericizeAuthentication (AuthBasic user pass) = Left (user, pass)
genericizeAuthentication (AuthSSH key)         = Right key

numFieldsAuthentication :: Authentication -> Natural
numFieldsAuthentication = gnumFields . genericizeAuthentication

Now we get the results we want on our Authentication type using numFieldsAuthentication, but we’re not done yet, since it only works on Authentication values. Is there a way to define a generic numFields function that works on arbitrary datatypes that implement this conversion to sums-of-products? Yes, with another typeclass:

class Generic a where
  type Rep a
  genericize :: a -> Rep a

instance Generic Authentication where
  type Rep Authentication = Either (Username, Password) PublicKey
  genericize (AuthBasic user pass) = Left (user, pass)
  genericize (AuthSSH key)         = Right key

numFields :: (Generic a, GNumFields (Rep a)) => a -> Natural
numFields = gnumFields . genericize

Now numFields (AuthBasic "alyssa" "pass1234") returns 2, as desired, and it will also work with any datatype that provides a Generic instance. If the above code makes your head spin, don’t worry: this is by far the most complicated piece of code in this blog post up to this point. Let’s break down how it works piece by piece:

First, we define the Generic class, comprised of two parts:
1. The Rep a associated type maps a type a onto its generic, sums-of-products representation, i.e. one built out of combinations of Either and pairs.
2. The genericize method converts an actual value of type a to the equivalent value using the sums-of-products representation.
Next, we define a Generic instance for Authentication. Rep Authentication is the sums-of-products representation we described above, and genericize is likewise genericizeAuthentication from above.
Finally, we define numFields as a function with a GNumFields (Rep a) constraint. This is where all the magic happens:
- When we apply numFields to a datatype, Rep retrieves its generic, sums-of-products representation type.
- The GNumFields class then uses various TMP techniques we’ve already described so far in this blog post to generate a numFields implementation on the fly from the structure of Rep a.
- Finally, that generated numFields implementation is applied to the genericized term-level value, and the result is produced.

After all that, I suspect you might think this seems like a very convoluted way to define the (rather unhelpful) numFields operation. Surely just defining numFields on each type directly would be far easier? Indeed, if we were just considering numFields, you’d be right, but in fact we get much more than that. Using the same machinery, we can continue to define other generic operations—equality, comparison, etc.—the same way we defined numFields, and all of them would automatically work on Authentication because they all leverage the same Generic instance!

This is the basic value proposition of generic programming: we can do a little work up front to normalize our datatype to a generic representation once, then get a whole buffet of generic operations on it for free. In Haskell, the code generation capabilities of TMP is a key piece of that puzzle.

Improving our definition of `Generic`

You may note that the definition of Generic provided above does not match the one in GHC.Generic. Indeed, our naïve approach suffers from several flaws that the real version does not. This is not a GHC.Generics tutorial, so I will not discuss every detail of the full implementation, but I will highlight a few improvements relevant to the broader theme of TMP.

Distinguishing leaves from the spine

One problem with our version of Generic is that it provides no way to distinguish an Either or pair that should be considered a “leaf”, as in a type like this:

data Foo = A (Either Int String) | B (Char, Bool)

Given this type, Rep Foo should be Either (Either Int String) (Char, Bool), and numFields (Right ('a', True)) will erroneously return 2 rather than 1. To fix this, we can introduce a simple wrapper newtype that distinguishes leaves specifically:

newtype Leaf a = Leaf { getLeaf :: a }

Now our Generic instances look like this:

instance Generic Authentication where
  type Rep Authentication = Either (Leaf Username, Leaf Password) (Leaf PublicKey)
  genericize (AuthBasic user pass) = Left (Leaf user, Leaf pass)
  genericize (AuthSSH key)         = Right (Leaf key)

instance Generic Foo where
  type Rep Foo = Either (Leaf (Either Int String)) (Leaf (Char, Bool))
  genericize (A x) = Left (Leaf x)
  genericize (B x) = Right (Leaf x)

Since the Leaf constructor now distinguishes a leaf, rather than the absence of an Either or (,) constructor, we’ll have to update our GNumFields instances as well. However, this has the additional pleasant effect of eliminating the need for overlapping instances:

instance GNumFields (Leaf a) where  
  gnumFields _ = 1

instance (GNumFields a, GNumFields b) => GNumFields (a, b) where
  gnumFields (a, b) = gnumFields a + gnumFields b

instance (GNumFields a, GNumFields b) => GNumFields (Either a b) where
  gnumFields (Left a)  = gnumFields a
  gnumFields (Right b) = gnumFields b

This is a good example of why overlapping instances can be so seductive, but they often have unintended consequences. Even when doing TMP, explicit tags are almost always preferable.

Handling empty constructors

Suppose we have a type with nullary data constructors, like the standard Bool type:

data Bool = False | True

How do we write a Generic instance for Bool? Using just Either, (,), and Leaf, we can’t, but if we are willing to add a case for (), we can use it to denote nullary constructors:

instance GNumFields () where
  gnumFields _ = 0

instance Generic Bool where
  type Rep Bool = Either () ()
  genericize False = Left ()
  genericize True  = Right ()

In a similar vein, we could use Void to represent datatypes that don’t have any constructors at all.

Continuing from here

The full version of Generic has a variety of further improvements useful for generic programming, including:

Support for converting from Rep a to a.
Special indication of self-recursive datatypes, making generic tree traversals possible.
Type-level information about datatype constructor and record accessor names, allowing them to be used in serialization.
Fully automatic generation of Generic instances via the DeriveGeneric extension, which reduces the per-type boilerplate to essentially nothing.

The module documentation for GHC.Generics discusses the full system in detail, and it provides an additional example that uses the same essential TMP techniques discussed here.

Part 3: Dependent typing

It’s time for the third and final part of this blog post: an introduction to dependently typed programming in Haskell. A full treatment of dependently typed programming is far, far too vast to be contained in a single blog post, so I will not attempt to do so here. Rather, I will cover some basic idioms for doing dependent programming and highlight how TMP can be valuable when doing so.

Datatype promotion

In part 1, we used uninhabited datatypes like Z and S a to define new type-level constants. This works, but it is awkward. Imagine for a moment that we wanted to work with type-level booleans. Using our previous approach, we could define two empty datatypes, True and False:

data True
data False

Now we could define type families to provide operations on these types, such as Not:

type family Not a where
  Not True  = False
  Not False = True

However, this has some frustrating downsides:

First, it’s simply inconvenient that we have to define these new True and False “dummy” types, which are completely distinct from the Bool type provided by the prelude.
More significantly, it means Not has a very unhelpful kind:
```
ghci> :kind Not
Not :: * -> *
```
Even though Not is only supposed to be applied to True or False, its kind allows it to be applied to any type at all. You can see this in practice if you try to evaluate something like Not Char:
```
ghci> :kind! Not Char
Not Char :: *
= Not Char
```
Rather than getting an error, GHC simply spits Not Char back at us. This is a somewhat unintuitive property of closed type families: if none of the clauses match, the type family just gets “stuck,” not reducing any further. This can lead to very confusing type errors later in the typechecking process.

One way to think about Not is that it is largely dynamically kinded in the same way some languages are dynamically typed. That isn’t entirely true, as we technically will get a kind error if we try to apply Not to a type constructor rather than a type, such as Maybe:

ghci> :kind! Not Maybe

<interactive>:1:5: error:
    • Expecting one more argument to ‘Maybe’
      Expected a type, but ‘Maybe’ has kind ‘* -> *’

…but * is still a very big kind, much bigger than we would like to permit for Not.

To help with both these problems, GHC provides datatype promotion via the DataKinds language extension. The idea is that for each normal, non-GADT type definition like

data Bool = False | True

then in addition to the normal type constructor and value constructors, GHC also defines several promoted constructors:

Bool is allowed as both a type and a kind.
'True and 'False are defined as new types of kind Bool.

We can see this in action if we remove our data True and data False declarations and adjust our definition of Not to use promoted constructors:

{-# LANGUAGE DataKinds #-}

type family Not a where
  Not 'True  = 'False
  Not 'False = 'True

Now the inferred kind of Not is no longer * -> *:

ghci> :kind Not
Not :: Bool -> Bool

Consequently, we will now get a kind error if we attempt to apply Not to anything other than 'True or 'False:

ghci> :kind! Not Char

<interactive>:1:5: error:
    • Expected kind ‘Bool’, but ‘Char’ has kind ‘*’

This is a nice improvement. We can make a similar change to our definitions involving type-level natural numbers:

data Nat = Z | S Nat

class ReifyNat (a :: Nat) where
  reifyNat :: Natural

instance ReifyNat 'Z where
  reifyNat = 0

instance ReifyNat a => ReifyNat ('S a) where
  reifyNat = 1 + reifyNat @a

Note that we need to add an explicit kind signature on the definition of the ReifyNat typeclass, since otherwise GHC will assume a has kind *, since nothing in the types of the typeclass methods suggests otherwise. In addition to making it clearer that Z and S are related, this prevents someone from coming along and defining a nonsensical instance like ReifyNat Char, which previously would have been allowed but will now be rejected with a kind error.

Datatype promotion is not strictly required to do TMP, but makes the process significantly less painful. It makes Haskell’s kind language extensible in the same way its type language is, which allows type-level programming to enjoy static typechecking (or more accurately, static kindchecking) in the same way term-level programming does.

GADTs and proof terms

So far in this blog post, we have discussed several different function-like things:

Ordinary Haskell functions are functions from terms to terms.
Type families are functions from types to types.
Typeclasses are functions from types to terms.

A curious reader may wonder about the existence of a fourth class of function:

??? are functions from terms to types.

To reason about what could go in the ??? above, we must consider what “a function from terms to types” would even mean. Functions from terms to terms and types to types are straightforward enough. Functions from types to terms are a little trickier, but they make intuitive sense: we use information known at compile-time to generate runtime behavior. But how could information possibly flow in the other direction? How could we possibly turn runtime information into compile-time information without being able to predict the future?

In general, we cannot. However, one feature of Haskell allows a restricted form of seemingly doing the impossible—turning runtime information into compile-time information—and that’s GADTs.

GADTs⁴ are described in detail in the GHC User’s Guide, but the key idea for our purposes is that pattern-matching on a GADT constructor can refine type information. Here’s a simple, silly example:

data WhatIsIt a where
  ABool :: WhatIsIt Bool
  AnInt :: WhatIsIt Int

doSomething :: WhatIsIt a -> a -> a
doSomething ABool x = not x
doSomething AnInt x = x + 1

Here, WhatIsIt is a datatype with two nullary constructors, ABool and AnInt, similar to a normal, non-GADT datatype like this one:

data WhatIsIt a = ABool | AnInt

What’s special about GADTs is that each constructor is given an explicit type signature. With the plain ADT definition above, ABool and AnInt would both have the type forall a. WhatIsIt a, but in the GADT definition, we explicitly fix a to Bool in the type of ABool and to Int in the type of AnInt.

This simple feature allows us to do very interesting things. The doSomething function is polymorphic in a, but on the right-hand side of the first equation, x has type Bool, while on the right-hand side of the second equation, x has type Int. This is because the WhatIsIt a argument effectively constrains the type of a, as we can see by experimenting with doSomething in GHCi:

ghci> doSomething ABool True
False
ghci> doSomething AnInt 10
11
ghci> doSomething AnInt True

error:
    • Couldn't match expected type ‘Int’ with actual type ‘Bool’
    • In the second argument of ‘doSomething’, namely ‘True’
      In the expression: doSomething AnInt True
      In an equation for ‘it’: it = doSomething AnInt True

One way to think about GADTs is as “proofs” or “witnesses” of type equalities. The ABool constructor is a proof of a ~ Bool, while the AnInt constructor is a proof of a ~ Int. When you construct ABool or AnInt, you must be able to satisfy the equality, and it is in a sense “packed into” the constructor value. When code pattern-matches on the constructor, the equality is “unpacked from” the value, and the equality becomes available on the right-hand side of the pattern match.

GADTs can be much more sophisticated than our simple WhatIsIt type above. Just like normal ADTs, GADT constructors can have parameters, which makes it possible to write inductive datatypes that carry type equality proofs with them:

infixr 5 `HCons`

data HList as where
  HNil  :: HList '[]
  HCons :: a -> HList as -> HList (a ': as)

This type is a heterogenous list, a list that can contain elements of different types:

ghci> :t True `HCons` "hello" `HCons` 42 `HCons` HNil
True `HCons` "hello" `HCons` 42 `HCons` HNil
  :: Num a => HList '[Bool, [Char], a]

An HList is parameterized by a type-level list that keeps track of the types of its elements, which allows us to highlight another interesting property of GADTs: if we restrict that type information, the GHC pattern exhaustiveness checker will take the restriction into account. For example, we can write a completely total head function on HLists like this:

head :: HList (a ': as) -> a
head (x `HCons` _) = x

Remarkably, GHC does not complain that this definition of head is non-exhaustive. Since we specified that the argument must be of type HList (a ': as) in the type signature for head, GHC knows that the argument cannot be HNil (which would have the type HList '[]), so it doesn’t ask us to handle that case.

These examples illustrate the way GADTs serve as a general-purpose construct for relating type- and term-level information. Information flows bidirectionally: type information refines the set of type constructors that can be matched on, and matching on type constructors exposes new type equalities.

Proofs that work together

This interplay is wonderfully compositional. Suppose we wanted to write a function that accepts an HList of exactly 1, 2, or 3 elements. There’s no easy way to express that in the type signature the way we did with head, so it might seem like all we can do is write an entirely new container datatype that has three constructors, one for each case.

However, a more interesting solution exists that takes advantage of the bidirectional nature of GADTs. We can start by writing a proof term that contains no values, it just encapsulates type equalities on a type-level list:

data OneToThree a b c as where
  One   :: OneToThree a b c '[a]
  Two   :: OneToThree a b c '[a, b]
  Three :: OneToThree a b c '[a, b, c]

We call it a proof term because a value of type OneToThree a b c as constitutes a proof that as has exactly 1, 2, or 3 elements. Using OneToThree, we can write a function that accepts an HList accompanied by a proof term:

sumUpToThree :: OneToThree Int Int Int as -> HList as -> Int
sumUpToThree One   (x `HCons` HNil)                     = x
sumUpToThree Two   (x `HCons` y `HCons` HNil)           = x + y
sumUpToThree Three (x `HCons` y `HCons` z `HCons` HNil) = x + y + z

As with head, this function is completely exhaustive, in this case because we take full advantage of the bidirectional nature of GADTs:

When we match on the OneToThree proof term, information flows from the term level to the type level, refining the type of as in that branch.
The refined type of as then flows back down to the term level, restricting the shape the HList can take and refinine the set of patterns we have to match.

Of course, this example is not especially useful, but in general proof terms can encode any number of useful properties. For example, we can write a proof term that ensures an HList has an even number of elements:

data Even as where
  EvenNil  :: Even '[]
  EvenCons :: Even as -> Even (a ': b ': as)

This is a proof which itself has inductive structure: EvenCons takes a proof that as has an even number of elements and produces a proof that adding two more elements preserves the evenness. We can combine this with a type family to write a function that “pairs up” elements in an HList:

type family PairUp as where
  PairUp '[]            = '[]
  PairUp (a ': b ': as) = (a, b) ': PairUp as

pairUp :: Even as -> HList as -> HList (PairUp as)
pairUp EvenNil         HNil                     = HNil
pairUp (EvenCons even) (x `HCons` y `HCons` xs) = (x, y) `HCons` pairUp even xs

Once again, this definition is completely exhaustive, and we can show that it works in GHCi:

ghci> pairUp (EvenCons $ EvenCons EvenNil)
             (True `HCons` 'a' `HCons` () `HCons` "foo" `HCons` HNil)
(True,'a') `HCons` ((),"foo") `HCons` HNil

This ability to capture properties of a type using auxiliary proof terms, rather than having to define an entirely new type, is one of the things that makes dependently typed programming so powerful.

Proof inference

While our definition of pairUp is interesting, you may be skeptical of its practical utility. It’s fiddly and inconvenient to have to pass the Even proof term explicitly, since it must be updated every time the length of the list changes. Fortunately, this is where TMP comes in.

Remember that typeclasses are functions from types to terms. As its happens, a value of type Even as can be mechanically produced from the structure of the type as. This suggests that we could use TMP to automatically generate Even proofs, and indeed, we can. In fact, it’s not at all complicated:

class IsEven as where
  evenProof :: Even as

instance IsEven '[] where
  evenProof = EvenNil

instance IsEven as => IsEven (a ': b ': as) where
  evenProof = EvenCons evenProof

We can now adjust our pairUp function to use IsEven instead of an explicit Even argument:

pairUp :: IsEven as => HList as -> HList (PairUp as)
pairUp = go evenProof where
  go :: Even as -> HList as -> HList (PairUp as)
  go EvenNil         HNil                     = HNil
  go (EvenCons even) (x `HCons` y `HCons` xs) = (x, y) `HCons` go even xs

This is essentially identical to its old definition, but by acquiring the proof via IsEven rather than passing it explicitly, we can call pairUp without having to construct a proof manually:

ghci> pairUp (True `HCons` 'a' `HCons` () `HCons` "foo" `HCons` HNil)
(True,'a') `HCons` ((),"foo") `HCons` HNil

This is rather remarkable. Using TMP, we are able to get GHC to automatically construct a proof that a list is even, with no programmer guidance beyond writing the IsEven typeclass. This relies once more on the perspective that typeclasses are functions that accept types and generate term-level code: IsEven is a function that accepts a type-level list and generates an Even proof term.

From this perspective, typeclasses are a way of specifying a proof search algorithm to the compiler. In the case of IsEven, the proofs being generated are rather simple, so the proof search algorithm is quite mechanical. But in general, typeclasses can be used to perform proof search of significant complexity, given a sufficiently clever encoding into the type system.

Aside: GADTs versus type families

Before moving on, I want to explicitly call attention to the relationship between GADTs and type families. Though at first glance they may seem markedly different, there are some similarities between the two, and sometimes they may be used to accomplish similar things.

Consider again the type of the pairUp function above (without the typeclass for simplicity):

pairUp :: Even as -> HList as -> HList (PairUp as)

We used both a GADT, Even, and a type family, PairUp. But we could have, in theory, used only a GADT and eliminated the type family altogether. Consider this variation on the Even proof term:

data EvenPairs as bs where
  EvenNil  :: EvenPairs '[] '[]
  EvenCons :: EvenPairs as bs -> EvenPairs (a ': b ': as) ((a, b) ': bs)

This type has two type parameters rather than one, and though there’s no distinction between the two from GHC’s point of view, it can be useful to think of as as an “input” parameter and bs as an “output” parameter. The idea is that any EvenPairs proof relates both an even-length list type and its paired up equivalent:

EvenNil has type EvenPairs '[] '[],
EvenCons EvenNil has type EvenPairs '[a, b] '[(a, b)],
EvenCons (EvenCons EvenNil) has type EvenPairs '[a, b, c, d] '[(a, b), (c, d)],
…and so on.

This allows us to reformulate our pairUp type signature this way:

pairUp :: EvenPairs as bs -> HList as -> HList bs

The definition is otherwise unchanged. The PairUp type family is completely gone, because now EvenPairs itself defines the relation. In this way, GADTs can be used like type-level functions!

The inverse, however, is not true, at least not directly: we cannot eliminate the GADT altogether and exclusively use type families. One way to attempt doing so would be to define a type family that returns a constraint rather than a type:

import Data.Kind (Constraint)

type family IsEvenTF as :: Constraint where
  IsEvenTF '[]            = ()
  IsEvenTF (_ ': _ ': as) = IsEvenTF as

The idea here is that IsEvenTF as produces a constraint can only be satisfied if as has an even number of elements, since that’s the only way it will eventually reduce to (), which in this case means the empty set of constraints, not the unit type (yes, the syntax for that is confusing). And in fact, it’s true that putting IsEvenTF as => in a type signature successfully restricts as to be an even-length list, but it doesn’t allow us to write pairUp. To see why, we can try the following definition:

pairUp :: IsEvenTF as => HList as -> HList (PairUp as)
pairUp HNil                     = HNil
pairUp (x `HCons` y `HCons` xs) = (x, y) `HCons` pairUp xs

Unlike the version using the GADT, this version of pairUp is not considered exhaustive:

warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In an equation for ‘pairUp’: Patterns not matched: HCons _ HNil

This is because type families don’t provide the same bidirectional flow of information that GADTs do, they’re only type-level functions. The constraint generated by IsEvenTF provides no term-level evidence about the shape of as, so we can’t branch on it the way we can branch on the Even GADT.⁵ (In a sense, IsEvenTF is doing validation, not parsing.)

For this reason, I caution against overuse of type families. Their simplicity is seductive, but all too often you pay for that simplicity with inflexibility. GADTs combined with TMP for proof inference can provide the best of both worlds: complete control over the term-level proof that gets generated while still letting the compiler do most of the work for you.

Guiding type inference

So far, this blog post has given relatively little attention to type inference. That is in some part a testament to the robustness of GHC’s type inference algorithm: even when fairly sophisticated TMP is involved, GHC often manages to propagate enough type information that type annotations are rarely needed.

However, when doing TMP, it would be irresponsible to not at least consider the type inference properties of programs. Type inference is what drives the whole typeclass resolution process to begin with, so poor type inference can easily make your fancy TMP construction next to useless. To take advantage of GHC to the fullest extent, programs should proactively guide the typechecker to help it infer as much as possible as often as possible.

To illustrate what that can look like, suppose we want to use TMP to generate an HList full of () values of an arbitrary length:

class UnitList as where
  unitList :: HList as

instance UnitList '[] where
  unitList = HNil

instance UnitList as => UnitList (() ': as) where
  unitList = () `HCons` unitList

Testing in GHCi, we can see it behaves as desired:

ghci> unitList :: HList '[(), (), ()]
() `HCons` () `HCons` () `HCons` HNil

Now suppose we write a function that accepts a list containing exactly one element and returns it:

unsingleton :: HList '[a] -> a
unsingleton (x `HCons` HNil) = x

Naturally, we would expect these to compose without a hitch. If we write unsingleton unitList, our TMP should generate a list of length 1, and we should get back (). However, it may surprise you to learn that isn’t, in fact, what happens:⁶

ghci> unsingleton unitList

error:
    • Ambiguous type variable ‘a0’ arising from a use of ‘unitList’
      prevents the constraint ‘(UnitList '[a0])’ from being solved.
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      These potential instances exist:
        instance UnitList as => UnitList (() : as)

What went wrong? The type error says that a0 is ambiguous, but it only lists a single matching UnitList instance—the one we want—so how can it be ambiguous which one to select?

The problem stems from the way we defined UnitList. When we wrote the instance

instance UnitList as => UnitList (() ': as) where

we said the first element of the type-level list must be (), so there’s nothing stopping someone from coming along and defining another instance:

instance UnitList as => UnitList (Int ': as) where
  unitList = 0 `HCons` unitList

In that case, GHC would have no way to know which instance to pick. Nothing in the type of unsingleton forces the element in the list to have type (), so both instances are equally valid. To hedge against this future possibility, GHC rejects the program as ambiguous from the start.

Of course, this isn’t what we want. The UnitList class is supposed to always return a list of () values, so how can we force GHC to pick our instance anyway? The answer is to play a trick:

instance (a ~ (), UnitList as) => UnitList (a ': as) where
  unitList = () `HCons` unitList

Here we’ve changed the instance so that it has the shape UnitList (a ': as), with a type variable in place of the (), but we also added an equality constraint that forces a to be (). Intuitively, you might think these two instances are completely identical, but in fact they are not! As proof, our example now typechecks:

ghci> unsingleton unitList
()

To understand why, it’s important to understand how GHC’s typeclass resolution algorithm works. Let’s start by establishing some terminology. Note that every instance declaration has the following shape:

instance <constraints> => C <types>

The part to the left of the => is known as the instance context, while the part to the right is known as the instance head. Now for the important bit: when GHC attempts to pick which typeclass instance to use to solve a typeclass constraint, only the instance head matters, and the instance context is completely ignored. Once GHC picks an instance, it commits to its choice, and only then does it consider the instance context.

This explains why our two UnitList instances behave differently:

Given the instance head UnitList (() ': as), GHC won’t select the instance unless it knows the first element of the list is ().
But given the instance head UnitList (a ': as), GHC will pick the instance regardless of the type of the first element. All that matters is that the list is at least one element long.

After the UnitList (a ': as) instance is selected, GHC attempts to solve the constraints in the instance context, including the a ~ () constraint. This forces a to be (), resolving the ambiguity and allowing type inference to proceed.

This distinction might seem excessively subtle, but in practice it is enormously useful. It means you, the programmer, have direct control over the type inference process:

If you put a type in the instance head, you’re asking GHC to figure out how to make the types match up by some other means. Sometimes that’s very useful, since perhaps you want that type to inform which instance to pick.
But if you put an equality constraint in the instance context, the roles are reversed: you’re saying to the compiler “you don’t tell me, I’ll tell you what type this is,” effectively giving you a role in type inference itself.

From this perspective, typeclass instances with equality constraints make GHC’s type inference algorithm extensible. You get to pick which decisions are made and when, and crucially, you can use knowledge of your own program structure to expose more information to the typechecker.

Given all of the above, consider again the definition of IsEven from earlier:

class IsEven as where
  evenProof :: Even as

instance IsEven '[] where
  evenProof = EvenNil

instance IsEven as => IsEven (a ': b ': as) where
  evenProof = EvenCons evenProof

Though it didn’t cause any problems in the examples we tried, this definition isn’t optimized for type inference. If GHC needed to solve an IsEven (a ': b0) constraint, where b0 is an ambiguous type variable, it would get stuck, since it doesn’t know that someone won’t come along and define an IsEven '[a] instance in the future.

To fix this, we can apply the same trick we used for UnitList, just in a slightly different way:

instance (as ~ (b ': bs), IsEven bs) => IsEven (a ': as) where
  evenProof = EvenCons evenProof

Again, the idea is to move the type information we learn from picking this instance into the instance context, allowing it to guide type inference rather than making type inference figure it out from some other source. Consistently applying this transformation can dramatically improve type inference in programs that make heavy use of TMP.

Example 3: Subtyping constraints

At last, we have reached the final example of this blog post. For this one, I have the pleasure of providing a real-world example from a production Haskell codebase: while I was working at Hasura, I had the opportunity to design an internal parser combinator library that captures aspects of the GraphQL type system. One such aspect of that type system is a form of subtyping; GraphQL essentially has two “kinds” of types—input types and output types—but some types can be used as both.

Haskell has no built-in support for subtyping, so most Haskell programs do their best to get away with parametric polymorphism instead. However, in our case, we actually need to distinguish (at runtime) types in the “both” category from those that are exclusively input or exclusively output types. Consequently, our GQLKind datatype has three cases:

data GQLKind
  = Both
  | Input
  | Output

We use DataKind-promoted versions of this GQLKind type as a parameter to a GQLType GADT:

data GQLType k where
  TScalar      :: GQLType 'Both
  TInputObject :: InputObjectInfo -> GQLType 'Input
  TIObject     :: ObjectInfo -> GQLType 'Output
  -- ...and so on...

This allows us to write functions that only accept input types or only accept output types, which is a wonderful property to be able to guarantee at compile-time! But there’s a problem: if we write a function that only accepts values of type GQLType 'Input, we can’t pass a GQLType 'Both, even though we really ought to be able to.

To fix this, we can use a little dependently typed programming. First, we’ll define a type to represent proof terms that witness a subkinding relationship:

data SubKind k1 k2 where
  KRefl :: SubKind k k
  KBoth :: SubKind 'Both k

The first case, KRefl, states that every kind is trivially a subkind of itself. The second case, KBoth, states that Both is a subkind of any kind at all. (This is a particularly literal example of using a type to define axioms.) The next step is to use TMP to implement proof inference:

class IsSubKind k1 k2 where
  subKindProof :: SubKind k1 k2

instance IsSubKind 'Both k where
  subKindProof = KBoth

instance (k ~ 'Input) => IsSubKind 'Input k where
  subKindProof = KRefl

instance (k ~ 'Output) => IsSubKind 'Output k where
  subKindProof = KRefl

These instances use the type equality trick described in the previous section to guide type inference, ensuring that if we ever need to prove that k is a superkind of 'Input or 'Output, type inference will force them to be equal.

Using IsSubKind, we can easily resolve the problem described above. Rather than write a function with a type like this:

nullable :: GQLParser 'Input a -> GQLParser 'Input (Maybe a)

…we simply use an IsSubKind constraint, instead:

nullable :: IsSubKind k 'Input => GQLParser k a -> GQLParser k (Maybe a)

Now both 'Input and 'Both kinds are accepted. In my experience, this caused no trouble at all for callers of these functions; everything worked completely automatically. Consuming the SubKind proofs was slightly more involved, but only ever so slightly. For example, we have a type family that looks like this:

type family ParserInput k where
  ParserInput 'Both   = InputValue
  ParserInput 'Input  = InputValue
  ParserInput 'Output = SelectionSet

This type family is used to determine what a GQLParser k a actually consumes as input, based on the kind of the GraphQL type it corresponds to. In some functions, we need to prove to GHC that IsSubKind k 'Input implies ParserInput k ~ InputValue.

Fortunately, that is very easy to do using the (:~:) type from Data.Type.Equality in base to capture a term-level witness of a type equality. It’s an ordinary Haskell GADT that happens to have an infix type constructor, and this is its definition:

data a :~: b where
  Refl :: a :~: a

Just as with any other GADT, (:~:) can be used to pack up type equalities and unpack them later; a :~: b just happens to be the GADT that corresponds precisely to the equality a ~ b. Using (:~:), we can write a reusable proof that IsSubKind k 'Input implies ParserInput k ~ InputValue:

inputParserInput :: forall k. IsSubKind k 'Input => ParserInput k :~: InputValue
inputParserInput = case subKindProof @k @'Input of
  KRefl -> Refl
  KBoth -> Refl

This function is a very simple proof by cases, where Refl can be read as “Q.E.D.”:

In the first case, matching on KRefl refines k to 'Input, and ParserInput 'Input is InputValue by definition of ParserInput.
Likewise, in the second case, matching on KBoth refines k to 'Both, and ParserInput 'Both is also InputValue by definition of ParserInput.

This inputParserInput helper allows functions like nullable, which internally need ParserInput k ~ InputValue, to take the form

nullable :: forall k a. IsSubKind k 'Input => GQLParser k a -> GQLParser k (Maybe a)
nullable parser = case inputParserInput @k of
  Refl -> {- ...implementation goes here... -}

Overall, this burden is quite minimal, so the additional type safety is more than worth the effort. The same could not be said without IsSubKind doing work to infer the proofs at each use site, so in this case, TMP has certainly paid its weight!

Wrapping up and closing thoughts

So concludes my introduction to Haskell TMP. As seems to happen all too often with my blog posts, this one has grown rather long, so allow me to provide a summary of the most important points:

Typeclass metaprogramming is a powerful technique for performing type-directed code generation, making it a form of “value inference” that infers values from types.
Unlike most other metaprogramming mechanisms, TMP has a wonderful synergy with type inference, which allows it to take advantage of information the programmer may not have even written explicitly.
Though I’ve called the technique “typeclass metaprogramming,” TMP really leverages the entirety of the modern GHC type system. Type families, GADTs, promoted types, and more all have their place in usefully applying type-level programming.
Finally, since TMP relies so heavily on type inference to do its job, it’s crucial to be thoughtful about how you design type-level code to give the typechecker as many opportunities to succeed as you possibly can.

The individual applications of TMP covered in this blog post—type-level computation, generic programming, and dependent typing—are all useful in their own right, and this post does not linger on any of them long enough to do any of them justice. That is, perhaps, the cost one pays when trying to discuss such an abstract, general technique. However, I hope that readers can see the forest for the trees and understand how TMP can be a set of techniques in their own right, applicable to the topics described above and more.

Readers may note that this blog post targets a slightly different audience than my other recent writing has been. That is a conscious choice: there is an unfortunate dearth of resources to help intermediate Haskell programmers become advanced Haskell programmers, in part because it’s hard to write them. The lack of resources makes tackling topics like this rather difficult, as too often it feels as though an entire web of concepts must be explained all at once, with no obvious incremental path that provides sufficient motivation every step of the way.

It remains to be seen whether my stab at the problem will be successful. But on the chance that it is, I suspect some readers will be curious about where to go next. Here are some ideas:

As mentioned earlier in this blog post, the GHC.Generics module documentation is a great resource if you want to explore generic programming further, and generic programming is a great way to put TMP to practical use.
I have long believed that the GHC User’s Guide is a criminally under-read and underappreciated piece of documentation. It is a treasure trove of knowledge, and I highly recommend reading through the sections on type-related language extensions if you want to get a better grasp of the mechanics of the Haskell type system.
Finally, if dependently typed programming in Haskell intrigues you, and you don’t mind staring into the sun, the singletons library provides abstractions and design patterns that can considerably cut down on the boilerplate. (Also, the accompanying paper is definitely worth a read if you’d like to go down that route.)

Even if you don’t decide to pursue type-level programming in Haskell, I hope this blog post helps make some of the concepts involved less mystical and intimidating. I, for one, think this stuff is worth the effort involved in understanding. After all, you never know when it might come in handy.

Not to be confused with C++’s template metaprogramming, though there are significant similarities between the two techniques. ↩
There have been proposals to introduce ordered instances, known in the literature as instance chains, but as of this writing, GHC does not implement them. ↩
Note that this also preserves an important property of the Haskell type system, parametricity. A function like id :: a -> a shouldn’t be allowed to do different things depending on which type is chosen for a, which our first version of guardUnit tried to violate. Typeclasses, being functions on types, can naturally do different things given different types, so a typeclass constraint is precisely what gives us the power to violate parametricity. ↩
Short for generalized algebraic datatypes, which is a rather unhelpful name for actually understanding what they are or what they’re for. ↩
If GHC allowed lightweight existential quantification, we could make that term-level evidence available with a sufficiently clever definition for IsEvenTF:
```
type family IsEvenTF as :: Constraint where
  IsEvenTF '[]       = ()
  IsEvenTF (a ': as) = exists b as'. (as ~ (b ': as'), IsEvenTF as')
```
The type refinement provided by matching on HCons would be enough for the second case of IsEvenTF to be selected, which would provide an equality proof that as has at least two elements. Sadly, GHC does not support anything of this sort, and it’s unclear if it would be tractable to implement at all. ↩
Actually, I’ve cheated a little bit here, because unsingleton unitList really does typecheck in GHCi under normal circumstances. That’s because the ExtendedDefaultRules extension is enabled in GHCi by default, which defaults ambiguous type variables to (), which happens to be exactly what’s needed to make this contrived example typecheck. However, that doesn’t say anything very useful, since the same expression really would fail to typecheck inside a Haskell module, so I’ve turned ExtendedDefaultRules off to illustrate the problem. ↩

Names are not type safety

2020-11-01T00:00:00Z

Haskell programmers spend a lot of time talking about type safety. The Haskell school of program construction advocates “capturing invariants in the type system” and “making illegal states unrepresentable,” both of which sound like compelling goals, but are rather vague on the techniques used to achieve them. Almost exactly one year ago, I published Parse, Don’t Validate as an initial stab towards bridging that gap.

The ensuing discussions were largely productive and right-minded, but one particular source of confusion quickly became clear: Haskell’s newtype construct. The idea is simple enough—the newtype keyword declares a wrapper type, nominally distinct from but representationally equivalent to the type it wraps—and on the surface this sounds like a simple and straightforward path to type safety. For example, one might consider using a newtype declaration to define a type for an email address:

newtype EmailAddress = EmailAddress Text

This technique can provide some value, and when coupled with a smart constructor and an encapsulation boundary, it can even provide some safety. But it is a meaningfully distinct kind of type safety from the one I highlighted a year ago, one that is far weaker. On its own, a newtype is just a name.

And names are not type safety.

Intrinsic and extrinsic safety

To illustrate the difference between constructive data modeling (discussed at length in my previous blog post) and newtype wrappers, let’s consider an example. Suppose we want a type for “an integer between 1 and 5, inclusive.” The natural constructive modeling would be an enumeration with five cases:

data OneToFive
  = One
  | Two
  | Three
  | Four
  | Five

We could then write some functions to convert between Int and our OneToFive type:

toOneToFive :: Int -> Maybe OneToFive
toOneToFive 1 = Just One
toOneToFive 2 = Just Two
toOneToFive 3 = Just Three
toOneToFive 4 = Just Four
toOneToFive 5 = Just Five
toOneToFive _ = Nothing

fromOneToFive :: OneToFive -> Int
fromOneToFive One   = 1
fromOneToFive Two   = 2
fromOneToFive Three = 3
fromOneToFive Four  = 4
fromOneToFive Five  = 5

This would be perfectly sufficient for achieving our stated goal, but you’d be forgiven for finding it odd: it would be rather awkward to work with in practice. Because we’ve invented an entirely new type, we can’t reuse any of the usual numeric functions Haskell provides. Consequently, many programmers would gravitate towards a newtype wrapper, instead:

newtype OneToFive = OneToFive Int

Just as before, we can provide toOneToFive and fromOneToFive functions, with identical types:

toOneToFive :: Int -> Maybe OneToFive
toOneToFive n
  | n >= 1 && n <= 5 = Just $ OneToFive n
  | otherwise        = Nothing

fromOneToFive :: OneToFive -> Int
fromOneToFive (OneToFive n) = n

If we put these declarations in their own module and choose not to export the OneToFive constructor, these APIs might appear entirely interchangeable. Naïvely, it seems that the newtype version is both simpler and equally type-safe. However—perhaps surprisingly—this is not actually true.

To see why, suppose we write a function that consumes a OneToFive value as an argument. Under the constructive modeling, such a function need only pattern-match against each of the five constructors, and GHC will accept the definition as exhaustive:

ordinal :: OneToFive -> Text
ordinal One   = "first"
ordinal Two   = "second"
ordinal Three = "third"
ordinal Four  = "fourth"
ordinal Five  = "fifth"

The same is not true given the newtype encoding. The newtype is opaque, so the only way to observe it is to convert it back to an Int—after all, it is an Int. An Int can of course contain many other values besides 1 through 5, so we are forced to add an error case to satisfy the exhaustiveness checker:

ordinal :: OneToFive -> Text
ordinal n = case fromOneToFive n of
  1 -> "first"
  2 -> "second"
  3 -> "third"
  4 -> "fourth"
  5 -> "fifth"
  _ -> error "impossible: bad OneToFive value"

In this highly contrived example, this may not seem like much of a problem to you. But it nonetheless illustrates a key difference in the guarantees afforded by the two approaches:

The constructive datatype captures its invariants in such a way that they are accessible to downstream consumers. This frees our ordinal function from worrying about handling illegal values, as they have been made unutterable.
The newtype wrapper provides a smart constructor that validates the value, but the boolean result of that check is used only for control flow; it is not preserved in the function’s result. Accordingly, downstream consumers cannot take advantage of the restricted domain; they are functionally accepting Ints.

Losing exhaustiveness checking might seem like small potatoes, but it absolutely is not: our use of error has punched a hole right through our type system. If we were to add another constructor to our OneToFive datatype,¹ the version of ordinal that consumes a constructive datatype would be immediately detected non-exhaustive at compile-time, while the version that consumes a newtype wrapper would continue to compile yet fail at runtime, dropping through to the “impossible” case.

All of this is a consequence of the fact that the constructive modeling is intrinsically type-safe; that is, the safety properties are enforced by the type declaration itself. Illegal values truly are unrepresentable: there is simply no way to represent 6 using any of the five constructors. The same is not true of the newtype declaration, which has no intrinsic semantic distinction from that of an Int; its meaning is specified extrinsically via the toOneToFive smart constructor. Any semantic distinction intended by a newtype is thoroughly invisible to the type system; it exists only in the programmer’s mind.

Revisiting non-empty lists

Our OneToFive datatype is rather artificial, but identical reasoning applies to other datatypes that are significantly more practical. Consider the NonEmpty datatype I’ve repeatedly highlighted in recent blog posts:

data NonEmpty a = a :| [a]

It may be illustrative to imagine a version of NonEmpty represented as a newtype over ordinary lists. We can use the usual smart constructor strategy to enforce the desired non-emptiness property:

newtype NonEmpty a = NonEmpty [a]

nonEmpty :: [a] -> Maybe (NonEmpty a)
nonEmpty [] = Nothing
nonEmpty xs = Just $ NonEmpty xs

instance Foldable NonEmpty where
  toList (NonEmpty xs) = xs

Just as with OneToFive, we quickly discover the consequences of failing to preserve this information in the type system. Our motivating use case for NonEmpty was the ability to write a safe version of head, but the newtype version requires another assertion:

head :: NonEmpty a -> a
head xs = case toList xs of
  x:_ -> x
  []  -> error "impossible: empty NonEmpty value"

This might not seem like a big deal, since it seems unlikely such a case would ever happen. But that reasoning hinges entirely on trusting the correctness of the module that defines NonEmpty, while the constructive definition only requires trusting the GHC typechecker. As we generally trust that the typechecker works correctly, the latter is a much more compelling proof.

Newtypes as tokens

If you are fond of newtypes, this whole argument may seem a bit troubling. It may seem like I’m implying newtypes are scarcely better than comments, albeit comments that happen to be meaningful to the typechecker. Fortunately, the situation is not quite that grim—newtypes can provide a sort of safety, just a weaker one.

The primary safety benefit of newtypes is derived from abstraction boundaries. If a newtype’s constructor is not exported, it becomes opaque to other modules. The module that defines the newtype—its “home module”—can take advantage of this to create a trust boundary where internal invariants are enforced by restricting clients to a safe API.

We can use the NonEmpty example from above to illustrate how this works. We refrain from exporting the NonEmpty constructor, and we provide head and tail operations that we trust to never actually fail:

module Data.List.NonEmpty.Newtype
  ( NonEmpty
  , cons
  , nonEmpty
  , head
  , tail
  ) where

newtype NonEmpty a = NonEmpty [a]

cons :: a -> [a] -> NonEmpty a
cons x xs = NonEmpty (x:xs)

nonEmpty :: [a] -> Maybe (NonEmpty a)
nonEmpty [] = Nothing
nonEmpty xs = Just $ NonEmpty xs

head :: NonEmpty a -> a
head (NonEmpty (x:_)) = x
head (NonEmpty [])    = error "impossible: empty NonEmpty value"

tail :: NonEmpty a -> [a]
tail (NonEmpty (_:xs)) = xs
tail (NonEmpty [])     = error "impossible: empty NonEmpty value"

Since the only way to construct or consume NonEmpty values is to use the functions in Data.List.NonEmpty.Newtype’s exported API, the above implementation makes it impossible for clients to violate the non-emptiness invariant. In a sense, values of opaque newtypes are like tokens: the implementing module issues tokens via its constructor functions, and those tokens have no intrinsic value. The only way to do anything useful with them is to “redeem” them to the issuing module’s accessor functions, in this case head and tail, to obtain the values contained within.

This approach is significantly weaker than using a constructive datatype, since it is theoretically possible to screw up and accidentally provide a means to construct an invalid NonEmpty [] value. For this reason, the newtype approach to type safety does not on its own constitute a proof that a desired invariant holds. However, it restricts the “surface area” where an invariant violation can occur to the defining module, so reasonable confidence the invariant really does hold can be achieved by thoroughly testing the module’s API using fuzzing or property-based testing techniques.²

This tradeoff may not seem all that bad, and indeed, it is often a very good one! Guaranteeing invariants using constructive data modeling can, in general, be quite difficult, which often makes it impractical. However, it is easy to dramatically underestimate the care needed to avoid accidentally providing a mechanism that permits violating the invariant. For example, the programmer may choose to take advantage of GHC’s convenient typeclass deriving to derive a Generic instance for NonEmpty:

{-# LANGUAGE DeriveGeneric #-}

import GHC.Generics (Generic)

newtype NonEmpty a = NonEmpty [a]
  deriving (Generic)

However, this innocuous line provides a trivial mechanism to circumvent the abstraction boundary:

ghci> GHC.Generics.to @(NonEmpty ()) (M1 $ M1 $ M1 $ K1 [])
NonEmpty []

This is a particularly extreme example, since derived Generic instances are fundamentally abstraction-breaking, but this problem can crop up in less obvious ways, too. The same problem occurs with a derived Read instance:

ghci> read @(NonEmpty ()) "NonEmpty []"
NonEmpty []

To some readers, these pitfalls may seem obvious, but safety holes of this sort are remarkably common in practice. This is especially true for datatypes with more sophisticated invariants, as it may not be easy to determine whether the invariants are actually upheld by the module’s implementation. Proper use of this technique demands caution and care:

All invariants must be made clear to maintainers of the trusted module. For simple types, such as NonEmpty, the invariant is self-evident, but for more sophisticated types, comments are not optional.
Every change to the trusted module must be carefully audited to ensure it does not somehow weaken the desired invariants.
Discipline is needed to resist the temptation to add unsafe trapdoors that allow compromising the invariants if used incorrectly.
Periodic refactoring may be needed to ensure the trusted surface area remains small. It is all too easy for the responsibility of the trusted module to accumulate over time, dramatically increasing the likelihood of some subtle interaction causing an invariant violation.

In contrast, datatypes that are correct by construction suffer none of these problems. The invariant cannot be violated without changing the datatype definition itself, which has rippling effects throughout the rest of the program to make the consequences immediately clear. Discipline on the part of the programmer is unnecessary, as the typechecker enforces the invariants automatically. There is no “trusted code” for such datatypes, since all parts of the program are equally beholden to the datatype-mandated constraints.

In libraries, the newtype-afforded notion of safety via encapsulation is useful, as libraries often provide the building blocks used to construct more complicated data structures. Such libraries generally receive more scrutiny and care than application code does, especially given they change far less frequently. In application code, these techniques are still useful, but the churn of a production codebase tends to weaken encapsulation boundaries over time, so correctness by construction should be preferred whenever practical.

Other newtype use, abuse, and misuse

The previous section covers the primary means by which newtypes are useful. However, in practice, newtypes are routinely used in ways that do not fit the above pattern. Some such uses are reasonable:

Haskell’s notion of typeclass coherency limits each type to a single instance of any given class. For types that permit more than one useful instance, newtypes are the traditional solution, and this can be used to good effect. For example, the Sum and Product newtypes from Data.Monoid provide useful Monoid instances for numeric types.
In a similar vein, newtypes can be useful for introducing or rearranging type parameters. The Flip newtype from Data.Bifunctor.Flip is a simple example, flipping the arguments of a Bifunctor so the Functor instance may operate on the other side:
```
newtype Flip p a b = Flip { runFlip :: p b a }
```
Newtypes are needed to do this sort of juggling, as Haskell does not (yet) support type-level lambdas.
More simply, transparent newtypes can be useful to discourage misuse when the value needs to be passed between distant parts of the program and the intermediate code has no reason to inspect the value. For example, a ByteString containing a secret key may be wrapped in a newtype (with a Show instance omitted) to discourage code from accidentally logging or otherwise exposing it.

All of these applications are good ones, but they have little to do with type safety. The last bullet in particular is often confused for safety, and to be fair, it does in fact take advantage of the type system to help avoid logical mistakes. However, it would be a mischaracterization to claim such usage actually prevents misuse; any part of the program may inspect the value at any time.

Too often, this illusion of safety leads to outright newtype abuse. For example, here’s a definition from the very codebase I work on for a living:

newtype ArgumentName = ArgumentName { unArgumentName :: GraphQL.Name }
  deriving ( Show, Eq, FromJSON, ToJSON, FromJSONKey, ToJSONKey
           , Hashable, ToTxt, Lift, Generic, NFData, Cacheable )

This newtype is useless noise. Functionally, it is completely interchangeable with its underlying Name type, so much so that it derives a dozen typeclasses! In every location it’s used, it’s immediately unwrapped the instant it’s extracted from its enclosing record, so there is no type safety benefit whatsoever. Worse, there isn’t even any clarity added by labeling it an ArgumentName, since the enclosing field name already makes its role clear.

Newtypes like these seem to arise from a desire to use the type system as a taxonomy of the external world. An “argument name” is a more specific concept than a generic “name,” so surely it ought to have its own type. This makes some intuitive sense, but it’s rather misguided: taxonomies are useful for documenting a domain of interest, but not necessarily helpful for modeling it. When programming, we use types for a different end:

Primarily, types distinguish functional differences between values. A value of type NonEmpty a is functionally distinct from a value of type [a], since it is fundamentally structurally different and permits additional operations. In this sense, types are structural; they describe what values are in the internal world of the programming language.
Secondarily, we sometimes use types to help ourselves avoid making logical mistakes. We might use separate Distance and Duration types to avoid accidentally doing something nonsensical like adding them together, even though they’re both representationally real numbers.

Note that both these uses are pragmatic; they look at the type system as a tool. This is a rather natural perspective to take, seeing as a static type system is a tool in a literal sense. Nevertheless, that perspective seems surprisingly unusual, even though the use of types to classify the world routinely yields unhelpful noise like ArgumentName.

If a newtype is completely transparent, and it is routinely wrapped and unwrapped at will, it is likely not very helpful. In this particular case, I would eliminate the distinction altogether and use Name, but in situations where the different label adds genuine clarity, one can always use a type alias:³

type ArgumentName = GraphQL.Name

Newtypes like these are security blankets. Forcing programmers to jump through a few hoops is not type safety—trust me when I say they will happily jump through them without a second thought.

Final thoughts and related reading

I’ve been wanting to write this blog post for a long time. Ostensibly, it’s a very specific critique of Haskell newtypes, and I’ve chosen to frame things this way because I write Haskell for a living and this is the way I encounter this problem in practice. Really, though, the core idea is much bigger than that.

Newtypes are one particular mechanism of defining wrapper types, a concept that exists in almost any language, even those that are dynamically typed. Even if you don’t write Haskell, much of the reasoning in this blog post is likely still relevant in your language of choice. More broadly, this is a continuation of a theme I’ve been trying to convey from different angles over the past year: type systems are tools, and we should be more conscious and intentional about what they actually do and how to use them effectively.

The catalyst that got me to finally sit down and write this was the recently-published Tagged is not a Newtype. It’s a good blog post, and I wholeheartedly agree with its general thrust, but I thought it was a missed opportunity to make a larger point. Indeed, Tagged is a newtype, definitionally, so the title of the blog post is something of a misdirection. The real problem is a little deeper.

Newtypes are useful when carefully applied, but their safety is not intrinsic, no more than the safety of a traffic cone is somehow contained within the plastic it’s made of. What matters is being placed in the right context—without that, newtypes are just a labeling scheme, a way of giving something a name.

And a name is not type safety.

Admittedly rather unlikely given its name, but bear with me through the contrived example. ↩
In theory, it is still possible to thoroughly prove the invariant holds using external verification techniques, such as by writing a pen-and-paper proof or by using program extraction in combination with a proof assistant/theorem prover. However, these techniques are extremely uncommon in general programming practice. ↩
As it happens, I think type aliases are often also more harmful than helpful, so I would caution against overusing them, too, but that is outside the scope of this blog post. ↩

Types as axioms, or: playing god with static types

2020-08-13T00:00:00Z

Just what exactly is a type?

A common perspective is that types are restrictions. Static types restrict the set of values a variable may contain, capturing some subset of the space of “all possible values.” Under this worldview, a typechecker is sort of like an oracle, predicting which values will end up where when the program runs and making sure they satisfy the constraints the programmer wrote down in the type annotations. Of course, the typechecker can’t really predict the future, so when the typechecker gets it wrong—it can’t “figure out” what a value will be—static types can feel like self-inflicted shackles.

But that is not the only perspective. There is another way—a way that puts you, the programmer, back in the driver’s seat. You make the rules, you call the shots, you set the objectives. You need not be limited any longer by what the designers of your programming language decided the typechecker can and cannot prove. You do not serve the typechecker; the typechecker serves you.

…no, I’m not trying to sell you a dubious self-help book for programmers who feel like they’ve lost control of their lives. If the above sounds too good to be true, well… I won’t pretend it’s all actually as easy as I make it sound. Nevertheless, it’s well within the reach of the working programmer, and most remarkably, all it takes is a change in perspective.

Seeing the types half-empty

Let’s talk a little about TypeScript.

TypeScript is a gradually-typed language, which means it’s possible to mix statically- and dynamically-typed code. The original intended use case of gradual typing was to gradually add static types to an existing dynamically-typed codebase, which imposes some interesting design constraints. For one, a valid JavaScript program must also be a valid TypeScript program; for another, TypeScript must be accommodating of traditional JavaScript idioms.

Gradually typed languages like TypeScript are particularly good illustrations of the way type annotations can be viewed as constraints. A function with no explicit type declarations¹ can accept any JavaScript value, so adding a type annotation fundamentally restricts the set of legal values.

Furthermore, languages like TypeScript tend to have subtyping. This makes it easy to classify certain types as “more restrictive” than others. For example, a type like string | number clearly includes more values than just number, so number is a more restrictive type—a subtype.

An exceptionally concrete way to illustrate this “types are restrictions” mentality is to write a function with an unnecessarily specific type. Here’s a TypeScript function that returns the first element in an array of numbers:

function getFirst(arr: number[]): number | undefined {
  return arr[0];
}

If we ignore the type annotations and consider only the dynamic semantics of JavaScript, this function would work perfectly well given a list of strings. However, if we write getFirst(["hello", "world"]), the typechecker will complain. In this example, the restriction is thoroughly self-imposed—it would be easy to give this function a more generic type—but it’s not always that easy. For example, suppose we wrote a function where the return type depends upon the type of the argument:

function emptyLike(val: number | string): number | string {
  if (typeof val === "number") {
    return 0;
  } else {
    return "";
  }
}

Now if we write emptyLike(42) * 10, the typechecker will once again complain, claiming the result might be a string—it can’t “figure out” that when we pass a number, we always get a number back.

When type systems are approached from this perspective, the result is often frustration. The programmer knows that the equivalent untyped JavaScript is perfectly well-behaved, so the typechecker comes off as being the highly unfortunate combination of stubborn yet dim-witted. What’s more, the programmer likely has little mental model of the typechecker’s internal operation, so when types like the above are inferred (not explicitly written), it can be unclear what solutions exist to make the error go away.

At this point, the programmer may give up. “Stupid typechecker,” they grumble, changing the return type of emptyLike to any. “If it can’t even figure this out, can it really be all that useful?”

Sadly, this relationship with the typechecker is all too common, and gradually-typed languages in particular tend to create a vicious cycle of frustration:

Gradual type systems are intentionally designed to “just work” on idiomatic code as much as possible, so programmers may not think much about the types except when they get type errors.
Furthermore, many programmers using gradually-typed languages are already adept at programming in the underlying dynamically-typed language, so they have working mental models of program operation in terms of the dynamic semantics alone. They are much less likely to develop a rich mental model of the static semantics of the type system because they are used to reasoning without one.
Gradually typed languages must support idioms from their dynamically-typed heritage, so they often include ad-hoc special cases (such as, for example, special treatment of typeof checks) that obscure the rules the typechecker follows and make them seem semi-magical.
Builtin types are deeply blessed in the type system, strongly encouraging programmers to embrace their full flexibility, but leaving little recourse when they run up against their limits.
All this frustration breeds a readiness to override the typechecker using casts or any, which ultimately creates a self-fulfilling prophecy in which the typechecker rarely catches any interesting mistakes because it has been so routinely disabled.

The end result of all of this is a defeatist attitude that views the typechecker as a minor tooling convenience at best (i.e. a fancy autocomplete provider) or an active impediment at worst. Who can really blame them? The type system has (unintentionally of course) been designed in such a way so as to lead them into this dead end. The public perception of type systems settles into that of a strikingly literal nitpicker we endure rather than as a tool we actively leverage.

Taking back types

After everything I said above, it may be hard to imagine seeing types any other way. Indeed, through the lens of TypeScript, the “types are restrictions” mentality is incredibly natural, so much so that it seems self-evident. But let’s move away from TypeScript for a moment and focus on a different language, Haskell, which encourages a somewhat different perspective. If you aren’t familiar with Haskell, that’s alright—I’m going to try to keep the examples in this blog post as accessible as possible whether you’ve written any Haskell or not.

Though Haskell and TypeScript are both statically-typed—and both of their type systems are fairly sophisticated—Haskell’s type system is almost completely different philosophically:

Haskell does not have subtyping,² which means that every value belongs to exactly one type.
While JavaScript is built around a small handful of flexible builtin datatypes (booleans, numbers, strings, arrays, and objects), Haskell has essentially no blessed, built-in datatypes other than numbers. Key types such as booleans, lists, and tuples are ordinary datatypes defined in the standard library, no different from types users could define.³
In particular, Haskell is built around the idea that datatypes can be defined with multiple cases, and branching is done via pattern-matching (more on this shortly).

Let’s look at a basic Haskell datatype declaration. Suppose we want to define a type that represents a season:

data Season = Spring | Summer | Fall | Winter

If you are familiar with TypeScript, this may look rather similar to a union type; if you’re familiar with a C-family language, this may remind you more of an enum. Both are on the right track: this defines a new type named Season with four possible values, Spring, Summer, Fall, and Winter.

But what exactly are those values?

In TypeScript, we’d represent this type with a union of strings, like this:
```
type Season = "spring" | "summer" | "fall" | "winter";
```
Here, Season is a type that can be one of those four strings, but nothing else.
In C, we’d represent this type with an enum, like this:
```
enum season { SPRING, SUMMER, FALL, WINTER };
```
Here, SPRING, SUMMER, FALL, and WINTER are essentially defined to be global aliases for the integers 0, 1, 2, and 3, and the type enum season is essentially an alias for int.

So in TypeScript, the values are strings, and in C, the values are numbers. What are they in Haskell? Well… they simply are.

The Haskell declaration invents four completely new constants out of thin air, Spring, Summer, Fall, and Winter. They aren’t aliases for numbers, nor are they symbols or strings. The compiler doesn’t expose anything about how it chooses to represent these values at runtime; that’s an implementation detail. In Haskell, Spring is now a value distinct from all other values, even if someone in a different module were to also use the name Spring. Haskell type declarations let us play god, creating something from nothing.

Since these values are totally unique, abstract constants, what can we actually do with them? The answer is one thing and exactly one thing: we can branch on them. For example, we can write a function that takes a Season as an argument and returns whether or not Christmas occurs during it:

containsChristmas :: Season -> Bool
containsChristmas season = case season of
  Spring -> False
  Summer -> True  -- southern hemisphere
  Fall   -> False
  Winter -> True  -- northern hemisphere

case expressions are, to a first approximation, a lot like C-style switch statements (though they can do a lot more than this simple example suggests). Using case, we can also define conversions from our totally unique Season constants to other types, if we want:

seasonToString :: Season -> String
seasonToString season = case season of
  Spring -> "spring"
  Summer -> "summer"
  Fall   -> "fall"
  Winter -> "winter"

We can also go the other way around, converting a String to a Season, but if we try, we run into a problem: what do we return for a string like, say, "cheesecake"? In other languages, we might throw an error or return null, but Haskell does not have null, and errors are generally reserved for truly catastrophic failures. What can we do instead?

A particularly naïve solution would be to create a type called MaybeASeason that has two cases—it can be a valid Season, or it can be NotASeason:

data MaybeASeason = IsASeason Season | NotASeason

stringToSeason :: String -> MaybeASeason
stringToSeason seasonString = case seasonString of
  "spring" -> IsASeason Spring
  "summer" -> IsASeason Summer
  "fall"   -> IsASeason Fall
  "winter" -> IsASeason Winter
  _        -> NotASeason

This shows a feature of Haskell datatypes that C-style enums do not have: they aren’t just constants, they can contain other values. A MaybeASeason can be one of five different values: IsASeason Spring, IsASeason Summer, IsASeason Fall, IsASeason Winter, or NotASeason.

In TypeScript, we’d write MaybeASeason more like this:

type MaybeASeason = Season | "not-a-season";

This is kind of nice, because we don’t have to wrap all our Season values with IsASeason like we have to do in Haskell. But remember that Haskell doesn’t have subtyping—every value must belong to exactly one type—so the Haskell code needs the IsASeason wrapper to distinguish the value as a MaybeASeason rather than a Season.

Now, you may rightly point out that having to invent a type like MaybeASeason every time we need to create a variant of a type with a failure case is absurd, so fortunately we can define a type like MaybeASeason that works for any underlying type. In Haskell, it looks like this:

data Maybe a = Just a | Nothing

This defines a generic type, where the a in Maybe a is a stand-in for some other type, much like the T in Array<T> in other languages. We can change our stringToSeason function to use Maybe:

stringToSeason :: String -> Maybe Season
stringToSeason seasonString = case seasonString of
  "spring" -> Just Spring
  "summer" -> Just Summer
  "fall"   -> Just Fall
  "winter" -> Just Winter
  _        -> Nothing

Maybe gets us something a lot like nullable types, but it isn’t built into the type system, it’s just an ordinary type defined in the standard library.

Positive versus negative space

At this point, you may be wondering to yourself why I am talking about all of this, seeing as everything in the previous section is information you could find in a basic Haskell tutorial. But the point of this blog post is not to teach you Haskell, it’s to focus on a particular philosophical approach to modeling data.

In TypeScript, when we write a type declaration like

type Season = "summer" | "spring" | "fall" | "winter";

we are defining a type that can be one of those four strings and nothing else. All the other strings that aren’t one of those four make up Season’s “negative space”—values that exist, but that we have intentionally excluded. In contrast, the Haskell type does not really have any “negative space” because we pulled four new values out of thin air.

Of course, I suspect you don’t really buy this argument. What makes a string like "cheesecake" “negative space” in TypeScript but not in Haskell? Well… nothing, really. The distinction I’m drawing here doesn’t really exist, it’s just a different perspective, and arguably a totally contrived and arbitrary one. But now that I’ve explained the premise and set up some context, let me provide a more compelling example.

Suppose you are writing a TypeScript program, and you want a function that only accepts non-empty arrays. What can you do? Your first instinct is that you need a way to somehow further restrict the function’s input type to exclude empty arrays. And indeed, there is a trick for doing that:

type NonEmptyArray<T> = [T, ...T[]];

Great! But what if the constraint was more complicated: what if you needed an array containing an even number of elements? Unfortunately, there isn’t really a trick for that one. At this point, you might start wishing the type system had support for something really fancy, like refinement types, so you could write something like this:

type EvenArray<T> = T[] satisfies (arr => arr.length % 2 === 0);

But TypeScript doesn’t support anything like that, so for now you’re stuck. You need a way to restrict the function’s domain in a way the type system does not have any special support for, so your conclusion might be “I guess the type system just can’t do this.” People tend to call this “running up against the limits of the type system.”

But what if we took a different perspective? Recall that in Haskell, lists aren’t built-in datatypes, they’re just ordinary datatypes defined in the standard library:⁴

data List a = Nil | Cons a (List a)

This type might be a bit confusing at first if you have not written any Haskell, since it’s recursive. All of these are valid values of type List Int:

Nil
Cons 1 Nil
Cons 1 (Cons 2 Nil)
Cons 1 (Cons 2 (Cons 3 Nil))

The recursive nature of Cons is what gives our user-defined datatype the ability to hold any number of values: we can have any number of nested Conses we want before we terminate the list with a final Nil.

If we wanted to define an EvenList type in Haskell, we might end up thinking along the same lines we did before, that we need some fancy type system extension so we can restrict List to exclude lists with odd numbers of elements. But that’s focusing on the negative space of things we want to exclude… what if instead, we focused on the positive space of things we want to include?

What do I mean by that? Well, we could define an entirely new type that’s just like List, but we make it impossible to ever include an odd number of elements:

data EvenList a = EvenNil | EvenCons a a (EvenList a)

Here are some valid values of type EvenList Int:

EvenNil
EvenCons 1 2 EvenNil
EvenCons 1 2 (EvenCons 3 4 EvenNil)

Lo and behold, a datatype that can only ever include even numbers of elements!

Now, at this point you might realize that this is kind of silly. We don’t need to invent an entirely new datatype for this! We could just create a list of pairs:

type EvenList a = List (a, a)

Now values like Cons (1, 2) (Cons (3, 4) Nil) would be valid values of type EvenList Int, and we wouldn’t have to reinvent lists. But again, this is an approach based on thinking not on which values we want to exclude, but rather how to structure our data such that those illegal values aren’t even constructible.

This is the essence of the Haskeller’s mantra, “Make illegal states unrepresentable,” and sadly it is often misinterpreted. It’s much easier to think “hm, I want to make these states illegal, how can I add some post-hoc restrictions to rule them out?” And indeed, this is why refinement types really are awesome, and when they’re available, by all means use them! But checking totally arbitrary properties at the type level is not tractable in general, and sometimes you need to think a little more outside the box.

Types as axiom schemas

So far in this blog post, I’ve repeatedly touched upon a handful of different ideas in a few different ways:

Instead of thinking about how to restrict, it can be useful to think about how to correctly construct.
In Haskell, datatype declarations invent new values out of thin air.
We can represent a lot of different data structures using the incredibly simple framework of “datatypes with several possibilities.”

Independently, those ideas might not seem deeply related, but in fact, they’re all essential to the Haskell school of data modeling. I want to now explore how we can unify them into a single framework that makes this seem less magical and more like an iterative design process.

In Haskell, when you define a datatype, you’re really defining a new, self-contained set of axioms and inference rules. That is rather abstract, so let’s make it more concrete. Consider the List type again:

data List a = Nil | Cons a (List a)

Viewed as an axiom schema, this type has one axiom and one inference rule:

The empty list is a list.
If you have a list, and you add an element to the beginning, the result is also a list.

The axiom is Nil, and the inference rule is Cons. Every list⁵ is constructed by starting with the axiom, Nil, followed by some number of applications of the inference rule, Cons.

We can take a similar approach when designing the EvenList type. The axiom is the same:

The empty list is a list with an even number of elements.

But our inference rule must preserve the invariant that the list always contains an even number of elements. We can do this by always adding two elements at a time:

If you have a list with an even number of elements, and you add two elements to the beginning, the result is also a list with an even number of elements.

This corresponds precisely to our EvenList declaration:

data EvenList a = EvenNil | EvenCons a a (EvenList a)

We can also go through this same reasoning process to come up with a type that represents non-empty lists. That type has just one inference rule:

If you have a list, and you add an element to the beginning, the result is a non-empty list.

That inference rule corresponds to the following datatype:

data NonEmptyList a = NonEmptyCons a (List a)

Of course, it’s possible to do this with much more than just lists. A particularly classic example is the constructive definition of natural numbers:

Zero is a natural number.
If you have a natural number, its successor (i.e. that number plus one) is also a natural number.

These are two of the Peano axioms, which can be represented in Haskell as the following datatype:

data Natural = Zero | Succ Natural

Using this type, Zero represents 0, Succ Zero represents 1, Succ (Succ Zero) represents 2, and so on. Just as EvenList allowed us to represent any list with an even number of elements but made other values impossible to even express, this Natural type allows us to represent all natural numbers, while other numbers (such as, for example, negative integers) are impossible to express.

Now, of course, all this hinges on our interpretation of the values we’ve invented! We have chosen to interpret Zero as 0 and Succ n as n + 1, but that interpretation is not inherent to Natural’s definition—it’s all in our heads! We could choose to interpret Succ n as n - 1 instead, in which case we would only be able to represent non-positive integers, or we could interpret Zero as 1 and Succ n as n * 2, in which case we could only represent powers of two.

I find that people sometimes find this approach troubling, or at least counterintuitive. Is Succ (Succ Zero) really 2? It certainly doesn’t look like a number we’re used to writing. When someone thinks “I need a datatype for a number greater than or equal to zero,” they’re going to reach for the type in their programming language called number or int, not think to invent a recursive datatype. And admittedly, the Natural type defined here is not very practical: it’s an incredibly inefficient representation of natural numbers.

But in less contrived situations, this approach is practical, and in fact it’s highly useful! The quibble that an EvenList Int isn’t “really” a List Int is rather meaningless, seeing as our definition of List was just as arbitrary. A great deal of our jobs as programmers is imbuing arbitrary symbols with meaning; at some point someone decided that the number 65 would correspond to the capital letter A, and it was no less arbitrary then.

So when you have a property you want to capture in your types, take a step back and think about it for a little bit. Is there a way you can structure your data so that, no matter how you build it, the result is always a valid value? In other words, don’t try to add post-hoc restrictions to exclude bad values, make your datatypes correct by construction.

“But what if I don’t write Haskell?” And other closing thoughts

I write Haskell for a living, and I wrote this blog post with both my coworkers and the broader Haskell community in mind, but if I had only written it with those people in mind, it wouldn’t make sense to have spent so much time explaining basic Haskell. These techniques can be used in almost any statically typed programming language, though it’s certainly easier in some than others.

I don’t want people to come away from this blog post with an impression that I think TypeScript is a bad language, or that I’m claiming Haskell can do things TypeScript can’t. In fact, TypeScript can do all the things I’ve talked about in this blog post! As proof, here are TypeScript definitions of both EvenList and Natural:

type EvenList<T> = [] | [T, T, EvenList<T>];
type Natural = "zero" | { succ: Natural };

If anything, the real point of this blog post is that a type system does not have a well-defined list of things it “can prove” and “can’t prove.” Languages like TypeScript don’t really encourage this approach to data modeling, where you restructure your values in a certain way so as to guarantee certain properties. Rather, they prefer to add increasingly sophisticated constraints and type system features that can capture the properties people want to capture without having to change their data representation.

And in general, that’s great!

Being able to reuse the same data representation is hugely beneficial. Functions like map and filter already exist for ordinary lists/arrays, but a home-grown EvenList type needs its own versions. Passing an EvenList to a function that expects a list requires explicitly converting between the two. All these things have both code complexity and performance costs, and type system features that make these issues just invisibly disappear are obviously a good thing.

But the danger of treating the type system this way is that it means you may find yourself unsure what to do when suddenly you have a new requirement that the type system doesn’t provide built-in support for. What then? Do you start punching holes through your type system? The more you do that, the less useful the type system becomes: type systems are great at detecting how changes in one part of a codebase can impact seemingly-unrelated areas in surprising ways, but every unsafe cast or use of any is a hard stop, a point past which the typechecker cannot propagate information. Do that once or twice in a leaf function, it’s okay, but do that even just a half dozen times in your application’s connective tissue, and your type system might not be able to catch those things anymore.

Even if it isn’t a technique you use every day, it’s worth getting comfortable tweaking your data representation to preserve those guarantees. It’s a magical experience having the typechecker teach you things about your domain you hadn’t even considered simply because you got a type error and started thinking through why. Yes, it’s extra work, but trust me: it’s a lot more pleasant to work for your typechecker when you know exactly how much your typechecker is working for you.

Sort of. TypeScript will try to infer type annotations based on how variables and functions are used, but by default, it falls back on the dynamic, unchecked any type if it can’t find a solution that makes the program typecheck. That behavior can be changed via a configuration option, but that isn’t relevant here: I’m just trying to illustrate a perspective, not make any kind of value judgment about TypeScript specifically. ↩
Sort of. Haskell does have a limited notion of subtyping when polymorphism is involved; for example, the type forall a. a -> a is a subtype of the type Int -> Int. But Haskell does not have anything resembling inheritance (e.g. there is no common Number supertype that includes both Int and Double) nor does it have untagged unions (e.g. the argument to a function cannot be something like Int | String, you must define a wrapper type like data IntOrString = AnInt Int | AString String). ↩
Lists, tuples, and strings do technically have special syntax, which is built into the compiler, but there is truly nothing special about their semantics. They would work exactly the same way without the syntax, the code would just look less pretty. ↩
Haskell programmers will notice that this is not actually the definition of the list type, since the real list type uses special syntax, but I wanted to keep things as simple as possible for this blog post. ↩
Ignoring infinite lists, but the fact that infinite lists are representable in Haskell is outside the scope of this blog post. ↩

No, dynamic type systems are not inherently more open

2020-01-19T00:00:00Z

Internet debates about typing disciplines continue to be plagued by a pervasive myth that dynamic type systems are inherently better at modeling “open world” domains. The argument usually goes like this: the goal of static typing is to pin everything down as much as possible, but in the real world, that just isn’t practical. Real systems should be loosely coupled and worry about data representation as little as possible, so dynamic types lead to a more robust system in the large.

This story sounds compelling, but it isn’t true. The flaw is in the premise: static types are not about “classifying the world” or pinning down the structure of every value in a system. The reality is that static type systems allow specifying exactly how much a component needs to know about the structure of its inputs, and conversely, how much it doesn’t. Indeed, in practice static type systems excel at processing data with only a partially-known structure, as they can be used to ensure application logic doesn’t accidentally assume too much.

Two typing fallacies

I’ve wanted to write this blog post for a while, but what finally made me decide to do it were misinformed comments responding to my previous blog post. Two comments in particular caught my eye, the first of which was posted on /r/programming:

Strongly disagree with the post […] it promotes a fundamentally entangled and static view of the world. It assumes that we can or should theorize about what is "valid" input at the edge between the program and the world, thus introducing a strong sense of coupling through the entire software, where failure to conform to some schema will automatically crash the program.
This is touted as a feature here but imagine if the internet worked like this. A server changes their JSON output, and we need to recompile and reprogram the entire internet. This is the static view that is promoted as a feature here. […] The "parser mentality" is fundamentally rigid and global, whereas robust system design should be decentralised and leave interpretation of data to the receiver.

Given the argument being made in the blog post—that you should use precise types whenever possible—one can see where this misinterpretation comes from. How could a proxy server possibly be written in such a style, since it cannot anticipate the structure of its payloads? The commenter’s conclusion is that strict static typing is at odds with programs that don’t know the structure of their inputs ahead of time.

The second comment was left on Hacker News, and it is significantly shorter than the first one:

What would be the type signature of, say, Python's pickle.load()?

This is a different kind of argument, one that relies on the fact that the types of reflective operations may depend on runtime values, which makes them challenging to capture with static types. This argument suggests that static types limit expressiveness because they forbid such operations outright.

Both these arguments are fallacious, but in order to show why, we have to make explicit an implicit claim. The two comments focus primarily on illustrating how static type systems can’t process data of an unknown shape, but they simultaneously advance an implicit belief: that dynamically typed languages can process data of an unknown shape. As we’ll see, this belief is misguided; programs are not capable of processing data of a truly unknown shape regardless of typing discipline, and static type systems only make already-present assumptions explicit.

You can’t process what you don’t know

The claim is simple: in a static type system, you must declare the shape of data ahead of time, but in a dynamic type system, the type can be, well, dynamic! It sounds self-evident, so much so that Rich Hickey has practically built a speaking career upon its emotional appeal. The only problem is it isn’t true.

The hypothetical scenario usually goes like this. Say you have a distributed system, and services in the system emit events that can be consumed by any other service that might need them. Each event is accompanied by a payload, which listening services can use to inform further action. The payload itself is minimally-structured, schemaless data encoded using a generic interchange format such as JSON or EDN.

As a simple example, a login service might emit an event like this one whenever a new user signs up:

{
  "event_type": "signup",
  "timestamp": "2020-01-19T05:37:09Z",
  "data": {
    "user": {
      "id": 42,
      "name": "Alyssa",
      "email": "alyssa@example.com"
    }
  }
}

Some downstream services might listen for these signup events and take further action whenever they are emitted. For example, a transactional email service might send a welcome email whenever a new user signs up. If the service were written in JavaScript, the handler might look something like this:

const handleEvent = ({ event_type, data }) => {
  switch (event_type) {
    case 'login':
      /* ... */
      break
    case 'signup':
      sendEmail(data.user.email, `Welcome to Blockchain Emporium, ${data.user.name}!`)
      break
  }
}

But what if this service were written in Haskell instead? Being good, reality-fearing Haskell programmers who parse, not validate, the Haskell code might look something like this, instead:

data Event = Login LoginPayload | Signup SignupPayload
data LoginPayload = LoginPayload { userId :: Int }
data SignupPayload = SignupPayload
  { userId :: Int
  , userName :: Text
  , userEmail :: Text }

instance FromJSON Event where
  parseJSON = withObject "Event" \obj -> do
    eventType <- obj .: "event_type"
    case eventType of
      "login" -> Login <$> (obj .: "data")
      "signup" -> Signup <$> (obj .: "signup")
      _ -> fail $ "unknown event_type: " <> eventType

instance FromJSON LoginPayload where { ... }
instance FromJSON SignupPayload where { ... }

handleEvent :: JSON.Value -> IO ()
handleEvent payload = case fromJSON payload of
  Success (Login LoginPayload { userId }) -> {- ... -}
  Success (Signup SignupPayload { userName, userEmail }) ->
    sendEmail userEmail $ "Welcome to Blockchain Emporium, " <> userName <> "!"
  Error message -> fail $ "could not parse event: " <> message

It’s definitely more boilerplate, but some extra overhead for type definitions is to be expected (and is greatly exaggerated in such tiny examples), and the arguments we’re discussing aren’t about boilerplate, anyway. The real problem with this version of the code, according to the Reddit comment from earlier, is that the Haskell code has to be updated whenever a service adds a new event type! A new case has to be added to the Event datatype, and it must be given new parsing logic. And what about when new fields get added to the payload? What a maintenance nightmare.

In comparison, the JavaScript code is much more permissive. If a new event type is added, it will just fall through the switch and do nothing. If extra fields are added to the payload, the JavaScript code will just ignore them. Seems like a win for dynamic typing.

Except that no, it isn’t. The only reason the statically typed program fails if we don’t update the Event type is that we wrote handleEvent that way. We could just have easily done the same thing in the JavaScript code, adding a default case that rejects unknown event types:

const handleEvent = ({ event_type, data }) => {
  switch (event_type) {
    /* ... */
    default:
      throw new Error(`unknown event_type: ${event_type}`)
  }
}

We didn’t do that, since in this case it would clearly be silly. If a service receives an event it doesn’t know about, it should just ignore it. This is a case where being permissive is clearly the correct behavior, and we can easily implement that in the Haskell code too:

handleEvent :: JSON.Value -> IO ()
handleEvent payload = case fromJSON payload of
  {- ... -}
  Error _ -> pure ()

This is still in the spirit of “parse, don’t validate” because we’re still parsing the values we do care about as early as possible, so we don’t fall into the double-validation trap. At no point do we take a code path that depends on a value being well-formed without first ensuring (with the help of the type system) that it is, in fact, actually well-formed. We don’t have to respond to an ill-formed value by raising an error! We just have to be explicit about ignoring it.

This illustrates an important point: the Event type in this Haskell code doesn’t describe “all possible events,” it describes all the events that the application cares about. Likewise, the code that parses those events’ payloads only worries about the fields the application needs, and it ignores extraneous ones. A static type system doesn’t require you eagerly write a schema for the whole universe, it simply requires you to be up front about the things you need.

This turns out to have a lot of pleasant benefits even though knowledge about inputs is limited:

It’s easy to discover the assumptions of the Haskell program just by looking at the type definitions. We know, for example, that this application doesn’t care about the timestamp field, since it never appears in any of the payload types. In the dynamically-typed program, we’d have to audit every code path to see whether or not it inspects that field, which would be a lot of error-prone work!
What’s more, it turns out the Haskell code doesn’t actually use the userId field inside the SignupPayload type, so that type is overly conservative. If we want to ensure it isn’t actually needed (since, for example, maybe we’re phasing out providing the user ID in that payload entirely), we need only delete that record field; if the code typechecks, we can be confident it really doesn’t depend on that field.
Finally, we neatly avoid all the gotchas related to shotgun parsing mentioned in the previous blog post, since we still haven’t compromised on any of those principles.

We’ve already invalidated the first half of the claim: that statically typed languages can’t deal with data where the structure isn’t completely known. Let’s now look at the other half, which states that dynamically typed languages can process data where the structure isn’t known at all. Maybe that still sounds right, but if you slow down and think about it more carefully, you’ll find it can’t be.

The above JavaScript code makes all the same assumptions our Haskell code does: it assumes event payloads are JSON objects with an event_type field, and it assumes signup payloads include data.user.name and data.user.email fields. It certainly can’t do anything useful with truly unknown input! If a new event payload is added, our JavaScript code can’t magically adapt to handle it simply because it is dynamically typed. Dynamic typing just means the types of values are carried alongside them at runtime and checked as the program executes; the types are still there, and this program still implicitly relies on them being particular things.

Keeping opaque data opaque

In the previous section, we debunked the idea that statically typed systems can’t process partially-known data, but if you have been paying close attention, you may have noticed it did not fully refute the original claim.

Although we were able to handle unknown data, we always simply discarded it, which would not fly if we were trying to implement some sort of proxying. For example, suppose we have a forwarding service that broadcasts events over a public network, attaching a signature to each payload to ensure it can’t be spoofed. We might implement this in JavaScript this way:

const handleEvent = (payload) => {
  const signedPayload = { ...payload, signature: signature(payload) }
  retransmitEvent(signedPayload)
}

In this case, we don’t care about the structure of the payload at all (the signature function just works on any valid JSON object), but we still have to preserve all the information. How could we do that in a statically typed language, since a statically-typed language would have to assign the payload a precise type?

Once again, the answer involves rejecting the premise: there’s no need to give data a type that’s any more precise than the application needs. The same logic could be written in a straightforward way in Haskell:

handleEvent :: JSON.Value -> IO ()
handleEvent (Object payload) = do
  let signedPayload = Map.insert "signature" (signature payload) payload
  retransmitEvent signedPayload
handleEvent payload = fail $ "event payload was not an object " <> show payload

In this case, since we don’t care about the structure of the payload, we manipulate a value of type JSON.Value directly. This type is extremely imprecise compared to our Event type from earlier—it can hold any legal JSON value, of any shape—but in this case, we want it to be imprecise.

Thanks to that imprecision, the type system helped us here: it caught the fact that we’re assuming the payload is a JSON object, not some other JSON value, and it made us handle the non-object cases explicitly. In this case we chose to raise an error, but of course, as before, you could choose some other form of recovery if you wanted to. You just have to be explicit about it.

Once more, note that the assumption we were forced to make explicit in Haskell is also made by the JavaScript code! If our JavaScript handleEvent function were called with a string rather than an object, it’s unlikely the behavior would be desirable, since an object spread on a string results in the following surprise:

> { ..."payload", signature: "sig" }
{0: "p", 1: "a", 2: "y", 3: "l", 4: "o", 5: "a", 6: "d", signature: "sig"}

Oops. Once again, the parsing style of programming has helped us out, since if we didn’t “parse” the JSON value into an object by matching on the Object case explicitly, our code would not compile, and if we left off the fallthrough case, we’d get a warning about inexhaustive patterns.

Let’s look at one more example of this phenomenon before moving on. Suppose we’re consuming an API that returns user IDs, and suppose those IDs happen to be UUIDs. A straightforward interpretation of “parse, don’t validate” might suggest we represent user IDs in our Haskell API client using a UUID type:

type UserId = UUID

However, our Reddit commenter would likely take umbrage with this! Unless the API contract explicitly states that all user IDs will be UUIDs, this representation is overstepping our bounds. Although user IDs might be UUIDs today, perhaps they won’t be tomorrow, and then our code would break for no reason! Is this the fault of static type systems?

Again, the answer is no. This is a case of improper data modeling, but the static type system is not at fault—it has simply been misused. The appropriate way to represent a UserId is to define a new, opaque type:

newtype UserId = UserId Text
  deriving (Eq, FromJSON, ToJSON)

Unlike the type alias defined above which simply creates a new name for the existing UUID type, this declaration creates a totally new UserId type that is distinct from all other types, including Text. If we keep the datatype’s constructor private (that is, we don’t export it from the module that defines this type), then the only way to produce a UserId will be to go through its FromJSON parser. Dually, the only things you can do with a UserId are compare it with other UserIds for equality or serialize it using the ToJSON instance. Nothing else is permitted: the type system will prevent you from depending on the remote service’s internal representation of user IDs.

This illustrates another way that static type systems can provide strong, useful guarantees when manipulating completely opaque data. The runtime representation of a UserId is really just a string, but the type system does not allow you to accidentally use it like it’s a string, nor does it allow you to forge a new UserId out of thin air from an arbitrary string.¹

The type system is not a ball and chain forcing you to describe the representation of every value that enters and leaves your program in exquisite detail. Rather, it’s a tool that you can use in whatever way best suits your needs.

Reflection is not special

We’ve now thoroughly debunked the claims made by the first commenter, but the question posed by the second commenter may still seem like a loophole in our logic. What is the type of Python’s pickle.load()? For those unfamiliar, Python’s cutely-named pickle library allows serializing and deserializing entire Python object graphs. Any object can be serialized and stored in a file using pickle.dump(), and it can be deserialized at a later point in time using pickle.load().

What makes this appear challenging to our static type system is that the type of value produced by pickle.load() is difficult to predict—it depends entirely on whatever happened to be written to that file using pickle.dump(). This seems inherently dynamic, since we cannot possibly know what type of value it will produce at compile-time. At first blush, this is something a dynamically typed system can pull off, but a statically-typed one just can’t.

However, it turns out this situation is actually identical to the previous examples using JSON, and the fact that Python’s pickling serializes native Python objects directly does not change things. Why? Well, consider what happens after a program calls pickle.load(). Say you write the following function:

def load_value(f):
  val = pickle.load(f)
  # do something with `val`

The trouble is that val can now be of any type, and just as you can’t do anything useful with truly unknown, unstructured input, you can’t do anything with a value unless you know at least something about it. If you call any method or access any field on the result, then you’ve already made an assumption about what sort of thing pickle.load(f) returned—and it turns out those assumptions are val’s type!

For example, imagine the only thing you do with val is call the val.foo() method and return its result, which is expected to be a string. If we were writing Java, then the expected type of val would be quite straightforward—we’d expect it to be an instance of the following interface:

interface Foo extends Serializable {
  String foo();
}

And indeed, it turns out a pickle.load()-like function can be given a perfectly reasonable type in Java:

static <T extends Serializable> Optional<T> load(InputStream in, Class<? extends T> cls);

Nitpickers will complain that this isn’t the same as pickle.load(), since you have to pass a Class<T> token to choose what type of thing you want ahead of time. However, nothing is stopping you from passing Serializable.class and branching on the type later, after the object has been loaded. And that’s the key point: the instant you do anything with the object, you must know something about its type, even in a dynamically typed language! The statically-typed language just forces you to be more explicit about it, just as it did when we were talking about JSON payloads.

Can we do this in Haskell, too? Absolutely—we can use the serialise library, which has a similar API to the Java one mentioned above. It also happens to have a very similar interface to the Haskell JSON library, aeson, as it turns out the problem of dealing with unknown JSON data is not terribly different from dealing with an unknown Haskell value—at some point, you have to do a little bit of parsing to do anything with the value.

That said, while you can emulate the dynamic typing of pickle.load() if you really want to by deferring the type check until the last possible moment, the reality is that doing so is almost never actually useful. At some point, you have to make assumptions about the structure of the value in order to use it, and you know what those assumptions are because you wrote the code. While there are extremely rare exceptions to this that require true dynamic code loading (such as, say, implementing a REPL for your programming language), they do not occur in day-to-day programming, and programmers in statically-typed languages are perfectly happy to supply their assumptions up front.

This is one of the fundamental disconnects between the static typing camp and the dynamic typing camp. Programmers working in statically-typed languages are perplexed when a programmer suggests they can do something in a dynamically typed language that a statically-typed language “fundamentally” prevents, since a programmer in a statically-typed language may reply the value has simply not been given a sufficiently precise type. From the perspective of a programmer working in a dynamically-typed language, the type system restricts the space of legal behaviors, but from the perspective of a programmer working in a statically-typed language, the set of legal behaviors is a value’s type.

Neither of these perspectives are actually inaccurate, from the appropriate point of view. Static type systems do impose restrictions on program structure, as it is provably impossible to reject all bad programs in a Turing-complete language without also rejecting some good ones (this is Rice’s theorem). But it is simultaneously true that the impossibility of solving the general problem does not preclude solving a slightly more restricted version of the problem in a useful way, and a lot of the so-called “fundamental” inabilities of static type systems are not fundamental at all.

Appendix: the reality behind the myths

The key thesis of this blog post has now been delivered: static type systems are not fundamentally worse than dynamic type systems at processing data with an open or partially-known structure. The sorts of claims made in the comments cited at the beginning of this blog post are not accurate depictions of what statically-typed program construction is like, and they misunderstand the limitations of static typing disciplines while exaggerating the capabilities of dynamically typed disciplines.

However, although greatly exaggerated, these myths do have some basis in reality. They appear to have developed at least in part from a misunderstanding about the differences between structural and nominal typing. This difference is unfortunately too big to address in this blog post, as it could likely fill several blog posts of its own. About six months ago I attempted to write a blog post on the subject, but I didn’t think it came out very compelling, so I scrapped it. Maybe someday I’ll find a better way to communicate the ideas.

Although I can’t give it the full treatment it deserves right now, I’d still like to touch on the idea briefly so that interested readers may be able to find other resources on the subject should they wish to do so. The key idea is that many dynamically typed languages idiomatically reuse simple data structures like hashmaps to represent what in statically-typed languages are often represented by bespoke datatypes (usually defined as classes or structs).

These two styles facilitate very different flavors of programming. A JavaScript or Clojure program may represent a record as a hashmap from string or symbol keys to values, written using object or hash literals and manipulated using ordinary functions from the standard library that manipulate keys and values in a generic way. This makes it straightforward to take two records and union their fields or to take an arbitrary (or even dynamic) subselection of fields from an existing record.

In contrast, most static type systems do not allow such free-form manipulation of records because records are not maps at all but unique types distinct from all other types. These types are uniquely identified by their (fully-qualified) name, hence the term nominal typing. If you wish to take a subselection of a struct’s fields, you must define an entirely new struct; doing this often creates an explosion of awkward boilerplate.

This is one of the main ideas that Rich Hickey has discussed in many of his talks that criticize static typing. He has advanced the idea that this ability to fluidly merge, separate, and transform records makes dynamic typing particularly suited to the domain of distributed, open systems. Unfortunately, this rhetoric has two significant flaws:

It skirts too close to calling this a fundamental limitation of type systems, suggesting that it is not simply inconvenient but impossible to model such systems in a nominal, static type system. Not only is this not true (as this blog post has demonstrated), it misdirects people away from the point of his that actually has value: the practical, pragmatic advantage of a more structural approach to data modeling.
It confuses the structural/nominal distinction with the dynamic/static distinction, incorrectly creating the impression that the fluid merging and splitting of records represented as key-value maps is only possible in a dynamically typed language. In fact, not only can statically-typed languages support structural typing, many dynamically-typed languages also support nominal typing. These axes have historically loosely correlated, but they are theoretically orthogonal.

For counterexamples to these claims, consider Python classes, which are quite nominal despite being dynamic, and TypeScript interfaces, which are structural despite being static. Indeed, modern statically-typed languages are increasingly acquiring native support for structurally-typed records. In these systems, record types work much like hashes in Clojure—they are not distinct, named types but rather anonymous collections of key-value pairs—and they support many of the same expressive manipulation operations that Clojure’s hashes do, all within a statically-typed framework.

If you are interested in exploring static type systems with strong support for structural typing, I would recommend taking a look at any of TypeScript, Flow, PureScript, Elm, OCaml, or Reason, all of which have some sort of support for structurally typed records. What I would not recommend for this purpose is Haskell, which has abysmal support for structural typing; Haskell is (for various reasons outside the scope of this blog post) aggressively nominal.²

Does this mean Haskell is bad, or that it cannot be practically used to solve these kinds of problems? No, certainly not; there are many ways to model these problems in Haskell that work well enough, though some of them suffer from significant boilerplate. The core thesis of this blog post applies just as much to Haskell as it does to any of the other languages I mentioned above. However, I would be remiss not to mention this distinction, as it may give programmers from a dynamically-typed background who have historically found statically-typed languages much more frustrating to work with a better understanding of the real reason they feel that way. (Essentially all mainstream, statically-typed OOP languages are even more nominal than Haskell!)

As closing thoughts: this blog post is not intended to start a flame war, nor is it intended to be an assault on dynamically typed programming. There are many patterns in dynamically-typed languages that are genuinely difficult to translate into a statically-typed context, and I think discussions of those patterns can be productive. The purpose of this blog post is to clarify why one particular discussion is not productive, so please: stop making these arguments. There are much more productive conversations to have about typing than this.

Technically, you could abuse the FromJSON instance to convert an arbitrary string to a UserId, but this would not be as easy as it sounds, since fromJSON can fail. This means you’d somehow have to handle that failure case, so this trick would be unlikely to get you very far unless you’re already in a context where you’re doing input parsing… at which point it would be easier to just do the right thing. So yes, the type system doesn’t prevent you from going out of your way to shoot yourself in the foot, but it guides you towards the right solution (and there is no safeguard in existence that can completely protect a programmer from making their own life miserable if they are determined to do so). ↩
I consider this to be Haskell’s most significant flaw at the time of this writing. ↩

Parse, don’t validate

2019-11-05T00:00:00Z

Historically, I’ve struggled to find a concise, simple way to explain what it means to practice type-driven design. Too often, when someone asks me “How did you come up with this approach?” I find I can’t give them a satisfying answer. I know it didn’t just come to me in a vision—I have an iterative design process that doesn’t require plucking the “right” approach out of thin air—yet I haven’t been very successful in communicating that process to others.

However, about a month ago, I was reflecting on Twitter about the differences I experienced parsing JSON in statically- and dynamically-typed languages, and finally, I realized what I was looking for. Now I have a single, snappy slogan that encapsulates what type-driven design means to me, and better yet, it’s only three words long:

Parse, don’t validate.

The essence of type-driven design

Alright, I’ll confess: unless you already know what type-driven design is, my catchy slogan probably doesn’t mean all that much to you. Fortunately, that’s what the remainder of this blog post is for. I’m going to explain precisely what I mean in gory detail—but first, we need to practice a little wishful thinking.

The realm of possibility

One of the wonderful things about static type systems is that they can make it possible, and sometimes even easy, to answer questions like “is it possible to write this function?” For an extreme example, consider the following Haskell type signature:

foo :: Integer -> Void

Is it possible to implement foo? Trivially, the answer is no, as Void is a type that contains no values, so it’s impossible for any function to produce a value of type Void.¹ That example is pretty boring, but the question gets much more interesting if we choose a more realistic example:

head :: [a] -> a

This function returns the first element from a list. Is it possible to implement? It certainly doesn’t sound like it does anything very complicated, but if we attempt to implement it, the compiler won’t be satisfied:

head :: [a] -> a
head (x:_) = x

warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In an equation for ‘head’: Patterns not matched: []

This message is helpfully pointing out that our function is partial, which is to say it is not defined for all possible inputs. Specifically, it is not defined when the input is [], the empty list. This makes sense, as it isn’t possible to return the first element of a list if the list is empty—there’s no element to return! So, remarkably, we learn this function isn’t possible to implement, either.

Turning partial functions total

To someone coming from a dynamically-typed background, this might seem perplexing. If we have a list, we might very well want to get the first element in it. And indeed, the operation of “getting the first element of a list” isn’t impossible in Haskell, it just requires a little extra ceremony. There are two different ways to fix the head function, and we’ll start with the simplest one.

Managing expectations

As established, head is partial because there is no element to return if the list is empty: we’ve made a promise we cannot possibly fulfill. Fortunately, there’s an easy solution to that dilemma: we can weaken our promise. Since we cannot guarantee the caller an element of the list, we’ll have to practice a little expectation management: we’ll do our best return an element if we can, but we reserve the right to return nothing at all. In Haskell, we express this possibility using the Maybe type:

head :: [a] -> Maybe a

This buys us the freedom we need to implement head—it allows us to return Nothing when we discover we can’t produce a value of type a after all:

head :: [a] -> Maybe a
head (x:_) = Just x
head []    = Nothing

Problem solved, right? For the moment, yes… but this solution has a hidden cost.

Returning Maybe is undoubtably convenient when we’re implementing head. However, it becomes significantly less convenient when we want to actually use it! Since head always has the potential to return Nothing, the burden falls upon its callers to handle that possibility, and sometimes that passing of the buck can be incredibly frustrating. To see why, consider the following code:

getConfigurationDirectories :: IO [FilePath]
getConfigurationDirectories = do
  configDirsString <- getEnv "CONFIG_DIRS"
  let configDirsList = split ',' configDirsString
  when (null configDirsList) $
    throwIO $ userError "CONFIG_DIRS cannot be empty"
  pure configDirsList

main :: IO ()
main = do
  configDirs <- getConfigurationDirectories
  case head configDirs of
    Just cacheDir -> initializeCache cacheDir
    Nothing -> error "should never happen; already checked configDirs is non-empty"

When getConfigurationDirectories retrieves a list of file paths from the environment, it proactively checks that the list is non-empty. However, when we use head in main to get the first element of the list, the Maybe FilePath result still requires us to handle a Nothing case that we know will never happen! This is terribly bad for several reasons:

First, it’s just annoying. We already checked that the list is non-empty, why do we have to clutter our code with another redundant check?
Second, it has a potential performance cost. Although the cost of the redundant check is trivial in this particular example, one could imagine a more complex scenario where the redundant checks could add up, such as if they were happening in a tight loop.
Finally, and worst of all, this code is a bug waiting to happen! What if getConfigurationDirectories were modified to stop checking that the list is empty, intentionally or unintentionally? The programmer might not remember to update main, and suddenly the “impossible” error becomes not only possible, but probable.

The need for this redundant check has essentially forced us to punch a hole in our type system. If we could statically prove the Nothing case impossible, then a modification to getConfigurationDirectories that stopped checking if the list was empty would invalidate the proof and trigger a compile-time failure. However, as-written, we’re forced to rely on a test suite or manual inspection to catch the bug.

Paying it forward

Clearly, our modified version of head leaves some things to be desired. Somehow, we’d like it to be smarter: if we already checked that the list was non-empty, head should unconditionally return the first element without forcing us to handle the case we know is impossible. How can we do that?

Let’s look at the original (partial) type signature for head again:

head :: [a] -> a

The previous section illustrated that we can turn that partial type signature into a total one by weakening the promise made in the return type. However, since we don’t want to do that, there’s only one thing left that can be changed: the argument type (in this case, [a]). Instead of weakening the return type, we can strengthen the argument type, eliminating the possibility of head ever being called on an empty list in the first place.

To do this, we need a type that represents non-empty lists. Fortunately, the existing NonEmpty type from Data.List.NonEmpty is exactly that. It has the following definition:

data NonEmpty a = a :| [a]

Note that NonEmpty a is really just a tuple of an a and an ordinary, possibly-empty [a]. This conveniently models a non-empty list by storing the first element of the list separately from the list’s tail: even if the [a] component is [], the a component must always be present. This makes head completely trivial to implement:²

head :: NonEmpty a -> a
head (x:|_) = x

Unlike before, GHC accepts this definition without complaint—this definition is total, not partial. We can update our program to use the new implementation:

getConfigurationDirectories :: IO (NonEmpty FilePath)
getConfigurationDirectories = do
  configDirsString <- getEnv "CONFIG_DIRS"
  let configDirsList = split ',' configDirsString
  case nonEmpty configDirsList of
    Just nonEmptyConfigDirsList -> pure nonEmptyConfigDirsList
    Nothing -> throwIO $ userError "CONFIG_DIRS cannot be empty"

main :: IO ()
main = do
  configDirs <- getConfigurationDirectories
  initializeCache (head configDirs)

Note that the redundant check in main is now completely gone! Instead, we perform the check exactly once, in getConfigurationDirectories. It constructs a NonEmpty a from a [a] using the nonEmpty function from Data.List.NonEmpty, which has the following type:

nonEmpty :: [a] -> Maybe (NonEmpty a)

The Maybe is still there, but this time, we handle the Nothing case very early in our program: right in the same place we were already doing the input validation. Once that check has passed, we now have a NonEmpty FilePath value, which preserves (in the type system!) the knowledge that the list really is non-empty. Put another way, you can think of a value of type NonEmpty a as being like a value of type [a], plus a proof that the list is non-empty.

By strengthening the type of the argument to head instead of weakening the type of its result, we’ve completely eliminated all the problems from the previous section:

The code has no redundant checks, so there can’t be any performance overhead.
Furthermore, if getConfigurationDirectories changes to stop checking that the list is non-empty, its return type must change, too. Consequently, main will fail to typecheck, alerting us to the problem before we even run the program!

What’s more, it’s trivial to recover the old behavior of head from the new one by composing head with nonEmpty:

head' :: [a] -> Maybe a
head' = fmap head . nonEmpty

Note that the inverse is not true: there is no way to obtain the new version of head from the old one. All in all, the second approach is superior on all axes.

The power of parsing

You may be wondering what the above example has to do with the title of this blog post. After all, we only examined two different ways to validate that a list was non-empty—no parsing in sight. That interpretation isn’t wrong, but I’d like to propose another perspective: in my mind, the difference between validation and parsing lies almost entirely in how information is preserved. Consider the following pair of functions:

validateNonEmpty :: [a] -> IO ()
validateNonEmpty (_:_) = pure ()
validateNonEmpty [] = throwIO $ userError "list cannot be empty"

parseNonEmpty :: [a] -> IO (NonEmpty a)
parseNonEmpty (x:xs) = pure (x:|xs)
parseNonEmpty [] = throwIO $ userError "list cannot be empty"

These two functions are nearly identical: they check if the provided list is empty, and if it is, they abort the program with an error message. The difference lies entirely in the return type: validateNonEmpty always returns (), the type that contains no information, but parseNonEmpty returns NonEmpty a, a refinement of the input type that preserves the knowledge gained in the type system. Both of these functions check the same thing, but parseNonEmpty gives the caller access to the information it learned, while validateNonEmpty just throws it away.

These two functions elegantly illustrate two different perspectives on the role of a static type system: validateNonEmpty obeys the typechecker well enough, but only parseNonEmpty takes full advantage of it. If you see why parseNonEmpty is preferable, you understand what I mean by the mantra “parse, don’t validate.” Still, perhaps you are skeptical of parseNonEmpty’s name. Is it really parsing anything, or is it merely validating its input and returning a result? While the precise definition of what it means to parse or validate something is debatable, I believe parseNonEmpty is a bona-fide parser (albeit a particularly simple one).

Consider: what is a parser? Really, a parser is just a function that consumes less-structured input and produces more-structured output. By its very nature, a parser is a partial function—some values in the domain do not correspond to any value in the range—so all parsers must have some notion of failure. Often, the input to a parser is text, but this is by no means a requirement, and parseNonEmpty is a perfectly cromulent parser: it parses lists into non-empty lists, signaling failure by terminating the program with an error message.

Under this flexible definition, parsers are an incredibly powerful tool: they allow discharging checks on input up-front, right on the boundary between a program and the outside world, and once those checks have been performed, they never need to be checked again! Haskellers are well-aware of this power, and they use many different types of parsers on a regular basis:

The aeson library provides a Parser type that can be used to parse JSON data into domain types.
Likewise, optparse-applicative provides a set of parser combinators for parsing command-line arguments.
Database libraries like persistent and postgresql-simple have a mechanism for parsing values held in an external data store.
The servant ecosystem is built around parsing Haskell datatypes from path components, query parameters, HTTP headers, and more.

The common theme between all these libraries is that they sit on the boundary between your Haskell application and the external world. That world doesn’t speak in product and sum types, but in streams of bytes, so there’s no getting around a need to do some parsing. Doing that parsing up front, before acting on the data, can go a long way toward avoiding many classes of bugs, some of which might even be security vulnerabilities.

One drawback to this approach of parsing everything up front is that it sometimes requires values be parsed long before they are actually used. In a dynamically-typed language, this can make keeping the parsing and processing logic in sync a little tricky without extensive test coverage, much of which can be laborious to maintain. However, with a static type system, the problem becomes marvelously simple, as demonstrated by the NonEmpty example above: if the parsing and processing logic go out of sync, the program will fail to even compile.

The danger of validation

Hopefully, by this point, you are at least somewhat sold on the idea that parsing is preferable to validation, but you may have lingering doubts. Is validation really so bad if the type system is going to force you to do the necessary checks eventually anyway? Maybe the error reporting will be a little bit worse, but a bit of redundant checking can’t hurt, right?

Unfortunately, it isn’t so simple. Ad-hoc validation leads to a phenomenon that the language-theoretic security field calls shotgun parsing. In the 2016 paper, The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them, its authors provide the following definition:

Shotgun parsing is a programming antipattern whereby parsing and input-validating code is mixed with and spread across processing code—throwing a cloud of checks at the input, and hoping, without any systematic justification, that one or another would catch all the “bad” cases.

They go on to explain the problems inherent to such validation techniques:

Shotgun parsing necessarily deprives the program of the ability to reject invalid input instead of processing it. Late-discovered errors in an input stream will result in some portion of invalid input having been processed, with the consequence that program state is difficult to accurately predict.

In other words, a program that does not parse all of its input up front runs the risk of acting upon a valid portion of the input, discovering a different portion is invalid, and suddenly needing to roll back whatever modifications it already executed in order to maintain consistency. Sometimes this is possible—such as rolling back a transaction in an RDBMS—but in general it may not be.

It may not be immediately apparent what shotgun parsing has to do with validation—after all, if you do all your validation up front, you mitigate the risk of shotgun parsing. The problem is that validation-based approaches make it extremely difficult or impossible to determine if everything was actually validated up front or if some of those so-called “impossible” cases might actually happen. The entire program must assume that raising an exception anywhere is not only possible, it’s regularly necessary.

Parsing avoids this problem by stratifying the program into two phases—parsing and execution—where failure due to invalid input can only happen in the first phase. The set of remaining failure modes during execution is minimal by comparison, and they can be handled with the tender care they require.

Parsing, not validating, in practice

So far, this blog post has been something of a sales pitch. “You, dear reader, ought to be parsing!” it says, and if I’ve done my job properly, at least some of you are sold. However, even if you understand the “what” and the “why,” you might not feel especially confident about the “how.”

My advice: focus on the datatypes.

Suppose you are writing a function that accepts a list of tuples representing key-value pairs, and you suddenly realize you aren’t sure what to do if the list has duplicate keys. One solution would be to write a function that asserts there aren’t any duplicates in the list:

checkNoDuplicateKeys :: (MonadError AppError m, Eq k) => [(k, v)] -> m ()

However, this check is fragile: it’s extremely easy to forget. Because its return value is unused, it can always be omitted, and the code that needs it would still typecheck. A better solution is to choose a data structure that disallows duplicate keys by construction, such as a Map. Adjust your function’s type signature to accept a Map instead of a list of tuples, and implement it as you normally would.

Once you’ve done that, the call site of your new function will likely fail to typecheck, since it is still being passed a list of tuples. If the caller was given the value via one of its arguments, or if it received it from the result of some other function, you can continue updating the type from list to Map, all the way up the call chain. Eventually, you will either reach the location the value is created, or you’ll find a place where duplicates actually ought to be allowed. At that point, you can insert a call to a modified version of checkNoDuplicateKeys:

checkNoDuplicateKeys :: (MonadError AppError m, Eq k) => [(k, v)] -> m (Map k v)

Now the check cannot be omitted, since its result is actually necessary for the program to proceed!

This hypothetical scenario highlights two simple ideas:

Use a data structure that makes illegal states unrepresentable. Model your data using the most precise data structure you reasonably can. If ruling out a particular possibility is too hard using the encoding you are currently using, consider alternate encodings that can express the property you care about more easily. Don’t be afraid to refactor.
Push the burden of proof upward as far as possible, but no further. Get your data into the most precise representation you need as quickly as you can. Ideally, this should happen at the boundary of your system, before any of the data is acted upon.³
If one particular code branch eventually requires a more precise representation of a piece of data, parse the data into the more precise representation as soon as the branch is selected. Use sum types judiciously to allow your datatypes to reflect and adapt to control flow.

In other words, write functions on the data representation you wish you had, not the data representation you are given. The design process then becomes an exercise in bridging the gap, often by working from both ends until they meet somewhere in the middle. Don’t be afraid to iteratively adjust parts of the design as you go, since you may learn something new during the refactoring process!

Here are a handful of additional points of advice, arranged in no particular order:

Let your datatypes inform your code, don’t let your code control your datatypes. Avoid the temptation to just stick a Bool in a record somewhere because it’s needed by the function you’re currently writing. Don’t be afraid to refactor code to use the right data representation—the type system will ensure you’ve covered all the places that need changing, and it will likely save you a headache later.
Treat functions that return m () with deep suspicion. Sometimes these are genuinely necessary, as they may perform an imperative effect with no meaningful result, but if the primary purpose of that effect is raising an error, it’s likely there’s a better way.
Don’t be afraid to parse data in multiple passes. Avoiding shotgun parsing just means you shouldn’t act on the input data before it’s fully parsed, not that you can’t use some of the input data to decide how to parse other input data. Plenty of useful parsers are context-sensitive.
Avoid denormalized representations of data, especially if it’s mutable. Duplicating the same data in multiple places introduces a trivially representable illegal state: the places getting out of sync. Strive for a single source of truth.
- Keep denormalized representations of data behind abstraction boundaries. If denormalization is absolutely necessary, use encapsulation to ensure a small, trusted module holds sole responsibility for keeping the representations in sync.
Use abstract datatypes to make validators “look like” parsers. Sometimes, making an illegal state truly unrepresentable is just plain impractical given the tools Haskell provides, such as ensuring an integer is in a particular range. In that case, use an abstract newtype with a smart constructor to “fake” a parser from a validator.

As always, use your best judgement. It probably isn’t worth breaking out singletons and refactoring your entire application just to get rid of a single error "impossible" call somewhere—just make sure to treat those situations like the radioactive substance they are, and handle them with the appropriate care. If all else fails, at least leave a comment to document the invariant for whoever needs to modify the code next.

Recap, reflection, and related reading

That’s all, really. Hopefully this blog post proves that taking advantage of the Haskell type system doesn’t require a PhD, and it doesn’t even require using the latest and greatest of GHC’s shiny new language extensions—though they can certainly sometimes help! Sometimes the biggest obstacle to using Haskell to its fullest is simply being aware what options are available, and unfortunately, one downside of Haskell’s small community is a relative dearth of resources that document design patterns and techniques that have become tribal knowledge.

None of the ideas in this blog post are new. In fact, the core idea—“write total functions”—is conceptually quite simple. Despite that, I find it remarkably challenging to communicate actionable, practicable details about the way I write Haskell code. It’s easy to spend lots of time talking about abstract concepts—many of which are quite valuable!—without communicating anything useful about process. My hope is that this is a small step in that direction.

Sadly, I don’t know very many other resources on this particular topic, but I do know of one: I never hesitate to recommend Matt Parson’s fantastic blog post Type Safety Back and Forth. If you want another accessible perspective on these ideas, including another worked example, I’d highly encourage giving it a read. For a significantly more advanced take on many of these ideas, I can also recommend Matt Noonan’s 2018 paper Ghosts of Departed Proofs, which outlines a handful of techniques for capturing more complex invariants in the type system than I have described here.

As a closing note, I want to say that doing the kind of refactoring described in this blog post is not always easy. The examples I’ve given are simple, but real life is often much less straightforward. Even for those experienced in type-driven design, it can be genuinely difficult to capture certain invariants in the type system, so do not consider it a personal failing if you cannot solve something the way you’d like! Consider the principles in this blog post ideals to strive for, not strict requirements to meet. All that matters is to try.

Technically, in Haskell, this ignores “bottoms,” constructions that can inhabit any value. These aren’t “real” values (unlike null in some other languages)—they’re things like infinite loops or computations that raise exceptions—and in idiomatic Haskell, we usually try to avoid them, so reasoning that pretends they don’t exist still has value. But don’t take my word for it—I’ll let Danielsson et al. convince you that Fast and Loose Reasoning is Morally Correct. ↩
In fact, Data.List.NonEmpty already provides a head function with this type, but just for the sake of illustration, we’ll reimplement it ourselves. ↩
Sometimes it is necessary to perform some kind of authorization before parsing user input to avoid denial of service attacks, but that’s okay: authorization should have a relatively small surface area, and it shouldn’t cause any significant modifications to the state of your system. ↩

Empathy and subjective experience in programming languages

2019-10-19T00:00:00Z

A stereotype about programmers is that they like to think in black and white. Programmers like things to be good or bad, moral or immoral, responsible or irresponsible. Perhaps there is something romantic in the idea that programmers like to be as binary as the computers they program. Reductionist? Almost certainly, but hey, laugh at yourself a bit: we probably deserve to be made fun of from time to time.

Personally, I have no idea if the trope of the nuance-challenged programmer is accurate, but whether it’s a property of programmers or just humans behind a keyboard, the intensity with which we disagree with one another never ceases to amaze. Ask any group of working programmers what their least favorite programming language is, and there’s a pretty good chance things are going to get heated real fast. Why? What is it about programming that makes us feel so strongly that we are right and others are wrong, even when our experiences contradict those of tens or hundreds of thousands of others?

I think about that question a lot.

2015 called, and they want their dress back

Humans have a knack for caring intensely about the most trivial of things. Name almost anything—cats versus dogs, the appropriate way to fasten a necktie, or even which day of the week comes first—and someone somewhere has probably written an essay about it on an internet forum. It would be easy to throw up our hands and give up trying to understand our peers, as sometimes they seem like aliens from another planet.

However, what interests me is how the littlest things seem to get people the most upset. Few people have shouting matches over the best interpretation of quantum mechanics, but friendships will be tested when someone says they just aren’t that into Star Wars. One explanation for this phenomenon is simple accessibility: most people aren’t equipped to understand quantum mechanics well enough to argue about it, but almost anyone can have an opinion on which direction the toilet paper is supposed to go.¹

There is truth in that explanation, but personally, I don’t think it’s the whole story. Rather, I think we grow so used to the idea that our experiences are universal that discovering someone else experienced the exact same thing we did yet came to a different conclusion is not just frustrating: it’s incomprehensible.

Take 2015’s phenomenon of “the dress” as an example. Some people see black and blue, others white and gold, and frankly, whether you see one or the other has no impact on anything remotely meaningful. How did this—something so completely irrelevant—become a cross-cultural phenomenon reported on by major news outlets? My guess: people just aren’t used to the idea that vision—the primary way we sense the world—does not provide us with an objective, universal understanding of reality.

When something objective isn’t

Our culture and society works because, in spite of our differences, we’re still all humans. We eat food, we sleep, we like spending time with each other, and we like feeling connected to those around us. So when we watch a movie, and it tickles us in a way that makes us feel good, we can have an awfully hard time understanding how our best friend—who we largely agree with about everything—didn’t like it at all.

The truth, of course, is that very little of what we experience is in any way objective. Yes, we can be pretty confident that basic arithmetic is true anywhere in the universe, and that if we all agree a table is brown it probably is. There are even things we accept as subjective without a second thought, such as the kinds of food people like or the fashions they find attractive. It’s all the in-betweens that are so pernicious! “The dress” was so unbelievable to most people because, nine hundred and ninety nine times out of of a thousand, when two humans look at a picture, they at least mostly agree on the colors contained within. We do not consider that we are seeing different lenses into the same objective reality, we simply think we are perceiving objective truths directly.

In the case of the dress, whether you heard “yanny” or “laurel,” or whether you believe the Sonic games were ever any good, subjective disagreement is essentially harmless. But what about when it isn’t? Might incorrect beliefs that our experiences are universal cause genuine harm?

I think the answer is absolutely, unequivocally yes.

Subjectivity in programming, and in programming languages specifically

Quick question: which is better, functional or imperative programming?

My guess, given the usual subject of my blog, most of my readers would pick the former. However, the actual answer you chose doesn’t matter: my guess is you feel like you have a pretty rational argument to back it up. It certainly isn’t simply a matter of taste… right?

Well, no, I hope not. I don’t think the world is so subjective that we cannot ever advocate for one thing over another—we tried that whole “everything is XML” thing for a while, and I think we agreed it really wasn’t a good idea. But if you truly believe your answer to the above question can be completely objectively justified (as many do), how does one explain the average Hacker News comment thread on just about any post about Haskell?

I generally try not to read Hacker News if I can help it, as I find doing it mostly just makes me angry,² but I did happen to find a link to a recent discussion on a blog post about using Haskell in production. Let’s take a look at a few comments, shall we?

In a branch of the discussion, one user writes:

Haskell is great for business and great in production
I disagree. It's a beautiful language but it lacks a lot of features to make it useable at scale. And in my experience Haskell engineers are extremely smart but the environment/culture they create makes it difficult to foster team spirit.
I've been in 2 companies in the last 4 years who initially used Haskell in production. One has transitioned to Go and the other is rewriting the codebase in Rust.

The first paragraph is an assertion without many specifics, but it does sound like it could be reasonable. And although the last two sentences are entirely anecdotal, anecdotes are still better than hunches. Let’s see what someone else has to say in response:

I’ve met some pretty damn solid engineers who started on Haskell and, even at a junior level in other languages, produce an elegant solution far more easily than a senior engineer in that language. You probably wouldn’t put the code in production verbatim but you can very easily see what’s going on and it isn’t haunted by spectre of early abstraction, which IMO is the biggest flaw of OOP at scale.
[…]
From my naive perspective it’s easy to make classes out of everything, and to hold state and put side-effects everywhere, but you don’t want to deal with the trouble of a monad until you need it. So you have an automatic inclination towards cleaner code when you start functional and move on.

Also pretty vague and high-level, but also sounds reasonable. If you read either of these comments, and your first inclination was to grow frustrated and start crafting counter-arguments in your head, I encourage you to step outside your feelings momentarily (rational as they may be!) and try your very hardest to interpret them charitably. The discussion continues:

Haskell gives one plenty of rope to hang himself on complexity.
So much that developers develop an aversion to it as deep as fear. It's unavoidable, the ones that didn't develop it are still buried at the working of their first Rube Goldberg machine and unavailable.

Whether you think it’s accurate or not, there is definitely a perception held by a great many people that Haskell is a very complicated language. Surely at least some of them must have given it an honest shot, so have they just not “seen the light” yet? What do you think they’re missing? Perhaps a followup commenter can help elucidate things:

Hi, I find that everything people here are complaining about (and they're valid complaints) has also been true of C++. C++ developed a lot of its complexity (particularly 15-20 yrs ago in the template space) after it got popular, so people were already wed to it.
[…]
The C++ community's really gotten good in the last 5 years or so about reigning in the bad impulses and getting people to write clean, clear, efficient code that has reasonable expressiveness.
Coming into Haskell from C++, I have the same instincts. Haskell's been a pure pleasure. The benefits are really there, and they're easy to get. You just have to think of the trade-offs.

That argument seems reasonable, too. Everything in moderation, right? If you disagree, and you think Haskell is just not worth it, what does this person value that you don’t? What are they missing that you see?

The unsatisfying subjective reality of programming languages

You can probably see where I’m going with all this. These arguments are not built on hard, refutable facts or rigorous real-world evidence, they’re based in gut feelings and personal preferences. Does that mean they’re wrong, invalid, and worthless, and we should do studies to determine which language allows programmers to ship features the fastest and with the fewest bugs, then all agree to use that?

No!

These conversations are subjective because, for better or for worse, humans think in different ways and value different things, and programming languages are the medium in which we express ourselves. To many people who write Haskell (myself included), there is an effervescent joy in modeling a problem with the type system—like capturing something in amber—that others just don’t care about. What’s more, some people clearly loathe Haskell’s significant whitespace and plethora of infix operators, but I’ve never really minded. Is one of us wrong? If so, why? Talk about reliability all you want, but the few rigorous numbers we have don’t provide much evidence one way or the other.

While one commenter in the aforementioned Hacker News thread described Haskell as nothing less than “pain and torture,” another says they “did some Haskell in production and it was delightful.” People push excuses and rationalizations for these differences constantly—they point out that most people are exposed to imperative programming first, while others retort that Haskell is clearly not very widely used despite being around for an awfully long time—but none of their arguments ever seem to change people’s minds.

Often, people walk away from these conversations confused and incensed. To them, their point of view is so obviously apparent that it is hard to fathom anyone else seeing things differently. They rack their brains trying to figure out why their opponents just don’t get it. There must be some key point they’ve misunderstood, some joy they haven’t experienced, some sharp edge they haven’t yet been cut by. But no matter how much time they spend trying to reach these people, somehow, it’s never enough.

Empathy, and how bad results come from good intentions

I’ll admit that these kinds of discussions aren’t always fruitless; sometimes they really do manage to change people’s minds or help them see some new idea they had not been able to grasp. When people manage to keep their cool and acknowledge the differences in their mindsets while still helping people learn, everyone benefits.

Sadly, in my experience, this rarely happens. We have a natural tendency to become angry if people don’t see things the way we do; it’s confusing and disorienting, and it can even disgust us. None of those emotions are conducive to empathy. When we fail to account for the ways in which others might think differently, we voluntarily reject any insights we might have otherwise gained from the conversation because we did not allow ourselves to embrace, even just temporarily, someone else’s strange and perhaps uncomfortable set of values and experiences. We refuse to accept that our perception of color might not be as universal as we thought, and we miss out on the amazing insights we could learn about the nature of light, color, and human vision.

Although failing to empathize with those we are arguing with is bad enough, in my mind, this failure to accept the potential subjectivity of one’s own views has even worse, indirect effects. Take this comment for example, again from the same thread:

Sounds like you've barely programmed in Haskell and don't know what you're talking about. Haskell was the first language I learned. I didn't think this at all and I still don't. It doesn't strike me as any more difficult than learning Java or something.

I have no doubts that this commenter meant what they said: they didn’t find Haskell difficult to learn. The comment they were replying to was vitriolic and combative, so one could almost feel they had a smackdown coming to them… but this isn’t a private conversation. How do you think someone feels when they are learning Haskell, scroll through this thread, and find a comment that tells them they ought to find it easy? If they’ve been struggling, even a little bit, what do you think they might think?

If I were in their place, I might feel a little stupid. I might wonder if I’m really cut out for Haskell or if I should just give up. I definitely wouldn’t feel encouraged and excited to keep trying.

Who knows why this commenter found Haskell straightforward. Maybe they were exposed to certain concepts already, maybe it just fit their style of thinking, perhaps they’re even exceptionally smart. I don’t know. But no matter what the answer is, insulting the intelligence of others, even indirectly in this way, belies a lack of empathy in the face of frustration, and although the intent may not have been to hurt, it can still be seriously harmful.

To be clear, I’m not saying the commenter should have pretended their experiences were different or even kept them to themselves. I don’t believe in being “fake nice”—in my experience, I am best equipped to reach people when speaking genuinely, from the heart. What I would have done is tell my story in a different way, perhaps by writing something like this:

It’s true that a lot of people find Haskell challenging, and I totally accept that some people just don’t think it’s worth it. It’s fine if you don’t want to write Haskell. But personally, I really enjoy writing it, as do the people I work with, and I think we ship great software with it because it aligns naturally—even joyfully!—with the way we like to think about program construction.
Personally, I didn’t find Haskell as challenging to learn as I think some people have, but it was still work, and in some ways I was just exposed to it at the right time. Other people I know have struggled quite a lot at first, and reasonably so, but they’ve still managed to become great Haskell programmers, and they found it worthwhile. Our team dynamic just wouldn’t be the same in any other language.

When I respond to comments I disagree with, I try to tell a personal story that provides a different perspective without invalidating their experiences. Sometimes the result is ungrateful snark anyway (or just no response at all), but you might be surprised how often talking from an emotional place about your own experiences—while being neither aggressive nor especially defensive—can go a long way. Perhaps you can even learn something if they return the favor and explain what they find frustrating, beyond the fundamental, subjective disagreements.

It’s okay to have opinions. It’s okay to like and dislike things. It’s okay to be frustrated that others don’t see things the way you do, and to advocate for the technologies and values you believe in. It’s just not okay to tell someone else their reality is wrong.

Learn to embrace the subjective differences between us all, and you won’t just be kinder. You’ll be happier.

This is where I’m supposed to put a snarky footnote saying something like “obviously, the correct way is blah,” but you deserve better. So you, uh, get a meta snarky footnote instead. ↩
Which, to be entirely fair, may well be as subjective as anything else in this blog post. ↩

Demystifying MonadBaseControl

2019-09-07T00:00:00Z

MonadBaseControl from the monad-control package is a confusing typeclass, and its methods have complicated types. For many people, it’s nothing more than scary, impossible-to-understand magic that is, for some reason, needed when lifting certain kinds of operations. Few resources exist that adequately explain how, why, and when it works, which sadly seems to have resulted in some FUD about its use.

There’s no doubt that the machinery of MonadBaseControl is complex, and the role it plays in practice is often subtle. However, its essence is actually much simpler than it appears, and I promise it can be understood by mere mortals. In this blog post, I hope to provide a complete survey of MonadBaseControl—how it works, how it’s designed, and how it can go wrong—in a way that is accessible to anyone with a firm grasp of monads and monad transformers. To start, we’ll motivate MonadBaseControl by reinventing it ourselves.

The higher-order action problem

Say we have a function with the following type:¹

foo :: IO a -> IO a

If we have an action built from a transformer stack like

bar :: StateT X IO Y

then we might wish to apply foo to bar, but that is ill-typed, since IO is not the same as StateT X IO. In cases like these, we often use lift, but it’s not good enough here: lift adds a new monad transformer to an action, but here we need to remove a transformer. So we need a function with a type like this:

unliftState :: StateT X IO Y -> IO Y

However, if you think about that type just a little bit, it’s clear something’s wrong: it throws away information, namely the state. You may remember that a StateT X IO Y action is equivalent to a function of type X -> IO (Y, X), so our hypothetical unliftState function has two problems:

We have no X to use as the initial state.
We’ll lose any modifications bar made to the state, since the result type is just Y, not (Y, X).

Clearly, we’ll need something more sophisticated, but what?

A naïve solution

Given that foo doesn’t know anything about the state, we can’t easily thread it through foo itself. However, by using runStateT explicitly, we could do some of the state management ourselves:

foo' :: StateT s IO a -> StateT s IO a
foo' m = do
  s <- get
  (v, s') <- lift $ foo (runStateT m s)
  put s'
  pure v

Do you see what’s going on there? It’s not actually very complicated: we get the current state, then pass it as the initial state to runStateT. This produces an action IO (a, s) that has closed over the current state. We can pass that action to foo without issue, since foo is polymorphic in the action’s return type. Finally, all we have to do is put the modified state back into the enclosing StateT computation, and we can get on with our business.

That strategy works okay when we only have one monad transformer, but it gets hairy quickly as soon as we have two or more. For example, if we had baz :: ExceptT X (StateT Y IO) Z, then we could do the same trick by getting the underlying

Y -> IO (Either X Z, Y)

function, closing over the state, restoring it, and doing the appropriate case analysis to re-raise any ExceptT errors, but that’s a lot of work to do for every single function! What we’d like to do instead is somehow abstract over the pattern we used to write foo' in a way that scales to arbitrary monad transformers.

The essence of `MonadBaseControl`

To build a more general solution for “unlifting” arbitrary monad transformers, we need to start thinking about monad transformer state. The technique we used to implement foo' operated on the following process:

Capture the action’s input state and close over it.
Package up the action’s output state with its result and run it.
Restore the action’s output state into the enclosing transformer.
Return the action’s result.

For StateT s, it turns out that the input state and output state are both s, but other monad transformers have state, too. Consider the input and output state for the following common monad transformers:

transformer	representation	input state	output state
`StateT s m a`	`s -> m (a, s)`	`s`	`s`
`ReaderT r m a`	`r -> m a`	`r`	`()`
`WriterT w m a`	`m (a, w)`	`()`	`w`

Notice how the input state is whatever is to the left of the ->, while the output state is whatever extra information gets produced alongside the result. Using the same reasoning, we can also deduce the input and output state for compositions of multiple monad transformers, such as the following:

transformer	representation	input state	output state
`ReaderT r (WriterT w m) a`	`r -> m (a, w)`	`r`	`w`
`StateT s (ReaderT r m) a`	`r -> s -> m (a, s)`	`(r, s)`	`s`
`WriterT w (StateT s m) a`	`s -> m ((a, w), s)`	`s`	`(w, s)`

Notice that when monad transformers are composed, their states are composed, too. This is useful to keep in mind, since our goal is to capture the four steps above in a typeclass, polymorphic in the state of the monad transformers we need to lift through. At minimum, we need two new operations: one to capture the input state and close over it (step 1) and one to restore the output state (step 3). One class we might come up with could look like this:

class MonadBase b m => MonadBaseControl b m | m -> b where
  type InputState m
  type OutputState m
  captureInputState :: m (InputState m)
  closeOverInputState :: m a -> InputState m -> b (a, OutputState m)
  restoreOutputState :: OutputState m -> m ()

If we can write instances of that typeclass for various transformers, we can use the class’s operations to implement foo' in a generic way that works with any combination of them:

foo' :: MonadBaseControl IO m => m a -> m a
foo' m = do
  s <- captureInputState
  let m' = closeOverInputState m s
  (v, s') <- liftBase $ foo m'
  restoreOutputState s'
  pure v

So how do we implement those instances? Let’s start with IO, since that’s the base case:

instance MonadBaseControl IO IO where
  type InputState IO = ()
  type OutputState IO = ()
  captureInputState = pure ()
  closeOverInputState m () = m <&> (, ())
  restoreOutputState () = pure ()

Not very exciting. The StateT s instance, on the other hand, is significantly more interesting:

instance MonadBaseControl b m => MonadBaseControl b (StateT s m) where
  type InputState (StateT s m) = (s, InputState m)
  type OutputState (StateT s m) = (s, OutputState m)
  captureInputState = (,) <$> get <*> lift captureInputState
  closeOverInputState m (s, ss) = do
    ((v, s'), ss') <- closeOverInputState (runStateT m s) ss
    pure (v, (s', ss'))
  restoreOutputState (s, ss) = lift (restoreOutputState ss) *> put s

This instance alone includes most of the key ideas behind MonadBaseControl. There’s a lot going on, so let’s break it down, step by step:

Start by examining the definitions of InputState and OutputState. Are they what you expected? You’d be forgiven for expecting the following:
```
type InputState (StateT s m) = s
type OutputState (StateT s m) = s
```
After all, that’s what we wrote in the table, isn’t it?
However, if you give it a try, you’ll find it doesn’t work. InputState and OutputState must capture the state of the entire monad, not just a single transformer layer, so we have to combine the StateT s state with the state of the underlying monad. In the simplest case we get
```
InputState (StateT s IO) = (s, ())
```
which is boring, but in a more complex case, we need to get something like this:
```
InputState (StateT s (ReaderT IO)) = (s, (r, ()))
```
Therefore, InputState (StateT s m) combines s with InputState m in a tuple, and OutputState does the same.
Moving on, take a look at captureInputState and closeOverInputState. Just as InputState and OutputState capture the state of the entire monad, these functions need to be inductive in the same way.
captureInputState acquires the current state using get, and it combines it with the remaining monadic state using lift captureInputState. closeOverInputState uses the captured state to peel off the outermost StateT layer, then calls closeOverInputState recursively to peel off the rest of them.
Finally, restoreOutputState restores the state of the underlying monad stack, then restores the StateT state, ensuring everything ends up back the way it’s supposed to be.

Take the time to digest all that—work through it yourself if you need to—as it’s a dense piece of code. Once you feel comfortable with it, take a look at the instances for ReaderT and WriterT as well:

instance MonadBaseControl b m => MonadBaseControl b (ReaderT r m) where
  type InputState (ReaderT r m) = (r, InputState m)
  type OutputState (ReaderT r m) = OutputState m
  captureInputState = (,) <$> ask <*> lift captureInputState
  closeOverInputState m (s, ss) = closeOverInputState (runReaderT m s) ss
  restoreOutputState ss = lift (restoreOutputState ss)

instance (MonadBaseControl b m, Monoid w) => MonadBaseControl b (WriterT w m) where
  type InputState (WriterT w m) = InputState m
  type OutputState (WriterT w m) = (w, OutputState m)
  captureInputState = lift captureInputState
  closeOverInputState m ss = do
    ((v, s'), ss') <- closeOverInputState (runWriterT m) ss
    pure (v, (s', ss'))
  restoreOutputState (s, ss) = lift (restoreOutputState ss) *> tell s

Make sure you understand these instances, too. It should be easier this time, since they share most of their structure with the StateT instance, but note the asymmetry that arises from the differing input and output states. (It may even help to try and write these instances yourself, focusing on the types whenever you get stuck.)

If you feel alright with them, then congratulations: you’re already well on your way to grokking MonadBaseControl!

Hiding the input state

So far, our implementation of MonadBaseControl works, but it’s actually slightly more complicated than it needs to be. As it happens, all valid uses of MonadBaseControl will always end up performing the following pattern:

s <- captureInputState
let m' = closeOverInputState m s

That is, we close over the input state as soon as we capture it. We can therefore combine captureInputState and closeOverInputState into a single function:

captureAndCloseOverInputState :: m a -> m (b (a, OutputState m))

What’s more, we no longer need the InputState associated type at all! This is an improvement, since it simplifies the API and removes the possibility for any misuse of the input state, since it’s never directly exposed. On the other hand, it has a more complicated type: it produces a monadic action that returns another monadic action. This can be a little more difficult to grok, which is why I presented the original version first, but it may help to consider how the above type arises naturally from the following definition:

captureAndCloseOverInputState m = closeOverInputState m <$> captureInputState

Let’s update the MonadBaseControl class to incorporate this simplification:

class MonadBase b m => MonadBaseControl b m | m -> b where
  type OutputState m
  captureAndCloseOverInputState :: m a -> m (b (a, OutputState m))
  restoreOutputState :: OutputState m -> m ()

We can then update all the instances to use the simpler API by simply fusing the definitions of captureInputState and closeOverInputState together:

instance MonadBaseControl IO IO where
  type OutputState IO = ()
  captureAndCloseOverInputState m = pure (m <&> (, ()))
  restoreOutputState () = pure ()

instance MonadBaseControl b m => MonadBaseControl b (StateT s m) where
  type OutputState (StateT s m) = (s, OutputState m)
  captureAndCloseOverInputState m = do
    s <- get
    m' <- lift $ captureAndCloseOverInputState (runStateT m s)
    pure $ do
      ((v, s'), ss') <- m'
      pure (v, (s', ss'))
  restoreOutputState (s, ss) = lift (restoreOutputState ss) *> put s

instance MonadBaseControl b m => MonadBaseControl b (ReaderT r m) where
  type OutputState (ReaderT r m) = OutputState m
  captureAndCloseOverInputState m = do
    s <- ask
    lift $ captureAndCloseOverInputState (runReaderT m s)
  restoreOutputState ss = lift (restoreOutputState ss)

instance (MonadBaseControl b m, Monoid w) => MonadBaseControl b (WriterT w m) where
  type OutputState (WriterT w m) = (w, OutputState m)
  captureAndCloseOverInputState m = do
    m' <- lift $ captureAndCloseOverInputState (runWriterT m)
    pure $ do
      ((v, s'), ss') <- m'
      pure (v, (s', ss'))
  restoreOutputState (s, ss) = lift (restoreOutputState ss) *> tell s

This is already very close to a full MonadBaseControl implementation. The captureAndCloseOverInputState implementations are getting a little out of hand, but bear with me—they’ll get simpler before this blog post is over.

Coping with partiality

Our MonadBaseControl class now works with StateT, ReaderT, and WriterT, but one transformer we haven’t considered is ExceptT. Let’s try to extend our table from before with a row for ExceptT:

transformer	representation	input state	output state
`ExceptT e m a`	`m (Either e a)`	`()`	`???`

Hmm… what is the output state for ExceptT?

The answer can’t be e, since we might not end up with an e—the computation might not fail. Maybe e would be closer… could that work?

Well, let’s try it. Let’s write a MonadBaseControl instance for ExceptT:

instance MonadBaseControl b m => MonadBaseControl b (ExceptT e m) where
  type OutputState (ExceptT e m) = (Maybe e, OutputState m)
  captureAndCloseOverInputState m = do
    m' <- lift $ captureAndCloseOverInputState (runExceptT m)
    pure $ do
      ((v, s'), ss') <- m'
      pure (v, (s', ss'))
  restoreOutputState (s, ss) = lift (restoreOutputState ss) *> case s of
    Just e -> throwError e
    Nothing -> pure ()

Sadly, the above implementation doesn’t typecheck; it is rejected with the following type error:

• Couldn't match type ‘Either e a’ with ‘(a, Maybe e)’
  Expected type: m (b ((a, Maybe e), OutputState m))
    Actual type: m (b (Either e a, OutputState m))
• In the second argument of ‘($)’, namely
    ‘captureAndCloseOverInputState (runExceptT m)’
  In a stmt of a 'do' block:
    m' <- lift $ captureAndCloseOverInputState (runExceptT m)
  In the expression:
    do m' <- lift $ captureAndCloseOverInputState (runExceptT m)
       return do ((v, s'), ss') <- m'
                 pure (v, (s', ss'))

We promised a (a, Maybe e), but we have an Either e a, and there’s certainly no way to get the former from the latter. Are we stuck? (If you’d like, take a moment to think about how you’d solve this type error before moving on, as it may be helpful for understanding the following solution.)

The fundamental problem here is partiality. The type of the captureAndCloseOverInputState method always produces an action in the base monad that includes an a in addition to some other output state. But ExceptT is different: when it an error is raised, it doesn’t produce an a at all—it only produces an e. Therefore, as written, it’s impossible to give ExceptT a MonadBaseControl instance.

Of course, we’d very much like to give ExceptT a MonadBaseControl instance, so that isn’t very satisfying. Somehow, we need to change captureAndCloseOverInputState so that it doesn’t always need to produce an a. There are a few ways we could accomplish that, but an elegant way to do it is this:

class MonadBase b m => MonadBaseControl b m | m -> b where
  type WithOutputState m a
  captureAndCloseOverInputState :: m a -> m (b (WithOutputState m a))
  restoreOutputState :: WithOutputState m a -> m a

We’ve replaced the old OutputState associated type with a new WithOutputState type, and the key difference between them is that WithOutputState describes the type of a combination of the result (of type a) and the output state, rather than describing the type of the output state alone. For total monad transformers like StateT, ReaderT, and WriterT, WithOutputState m a will just be a tuple of the result value and the output state, the same as before. For example, here’s an updated MonadBaseControl instance for StateT:

instance MonadBaseControl b m => MonadBaseControl b (StateT s m) where
  type WithOutputState (StateT s m) a = WithOutputState m (a, s)
  captureAndCloseOverInputState m = do
    s <- get
    lift $ captureAndCloseOverInputState (runStateT m s)
  restoreOutputState ss = do
    (a, s) <- lift $ restoreOutputState ss
    put s
    pure a

Before we consider how this helps us with ExceptT, let’s pause for a moment and examine the revised StateT instance in detail, as there are some new things going on here:

Take a close look at the definition of WithOutputState (StateT s m) a. Note that we’ve defined it to be WithOutputState m (a, s), not (WithOutputState m a, s). Consider, for a moment, the difference between these types. Can you see why we used the former, not the latter?
If it’s unclear to you, that’s okay—let’s illustrate the difference with an example. Consider two similar monad transformer stacks:
```
m1 :: StateT s (ExceptT e IO) a
m2 :: ExceptT e (StateT s IO) a
```
Both these stacks contain StateT and ExceptT, but they are layered in a different order. What’s the difference? Well, consider what m1 and m2 return once fully unwrapped:
```
runExceptT (runStateT m1 s) :: m (Either e (a, s))
runStateT (runExceptT m2) s :: m (Either e a, s)
```
These results are meaningfully different: in m1, the state is discarded if an error is raised, but in m2, the final state is always returned, even if the computation is aborted. What does this mean for WithOutputState?
Here’s the important detail: the state is discarded when ExceptT is “inside” StateT, not the other way around. This can be counterintuitive, since the s ends up inside the Either when the StateT constructor is on the outside and vice versa. This is really just a property of how monad transformers compose, not anything specific to MonadBaseControl, so an explanation of why this happens is outside the scope of this blog post, but the relevant insight is that the m in StateT s m a controls the eventual action’s output state.
If we had defined WithOutputState (StateT s m) a to be (WithOutputState m a, s), we’d be in a pickle, since m would be unable to influence the presence of s in the output state. Therefore, we have no choice but to use WithOutputState m (a, s). (If you are still confused by this, try it yourself; you’ll find that there’s no way to make the other definition typecheck.)
Now that we’ve developed an intuitive understanding of why WithOutputState must be defined the way it is, let’s look at things from another perspective. Consider the type of runStateT once more:
```
runStateT :: StateT s m a -> s -> m (a, s)
```
Note that the result type is m (a, s), with the m on the outside. As it happens, this correspondence simplifies the definition of captureAndCloseOverInputState, since we no longer have to do any fiddling with its result—it’s already in the proper shape, so we can just return it directly.
Finally, this instance illustrates an interesting change to restoreOutputState. Since the a is now packed inside the WithOutputState m a value, the caller of captureAndCloseOverInputState needs some way to get the a back out! Conveniently, restoreOutputState can play that role, both restoring the output state and unpacking the result.
Even ignoring partial transformers like ExceptT, this is an improvement over the old API, as it conveniently prevents the programmer from forgetting to call restoreOutputState. However, as we’ll see shortly, it is much more than a convenience: once ExceptT comes into play, it is essential!

With those details addressed, let’s return to ExceptT. Using the new interface, writing an instance for ExceptT is not only possible, it’s actually rather easy:

instance MonadBaseControl b m => MonadBaseControl b (ExceptT e m) where
  type WithOutputState (ExceptT e m) a = WithOutputState m (Either e a)
  captureAndCloseOverInputState m =
    lift $ captureAndCloseOverInputState (runExceptT m)
  restoreOutputState ss =
    either throwError pure =<< lift (restoreOutputState ss)

This instance illustrates why it’s so crucial that restoreOutputState have the aforementioned dual role: it must handle the case where no a exists at all! In the case of ExceptT, it restores the state in the enclosing monad by re-raising an error.

Now all that’s left to do is update the other instances:

instance MonadBaseControl IO IO where
  type WithOutputState IO a = a
  captureAndCloseOverInputState = pure
  restoreOutputState = pure

instance MonadBaseControl b m => MonadBaseControl b (ReaderT r m) where
  type WithOutputState (ReaderT r m) a = WithOutputState m a
  captureAndCloseOverInputState m = do
    s <- ask
    lift $ captureAndCloseOverInputState (runReaderT m s)
  restoreOutputState ss = lift $ restoreOutputState ss

instance (MonadBaseControl b m, Monoid w) => MonadBaseControl b (WriterT w m) where
  type WithOutputState (WriterT w m) a = WithOutputState m (a, w)
  captureAndCloseOverInputState m =
    lift $ captureAndCloseOverInputState (runWriterT m)
  restoreOutputState ss = do
    (a, s) <- lift $ restoreOutputState ss
    tell s
    pure a

Finally, we can update our lifted variant of foo to use the new interface so it will work with transformer stacks that include ExceptT:

foo' :: MonadBaseControl IO m => m a -> m a
foo' m = do
  m' <- captureAndCloseOverInputState m
  restoreOutputState =<< liftBase (foo m')

At this point, it’s worth considering something: although getting the MonadBaseControl class and instances right was a lot of work, the resulting foo' implementation is actually incredibly simple. That’s a good sign, since we only have to write the MonadBaseControl instances once (in a library), but we have to write functions like foo' quite often.

Scaling to the real `MonadBaseControl`

The MonadBaseControl class we implemented in the previous section is complete. It is a working, useful class that is equivalent in power to the “real” MonadBaseControl class in the monad-control library. However, if you compare the two, you’ll notice that the version in monad-control looks a little bit different. What gives?

Let’s compare the two classes side by side:

-- ours
class MonadBase b m => MonadBaseControl b m | m -> b where
  type WithOutputState m a
  captureAndCloseOverInputState :: m a -> m (b (WithOutputState m a))
  restoreOutputState :: WithOutputState m a -> m a

-- theirs
class MonadBase b m => MonadBaseControl b m | m -> b where
  type StM m a
  liftBaseWith :: (RunInBase m b -> b a) -> m a
  restoreM :: StM m a -> m a

Let’s start with the similarities, since those are easy:

Our WithOutputState associated type is precisely equivalent to their StM associated type, they just use a (considerably) shorter name.
Likewise, our restoreOutputState method is precisely equivalent to their restoreM method, simply under a different name.

That leaves captureAndCloseOverInputState and liftBaseWith. Those two methods both do similar things, but they aren’t identical, and that’s where all the differences lie. To understand liftBaseWith, let’s start by inlining the definition of the RunInBase type alias so we can see the fully-expanded type:

liftBaseWith
  :: MonadBaseControl b m
  => ((forall c. m c -> b (StM m c)) -> b a)
  -> m a

That type is complicated! However, if we break it down, hopefully you’ll find it’s not as scary as it first appears. Let’s reimplement the foo' example from before using liftBaseWith to show how this version of MonadBaseControl works:

foo' :: MonadBaseControl IO m => m a -> m a
foo' m = do
  s <- liftBaseWith $ \runInBase -> foo (runInBase m)
  restoreM s

This is, in some ways, superficially similar to the version we wrote using our version of MonadBaseControl. Just like in our version, we capture the input state, apply foo in the IO monad, then restore the state. But what exactly is doing the state capturing, and what is runInBase?

Let’s start by adding a type annotation to runInBase to help make it a little clearer what’s going on:

foo' :: forall m a. MonadBaseControl IO m => m a -> m a
foo' m = do
  s <- liftBaseWith $ \(runInBase :: forall b. m b -> IO (StM m b)) ->
    foo (runInBase m)
  restoreM s

That type should look sort of recognizable. If we replace StM with WithOutputState, then we get a type that looks very similar to that of our original closeOverInputState function, except it doesn’t need to take the input state as an argument. How does that work?

Here’s the trick: liftBaseWith starts by capturing the input state, just as before. However, it then builds a function, runInBase, which is like closeOverInputState partially-applied to the input state it captured. It hands that function to us, and we’re free to apply it to m, which produces the IO (StM m a) action we need, and we can now pass that action to foo. The result is returned in the outer monad, and we restore the state using restoreM.

Sharing the input state

At first, this might seem needlessly complicated. When we first started, we separated capturing the input state and closing over it into two separate operations (captureInputState and closeOverInputState), but we eventually combined them so that we could keep the input state hidden. Why does monad-control split them back into two operations again?

As it turns out, when lifting foo, there’s no advantage to the more complicated API of monad-control. In fact, we could implement our captureAndCloseOverInputState operation in terms of liftBaseWith, and we could use that to implement foo' the same way we did before:

captureAndCloseOverInputState :: MonadBaseControl b m => m a -> m (b (StM m a))
captureAndCloseOverInputState m = liftBaseWith $ \runInBase -> pure (runInBase m)

foo' :: MonadBaseControl IO m => m a -> m a
foo' m = do
  m' <- captureAndCloseOverInputState m
  restoreM =<< liftBase (foo m')

However, that approach has a downside once we need to lift more complicated functions. foo is exceptionally simple, as it only accepts a single input argument, but what if we wanted to lift a more complicated function that took two monadic arguments, such as this one:

bar :: IO a -> IO a -> IO a

We could implement that by calling captureAndCloseOverInputState twice, like this:

bar' :: MonadBaseControl IO m => m a -> m a -> m a
bar' ma mb = do
  ma' <- captureAndCloseOverInputState ma
  mb' <- captureAndCloseOverInputState mb
  restoreM =<< liftBase (bar ma' mb')

However, that would capture the monadic state twice, which is rather inefficient. By using liftBaseWith, the state capturing is done just once, and it’s shared between all calls to runInBase:

bar' :: MonadBaseControl IO m => m a -> m a -> m a
bar' ma mb = do
  s <- liftBaseWith $ \runInBase ->
    bar (runInBase ma) (runInBase mb)
  restoreM s

By providing a “running” function (runInBase) instead of direct access to the input state, liftBaseWith allows sharing the captured input state between multiple actions without exposing it directly.

Sidebar: continuation-passing and impredicativity

One last point before we move on: although the above explains why captureAndCloseOverInputState is insufficient, you may be left wondering why liftBaseWith can’t just return runInBase. Why does it need to be given a continuation? After all, it would be nicer if we could just write this:

bar' :: MonadBaseControl IO m => m a -> m a -> m a
bar' ma mb = do
  runInBase <- askRunInBase
  restoreM =<< liftBase (bar (runInBase ma) (runInBase mb))

To understand the problem with a hypothetical askRunInBase function, remember that the type of runInBase is polymorphic:

runInBase :: forall a. m a -> b (StM m a)

This is important, since if you need to lift a function with a type like

baz :: IO b -> IO c -> IO (Either b c)

then you’ll want to instantiate that a variable with two different types. We’d need to retain that power in askRunInBase, so it would need to have the following type:

askRunInBase :: MonadBaseControl b m => m (forall a. m a -> b (StM m a))

Sadly, that type is illegal in Haskell. Type constructors must be applied to monomorphic types, but in the above type signature, m is applied to a polymorphic type.² The RankNTypes GHC extension introduces a single exception: the (->) type constructor is special and may be applied to polymorphic types. That’s why liftBaseWith is legal, but askRunInBase is not: since liftBaseWith is passed a higher-order function that receives runInBase as an argument, the polymorphic type appears immediately under an application of (->), which is allowed.

The aforementioned restriction means we’re basically out of luck, but if you really want askRunInBase, there is a workaround. GHC is perfectly alright with a field of a datatype being polymorphic, so we can define a newtype that wraps a suitably-polymorphic function:

newtype RunInBase b m = RunInBase (forall a. m a -> b (StM m a))

We can now alter askRunInBase to return our newtype, and we can implement it in terms of liftBaseWith:³

askRunInBase :: MonadBaseControl b m => m (RunInBase b m)
askRunInBase = liftBaseWith $ \runInBase -> pure $ RunInBase runInBase

To use askRunInBase, we have to pattern match on the RunInBase constructor, but it isn’t very noisy, since we can do it directly in a do binding. For example, we could implement a lifted version of baz this way:

baz' :: MonadBaseControl IO m => m a -> m b -> m (Either a b)
baz' ma mb = do
  RunInBase runInBase <- askRunInBase
  s <- liftBase (baz (runInBase ma) (runInBase mb))
  bitraverse restoreM restoreM s

As of version 1.0.2.3, monad-control does not provide a newtype like RunInBase, so it also doesn’t provide a function like askRunInBase. For now, you’ll have to use liftBaseWith, but it might be a useful future addition to the library.

Pitfalls

At this point in the blog post, we’ve covered the essentials of MonadBaseControl: how it works, how it’s designed, and how you might go about using it. However, so far, we’ve only considered situations where MonadBaseControl works well, and I’ve intentionally avoided examples where the technique breaks down. In this section, we’re going to take a look at the pitfalls and drawbacks of MonadBaseControl, plus some ways they can be mitigated.

No polymorphism, no lifting

All of the pitfalls of MonadBaseControl stem from the same root problem, and that’s the particular technique it uses to save and restore monadic state. We’ll start by considering one of the simplest ways that technique is thwarted, and that’s monomorphism. Consider the following two functions:

poly :: IO a -> IO a
mono :: IO X -> IO X

Even after all we’ve covered, it may surprise you to learn that although poly can be easily lifted to MonadBaseControl IO m => m a -> m a, it’s impossible to lift mono to MonadBaseControl IO m => m X -> m X. It’s a little unintuitive, as we often think of polymorphic types as being more complicated (so surely lifting polymorphic functions ought to be harder), but in fact, it’s the flexibility of polymorphism that allows MonadBaseControl to work in the first place.

To understand the problem, remember that when we lift a function of type forall a. b a -> b a using MonadBaseControl, we actually instantiate a to (StM m c). That produces a function of type b (StM m c) -> b (StM m c), which is isomorphic to the m c -> m c type we want. The instantiation step is easily overlooked, but it’s crucial, since otherwise we have no way to thread the state through the otherwise opaque function we’re trying to lift!

In the case of mono, that’s exactly the problem we’re faced with. mono will not accept an IO (StM m X) as an argument, only precisely an IO X, so we can’t pass along the monadic state. For all its machinery, MonadBaseControl is no help at all if no polymorphism is involved. Trying to generalize mono without modifying its implementation is a lost cause.

The dangers of discarded state

Our inability to lift mono is frustrating, but at least it’s conclusively impossible. In practice, however, many functions lie in an insidious in-between: polymorphic enough to be lifted, but not without compromises. The simplest of these functions have types such as the following:

sideEffect :: IO a -> IO ()

Unlike mono, it’s entirely possible to lift sideEffect:

sideEffect' :: MonadBaseControl IO m => m a -> m ()
sideEffect' m = liftBaseWith $ \runInBase -> sideEffect (runInBase m)

This definition typechecks, but you may very well prefer it didn’t, since it has a serious problem: any changes made by m to the monadic state are completely discarded once sideEffect' returns! Since sideEffect' never calls restoreM, there’s no way the state of m can be any different from the original state, but it’s impossible to call restoreM since we don’t actually get an StM m () result from sideEffect.

Sometimes this may be acceptable, since some monad transformers don’t actually have any output state anyway, such as ReaderT r. In other cases, however, sideEffect' could be a bug waiting to happen. One way to make sideEffect' safe would be to add a StM m a ~ a constraint to its context, since that guarantees the monad transformers being lifted through are stateless, and nothing is actually being discarded. Of course, that significantly restricts the set of monad transformers that can be lifted through.

Rewindable state

One scenario where state discarding can actually be useful is operations with so-called rewindable or transactional state. The most common example of such an operation is catch:

catch :: Exception e => IO a -> (e -> IO a) -> IO a

When lifted, state changes from the action or from the exception handler will be “committed,” but never both. If an exception is raised during the computation, those state changes are discarded (“rewound”), giving catch a kind of backtracking semantics. This behavior arises naturally from the way a lifted version of catch must be implemented:

catch' :: (Exception e, MonadBaseControl IO m) => m a -> (e -> m a) -> m a
catch' m f = do
  s <- liftBaseWith $ \runInBase ->
    catch (runInBase m) (runInBase . f)
  restoreM s

If m raises an exception, it will never return an StM m a value, so there’s no way to get ahold of any of the state changes that happened before the exception. Therefore, the only option is to discard that state.

This behavior is actually quite useful, and it’s definitely not unreasonable. However, useful or not, it’s inconsistent with state changes to mutable values like IORefs or MVars (they stay modified whether an exception is raised or not), so it can still be a gotcha. Either way, it’s worth being aware of.

Partially discarded state

The next function we’re going to examine is finally:

finally :: IO a -> IO b -> IO a

This function has a similar type to catch, and it even has similar semantics. Like catch, finally can be lifted, but unlike catch, its state can’t be given any satisfying treatment. The only way to implement a lifted version is

finally' :: MonadBaseControl IO m => m a -> m b -> m a
finally' ma mb = do
  s <- liftBaseWith $ \runInBase ->
    finally (runInBase ma) (runInBase mb)
  restoreM s

which always discards all state changes made by the second argument. This is clear just from looking at finally’s type: since b doesn’t appear anywhere in the return type, there’s simply no way to access that action’s result, and therefore no way to access its modified state.

However, don’t despair: there actually is a way to produce a lifted version of finally that preserves all state changes. It can’t be done by lifting finally directly, but if we reimplement finally in terms of simpler lifted functions that are more amenable to lifting, we can produce a lifted version of finally that preserves all the state:⁴

finally' :: MonadBaseControl IO m => m a -> m b -> m a
finally' ma mb = mask' $ \restore -> do
  a <- liftBaseWith $ \runInBase ->
    try (runInBase (restore ma))
  case a of
    Left e -> mb *> liftBase (throwIO (e :: SomeException))
    Right s -> restoreM s <* mb

This illustrates an important (and interesting) point about MonadBaseControl: whether or not an operation can be made state-preserving is not a fundamental property of the operation’s type, but rather a property of the types of the exposed primitives. There is sometimes a way to implement a state-preserving variant of operations that might otherwise seem unliftable given the right primitives and a bit of cleverness.

Forking state

As a final example, I want to provide an example where the state may not actually be discarded per se, just inaccessible. Consider the type of forkIO:

forkIO :: IO () -> IO ThreadId

Although forkIO isn’t actually polymorphic in its argument, we can convert any IO action to one that produces () via void, so it might as well be. Therefore, we can lift forkIO in much the same way we did with sideEffect:

forkIO' :: MonadBaseControl IO m => m () -> m ThreadId
forkIO' m = liftBaseWith $ \runInBase -> forkIO (void $ runInBase m)

As with sideEffect, we can’t recover the output state, but in this case, there’s a fundamental reason that goes deeper than the types: we’ve forked off a concurrent computation! We’ve therefore split the state in two, which might be what we want… but it also might not. forkIO is yet another illustration that it’s important to think about the state-preservation semantics when using MonadBaseControl, or you may end up with a bug!

`MonadBaseControl` in context

Congratulations: you’ve made it through most of this blog post. If you’ve followed everything so far, you now understand MonadBaseControl. All the tricky parts are over. However, before wrapping up, I’d like to add a little extra information about how MonadBaseControl relates to various other parts of the Haskell ecosystem. In practice, that information can be as important as understanding MonadBaseControl itself.

The remainder of `monad-control`

If you look at the documentation for monad-control, you’ll find that it provides more than just the MonadBaseControl typeclass. I’m not going to cover everything else in detail in this blog post, but I do want to touch upon it briefly.

First off, you should definitely take a look at the handful of helper functions provided by monad-control, such as control and liftBaseOp_. These functions provide support for lifting common function types without having to use liftBaseWith directly. It’s useful to understand liftBaseWith, since it’s the most general way to use MonadBaseControl, but in practice, it is simpler and more readable to use the more specialized functions wherever possible. Many of the examples in this very blog post could be simplified using them, and I only stuck to liftBaseWith to introduce as few new concepts at a time as possible.

Second, I’d like to mention the related MonadTransControl typeclass. You hopefully remember from earlier in the blog post how we defined MonadBaseControl instances inductively so that we could lift all the way down to the base monad. MonadTransControl is like MonadBaseControl if it intentionally did not do that—it allows lifting through a single transformer at a time, rather than through all of them at once.

Usually, MonadTransControl is not terribly useful to use directly (though I did use it once in a previous blog post of mine to help derive instances of mtl-style classes), but it is useful for implementing MonadBaseControl instances for your own transformers. If you define a MonadTransControl instance for your monad transformer, you can get a MonadBaseControl implementation for free using the provided ComposeSt, defaultLiftBaseWith, and defaultRestoreM bindings; see the documentation for more details.

`lifted-base` and `lifted-async`

If you’re going to use MonadBaseControl, the lifted-base and lifted-async packages are good to know about. As their names imply, they provide lifted versions of bindings in the base and async packages, so you can use them directly without needing to lift them yourself. For example, if you needed a lifted version of mask from Control.Exception, you could swap it for the mask export from Control.Exception.Lifted, and everything would mostly just work (though always be sure to check the documentation for any caveats on state discarding).

Relationship to `MonadUnliftIO`

Recently, FP Complete has developed the unliftio package as an alternative to monad-control. It provides the MonadUnliftIO typeclass, which is similar in spirit to MonadBaseControl, but heavily restricted: it is specialized to IO as the base monad, and it only allows instances for stateless monads, such as ReaderT. This is designed to encourage the so-called ReaderT design pattern, which avoids ever using stateful monads like ExceptT or StateT over IO, encouraging the use of IO exceptions and mutable variables (e.g. MVars or TVars) instead.

I should be clear: I really like most of what FP Complete has done—to this day, I still use stack as my Haskell build tool of choice—and I think the suggestions given in the aforementioned “ReaderT design pattern” blog post have real weight to them. I have a deep respect for Michael Snoyman’s commitment to opinionated, user-friendly tools and libraries. But truthfully, I can’t stand MonadUnliftIO.

MonadUnliftIO is designed to avoid all the complexity around state discarding that MonadBaseControl introduces, and on its own, that’s a noble goal. Safety first, after all. The problem is that MonadUnliftIO really is extremely limiting, and what’s more, it can actually be trivially encoded in terms of MonadBaseControl as follows:

type MonadUnliftIO m = (MonadBaseControl IO m, forall a. StM m a ~ a)

This alias can be used to define safe, lifted functions that never discard state while still allowing functions that can be safely lifted through stateful transformers to do so. Indeed, the Control.Concurrent.Async.Lifted.Safe module from lifted-async does exactly that (albeit with a slightly different formulation than the above alias).

To be fair, the unliftio README does address this in its comparison section:

monad-control allows us to unlift both styles. In theory, we could write a variant of lifted-base that never does state discards […] In other words, this is an advantage of monad-control over MonadUnliftIO. We've avoided providing any such extra typeclass in this package though, for two reasons:
MonadUnliftIO is a simple typeclass, easy to explain. We don't want to complicated [sic] matters […]
Having this kind of split would be confusing in user code, when suddenly [certain operations are] not available to us.

In other words, the authors of unliftio felt that MonadBaseControl was simply not worth the complexity, and they could get away with MonadUnliftIO. Frankly, if you feel the same way, by all means, use unliftio. I just found it too limiting given the way I write Haskell, plain and simple.

Recap

So ends another long blog post. As often seems the case, I set out to write something short, but I ended up writing well over 5,000 words. I suppose that means I learned something from this experience, too: MonadBaseControl is more complicated than I had anticipated! Maybe there’s something to take away from that.

In any case, it’s over now, so I’d like to briefly summarize what we’ve covered:

MonadBaseControl allows us to lift higher-order monadic operations.
It operates by capturing the current monadic state and explicitly threading it through the action in the base monad before restoring it.
That technique works well for polymorphic operations for the type forall a. b a -> b a, but it can be tricky or even impossible for more complex operations, sometimes leading to discarded state.
This can sometimes be mitigated by restricting certain operations to stateless monads using a StM m a ~ a constraint, or by reimplementing the operation in terms of simpler primitives.
The lifted-base and lifted-async packages provide lifted versions of existing operations, avoiding the need to lift them yourself.

As with many abstractions in Haskell, don’t worry too much if you don’t have a completely firm grasp of MonadBaseControl at first. Insight often comes with repeated experience, and monad-control can still be used in useful ways even without a perfect understanding. My hope is that this blog post has helped you build intuitions about MonadBaseControl even if some of the underlying machinery remains a little fuzzy, and I hope it can also serve as a reference for those who want or need to understand (or just be reminded of) all the little details.

Finally, I’ll admit MonadBaseControl isn’t especially elegant or beautiful as Haskell abstractions go. In fact, in many ways, it’s a bit of a kludge! Perhaps, in time, effect systems will evolve and mature so that it and its ilk are no longer necessary, and they may become distant relics of an inferior past. But in the meantime, it’s here, it’s useful, and I think it’s worth embracing. If you’ve shied away from it in the past, I hope I’ve illuminated it enough to make you consider giving it another try.

One example of a function with that type is mask_. ↩
Types with polymorphic types under type constructors are called impredicative. GHC technically has limited support for impredicativity via the ImpredicativeTypes language extension, but as of GHC 8.8, it has been fairly broken for some time. A fix is apparently being worked on, but even if that effort is successful, I don’t know what impact it will have on type inference. ↩
Note that askRunInBase = liftBaseWith (pure . RunInBase) does not typecheck, as it would require impredicative polymorphism: it would require instantiating the type of (.) with polymorphic types. The version using ($) works because GHC actually has special typechecking rules for ($)! Effectively, f $ x is really syntax in GHC. ↩
Assume that mask' is a suitably lifted version of mask (which can in fact be made state-preserving). ↩

Defeating Racket’s separate compilation guarantee

2019-04-21T00:00:00Z

Being a self-described programming-language programming language is an ambitious goal. To preserve predictability while permitting linguistic extension, Racket comes equipped with a module system carefully designed to accommodate composable and compilable macros. One of the module system’s foundational properties is its separate compilation guarantee, which imposes strong, unbreakable limits on the extent of compile-time side-effects. It is essential for preserving static guarantees in a world where compiling a module can execute arbitrary code, and despite numerous unsafe trapdoors that have crept into Racket since its birth as PLT Scheme, none have ever given the programmer the ability to cheat it.

Yet today, in this blog post, we’re going to do exactly that.

What is the separate compilation guarantee?

Before we get to the fun part (i.e. breaking things), let’s go over some fundamentals so we understand what we’re breaking. The authoritative source for the separate compilation guarantee is the Racket reference, but it is dense, as authoritative sources tend to be. Although I enjoy reading technical manuals for sport, it is my understanding that not all the people who read this blog are as strange as I am, so let’s start with a quick primer, instead. (If you’re already an expert, feel free to skip to the next section.)

Racket is a macro-enabled programming language. In Racket, a macro is a user-defined, code-to-code transformation that occurs at compile-time. These transformations cannot make arbitrary changes to the program—in Racket, they are usually required to be local, affecting a single expression or definition at a time—but they may be implemented using arbitrary code. This means that a macro can, if it so desires, read the SSH keys off your filesystem and issue an HTTP request to send them someplace.

That kind of attack is bad, admittedly, but it’s also uninteresting: Racket allows you do all that and then some, making no attempt to prevent it.¹ Racket calls these “external effects,” things that affect state outside of the programming language. They sound scary, but in practice, internal effects—effects that mutate state inside the programming language—are a much bigger obstacle to practical programming. Let’s take a look at why.

Let’s say we have a module with some global, mutable state. Perhaps it is used to keep track of a set of delicious foods:

;; foods.rkt
#lang racket
(provide delicious-food? add-delicious-food!)

(define delicious-foods (mutable-set))

(define (delicious-food? food)
  (set-member? delicious-foods food))

(define (add-delicious-food! new-food)
  (set-add! delicious-foods new-food))

Using this interface, let’s write a program that checks if a particular food, given as a command-line argument, is delicious:

;; check-food.rkt
#lang racket
(require "foods.rkt")

(add-delicious-food! "pineapple")
(add-delicious-food! "sushi")
(add-delicious-food! "cheesecake")

(command-line
  #:args [food-to-check]
  (if (delicious-food? food-to-check)
      (printf "~a is a delicious food.\n" food-to-check)
      (printf "~a is not delicious.\n" food-to-check)))

$ racket check-food.rkt cheesecake
cheesecake is a delicious food.
$ racket check-food.rkt licorice
licorice is not delicious.

Exhilarating. (Sorry, licorice fans.) But what if a macro were to call add-delicious-food!? What would happen? For example, what if we wrote a macro to add a lot of foods at once?²

(require syntax/parse/define)
(define-simple-macro (add-food-combinations! [fst:string ...]
                                             [snd:string ...])
  #:do [(for* ([fst-str (in-list (syntax->datum #'[fst ...]))]
               [snd-str (in-list (syntax->datum #'[snd ...]))])
          (add-delicious-food! (string-append fst-str " " snd-str)))]
  (void))

; should add “fried chicken,” “roasted chicken”, “fried potato,” and “roasted potato”
(add-food-combinations! ["fried" "roasted"] ["chicken" "potato"])

Now, what do you think executing racket check-food.rkt 'fried chicken' will do?

Clearly, the program should print fried chicken is a delicious food, and indeed, many traditional Lisp systems would happily produce such a result. After all, running racket check-food.rkt 'fried chicken' must load the source code inside check-food.rkt, expand and compile it, then run the result. While the program is being expanded, the compile-time calls to add-delicious-food! should add new elements to the delicious-food set, so when the program is executed, the string "fried chicken" ought to be in it.

But if you actually try this yourself, you will find that isn’t what happens. Instead, Racket rejects the program:

$ racket check-food.rkt 'fried chicken'
check-food.rkt:12:11: add-delicious-food!: reference to an unbound identifier
  at phase: 1; the transformer environment
  in: add-delicious-food!

Why does Racket reject this program? Well, consider that Racket allows programs to be pre-compiled using raco make, doing all the work of macroexpansion and compilation to bytecode ahead of time. Subsequent runs of the program will use the pre-compiled version, without having to run all the macros again. This is a problem, since expanding the add-food-combinations! macro had side-effects that our program depended on!

If Racket allowed the above program, it might do different things depending on whether it was pre-compiled. Running directly from source code might treat 'fried chicken' as a delicious food, while running from pre-compiled bytecode might not. Racket considers this unacceptable, so it disallows the program entirely.

Preserving separate compilation via phases

Hopefully, you are now mostly convinced that the above program is a bad one, but you might have some lingering doubts. You might, for example, wonder if Racket disallows mutable compile-time state entirely. That is not the case—Racket really does allow everything that happens at runtime to happen at compile-time—but it does prevent compile-time and run-time state from ever interacting. Racket stratifies every program into a compile-time part and a run-time part, and it restricts communication between them to limited, well-defined channels (mainly via expanding to code that does something at run-time).

Racket calls this system of stratification phases. Code that executes at run-time belongs to the run-time phase, while code that executes at compile-time (i.e. macros) belongs to the compile-time phase. When a variable is defined, it is always defined in a particular phase, so bindings declared with define can only be used at run-time, while bindings declared with define-for-syntax can only be used at compile-time. Since add-delicious-food! was declared using define, it was not allowed (and in fact was not even visible) in the body of the add-food-combinations! macro.

While the whole macro system could work precisely as just described, such a strict stratification would be incredibly rigid. Since every definition would belong to either run-time or compile-time, but never both, reusing run-time code to implement macros would be impossible. While the example in the previous section might make it seem like that’s a good thing, it very often isn’t: imagine if general-purpose functions like map and filter all needed to be written twice!

To avoid this problem, Racket allows modules to be imported at both run-time and compile-time, so long as it’s done explicitly. Writing (require "some-library.rkt") requires some-library.rkt for run-time code, but writing (require (for-syntax "some-library.rkt")) requires it for compile-time code. Requiring a module for-syntax is sort of like implicitly adjusting all of its uses of define to be define-for-syntax, instead, effectively shifting all the code from run-time to compile-time. This kind of operation is therefore known as phase shifting in Racket terminology.

We can use phase shifting to make the program we wrote compile. If we adjust the require at the beginning of our program, then we can ensure add-delicious-food! is visible to both the run-time and compile-time parts of check-food.rkt:

(require "foods.rkt" (for-syntax "foods.rkt"))

Now our program compiles. However, if you’ve been following everything carefully, you should be wondering why! According to the last section, sharing state between run-time and compile-time fundamentally can’t work without introducing inconsistencies between uncompiled and pre-compiled code. And that’s true—such a thing would cause all sorts of problems, and Racket doesn’t allow it. If you run the program, whether pre-compiled or not, you’ll find it always does the same thing:

$ racket check-food.rkt 'fried chicken'
fried chicken is not delicious.

This seems rather confusing. What happened to the calls to add-delicious-food! inside our add-food-combinations! macro? If we stick a printf inside add-delicious-food!, we’ll find that it really does get called:

(define (add-delicious-food! new-food)
  (printf "Registering ~a as a delicious food.\n" new-food)
  (set-add! delicious-foods new-food))

$ racket check-food.rkt 'fried chicken'
Registering fried chicken as a delicious food.
Registering fried potato as a delicious food.
Registering roasted chicken as a delicious food.
Registering roasted potato as a delicious food.
Registering pineapple as a delicious food.
Registering sushi as a delicious food.
Registering cheesecake as a delicious food.
fried chicken is not delicious.

And in fact, if we pre-compile check-food.rkt, we’ll see that the first four registrations appear at compile-time, exactly as we expect:

$ raco make check-food.rkt
Registering fried chicken as a delicious food.
Registering fried potato as a delicious food.
Registering roasted chicken as a delicious food.
Registering roasted potato as a delicious food.
$ racket check-food.rkt 'fried chicken'
Registering pineapple as a delicious food.
Registering sushi as a delicious food.
Registering cheesecake as a delicious food.
fried chicken is not delicious.

The compile-time registrations really are happening, but Racket is automatically restricting the compile-time side-effects so they only apply at compile-time. After compilation has finished, Racket ensures that compile-time side effects are thrown away, and the run-time code starts over with fresh, untouched state. This guarantees consistent behavior, since it becomes impossible to distinguish at run-time whether a module was just compiled on the fly, or if it was pre-compiled long ago (possibly even on someone else’s computer).

This is the essence of the separate compilation guarantee. To summarize:

Run-time and compile-time are distinct phases of execution, which cannot interact.
Modules can be required at multiple phases via phase shifting, but their state is kept separate. Each phase gets its own copy of the state.
Ensuring that the state is kept separate ensures predictable program behavior, no matter when the program is compiled.

This summary is a simplification of phases in Racket. The full Racket module system does not have only two phases, since macros can also be used at compile-time to implement other macros, effectively creating a separate “compile-time” for the compile-time code. Each compile-time pass is isolated to its own phase, creating a finite but arbitrarily large number of distinct program phases (all but one of which occur at compile-time).

Furthermore, the separate compilation guarantee does not just isolate the state of each phase from the state of other phases but also isolates all compile-time state from the compile-time state of other modules. This ensures that compilation is still deterministic even if modules are compiled in a different order, or if several modules are sometimes compiled individually while other times compiled together all at once.

If you want to learn more, the full details of the module system are described at length in the General Phase Levels section of the Racket Guide, but the abridged summary I’ve described is enough for the purposes of this blog post. If the bulleted list above mostly made sense to you, you’re ready to move on.

How we’re going to break it

The separate compilation guarantee is a sturdy opponent, but it is not without weaknesses. Although no API in Racket, safe or unsafe, allows arbitrarily disabling phase separation, a couple features of Racket are already known to allow limited forms of cross-phase communication.

The most significant of these, and the one we’ll be using as our vector of attack, is the logger. Unlike many logging systems, which are exclusively string-oriented, Racket’s logging interface allows structured logging by associating an arbitrary Racket value with each and every log message. Since it is possible to set up listeners within Racket that receive log messages sent to a particular “topic,” the logger can be used as a communication channel to send values between different parts of a program.

The following program illustrates how this works. One thread creates a listener for all log messages on the topic 'send-me-a-value using make-log-receiver, then uses sync to block until a value is received. Meanwhile, a second thread sends values through the logger using log-message. Together, this creates a makeshift buffered, asynchronous channel:

;; log-comm.rkt
#lang racket

(define t1
  (thread
   (lambda ()
     (define recv (make-log-receiver (current-logger) 'debug 'send-me-a-value))
     (let loop ()
       (println (sync recv))
       (loop)))))

(define t2
  (thread
   (lambda ()
     (let loop ([n 0])
       (log-message (current-logger) 'debug 'send-me-a-value "" n #f)
       (sleep 1)
       (loop (add1 n))))))

(thread-wait t1) ; wait forever

$ racket log-comm.rkt
'#(debug "" 1 send-me-a-value)
'#(debug "" 2 send-me-a-value)
'#(debug "" 3 send-me-a-value)
'#(debug "" 4 send-me-a-value)
^Cuser break

In this program, the value being sent through the logger is just a number, which isn’t very interesting. But the value really can be any value, even arbitrary closures or mutable data structures. It’s even possible to send a channel through a logger, which can subsequently be used to communicate directly, without having to abuse the logger.

Generally, this feature of loggers isn’t very useful, since Racket has plenty of features for cross-thread communication. What’s special about the logger, however, is that it is global, and it is cross-phase.

The cross-phase nature of the logger makes some sense. If a Racket program creates a namespace (that is, a fresh environment for dynamic evaluation), then uses it to expand and compile a Racket module, the process of compilation might produce some log messages, and the calling thread might wish to receive them. It wouldn’t be a very useful logging system if log messages during compile-time were always lost. However, this convenience is a loophole in the phase separation system, since it allows values to flow—bidirectionally—between phases.

This concept forms the foundation of our exploit, but it alone is not a new technique, and I did not discover it. However, all existing uses I know of that use the logger for cross-phase communication require control of the parent namespace in which modules are being compiled, which means some code must exist “outside” the actual program. That technique does not work for ordinary programs run directly with racket or compiled directly with raco make, so to get there, we’ll need something more clever.

The challenge

Our goal, therefore, is to share state between phases without controlling the compilation namespace. More precisely, we want to be able to create an arbitrary module-level definition that is cross-phase persistent, which means it will be evaluated once and only once no matter how many times its enclosing module is re-instantiated (i.e. given a fresh, untouched state) at various phases. A phase-shifted require of the module that contains the definition should share state with an unshifted version of the module, breaking the separate compilation guarantee wide open.

To use the example from the previous section, we should be able to adjust foods.rkt very slightly…

;; foods.rkt
#lang racket
(require "define-cross-phase.rkt")
(provide delicious-food? add-delicious-food!)

; share across phases
(define/cross-phase delicious-foods (mutable-set))

#| ... |#

…and the delicious-foods mutable state should magically become cross-phase persistent. When running check-food.rkt from source, we should see the side-effects persisted from the module’s compilation, while running from pre-compiled bytecode should give us the result with compile-time effects discarded.

We already know the logger is going to be part of our exploit, but implementing define/cross-phase on top of it is more subtle than it might seem. In our previous example that used make-log-receiver, we had well-defined sender and receiver threads, but who is the “sender” in our multi-phased world? And what exactly is the sender sending?

To answer those questions, allow me to outline the general idea of our approach:

The first time our foods.rkt module is instantiated, at any phase, it evaluates the (mutable-set) expression to produce a new mutable set. It spawns a sender thread that sends this value via the logger to anyone who will listen, and that thread lingers in the background for the remaining duration of the program.
All subsequent instantiations of foods.rkt do not evaluate the (mutable-set) expression. Instead, they obtain the existing set by creating a log receiver and obtaining the value the sender thread is broadcasting. This ensures that a single value is shared across all instantiations of the module.

This sounds deceptively simple, but the crux of the problem is how to determine whether foods.rkt has previously been instantiated or not. Since we can only communicate across phases via the logger, we cannot use any shared state to directly record the first time the module is instantiated. We can listen to a log receiver and wait to see if we get a response, but this introduces a race condition: how long do we wait until giving up and deciding we’re the first instantiation? Worse, what if two threads instantiate the module at the same time, and both threads end up spawning a new sender thread, duplicating the state?

The true challenge, therefore, is to develop a protocol by which we can be certain we are the first instantiation of a module, without relying on any unspecified behavior, and without introducing any race conditions. This is possible, but it isn’t obvious, and it requires combining loggers with some extra tools available to the Racket programmer.

The key idea

It’s finally time to tackle the key idea at the heart of our exploit: garbage collection. In Racket, garbage collection is an observable effect, since Racket allows attaching finalizers to values via wills and executors. Since a single heap is necessarily shared by the entire VM, behavior happening on other threads (even in other phases) can be indirectly observed by creating a unique value—a “canary”—then sending it to another thread, and waiting to see if it will be garbage collected or not (that is, whether or not the canary “dies”).

Remember that logs and log receivers are effectively buffered, multicast, asynchronous FIFO channels. Since they are buffered, if any thread is already listening to a logger topic when a value is sent, it cannot possibly be garbage collected until that thread either reads it and discards it or the receiver itself is garbage collected. It’s possible to use this mechanism to observe whether or not another thread is already listening on a topic, as the following program demonstrates:³

;; check-receivers.rkt
#lang racket

(define (check-receivers topic)
  (define executor (make-will-executor))
  ; limit scope of `canary` so we don’t retain a reference
  (let ()
    (define canary (gensym 'canary))
    (will-register executor canary void)
    (log-message (current-logger) 'debug topic "" canary #f))
  (if (begin
        (collect-garbage)
        (collect-garbage)
        (collect-garbage)
        (sync/timeout 0 executor))
      (printf "no receivers for ~v\n" topic)
      (printf "receiver exists for ~v\n" topic)))

; add a receiver on topic 'foo
(define recv (make-log-receiver (current-logger) 'debug 'foo))

(define t1 (thread (λ () (check-receivers 'foo))))
(define t2 (thread (λ () (check-receivers 'bar))))

(thread-wait t1)
(thread-wait t2)

$ racket check-receivers.rkt
no receivers for 'bar
receiver exists for 'foo

However, this program has some problems. For one, it needs to call collect-garbage several times to be certain that the canary will be collected if there are no listeners, which can take a second or two, and it also assumes that three calls to collect-garbage will be enough to collect the canary, though there is no guarantee that will be true.

A bulletproof solution should be both reasonably performant and guaranteed to work. To get there, we have to combine this idea with something more. Here’s the trick: instead of sending the canary alone, send a channel alongside it. Synchronize on both the canary’s executor and the channel so that the thread will unblock if either the canary is collected or the channel is received and sent a value using channel-put. Have the receiver listen for the channel on a separate thread, and when it receives it, send a value back to unblock the waiting thread as quickly as possible, without needing to rely on a timeout or a particular number of calls to collect-garbage.

Using that idea, we can revise the program:

;; check-receivers.rkt
#lang racket

(define (check-receivers topic)
  (define chan (make-channel))
  (define executor (make-will-executor))
  ; limit scope of `canary` so we don’t retain a reference
  (let ()
    (define canary (gensym 'canary))
    (will-register executor canary void)
    (log-message (current-logger) 'debug topic ""
                 ; send the channel + the canary
                 (vector-immutable chan canary) #f))
  (if (let loop ([n 0])
        (sleep) ; yield to try to let the receiver thread work
        (match (sync/timeout 0
                             (wrap-evt chan (λ (v) 'received))
                             (wrap-evt executor (λ (v) 'collected)))
          ['collected #t]
          ['received  #f]
          [_ ; collect garbage and try again
           (collect-garbage (if (< n 3) 'minor 'major))
           (loop (add1 n))]))
      (printf "no receivers for ~v\n" topic)
      (printf "receiver exists for ~v\n" topic)))

; add a receiver on topic 'foo
(define recv (make-log-receiver (current-logger) 'debug 'foo))
(void (thread
       (λ ()
         (let loop ()
           (match (sync recv)
             [(vector _ _ (vector chan _) _)
              (channel-put chan #t)
              (loop)])))))

(define t1 (thread (λ () (check-receivers 'foo))))
(define t2 (thread (λ () (check-receivers 'bar))))

(thread-wait t1)
(thread-wait t2)

Now the program completes almost instantly. For this simple program, the explicit (sleep) call is effective enough at yielding that, on my machine, (check-receivers 'foo) returns without ever calling collect-garbage, and (check-receivers 'bar) returns after performing a single minor collection.

This is extremely close to a bulletproof solution, but there are two remaining subtle issues:

There is technically a race condition between the (sync recv) in the receiver thread and the subsequent channel-put, since it’s possible for the canary to be received, discarded, and garbage collected before reaching the call to channel-put, which the sending thread would incorrectly interpret as indicating the topic has no receivers.
To fix that, the receiver thread can send the canary itself back through the channel, which fundamentally has to work, since the value cannot be collected until it has been received by the sending thread, at which point the sync has already chosen the channel.
It is possible for the receiver thread to receive the log message and call channel-put, but for the sending thread to somehow die in the meantime (which cannot be protected against in general in Racket, since thread-kill immediately and forcibly terminates a thread). If this were to happen, the sending thread would never obtain the value from the channel, blocking the receiving thread indefinitely.
A solution is to spawn a new thread for each channel-put instead of calling it directly from the receiving thread. Conveniently, this both ensures the receiving thread never gets stuck and avoids resource leaks, since the Racket runtime is smart enough to GC a thread blocked on a channel that has no other references and therefore can never be unblocked.

With those fixes in place, the program is, to the best of my knowledge, bulletproof. It will always correctly determine whether or not a logger has a listener, with no race conditions or reliance upon unspecified behavior of the Racket runtime. It does, however, make a couple of assumptions.

First, it assumes that the value of (current-logger) is shared between the threads. It is true that (current-logger) can be changed, and sometimes is, but it’s usually done via parameterize, not mutation of the parameter directly. Therefore, this can largely be mitigated by storing the value of (current-logger) at module instantiation time.

Second, it assumes that no other receivers are listening on the same topic. Technically, even using a unique, uninterned key for the topic is insufficient to ensure that no receivers are listening to it, since a receiver can choose to listen to all topics. However, in practice, it is highly unlikely that any receiver would willfully choose to listen to all topics at the 'debug level, since the receiver would be inundated with enormous amounts of useless information. Even if such a receiver were to be created, it is highly likely that it would dequeue the messages as quickly as possible and discard the accompanying payload, since doing otherwise would cause all messages to be retained in memory, leading to a significant memory leak.

Both these problems can be mitigated by using a logger other than the root logger, which is easy in this example. However, for the purpose of subverting the separate compilation guarantee, we would have no way to share the logger object itself across phases, defeating the whole purpose, so we are forced to use the root logger and hope the above two assumptions remain true (but they usually do).

Preparing the exploit

If you’ve made it here, congratulations! The most difficult part of this blog post is over. All that’s left is the fun part: performing the exploit.

The bulk of our implementation is a slightly adapted version of check-receivers:

;; define-cross-phase.rkt
#lang racket

(define root-logger (current-logger))

(define (make-cross-phase topic thunk)
  (define receiver (make-log-receiver root-logger 'debug topic))
  (define chan (make-channel))
  (define executor (make-will-executor))

  (let ()
    (define canary (gensym 'canary))
    (will-register executor canary (λ (v) 'collected))
    (log-message root-logger 'debug topic ""
                 (vector-immutable canary chan) #f)
    (let loop ()
      (match (sync receiver)
        [(vector _ _ (vector _ (== chan eq?)) _)
         (void)]
        [_
         (loop)])))

  (define execute-evt (wrap-evt executor will-execute))
  (define result (let loop ([n 0])
                   (sleep)
                   (or (sync/timeout 0 chan execute-evt)
                       (begin
                         (collect-garbage (if (< n 3) 'minor 'major))
                         (loop (add1 n))))))

  (match result
    [(vector _ value)
     value]
    ['collected
     (define value (thunk))
     (thread
      (λ ()
        (let loop ()
          (match (sync receiver)
            [(vector _ _ (vector canary chan) _)
             (thread (λ () (channel-put chan (vector-immutable canary value))))
             (loop)]))))
     value]))

There are a few minor differences, which I’ll list:

The most obvious difference is that make-cross-phase does the work of both checking if a receiver exists—which I’ll call the manager thread—and spawning it if it doesn’t. If it does end up spawning a manager thread, it evaluates the given thunk to produce a value, which becomes the cross-phase value that will be sent through the channel alongside the canary.
Once the manager thread is created, subsequent calls to make-cross-phase will receive the value through the channel and return it instead of re-invoking thunk. This is what ensures the right-hand side of each use of define/cross-phase is only ever evaluated once.
Since make-cross-phase needs to create a log receiver if no manager thread exists, it does so immediately, before sending the canary through the logger. This avoids a race condition between multiple threads that are simultaneously competing to become the manager thread, where both threads could send a canary through the logger before either was listening, both canaries would get GC’d, and both threads would spawn a new manager.⁴
Creating the receiver before sending the canary avoids this problem, but the thread now needs to receive its own canary and discard it before synchronizing on the channel and executor, since otherwise it will retain a reference to the canary. It’s possible that in between creating the receiver and sending the canary, another thread also sent a canary, so it needs to drop any log messages it finds that don’t include its own canary.
This ends up working out perfectly, since every thread drops all the messages received before the one containing its own canary, but retains all subsequent values. This means that only one thread can ever “win” and become the manager, since the first thread to send a canary is guaranteed to retain all subsequent canaries, yet also guaranteed its canary will be GC’d. Other threads racing to become the manager will remain blocked until the manager thread is created, since its canaries will be retained by the manager-to-be until it dequeues them.
(This is the most subtle part of the process to get right, but conveniently, it mostly just works out without very much code. If you didn’t understand any of the above three paragraphs, it isn’t a big deal.)

The final piece to this puzzle is to define the define/cross-phase macro that wraps make-cross-phase. The macro is actually slightly more involved than just generating a call to make-cross-phase directly, since we’d like to use an uninterned symbol for the topic instead of an interned one, just to ensure it is unique. Ordinarily, this might seem impossible, since an uninterned symbol is fundamentally a unique value that needs to be communicated across phases, and the whole problem we are solving is creating a communication channel that spans phases. However, Racket actually provides some built-in support for sharing uninterned symbols across phases (plus some other kinds of values, but they must always be immutable). To do this, we need to generate a cross-phase persistent submodule that exports an uninterned symbol, then pass that symbol as the topic to make-cross-phase:

(require (for-syntax racket/syntax)
         syntax/parse/define)

(provide define/cross-phase)

(define-simple-macro (define/cross-phase x:id e:expr)
  #:with topic-mod-name (generate-temporary 'cross-phase-topic-key)
  (begin
    (module topic-mod-name '#%kernel
      (#%declare #:cross-phase-persistent)
      (#%provide topic)
      (define-values [topic] (gensym "cross-phase")))
    (require 'topic-mod-name)
    (define x (make-cross-phase topic (λ () e)))))

And that’s really it. We’re done.

Executing the exploit

With our implementation of define/cross-phase in hand, all that’s left to do is run our original check-foods.rkt program and see what happens:

$ racket check-food.rkt 'fried chicken'
set-add!: contract violation:
expected: set?
given: (mutable-set "fried chicken" "roasted chicken" "roasted potato" "fried potato")
argument position: 1st
other arguments...:
  x: "pineapple"

Well, I don’t know what you expected. Play stupid games, win stupid prizes.

This error actually makes sense, but it belies one reason (of many) why this whole endeavor is probably a bad idea. Although we’ve managed to make our mutable set cross-phase persistent, our references to set operations like set-add! and set-member? are not, and every time racket/set is instantiated in a fresh phase, it creates an entirely new instance of the set structure type. This means that even though we have a bona fide mutable set, it isn’t actually the type of set that this phase’s set-add! understands!

Of course, this isn’t a problem that some liberal application of define/cross-phase can’t solve:

;; foods.rkt
#lang racket
(require "define-cross-phase.rkt")
(provide delicious-food? add-delicious-food!)

(define/cross-phase cross:set-member? set-member?)
(define/cross-phase cross:set-add! set-add!)

(define/cross-phase delicious-foods (mutable-set))

(define (delicious-food? food)
  (cross:set-member? delicious-foods food))

(define (add-delicious-food! new-food)
  (cross:set-add! delicious-foods new-food))

$ racket check-food.rkt 'fried chicken'
fried chicken is a delicious food.
$ raco make check-food.rkt
$ racket check-food.rkt 'fried chicken'
fried chicken is not delicious.

And thus we find that another so-called “guarantee” isn’t.

Reflection

Now comes the time in the blog post when I have to step back and think about what I’ve done. Have mercy.

Everything in this blog post is a terrible idea. No, you should not use loggers for anything that isn’t logging, you shouldn’t use wills and executors for critical control flow, and obviously you should absolutely not intentionally break one of the most helpful guarantees the Racket module system affords you.

But I thought it was fun to do all that, anyway.

The meaningful takeaways from this blog post aren’t that the separate compilation guarantee can be broken, nor that any of the particular techniques I used hold, but that

ensuring non-trivial guarantees is really hard,
despite that, the separate compilation guarantee is really, really hard to break,
the separate compilation guarantee is good, and you should appreciate the luxury it affords you while writing Racket macros,
avoiding races in a concurrent environment can be extremely subtle,
and Racket is totally awesome for giving me this much rope to hang myself with.

If you want to hang yourself with Racket, too, runnable code from this blog post is available here.

This isn’t strictly true, since Racket provides sandboxing mechanisms that can compile and execute untrusted code without file system or network access, but this is not the default compilation mode. Usually, it doesn’t matter nearly as much as it might sound: most of the time, if you’re compiling untrusted code, you’re also going to run it, and running untrusted code can do all those things, anyway. ↩
This is actually a terrible use case for a macro, since an ordinary function would do just fine, but I’m simplifying a little to keep the example small. ↩
Racket actually provides this functionality directly via the log-level? procedure. However, since log-level? provides no way to determine how many receivers are listening to a topic, using it to guard against creating a receiver is vulnerable to a race condition that the garbage collection-based approach can avoid, as is discussed later. Furthermore, the GC technique is more likely to be resilient to nosy log receivers listening on all topics at the 'debug level, since they will almost certainly dequeue and discard the value quickly (as otherwise they would leak large quantities of memory). ↩
This race is the one that makes using log-level? untenable, since the receiver needs to be created before the topic is checked for listeners to avoid the race, which can’t be done with log-level? (since it would always return #t). ↩

Macroexpand anywhere with local-apply-transformer!

2018-10-06T00:00:00Z

Racket programmers are accustomed to the language’s incredible capacity for extension and customization. Writing useful macros that do complicated things is easy, and it’s simple to add new syntactic forms to meet domain-specific needs. However, it doesn’t take long before many budding macrologists bump into the realization that only certain positions in Racket code are subject to macroexpansion.

To illustrate, consider a macro that provides a Clojure-style let form:

(require syntax/parse/define)

(define-simple-macro (clj-let [{~seq x:id e:expr} ...] body:expr ...+)
  (let ([x e] ...) body ...))

This can be used anywhere an expression is expected, and it does as one would expect:

> (clj-let [x 1
            y 2]
    (+ x y))
3

However, a novice macro programmer might realize that clj-let really only modifies the syntax of binding pairs for a let form. Therefore, could one define a macro that only adjusts the binding pairs of some existing let form instead of expanding to an entire let? That is, could one write the above example like this:

(define-simple-macro (clj-binding-pairs [{~seq x:id e:expr} ...])
  ([x e] ...))

> (let (clj-binding-pairs
        [x 1
         y 2])
    (+ x y))
3

The answer is no: the binding pairs of a let form are not subject to macroexpansion, so the above attempt fails with a syntax error. In this blog post, we will examine the reasons behind this limitation, then explain how to overcome it using a solution that allows macroexpansion anywhere in a Racket program.

Why only some positions are subject to macroexpansion

To understand why the macroexpander refuses to touch certain positions in a program, we must first understand how the macro system operates. In Racket, a macro is defined as a compile-time function associated with a particular binding, and macros are given complete control over the syntax trees they are surrounded with. If we define a macro mac, then we write the expression (mac form), form is provided as-is to mac as a syntax object. Its structure can be anything at all, since mac can be an arbitrary Racket function, and that function can use form however it pleases.

To give a concrete illustration, consider a macro that binds some identifiers to symbols in a local scope:

(define-simple-macro (let-symbols (x:id ...) body ...+)
  (let ([x 'x] ...) body ...))

> (let-symbols (hello goodbye)
    (list hello goodbye))
'(hello goodbye)

It isn’t the most exciting macro in the world, but it illustrates a key point: the first subform to let-symbols is a list of identifiers that are eventually put in binding position. This means that hello and goodbye are bindings, not uses, and such bindings shadow any existing bindings that might have been in scope:

> (let ([foo 42])
    (let-symbols (foo)
      foo))
'foo

This might not seem very interesting, but it’s critical to understand, since it means that the expander can’t know which sub-pieces of a use of let-symbols will eventually be expressions themselves until it expands the macro and discovers it produces a let form, so it can’t know where it’s safe to perform macroexpansion. To make this more explicit, imagine we define a macro under some name, then try and use that name with our let-symbols macro:

(define-simple-macro (hello x:id)
  (x:id))

> (let-symbols (hello goodbye)
    hello)

What should the above program do? If we treat the first use of hello in the let-symbols form as a macro application, then (hello goodbye) should be transformed into (goodbye), and the use of hello in the body should be a syntax error. But if the first use of hello was instead intended to be a binder, then it should shadow the hello definition above, and the output of the program should be 'hello.

To avoid the chaos that would ensue if defining a macro could completely break local reasoning about other macros, Racket chooses the second option, and the program produces 'hello. The macroexpander has no way of knowing how each macro will inspect its constituent pieces, so it avoids touching anything until the macro expands. After it discovers the let form in the expansion of let-symbols, it can safely determine that the body expressions are, indeed, expressions, and it can recursively expand any macros they contain. To put things another way, a macro’s sub-forms are never expanded before the macro itself is expanded, only after.

Forcing sub-form expansion

The above section explains why the expander must operate as it does, but it’s a little bit unsatisfying. What if we write a macro where we want certain sub-forms to be expanded before they are passed to us? Fortunately, the Racket macro system provides an API to handle this use case, too.

It is true that the Racket macro system never automatically expands sub-forms before outer forms are expanded, but macro transformers can explicitly op-in to recursive expansion via the local-expand function. This function effectively yields control back to the expander to expand some arbitrary piece of syntax as an expression, and when it returns, the macro transformer can inspect the expanded expression however it wishes. In theory, this can be used to implement extensible macros that allow macroexpansion in locations other than expression position.

To give an example of such a macro, consider the Racket match form, which implements an expressive pattern-matcher as a macro. One of the most interesting qualities of Racket’s match macro is that its pattern language is user-extensible, essentially allowing pattern-level macros. For example, a user might find they frequently match against natural numbers, and they wish to be able to write (nat n) as a shorthand for (? exact-nonnegative-integer? n). Fortunately, this is easy using define-match-expander:

(define-match-expander nat
  (syntax-parser
    [(_ pat)
     #'(? exact-nonnegative-integer? pat)]))

> (match '(-5 -2 4 -7)
    [(list _ ... (nat n) _ ...)
     n])
4

Clearly, match is somehow expanding the nat match expander as a part of its expansion. Is it using local-expand?

Well, no. While a previous blog post of mine has illustrated that it is possible to do such a thing with local-expand via some clever trickery, local-expand is really designed to expand expressions. This is a problem, since (nat n) is not an expression, it’s a pattern: it will expand into (? exact-nonnegative-integer? n), which will lead to a syntax error, since ? is not bound in the world of expressions. Instead, for a long while, match and forms like it have emulated how the expander performs macroexpansion in ad-hoc ways. Fortunately, as of Racket v7.0, the new local-apply-transformer API provides a way to invoke recursive macroexpansion in a consistent way, and it doesn’t assume that what’s being expanded is an expression.

A closer look at `local-apply-transformer`

If local-apply-transformer is the answer, what does it actually do? Well, local-apply-transformer allows explicitly invoking a transformer function on some piece of syntax and retrieving the result. In other words, local-apply-transformer allows expanding an arbitrary macro, but since it doesn’t make any assumptions about what the output will be, it only expands it once: just a single step of macro transformation.

To illustrate, we can write a macro that uses local-apply-transformer to invoke a transformer function and preserve the result using quote-syntax:

(require (for-syntax syntax/apply-transformer))

(define-for-syntax flip
  (syntax-parser
    [(a b more ...)
     #'(b a more ...)]))

(define-simple-macro (mac)
  #:with result (local-apply-transformer flip #'(([x 1]) let x) 'expression)
  (quote-syntax result))

When we use mac, our flip function will be applied, as a macro, to the syntax object we provide:

> (mac)
#<syntax (let ((x 1)) x)>

Alright, so this works, but it raises some questions. Why is flip defined as a function at phase 1 (using define-for-syntax) instead of as a macro (using define-syntax)? What’s the deal with the 'expression argument to local-apply-transformer given that local-apply-transformer is supposedly decoupled from expression expansion? And finally, how is this any different from just calling our flip function on the syntax object directly by writing (flip #'(([x 1]) let x))?

Let’s start with the first of those questions: why is flip defined as a function rather than as a macro? Well, local-apply-transformer is a fairly low-level operation: remember, it doesn’t assume anything about the argument it’s given! Therefore, it doesn’t take an expression containing a macro and expand it based on its structure, it needs to be explicitly provided the macro transformer function to apply. In practice, this might not seem very useful, since presumably we want to write our macros as macros, not as phase 1 functions. Fortunately, it’s possible to look up the function associated with a macro binding using the syntax-local-value function, so if we use that, we can define flip using define-syntax as usual:

(define-syntax flip
  (syntax-parser
    [(a b more ...)
     #'(b a more ...)]))

(define-simple-macro (mac)
  #:with result (local-apply-transformer (syntax-local-value #'flip)
                                         #'(([x 1]) let x)
                                         'expression)
  (quote-syntax result))

Now for the next question: what is the meaning of the 'expression argument? This one is more of a historical artifact than anything else: when the expander applies a macro transformer, it does it in a “context”, which is accessible using the syntax-local-context function. This context can be one of a predefined enumeration of cases, including 'expression, 'top-level, 'module, 'module-begin, or a list representing a definition context. Whether or not any of those actually apply to our use case, we still have to pick one, but aside from how they affect the value returned by syntax-local-context (which some macros inspect), the value we choose is largely irrelevant. Using 'expression will do, even if it’s a bit of a lie.

Finally, how does any of this differ from just applying the function we get directly? Well, the critical answer is all about hygiene. Racket’s macro system is hygienic, which, among other things, ensures bindings defined with the same name in different places do not unintentionally conflict. Racket’s hygiene mechanism is implemented in the macroexpander, when macro transformers are applied. If we just applied the flip transformer procedure to a syntax object directly, we would circumvent this hygiene mechanism, potentially causing all sorts of problems. By using local-apply-transformer, we ensure hygiene is preserved.

There is one small problem left with our program, however. Can you spot it? The key is to consider what would happen if we used flip as an ordinary macro, without using local-apply-transformer:

> (flip (([x 1]) let x))
let: bad syntax
  in: let

What happened? Well, remember that when a macro in Racket is used, it receives the whole use site as a syntax object: in this case, #'(flip (([x 1]) let x)). This means that flip ought to be written to parse its argument slightly differently:

(define-syntax flip
  (syntax-parser
    [(_ (a b more ...))
     #'(b a more ...)]))

Indeed, now that we’ve properly restructured the macro, we can easily switch to using the convenient define-simple-macro shorthand:

(define-simple-macro (flip (a b more ...))
  (b a more ...))

This means we also need to update our definition of mac to provide the full syntax object the expander would:

(define-simple-macro (mac)
  #:with result (local-apply-transformer (syntax-local-value #'flip)
                                         #'(flip (([x 1]) let x))
                                         'expression)
  (quote-syntax result))

This might seem redundant, but remember, local-apply-transformer is very low-level! While the convention that (mac . _) is the syntax for a macro transformation might seem obvious, local-apply-transformer makes no assumptions. It just does what we tell it to do.

Applying `local-apply-transformer`

So what does local-apply-transformer have to do with the problem at the beginning of this blog post? Well, as it happens, we can use local-apply-transformer to implement a macro that allows expansion anywhere using some simple tricks. While it’s true that we cannot magically divine which locations ought to be expanded, what we can do is explicitly annotate which places to expand.

To do this, we will implement a macro, expand-inside, that looks for subforms annotated with a special $expand identifier and performs macro transformation on those locations before proceeding with ordinary macroexpansion. Using the clj-binding-pairs example from the beginning of this blog post, our solution to that problem will look like this:

(define-simple-macro (clj-binding-pairs [{~seq x:id e:expr} ...])
  ([x e] ...))

> (expand-inside
   (let ($expand
         (clj-binding-pairs
          [x 1
           y 2]))
     (+ x y)))
3

Put another way, expand-inside will force eager expansion on any subform surrounded with an $expand annotation.

We’ll start by defining the $expand binding itself. This binding won’t mean anything at all outside of expand-inside, but we’d like it to be a unique binding so that users can rename it (using, rename-in, for example) if they wish. To do this, we’ll use the usual trick of defining it as a macro that always produces an error if it’s ever used:

(define-syntax ($expand stx)
  (raise-syntax-error #f "illegal outside an ‘expand-inside’ form" stx))

Next, we’ll implement a syntax class that will form the bulk of our implementation of expand-inside. Since we need to find uses of $expand that might be deeply-nested inside the syntax object provided to expand-inside, we need to recursively look through the syntax object, find any instances of $expand, and put it all back together once we’re done. This can be done relatively cleanly using a recursive syntax class:

(begin-for-syntax
  (define-syntax-class do-expand-inside
    #:literals [$expand]
    #:attributes [expansion]
    [pattern {~or $expand ($expand . _)}
             #:with :do-expand-inside (do-$expand this-syntax)]
    [pattern (a:do-expand-inside . b:do-expand-inside)
             #:attr expansion
             (let ([reassembled (cons (attribute a.expansion)
                                      (attribute b.expansion))])
               (if (syntax? this-syntax)
                   (datum->syntax this-syntax reassembled
                                  this-syntax this-syntax)
                   reassembled))]
    [pattern _ #:attr expansion this-syntax]))

There are some tricky details to get right in the reassembly of pairs, since syntax lists are actually composed of ordinary pairs rather than syntax pairs, but ultimately, the code for walking a syntax object is small. The key case of this syntax class is the call to do-$expand in the first clause, which we have not yet defined. This function will actually handle performing the expansion by invoking local-apply-transformer:

(begin-for-syntax
  (define (do-$expand stx)
    (syntax-parse stx
      [(_ {~and form {~or trans (trans . _)}})
       #:declare trans (static (disjoin procedure? set!-transformer?)
                               "syntax transformer")
       (local-apply-transformer (attribute trans.value)
                                #'form
                                'expression)])))

This uses the handy static syntax class that comes with syntax/parse, which implicitly handles the call to syntax-local-value and produces a nice error message if the value returned does not match a predicate. All we have to do is apply the transformer value bound to the trans.value attribute using local-apply-transformer, and now the expand-macro can be written in just a couple lines of code:

(define-syntax-parser expand-inside
  #:track-literals
  [(_ form:do-expand-inside) #'form.expansion])

(Using the #:track-literals option, also new in Racket v7.0, ensures that Check Syntax will be able to recognize the uses of $expand that disappear from after expand-inside is expanded.)

Putting everything together, our example from above really works:

(define-simple-macro (clj-binding-pairs [{~seq x:id e:expr} ...])
  ([x e] ...))

> (expand-inside
   (let ($expand
         (clj-binding-pairs
          [x 1
           y 2]))
     (+ x y)))
3

That’s it. All told, the entire implementation is only about 30 lines of code. For a full, compilable, working example, see this gist.

Custom core forms in Racket, part II: generalizing to arbitrary expressions and internal definitions

2018-09-13T00:00:00Z

In my previous blog post, I covered the process involved in creating a small language with a custom set of core forms. Specifically, it discussed what was necessary to create Hackett’s type language, which involved expanding to custom expressions. While somewhat involved, Hackett’s type language was actually a relatively simple example to use, since it only made use of a subset of the linguistic features Racket supports. In this blog post, I’ll demonstrate how that same technique can be generalized to support runtime bindings and internal definitions, two key concepts useful if intending to develop a more featureful language than Hackett’s intentionally-restrictive type system.

What are internal definitions?

This blog post is going to be largely focused on how to properly implement a form that handles the expansion of internal definitions in Racket. This is a tricky topic to get right, but before we can discuss internal definitions, we have to establish what definitions themselves are and how they relate to other binding forms.

In a traditional Lisp, there are two kinds of bindings: top-level bindings and local bindings. In Scheme and its descendants, this distinction is characterized by two different binding forms, define and let. To a first approximation, define is used for defining top-level, global bindings, and it resembles variable definitions in many mainstream languages in the sense that definitions using define are not really expressions. They don’t produce a value, they define a new binding. Definitions written with define look like this:

(define x 42)
(define y "hello")

Each definition is made up of two parts: the binding identifier, in this case x and y, and the right hand side, or RHS for short. Each RHS is a single expression that will be evaluated and used as the value for the introduced binding.

In Scheme and Racket, define also supports a shorthand form for defining functions in a natural syntax without the explicit need to write lambda, which looks like this:

(define (double x)
  (* x 2))

However, this is just syntactic sugar. The above form is really just a macro for the following equivalent, expanded version:

(define double
  (lambda (x) (* x 2)))

Since we only care about fully-expanded programs, we’ll focus exclusively on the expanded version of define in this blog post, since if we handle that, we’ll also handle the function shorthand’s expansion.

In contrast to define, there is also let, which has a rather different shape. A let form is an expression, and it creates local bindings in a delimited scope:

(let ([x 2]
      [y 3])
  (+ x y))

The binding clauses of a let expression are known as the binding pairs, and the sequence of expressions afterwards are known as the body of the let. Each binding pair consists of a binding identifier and a RHS, just like a top-level definition created with define, but while define is a standalone form, the binding pairs cannot meaningfully exist outside of a let—they are recognized as part of the grammar of the let form itself.

Like other Lisps, Racket distinguishes between top-level—or, more precisely, module-level—bindings and local bindings. A module-level binding can be exported using provide, which will allow other modules to access the binding by importing the module with require. Such definitions are treated specially by the macroexpander, compiler, and runtime system alike. There is a pervasive, meaningful difference between module-level definitions and local definitions besides simply scope.

I am making an effort to make this as clear as possible before discussing internal definitions because without it, the following point can be rather confusing: internal definitions are written using define, but they are local bindings, not module-level ones! In Racket, define is allowed to appear in the body of virtually all block forms like let, so the following is a legal program:

(let ()
  (define x 2)
  (define y 3)
  (+ x y))

This program is equivalent to the one expressed using let. In fact, when the Racket macroexpander expands these local uses of define, it actually translates them into uses of letrec. After expanding the above expression, it would look closer to the following:

(let ()
  (letrec ([x 2]
           [y 3])
    (+ x y)))

In this sense, define is a form with a double life in Racket. When used at the module level, it creates module-level definitions, which remain in a fully-expanded program and can be imported by other modules. When used inside local blocks, it creates internal definitions, which do not remain in fully expanded programs, since they are translated into recursive local binding forms.

In this blog post, we will ignore module-level definitions. Like in the previous blog post, we will focus exclusively on expanding expressions, not whole modules. However, we will extend our language to allow internal definitions inside local binding forms, and we will translate them into letrec forms in the same way as the Racket macroexpander.

Revisiting and generalizing the expression expander

In the previous blog post, our expander expanded types, which were essentially expressions from the perspective of the Racket macroexpander. We wrote a syntax class that handled the parsing of a restricted type grammar that disallowed most Racket-level expression forms, like begin, if, #%plain-lambda, and quote. After all, Hackett is not dependently-typed, and it disallows explicit type abstraction to preserve type inference, so it would be a very bad thing if we allowed if or explicit lambda abstraction to appear in our types. For this blog post, however, we will restructure the type expander to handle the full grammar of expressions permitted by Racket.

While the syntax class approach used in the previous blog post was cute, this blog post will use ordinary functions defined at phase 1 instead of syntax classes. In practice, this provides superior error reporting, since it reports syntax errors in terms of the form that went wrong, not the form prior to expansion. Since we can still use syntax-parse to parse the arguments to these functions, we don’t lose any expressive power in the expression of our pattern language.

To start, we’ll extract the call to local-expand into its own function. This corresponds to the type syntax class from the previous blog post, but we’ll use phase 1 parameters to avoid threading so many explicit function arguments around:

(begin-for-syntax
  (define current-context (make-parameter #f))
  (define current-stop-list (make-parameter (list #'define-values #'define-syntaxes)))
  (define current-intdef-ctx (make-parameter #f))

  (define (current-expand stx)
    (local-expand stx
                  (current-context)
                  (current-stop-list)
                  (current-intdef-ctx))))

Due to the way local-expand implicitly extends the stop list, as discussed in the previous blog post, we can initialize the stop list to a list containing just define-values and define-syntaxes, and the other forms we care about will be included automatically.

Next, we’ll use this function to implement a expand-expression function, which will emulate the way the expander expands a single expression, as the name implies. We’ll ignore any custom core forms for now, so we’ll just focus exclusively on the Racket core forms.

A few of Racket’s core forms are not actually subject to any expansion at all, and they expand to themselves. These forms are quote, quote-syntax, and #%variable-reference. Additionally, #%top is not something useful to handle ourselves, since it involves no recursive expansion, so we’ll treat it as if it expands to itself as well and allow the expander to raise any unbound identifier errors it produces. Here’s what the expand-expression function looks like when exclusively handling these things:

(define (expand-expression stx)
    (syntax-parse (parameterize ([current-context 'expression])
                    (current-expand stx))
      #:literal-sets [kernel-literals]
      [({~or quote quote-syntax #%top #%variable-reference} ~! . _)
       this-syntax]))

Another set of Racket core forms are simple expressions which contain subforms, all of which are themselves expressions. These forms include things like #%expression, begin, and if, and they can be expanded recursively. We’ll add another clause to handle these, which can be written with a straightforward recursive call to expand-expression:

[({~and head {~or #%expression #%plain-app begin begin0 if with-continuation-mark}} ~! form ...)
 #:with [form* ...] (map expand-expression (attribute form))
 (syntax/loc/props this-syntax
   (head form* ...))]

Another easy form to handle is set!, since it also requires simple recursive expansion, but it can’t be handled in the same way as the above forms since one of its subforms (the variable to mutate) should not be expanded. It needs another small clause:

[(head:set! ~! x:id rhs)
 (quasisyntax/loc/props this-syntax
   (head x #,(expand-expression #'rhs)))]

The other expressions are harder, since they’re all the binding forms. Fully-expanded Racket code has four local binding forms: #%plain-lambda, case-lambda, let-values, and letrec-values. Additionally, as discussed in the previous blog post, local-expand can also produce letrec-syntaxes+values forms produced by local syntax bindings. In the type expander, we completely disallowed runtime bindings from appearing in the resulting program, so we completely removed letrec-syntaxes+values in our expansion, but in the case of handling arbitrary Racket programs, we actually want to leave a letrec-values form behind to hold any runtime bindings (i.e. the values part of letrec-syntaxes+values).

We’ll start with #%plain-lambda, which is the simplest of all the five aforementioned binding forms. It binds a sequence of identifiers at runtime, and they are in scope within the body of the lambda expression. Just as we created and used an internal-definition context to hold the bindings of a letrec-syntax+values form in the previous blog post, we’ll do the same for Racket’s other binding forms as well:

[(head:#%plain-lambda ~! [x:id ...] body ...)
 #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
       (syntax-local-bind-syntaxes (attribute x) #f intdef-ctx)]
 #:with [x* ...] (internal-definition-context-introduce intdef-ctx #'[x ...])
 #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                      (map expand-expression (attribute body)))
 (syntax/loc/props this-syntax
   (head [x* ...] body* ...))]

However, the above handling of #%plain-lambda isn’t quite right, since the argument list can also include a “rest argument” binding in addition to a sequence of positional arguments. To accommodate this, we can introduce a simple syntax class that handles the different permutations:

(begin-for-syntax
  (define-syntax-class plain-formals
    #:description "formals"
    #:attributes [[id 1]]
    #:commit
    [pattern (id:id ...)]
    [pattern (id*:id ... . id**:id) #:with [id ...] #'[id* ... id**]]))

Now we can use this to adjust #%plain-lambda to handle rest arguments:

[(head:#%plain-lambda ~! formals:plain-formals body ...)
 #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
       (syntax-local-bind-syntaxes (attribute formals.id) #f intdef-ctx)]
 #:with formals* (internal-definition-context-introduce intdef-ctx #'formals)
 #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                      (map expand-expression (attribute body)))
 (syntax/loc/props this-syntax
   (head formals* body* ...))]

Next, we’ll handle case-lambda. As it turns out, expanding case-lambda is almost exactly the same as expanding #%plain-lambda, except that it has multiple clauses. Since each clause is expanded identically to the body of a #%plain-lambda, and it even has the same shape, the clauses can be extracted into a separate syntax class to share code between the two forms:

(begin-for-syntax
  (define-syntax-class lambda-clause
    #:description #f
    #:attributes [expansion]
    #:commit
    [pattern [formals:plain-formals body ...]
             #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
                   (syntax-local-bind-syntaxes (attribute formals.id) #f intdef-ctx)]
             #:with formals* (internal-definition-context-introduce intdef-ctx #'formals)
             #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                                  (map expand-expression (attribute body)))
             #:attr expansion #'[formals* body* ...]]))

Now, both #%plain-lambda and case-lambda can be handled in a few lines of code each:

[(head:#%plain-lambda ~! . clause:lambda-clause)
 (syntax/loc/props this-syntax
   (head . clause.expansion))]

[(head:case-lambda ~! clause:lambda-clause ...)
 (syntax/loc/props this-syntax
   (head clause.expansion ...))]

Finally, we need to tackle the three let forms. None of these involve any fundamentally new ideas, but they are a little bit more involved than the variants of lambda due to the need to handle the RHSs. Each variant is slightly different, but not dramatically so: the bindings aren’t in scope when expanding the RHSs of let-values, but they are for letrec-values and letrec-syntaxes+values, and letrec-syntaxes+values creates transformer bindings and must evaluate some RHSs in phase 1 while let-values and letrec-values exclusively bind runtime bindings. It would be possible to implement these three forms in separate clauses, but since we’d ideally like to duplicate as little code as possible, we can write a rather elaborate syntax/parse pattern to handle all three binding forms all at once.

We’ll start by handling let-values alone to keep things simple:

[(head:let-values ~! ([(x:id ...) rhs] ...) body ...)
 #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
       (syntax-local-bind-syntaxes (append* (attribute x)) #f intdef-ctx)]
 #:with [[x* ...] ...] (internal-definition-context-introduce intdef-ctx #'[[x ...] ...])
 #:with [rhs* ...] (map expand-expression (attribute rhs))
 #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                      (map expand-expression (attribute body)))
 (syntax/loc/props this-syntax
   (head ([(x* ...) rhs*] ...) body* ...))]

This isn’t dramatically different from the implementation of #%plain-lambda. The only difference is that we have to recursively invoke expand-expression on the RHSs in addition to expanding the body expressions. To handle letrec-values in the same clause, however, we’ll have to get a little more creative.

So far, we haven’t actually tapped very far into syntax/parse’s pattern language over the course of these two blog posts. The full language available to patterns is rather extensive, and we can take advantage of that to write a modification of the above clause that handles both let-values and letrec-values at once:

[({~or {~and head:let-values {~bind [rec? #f]}}
       {~and head:letrec-values {~bind [rec? #t]}}}
  ~! ([(x:id ...) rhs] ...) body ...)
 #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
       (syntax-local-bind-syntaxes (append* (attribute x)) #f intdef-ctx)]
 #:with [[x* ...] ...] (internal-definition-context-introduce intdef-ctx #'[[x ...] ...])
 #:with [rhs* ...] (if (attribute rec?)
                       (parameterize ([current-intdef-ctx intdef-ctx])
                         (map expand-expression (attribute rhs)))
                       (map expand-expression (attribute rhs)))
 #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                      (map expand-expression (attribute body)))
 (syntax/loc/props this-syntax
   (head ([(x* ...) rhs*] ...) body* ...))]

The ~bind pattern allows us to explicitly control how attributes are bound as part of the pattern-matching process, which allows us to track when we want to enable the recursive binding behavior of letrec-values in our handler code. Since the vast majority of the logic is otherwise identical, this is a significant improvement over duplicating the clause.

Adding support for letrec-syntaxes+values is done in the same general way, but the pattern is even more involved. In addition to tracking whether or not the bindings are recursive, we have to track if any syntax bindings were present at all, and if they were, bind them with syntax-local-bind-syntaxes:

[({~or {~or {~and head:let-values ~! {~bind [rec? #f] [stxs? #f]}}
            {~and head:letrec-values ~! {~bind [rec? #t] [stxs? #f]}}}
       {~seq head:letrec-syntaxes+values {~bind [rec? #t] [stxs? #t]}
             ~! ([(x/s:id ...) rhs/s] ...)}}
  ([(x:id ...) rhs] ...) body ...)
 #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
       (syntax-local-bind-syntaxes (append* (attribute x)) #f intdef-ctx)
       (when (attribute stxs?)
         (for ([xs/s (in-list (attribute x/s))]
               [rhs/s (in-list (attribute rhs/s))])
           (syntax-local-bind-syntaxes xs/s rhs/s intdef-ctx)))]
 #:with [[x* ...] ...] (internal-definition-context-introduce intdef-ctx #'[[x ...] ...])
 #:with [rhs* ...] (if (attribute rec?)
                       (parameterize ([current-intdef-ctx intdef-ctx])
                         (map expand-expression (attribute rhs)))
                       (map expand-expression (attribute rhs)))
 #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                      (map expand-expression (attribute body)))
 (if (attribute stxs?)
     (~> (syntax/loc this-syntax
           (letrec-values ([(x* ...) rhs*] ...) body* ...))
         (syntax-track-origin this-syntax #'head))
     (syntax/loc/props this-syntax
       (head ([(x* ...) rhs*] ...) body* ...)))]

This behemoth clause handles all three varieties of let forms that can appear in the result of local-expand. Notably, in the letrec-syntaxes+values case, we expand into letrec-values, since the transformer bindings are effectively erased, and we use syntax-track-origin to record that the result originally came from a use of letrec-syntaxes+values.

With these five clauses, we’ve handled all the special forms that can appear in expression position in Racket’s kernel language. To tie things off, we just need to handle the cases of a variable reference, which is represented by a bare identifier not bound to syntax, or literal data, like numbers or strings. We can add one more clause at the end to handle those:

[_
 this-syntax]

Putting them all together, our expand-expression function looks as follows:

(begin-for-syntax
  (define (expand-expression stx)
    (syntax-parse (parameterize ([current-context 'expression])
                    (current-expand stx))
      #:literal-sets [kernel-literals]
      [({~or quote quote-syntax #%top #%variable-reference} ~! . _)
       this-syntax]

      [({~and head {~or #%expression #%plain-app begin begin0 if with-continuation-mark}} ~! form ...)
       #:with [form* ...] (map expand-expression (attribute form))
       (syntax/loc/props this-syntax
         (head form* ...))]

      [(head:#%plain-lambda ~! . clause:lambda-clause)
       (syntax/loc/props this-syntax
         (head . clause.expansion))]

      [(head:case-lambda ~! clause:lambda-clause ...)
       (syntax/loc/props this-syntax
         (head clause.expansion ...))]

      [({~or {~or {~and head:let-values ~! {~bind [rec? #f] [stxs? #f]}}
                  {~and head:letrec-values ~! {~bind [rec? #t] [stxs? #f]}}}
             {~seq head:letrec-syntaxes+values {~bind [rec? #t] [stxs? #t]}
                   ~! ([(x/s:id ...) rhs/s] ...)}}
        ([(x:id ...) rhs] ...) body ...)
       #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
             (syntax-local-bind-syntaxes (append* (attribute x)) #f intdef-ctx)
             (when (attribute stxs?)
               (for ([xs/s (in-list (attribute x/s))]
                     [rhs/s (in-list (attribute rhs/s))])
                 (syntax-local-bind-syntaxes xs/s rhs/s intdef-ctx)))]
       #:with [[x* ...] ...] (internal-definition-context-introduce intdef-ctx #'[[x ...] ...])
       #:with [rhs* ...] (if (attribute rec?)
                             (parameterize ([current-intdef-ctx intdef-ctx])
                               (map expand-expression (attribute rhs)))
                             (map expand-expression (attribute rhs)))
       #:with [body* ...] (parameterize ([current-intdef-ctx intdef-ctx])
                            (map expand-expression (attribute body)))
       (if (attribute stxs?)
           (~> (syntax/loc this-syntax
                 (letrec-values ([(x* ...) rhs*] ...) body* ...))
               (syntax-track-origin this-syntax #'head))
           (syntax/loc/props this-syntax
             (head ([(x* ...) rhs*] ...) body* ...)))]

      [_
       this-syntax])))

If we try it out, we’ll see that it really does work! Even complicated local binding forms are handled properly by our expander:

> (expand-expression
   #'(let ([x 42])
       (letrec-syntax ([y (make-rename-transformer #'z)]
                       [z (make-rename-transformer #'x)])
         (+ y 3))))
#<syntax (let-values (((x) '42))
           (letrec-values ()
             (#%plain-app + x '3)))>

We are now able to expand arbitrary Racket expressions in the same way that the expander does. While this might not seem immediately useful—after all, we haven’t actually gained anything here over just calling local-expand with an empty stop list—we can use this as the basis of an expander that can extensibly handle custom core forms, which I may cover in a future blog post.

Adding support for internal definitions

In the previous section, we defined an expander that could expand arbitrary Racket expressions, but our expander is still imperfect: we still do not support internal definitions. For all forms that have bodies, including #%plain-lambda, case-lambda, let-values, letrec-values, and letrec-syntaxes+values, Racket permits the use of internal definitions.

In practice, internal-definition contexts allow for an increased degree of modularity compared to traditional local binding forms, since they provide an extensible binding language. Users may mix many different binding forms within a single definition context, such as define, define-syntax, match-define, and even struct. However, this means the rewriting process described earlier in this blog post is not as simple as detecting the definitions and lifting them into a local binding form, since it’s not immediately apparent which forms are binding forms and which are expressions!

For this reason, expanding internal-definition contexts happens to be a nontrivial problem in itself. It involves a little more care than expanding expressions does, since it requires using partial expansion to discover which forms are definitions and which forms are expressions. We must take care to never expand too much, but also to expand enough that we reveal all uses of define-values and define-syntaxes (which all definition forms eventually expand into). We also must handle the splicing behavior of begin, which is necessary to allow single forms to expand into multiple definitions.

We’ll start by writing an expand-body function, which operates similarly to our previous expand-expression function. Unlike expand-expression, expand-body will accept a list of syntax objects, which represents the sequence of forms that make up the body. Logically, each body will create a first-class definition context with syntax-local-make-definition-context to represent the sequence of definitions:

(begin-for-syntax
  (define (expand-body stxs)
    (define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
    (parameterize ([current-context (list (gensym))]
                   [current-intdef-ctx intdef-ctx])
      )))

The bulk of our expand-body function will be a loop that partially expands body forms, adds definitions to the definition context as it discovers them, and returns the expressions and runtime definitions to be rewritten into binding pairs for a letrec-values form. Additionally, the loop will also track so-called disappeared uses and disappeared bindings, which are attached to the expansion using syntax properties to allow tools like DrRacket to learn about the binding structure of phase 1 definitions that are erased as part of macroexpansion.

The skeleton of this loop is relatively straightforward to write. We will iterate over the syntax objects that make up the body, expand them, and process the expansion using syntax-parse:

(begin-for-syntax
  (define (expand-body stxs)
    (define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
    (parameterize ([current-context (list (gensym))]
                   [current-intdef-ctx intdef-ctx])
      (define-values [binding-clauses exprs disappeared-uses disappeared-bindings]
        (let loop ([stxs stxs]
                   [binding-clauses '()]
                   [exprs '()]
                   [disappeared-uses '()]
                   [disappeared-bindings '()])
          (if (empty? stxs)
              (values (reverse binding-clauses) (reverse exprs) disappeared-uses disappeared-bindings)
              (syntax-parse (current-expand (first stxs))
                #:literal-sets [kernel-literals]
                )))))))

The hard part, of course, is actually handling the potential results of that expansion. We need to handle three forms specially: begin, define-values, and define-syntaxes. All other results of partial expansion will be treated as expressions. We’ll start by handling begin, since it’s the simplest case; we only need to prepend the subforms to the list of body forms to be processed, then continue looping:

[(head:begin ~! form ...)
 (loop (append (attribute form) stxs) binding-clauses exprs
       disappeared-uses disappeared-bindings)]

However, as is often the case, this isn’t quite perfect, since the information that these forms came from a surrounding begin is lost, which tools like DrRacket want to know. To solve this problem, the expander adjusts the origin property for all spliced forms, which we can mimic using syntax-track-origin:

[(head:begin ~! form ...)
 (loop (append (for/list ([form (in-list (attribute form))])
                 (syntax-track-origin form this-syntax #'head))
               stxs)
       binding-clauses exprs disappeared-uses disappeared-bindings)]

This is sufficient for begin, so we can move onto the actual definitions themselves. This actually isn’t too hard, since we just need to add the bindings we discover to the first-class definition context and preserve define-values bindings as binding pairs:

[(head:define-values ~! [x:id ...] rhs)
 #:do [(syntax-local-bind-syntaxes (attribute x) #f intdef-ctx)]
 (loop (rest stxs) (cons #'[(x ...) rhs] binding-clauses) exprs
       disappeared-uses disappeared-bindings)]

This solution is missing one thing, however, which is the use of syntax-local-identifier-as-binding to any use-site scopes that were added to the binding identifier while expanding the binding form in the definition context. Explaining precisely why this is necessary is outside the scope of this blog post, and is best understood by reading the section on use-site scopes in the paper that describes the theory behind Racket’s current macro system, Bindings as Sets of Scopes. In any case, the impact on our implementation is small:

[(head:define-values ~! [x:id ...] rhs)
 #:with [x* ...] (map syntax-local-identifier-as-binding (attribute x))
 #:do [(syntax-local-bind-syntaxes (attribute x*) #f intdef-ctx)]
 (loop (rest stxs) (cons #'[(x* ...) rhs] binding-clauses) exprs
       disappeared-uses disappeared-bindings)]

Finally, as with begin, we want to track that the binding pairs we generate actually came from a use of define-values (which in turn likely came from a use of some other definition form). Therefore, we’ll add another use of syntax-track-origin to copy and extend the necessary properties:

[(head:define-values ~! [x:id ...] rhs)
 #:with [x* ...] (map syntax-local-identifier-as-binding (attribute x))
 #:do [(syntax-local-bind-syntaxes (attribute x*) #f intdef-ctx)]
 (loop
  (rest stxs)
  (cons (syntax-track-origin #'[(x* ...) rhs] this-syntax #'head) binding-clauses)
  exprs disappeared-uses disappeared-bindings)]

That’s it for define-values. All that’s left is to handle define-syntaxes, which is quite similar, but instead of storing the definition in a binding pair, its RHS is immediately evaluated and added to the definition context using syntax-local-bind-syntaxes:

[(head:define-syntaxes ~! [x:id ...] rhs)
 #:with [x* ...] (map syntax-local-identifier-as-binding (attribute x))
 #:do [(syntax-local-bind-syntaxes (attribute x*) #'rhs intdef-ctx)]
 (loop (rest stxs) binding-clauses exprs
       (cons #'head disappeared-uses) (cons (attribute x*) disappeared-bindings))]

As the above snippet indicates, this is also where the disappeared uses and disappeared bindings come in. In previous cases, we’ve used syntax-track-origin to indicate that a piece of syntax was the result of expanding a different piece of syntax, but in this case, define-syntaxes doesn’t expand into anything at all; it’s simply removed from the expansion entirely. Therefore, we need to resort to tracking the information in syntax properties on the resulting letrec-values form, so we’ll save them for later.

Finally, to finish things up, we can add a catchall clause that handles all other forms, which are now guaranteed to be expressions:

[_
 (loop (rest stxs) binding-clauses (cons this-syntax exprs)
       disappeared-uses disappeared-bindings)]

This completes our loop that processes definition forms, so all that’s left to do is handle the results. The only significant remaining work is to actually expand the RHSs of the binding pairs we collected and the body expressions, which can be done by calling our own expand-expression function directly:

(define expanded-binding-clauses
  (for/list ([binding-clause (in-list binding-clauses)])
    (syntax-parse binding-clause
      [[(x ...) rhs]
       (quasisyntax/loc/props this-syntax
         [(x ...) #,(expand-expression #'rhs)])])))
(define expanded-exprs (map expand-expression exprs))

Finally, we can assemble all the pieces together into a single local binding form with the appropriate syntax properties:

(~> #`(letrec-values #,expanded-binding-clauses #,@expanded-exprs)
    (syntax-property 'disappeared-uses disappeared-uses)
    (syntax-property 'disappeared-bindings disappeared-bindings))

That’s it. We’ve now written an expand-body function that can process internal definition contexts in the same way that the macroexpander does. Overall, the whole function is just under 45 lines of code:

(begin-for-syntax
  (define (expand-body stxs)
    (define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
    (parameterize ([current-context (list (gensym))]
                   [current-intdef-ctx intdef-ctx])
      (define-values [binding-clauses exprs disappeared-uses disappeared-bindings]
        (let loop ([stxs stxs]
                   [binding-clauses '()]
                   [exprs '()]
                   [disappeared-uses '()]
                   [disappeared-bindings '()])
          (if (empty? stxs)
              (values (reverse binding-clauses) (reverse exprs) disappeared-uses disappeared-bindings)
              (syntax-parse (current-expand (first stxs))
                #:literal-sets [kernel-literals]
                [(head:begin ~! form ...)
                 (loop (append (for/list ([form (in-list (attribute form))])
                                 (syntax-track-origin form this-syntax #'head))
                               stxs)
                       binding-clauses exprs disappeared-uses disappeared-bindings)]
                [(head:define-values ~! [x:id ...] rhs)
                 #:with [x* ...] (map syntax-local-identifier-as-binding (attribute x))
                 #:do [(syntax-local-bind-syntaxes (attribute x*) #f intdef-ctx)]
                 (loop
                  (rest stxs)
                  (cons (syntax-track-origin #'[(x* ...) rhs] this-syntax #'head) binding-clauses)
                  exprs disappeared-uses disappeared-bindings)]
                [(head:define-syntaxes ~! [x:id ...] rhs)
                 #:with [x* ...] (map syntax-local-identifier-as-binding (attribute x))
                 #:do [(syntax-local-bind-syntaxes (attribute x*) #'rhs intdef-ctx)]
                 (loop (rest stxs) binding-clauses exprs
                       (cons #'head disappeared-uses) (cons (attribute x*) disappeared-bindings))]
                [_
                 (loop (rest stxs) binding-clauses (cons this-syntax exprs)
                       disappeared-uses disappeared-bindings)]))))
      (define expanded-binding-clauses
        (for/list ([binding-clause (in-list binding-clauses)])
          (syntax-parse binding-clause
            [[(x ...) rhs]
             (quasisyntax/loc/props this-syntax
               [(x ...) #,(expand-expression #'rhs)])])))
      (define expanded-exprs (map expand-expression exprs))
      (~> #`(letrec-values #,expanded-binding-clauses #,@expanded-exprs)
          (syntax-property 'disappeared-uses disappeared-uses)
          (syntax-property 'disappeared-bindings disappeared-bindings)))))

The next step is to actually use this function. We need to replace certain recursive calls to expand-expression with calls to expand-body, but if we do this naïvely, we’ll have some problems. Currently, when we expand body forms, they’re always immediately inside another definition context (i.e. the bindings introduced by lambda formals or by let binding pairs), but they haven’t actually been expanded in that context yet. When we call expand-body, we create a nested context, which will inherit the bindings, but won’t automatically add the parent context’s scope. Therefore, we need to manually call internal-definition-context-introduce on the body syntax objects before calling expand-body. We can write a small helper function to make this easier:

(begin-for-syntax
  (define (expand-body/in-ctx stxs ctx)
    (define (add-ctx-scope stx)
      (internal-definition-context-introduce ctx stx 'add))
    (parameterize ([current-intdef-ctx ctx])
      (add-ctx-scope (expand-body (map add-ctx-scope stxs))))))

Now we just need to replace the relevant calls to expand-expression with calls to expand-body/in-ctx, starting with a minor adjustment to our lambda-clause syntax class from earlier:

(begin-for-syntax
  (define-syntax-class lambda-clause
    #:description #f
    #:attributes [expansion]
    #:commit
    [pattern [formals:plain-formals body ...]
             #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
                   (syntax-local-bind-syntaxes (attribute formals.id) #f intdef-ctx)]
             #:with formals* (internal-definition-context-introduce intdef-ctx #'formals)
             #:with body* (expand-body/in-ctx (attribute body) intdef-ctx)
             #:attr expansion #'[formals* body*]]))

The only other change must occur in the handling of the various let forms, which similarly replaces expand-expression with expand-body/in-ctx:

[({~or {~or {~and head:let-values ~! {~bind [rec? #f] [stxs? #f]}}
            {~and head:letrec-values ~! {~bind [rec? #t] [stxs? #f]}}}
       {~seq head:letrec-syntaxes+values {~bind [rec? #t] [stxs? #t]}
             ~! ([(x/s:id ...) rhs/s] ...)}}
  ([(x:id ...) rhs] ...) body ...)
 #:do [(define intdef-ctx (syntax-local-make-definition-context (current-intdef-ctx)))
       (syntax-local-bind-syntaxes (append* (attribute x)) #f intdef-ctx)
       (when (attribute stxs?)
         (for ([xs/s (in-list (attribute x/s))]
               [rhs/s (in-list (attribute rhs/s))])
           (syntax-local-bind-syntaxes xs/s rhs/s intdef-ctx)))]
 #:with [[x* ...] ...] (internal-definition-context-introduce intdef-ctx #'[[x ...] ...])
 #:with [rhs* ...] (if (attribute rec?)
                       (parameterize ([current-intdef-ctx intdef-ctx])
                         (map expand-expression (attribute rhs)))
                       (map expand-expression (attribute rhs)))
 #:with body* (expand-body/in-ctx (attribute body) intdef-ctx)
 (if (attribute stxs?)
     (~> (syntax/loc this-syntax
           (letrec-values ([(x* ...) rhs*] ...) body*))
         (syntax-track-origin this-syntax #'head))
     (syntax/loc/props this-syntax
       (head ([(x* ...) rhs*] ...) body*)))]

With these changes, we’ve now extended our expression expander with the ability to expand internal definitions. We can see this in action on a simple example:

> (expand-expression
   #'(let ()
       (define x 42)
       (define-syntax y (make-rename-transformer #'z))
       (define-syntax z (make-rename-transformer #'x))
       (+ y 3)))
#<syntax (let-values ()
           (letrec-values ([(x) '42])
             (#%app + x '3)))>

Just as we’d like, the transformer bindings were expanded and subsequently eliminated, and the runtime binding was collected into a letrec-values form. The outer let-values is left over from the outer let, which is needed only to create an internal-definition context to hold our internal definitions.

Putting the expression expander to work

So far, we’ve done a lot of work to emulate the behavior of Racket’s macroexpander, and as the above example demonstrates, we’ve been fairly successful in that goal. However, you might be wondering why we did any of this, as replicating the behavior of local-expand is not very useful on its own. As mentioned above, this can be used as the foundation of an expander for custom core forms that extends, rather than replaces, the built-in Racket core forms, It can also be used to “cheat” and expand through the behavior of the local-expand stop list, which implicitly adds the Racket core forms to any non-empty stop list. Hopefully, I’ll have a chance to cover some of these things more deeply in the future, but for now, I’ll just give a small taste of the latter.

By using the power of our expand-expression function, it’s actually possible to use this kind of expression expander to do genuinely nefarious things, such as hijack the behavior of arbitrary macros! For example, we could do something evil like make for loops run in reverse order by adding for to current-stop-list, then adding an additional special case to expand-expression for for:

(begin-for-syntax
  (define current-stop-list (make-parameter (list #'define-values #'define-syntaxes #'for)))

  (define (expand-expression stx)
    (syntax-parse (parameterize ([current-context 'expression])
                    (current-expand stx))
      #:literal-sets [kernel-literals]
      #:literals [for]
      ; ...
      [(head:for ([x:id seq:expr] ...) body ...+)
       (syntax/loc/props this-syntax
         (head ([x (in-list (reverse (sequence->list seq)))] ...)
           body ...))]
      ; ...
    )))

Amazingly, due to the fact that we’ve taken complete control of the expansion process, this will rewrite uses of for even if they are introduced by macroexpansion. For example, we could write a small macro that expands into a use of for:

(define-simple-macro (print-up-to n)
  (for ([i (in-range n)])
    (println i)))

> (print-up-to 5)
0
1
2
3
4

If we write a wrapper macro that applies our evil version of expand-expression to its body, then wrap a use of our print-up-to macro with it, it will execute the loop in reverse order:

(define-syntax-parser hijack-for-loops
  [(_ form:expr) (expand-expression #'form)])

> (hijack-for-loops
   (print-up-to 5))
4
3
2
1
0

On its own, this is not that impressive, since we could have just used local-expand on the body directly to achieve this. However, what’s remarkable about hijack-for-loops is that it will work even if the for loop is buried deep inside some arbitrary expression:

> (define foo
    (hijack-for-loops
     (lambda (x)
       (define n (* x 2))
       (print-up-to n))))
> (foo 3)
5
4
3
2
1
0

Of course, this example is rather contrived—mucking with for loops like this isn’t useful at all, and nobody would really write print-up-to as a macro, anyway—but there is potential for using this technique to do more interesting things.

Closing thoughts

The system outlined in this blog post is not something I would recommend using in any real macro. It is enormously complicated, requires knowledge well above that of your average working macrologist, and it involves doing rather horrible things to the macro system, things it was undoubtably never designed to do. Still, I believe this blog post is useful, for a few different reasons:

The technology outlined in this post, while perhaps not directly applicable to existing real-world problems, provides a framework for implementing various new kinds of syntax transformations in Racket without extending the macro system. It demonstrates the expressive power of the macro system, and it hopefully lays the foundation for a better, more high-level interface for users who wish to define their own languages with custom core forms.
This system provides insight into the way the Racket macroexpander operates, in terms of the userspace syntax API. The canonical existing model of hygienic macroexpansion, in the aforementioned Bindings as Sets of Scopes paper, does not explain the workings of internal definition contexts in detail, and it certainly doesn’t explain them in terms that a Racket programmer would already be familiar with. By reencoding those ideas within the macro system itself, an advanced macro writer may be able to more easily connect concepts in the macro system’s implementation to concepts they have already been exposed to.
The capability of the proof-of-concept outlined here demonstrates that the limitation imposed by the existing implementation of the stop list (namely, the way it is implicitly extended with additional identifiers) is essentially artificial, and it can be hacked around with sufficient (albeit significant) effort. This isn’t enormously important, but it is somewhat relevant to a recent debate in a GitHub issue about the handling of the local-expand stop list.
Finally, for myself as much as anyone else, this implementation records in a concise way (perhaps overly concise at times) the collection of very subtle details I’ve learned over the past six months about how information is preserved and propagated during the expansion process.

This blog post is not for everybody. If you made it to the end, give yourself a pat on the back. If you made it to the end and understood everything you read: congratulations, you are a certified expert in Racket macro programming. If not, do not fear, and do not lose hope—I plan for something significantly more mellow next time.

As always, I’d like to give thanks to the people who contributed significantly, if indirectly, to the contents of this blog post, namely Matthew Flatt, Michael Ballantyne, and Ryan Culpepper. And finally, for those interested, all of the code in this blog post can be found in a runnable form in this GitHub gist.

Reimplementing Hackett’s type language: expanding to custom core forms in Racket

2018-04-15T00:00:00Z

In the past couple of weeks, I completely rewrote the implementation of Hackett’s type language to improve the integration between the type representation and Racket’s macro system. The new type language effectively implements a way to reuse as much of the Racket macroexpanding infrastructure as possible while expanding a completely custom language, which uses a custom set of core forms. The fundamental technique used to do so is not novel, and it seems to be periodically rediscovered every so often, but it has never been published or documented anywhere, and getting it right involves understanding a great number of subtleties about the Racket macro system. While I cannot entirely eliminate the need to understand those subtleties, in this blog post, I hope to make the secret sauce considerably less secret.

This blog post is both a case study on how I implemented the expander for Hackett’s new type language and a discussion of how such a technique can apply more generally. Like my previous blog post on Hackett, which covered the implementation of its namespace system, the implementation section of this blog post is highly technical and probably requires significant experience with Racket’s macro system to completely comprehend. However, the surrounding material is written to be more accessible, so even if you are not a Racket programmer, you should hopefully be able to understand the big ideas behind this change.

What are core forms?

Before we can get started writing custom core forms, we need to understand the meaning of Racket’s plain old core forms. What is a core form? In order to answer that question, we need to think about how Racket’s expansion and compilation model works.

To start, let’s consider a simple Racket program. Racket programs are organized into modules, which are usually written with a #lang line at the top. In this case, we’ll use #lang racket to keep things simple:

#lang racket

(define (add2 x)
  (+ x 2))

(add2 3)

How does Racket see this program? Well, before it can do anything with it, it must parse the program text, which is known in Racket as reading the program. The #lang line controls how the program is read—some #langs provide parsers that allow syntax that is very different from the parser used for #lang racket—but no matter which reader is used, the result is an s-expression (actually a syntax object, but essentially an s-expression) representing a module. In the case of the above program, the result looks like this:

(module m racket
  (#%module-begin
    (define (add2 x)
      (+ x 2))

    (add2 3)))

Note the introduction of #%module-begin. Despite the fancy name, this is really just an ordinary macro provided by the racket language. By convention, the reader and expander cooperate to ensure the body of every module is wrapped with #%module-begin; as we’ll see shortly, this allows languages to add functionality that affects the entire contents of the module.

One the program has been read, it is subsequently expanded by the macroexpander. As the name implies, this is the phase that expands all the macros in a module. What does the above module look like after expansion? Well, it doesn’t look unrecognizable, but it certainly does look different:

(module m racket
  (#%plain-module-begin
    (define-values (add2)
      (lambda (x) (#%plain-app + x '2)))

    (#%plain-app call-with-values
                 (lambda () (#%plain-app add2 '3))
                 print-values)))

Let’s note the things that changed:

#%module-begin was replaced with #%plain-module-begin. #%plain-module-begin is a binding that wraps the body of every expanded module, and all definitions of #%module-begin in any language must eventually expand to #%plain-module-begin. However, #lang racket’s #%module-begin doesn’t just expand to #%plain-module-begin, it also wraps bare expressions at the top level of a module so that their results are printed. This is why running the above program prints 5 even though there is no code related to printing in the original program!
The lambda shorthand used with define was converted to an explicit use of lambda, and it was expanded to define-values. In Racket, define and define-syntax are really just macros for define-values and define-syntaxes that only bind a single identifier.
All function applications were tagged explicitly with #%plain-app. This syntactically distinguishes function applications from uses of forms like define-values or lambda. It also allows languages to customize function application by providing their own macros named #%app (just like languages can provide their own macros named #%module-begin that expand to #%plain-module-begin), but that is outside the scope of this blog post.
All literals have been wrapped with quote, so 2 became '2 and 3 became '3.

Importantly, the resulting program contains no macros. Such programs are called fully expanded, since all macros have been eliminated and no further expansion can take place.

So what’s left behind? Well, some of the things in the program are literal data, like the numbers 2 and 3. There are also some variable references, x and add2. Most of the program, however, is built out of primitives like module, #%plain-module-begin, #%plain-app, define-values, and lambda. These primitives are core forms—they are not variables, since they do not represent bindings that contain values at runtime, but they are also not macros, since they cannot be expanded any further.

In this sense, a fully-expanded program is just like a program in most languages that do not have macros. Core forms in Racket correspond to the syntax of other languages. We can imagine a JavaScript program similar to the above fully-expanded Racket program:

var add2 =
  function (x) { return x + 2; };

console.log(add2(3));

Just as this JavaScript program is internally transformed into an AST containing a definition node, a function abstraction node, and some function application nodes, a fully-expanded Racket program represents an AST ready to be sent off to be compiled. The Racket compiler has built-in rules for how to compile core forms like define-values, lambda, and #%plain-app, and the result is optimized Racket bytecode.

In the remainder of this blog post, as most discussions of macros do, we’ll ignore the read and compile steps of the Racket program pipeline and focus exclusively on the expand step. It’s useful, however, to keep the other steps in mind, since we’re going to be discussing what it means to implement custom core forms, and core forms really only make sense in the context of the subsequent compilation step that consumes them.

Racket’s default core forms

So, now that we know what core forms are in an abstract sense, what are they in practice? We’ve already encountered module, #%plain-module-begin, #%plain-app, define-values, lambda, and quote, but there are many more. The full list is available in the section of the Racket reference named Fully Expanded Programs, and I will not list all of them here. In general, they are more or less what you’d expect. The list of Racket’s core forms also includes things like define-syntaxes, if, let-values, letrec-values, begin, quote-syntax, and set!. Fundamentally, these correspond to the basic operations the Racket compiler understands, and it allows the remainder of Racket’s compilation pipeline to ignore the complexities of macroexpansion.

These forms are fairly versatile, and it’s easy to build high-level abstractions on top of them. For example, #lang racket implements cond as a macro that eventually expands into if, and it implements syntax as a macro that eventually expands into function calls and quote-syntax. The real power comes in the way new macros can be built out of other macros, not just core forms, so Racket’s match can expand into uses of let and cond, and it doesn’t need to concern itself with using let-values and if. For this reason, Racket’s core forms are quite capable of representing any language imaginable, since fully-expanded programs are essentially instructions for the Racket virtual machine, and macros are mini-compilers that can be mixed and matched.

The need for custom core forms

With that in mind, why might we wish to define custom core forms? In fact, what would such a thing even mean? By their very nature, all Racket programs eventually expand into Racket’s core forms; new core forms cannot be added because Racket’s underlying compiler infrastructure is not (currently) extensible. New forms can be added that are defined in terms of other forms, but adding new primitives doesn’t make any sense, since the compiler would not know what to do with them.

Despite this, there are at least two use-cases in which a programmer might wish to customize the set of core forms produced by the macroexpander. Each situation is slightly different, but they both revolve around the same idea.

Supporting multiple backends

The most commonly discussed use case for customizing the set of core forms is for languages that wish to use the Racket macroexpander, but target backends that are not the Racket compiler. For example, a user might implement a Racket #lang that describes electronic circuits, and they might even implement a way to execute such a program in Racket, but they might also wish to compile the result to a more traditional hardware description language. Like other languages in the Racket ecosystem, such a language would be made up of a tower of macros built on top of core forms; unlike other languages, the core forms might need to be more abstract than the ones provided by Racket to efficiently compile to other targets.

In the case of a hardware description language, the custom core forms might include things like input and output for declaring circuit inputs and outputs, and expressions might be built out of hardware operations rather than high-level things like function calls. The Racket macroexpander would expand the input program into the custom set of core forms, at which point an external compiler program could compile the resutling AST in a more traditional way. If the language author wished, they could additionally define implementations of these core forms as Racket macros that eventually expand into Racket, which would allow them to emulate their circuits in Racket at little cost, but this would be a wholly optional step.

Essentially, this use case stems from a desire to reuse Racket’s advanced language-development technology, such as the macroexpander, the module system, and editor tooling, without also committing to using Racket as a runtime, which is not always appropriate for all languages. This use case is not nearly as easy as it ought to be, but it is a common request, and it is possible that future improvements to the Racket toolchain will be designed specifically to address this problem.

Compiling an extensible embedded language

A second use case for custom core forms is less frequently discussed, but I think it might actually be significantly more common in practice were it available in a form accessible to working macro programmers. In this scenario, users might wish to remain within Racket, but still want to define a custom language that other macros can consume.

This concept is a little more vague and fuzzily-defined than the case of developing a separate backend, so allow me to propose an example. Imagine a Racket programmer decides to build an embedded DSL for asynchronously producing and consuming events, similar to first-order functional reactive programming. In this case, the DSL is designed to be used in larger Racket programs, so it will eventually expand to Racket’s core forms. However, it’s possible that such a language might wish to enforce static invariants about the network graph, and in doing so, it might be able to produce significantly more optimal Racket code via a compile-time analysis.

Performing such a compile-time analysis is essentially writing a custom optimizer as part of a macro, which has been done numerous times already within the Racket ecosystem. One of the most prominent examples of such a thing is the match macro, which parses users’ patterns into compile-time data structures, performs a fairly traditional optimization pass designed to efficiently compile pattern matching, and it emits optimized Racket code as a result. This approach works well for fairly contained problems like pattern-matching, but it works less well for entirely new embedded languages that include everything from their own notion of evaluation to their own binding forms.

Existing DSLs of this type are rare, but they do exist. syntax/parse provides an expressive, specialized pattern-matching language designed specifically for matching syntax objects, and it uses a different model from racket/match to be more suitable for that task. It allows backtracking with cuts, an extensible pattern language, an abstraction language for defining reusable parsers that can accept inputs and produce outputs, and fine-grained control over both parsing and binding. While match is essentially just a traditional pattern-matcher, albeit an extensible one, syntax-parse is its own programming language, closer in some ways to Prolog than to Racket.

For this reason, syntax/parse has an extensive language to do everything from creating new bindings to controlling when and how parsing fails. This language is represented in two ways: an inline pattern language, and an alternate syntax known as pattern directives. Here is an example of pattern directives in action, from my own threading library:

[(_ ex:expr cl:clause remaining:clause ...)
 #:do [(define call (syntax->list #'cl.call))
       (define-values (pre post)
         (split-at call (add1 (or (attribute cl.insertion-point) 0))))]
 #:with [pre ...] pre
 #:with [post ...] post
 #:with app/ctx (adjust-outer-context this-syntax #'(pre ... ex post ...) #'cl)
 (adjust-outer-context this-syntax #'(~> app/ctx remaining ...) this-syntax)]

Each directive is represented by a keyword, in this case #:do and #:with. Each directive has a corresponding keyword in the pattern language, in this case ~do and ~parse. Therefore, the above pattern could equivalently be written this way:

[{~and (_ ex:expr cl:clause remaining:clause ...)
       {~do (define call (syntax->list #'cl.call))
            (define-values (pre post)
              (split-at call (add1 (or (attribute cl.insertion-point) 0))))}
       {~parse [pre ...] pre}
       {~parse [post ...] post}
       {~parse app/ctx (adjust-outer-context this-syntax #'(pre ... ex post ...) #'cl)}}
 (adjust-outer-context this-syntax #'(~> app/ctx remaining ...) this-syntax)]

The transformation can go in the other direction, too—each syntax class annotation on each pattern variable can be extracted into the directive language using #:declare, so this is also equivalent:

[(_ ex cl remaining ...)
 #:declare ex expr
 #:declare cl clause
 #:declare remaining clause
 #:do [(define call (syntax->list #'cl.call))
       (define-values (pre post)
         (split-at call (add1 (or (attribute cl.insertion-point) 0))))]
 #:with [pre ...] pre
 #:with [post ...] post
 #:with app/ctx (adjust-outer-context this-syntax #'(pre ... ex post ...) #'cl)
 (adjust-outer-context this-syntax #'(~> app/ctx remaining ...) this-syntax)]

This is very much a programming language, but it has very different semantics from programming in Racket! Failure to match against a #:with or ~parse pattern causes pattern-matching to backtrack, and though it’s possible to escape to Racket using #:do or ~do, practical uses of syntax/parse really do involve quite a lot of programming in its pattern DSL.

But the Racket programmer might not find this DSL wholly satisfying. Why? Well, it isn’t extensible! The pattern directives—#:declare, #:do, and #:with, among others—are essentially the core forms of syntax/parse’s pattern-matching language, but new ones cannot be defined. The desire to make this language easy to analyze statically in order to emit optimal pattern-matching code meant its author opted to define the language in terms of a specific grammar rather than a tower of macros.

But what if syntax/parse could define its own core forms? What if, instead of #:do, #:declare, and #:with being implemented as keyword options specially recognized by the syntax-parse grammar, it defined do, declare, and with as core forms for a new, macro-enabled language? A user of the language could then define a completely ordinary Racket macro and use it with this new language as long as it eventually expanded into the syntax/parse core forms. The implementation of syntax/parse could then invoke the macroexpander to request each clause be expanded into its core forms, perform its static analysis on the result, and finally emit optimized Racket code.

Now, to be fair, syntax/parse is not actually entirely inextensible. While new directives cannot be defined, new patterns can be added through a pattern-expander API that was added to the library after its initial design. However, pattern expanders are still not ideal because they are not ordinary Racket macros—users must explicitly define each pattern expander differently from how they would a macro—and they cannot use existing Racket forms, even ones that would theoretically be compatible with an arbitrary set of core forms.

The technique described in this blog post avoids all those problems. In the following sections, I’ll show that it’s possible to define an embedded language with a custom set of core forms that works well with the rest of the Racket ecosystem and still permits arbitrary static analysis.

The need for a custom type language in Hackett

In the previous section, I described two use cases for custom core forms. Hackett, in fact, has uses for both of them:

Hackett can definitely make use of custom core forms to compile to multiple backends. Eventually, it would be nice to compile Hackett to an intermediate language that can target both the Racket runtime and Haskell or GHC Core. This would allow Hackett to take advantage of GHC’s advanced optimizing compiler that already has decades of tuning for a pure, lazy, functional programming language, at the cost of not having access to the rest of Racket’s ecosystem of libraries at runtime.
Hackett can also make use of custom core forms for an embedded DSL. In this case, that embedded DSL is actually Hackett’s type language.

The second of those two use cases is simpler, and it’s what I ended up implementing first, so it’s what I will focus on in this blog post. Hackett’s type language is fundamentally quite simple, so its set of custom core forms is small as well. Everything in the type language eventually compiles into only seven core forms:

(#%type:con id) — Type constructors, like Integer or Maybe. These are one of the fundamental building blocks of Hackett types.
(#%type:app type type) — Type application, such as (Maybe Integer). Types are curried, so type constructors that accept multiple arguments are represented by nested uses of #%type:app.
(#%type:forall id type) — Universal quantification. This is essentially a binding form, which binds any uses of (#%type:bound-var id) in type.
(#%type:qual type type) — Qualified types, aka types with typeclass constraints. Constraints in Hackett, like in GHC, are represented by types, so typeclass names like Eq are bound as type constructors.
Finally, Hackett types support three different varieties of type variables:
- (#%type:bound-var id) — Bound type variables. These are only legal under a corresponding #%type:forall.
- (#%type:wobbly-var id) — Solver variables, which may unify with any other type as part of the typechecking process.
- (#%type:rigid-var id) — Rigid variables, aka skolem variables, which only unify with themselves. They represent a unique, anonymous type used to ensure types are suitably polymorphic.

To implement our custom core forms in Racket, we need to somehow define them, but how? Intentionally, these should never be expanded, since we want the expander to stop expanding whenever it encounters one of these identifiers. While we can’t encode this directly, we can bind them to macros that do nothing but raise an exception if something attempts to expand them:

(define-syntaxes [#%type:con #%type:app #%type:forall #%type:qual
                  #%type:bound-var #%type:wobbly-var #%type:rigid-var]
  (let ([type-literal (λ (stx) (raise-syntax-error #f "cannot be used as an expression" stx))])
    (values type-literal type-literal type-literal type-literal
            type-literal type-literal type-literal)))

This will ensure our core forms are never accidentally expanded, and we’ll instruct the macroexpander to stop whenever it sees one of them via a separate mechanism.

Expanding types in our type language

We’ve now defined our core forms, but we’ve intentionally left them meaningless. How do we actually inform the expander about how our types ought to be expanded? While it’s true that we don’t want the core forms themselves to be eliminated, we do want to expand some of their subforms. For example, in the type (#%type:app a b), we want to recursively expand a and b.

In order to do this, we’ll use the API made available by the expander for manually invoking macroexpansion from within another macro. This API is called local-expand, and it has an option relevant to our needs: the stop list.

Often, local-expand is used to force the expander to completely, recursively expand a form. For example, by using local-expand, we can produce a fragment of a fully-expanded program from a piece of syntax that still includes macros:

(local-expand #'(let ([x 1]) (+ x 2)) 'expression '())
; => (let-values ([(x) '1]) (#%plain-app + x '2))

The third argument to local-expand is the stop list, which controls how deep the expander ought to expand a given form. By providing an empty list, we ask for a complete, recursive expansion. In this case, however, we don’t want a complete expansion! We can inform the expander to stop whenever it sees any of our custom core forms by passing a list of our core form identifiers instead of an empty list:

(begin-for-syntax
  (define type-literal-ids
    (list #'#%type:con #'#%type:app #'#%type:forall #'#%type:qual
          #'#%type:bound-var #'#%type:wobbly-var #'#%type:rigid-var))

  (local-expand #'(#%type:forall x t) 'expression type-literal-ids))
  ; => (#%type:forall x t)

Of course, this isn’t very interesting, since it just gives us back exactly what we gave it. It spotted the #%type:forall identifier, which is in our stop list, and immediately halted expansion. It didn’t attempt to continue expanding t since the expander has no way of knowing which pieces of (#%type:forall x t) it should expand! In this case, we want it to recur to expand t, since it should be a type, but not x, since #%type:forall essentially puts x in binding position.

Therefore, we have to get more clever. We need to call local-expand to produce a type, then we have to pattern-match on it and subsequently call local-expand again on any of the pieces of syntax we want to keep expanding. Eventually, we’ll run out of things to expand, and our type will be fully-expanded.

One good way to do this is to use syntax/parse syntax classes, since they provide a convenient way for other macros to invoke the type expander. To implement our type expander, we’ll use two mutually recursive syntax classes: one to perform the actual expansion using local-expand and a second to pattern-match on the resulting expanded type. For example, here’s what these two classes would look like if they only handled #%type:con and #%type:app:

(begin-for-syntax
  (define-literal-set type-literals
    [#%type:con #%type:app #%type:forall #%type:qual
     #%type:bound-var #%type:wobbly-var #%type:rigid-var])

  (define-syntax-class type
    #:description "type"
    #:attributes [expansion]
    [pattern _ #:with :expanded-type
                      (local-expand this-syntax 'expression type-literal-ids)])

  (define-syntax-class expanded-type
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [type-literals]
    [pattern (#%type:con ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:app ~! a:type b:type)
             #:attr expansion #'(#%type:app a.expansion b.expansion)]))

This blog post is definitely not a syntax/parse tutorial, so I will not explain in detail everything that’s going on here, but the gist of it is that the above code defines two syntax classes, both of which produce a single output attribute named expansion. This attribute contains the fully expanded version of the type currently being parsed. In the #%type:con case, expansion is just this-syntax, which holds the current piece of syntax being parsed. This makes sense, since uses of #%type:con just expand to themselves—expanding (#%type:con Maybe) should not perform any additional expansion on Maybe. This is one of Hackett’s atomic types.

In contrast, #%type:app does recursively expand its arguments. By annotating its two subforms with :type, the type syntax class will invoke local-expand on each subform, which will in turn use expanded-type to parse the resulting type. This is what implements the expansion loop that will eventually expand each type completely. Once a and b have been expanded, #%type:app reassembles them into a new syntax object using #'(#%type:app a.expansion b.expansion), which replaces their unexpanded versions with their new, expanded versions.

We can see this behavior by writing a small expand-type function that will expand its argument:

(begin-for-syntax
  (define expand-type (syntax-parser [t:type #'t.expansion])))

Now we can use it to observe what happens when we try expanding a type using #%type:app:

(expand-type #'(#%type:app Maybe Integer))
; => #%type:app: expected type
;      at: Maybe
;      in: (#%type:app Maybe Integer)

Okay, it failed with an error, which is not ideal, but it makes sense. We haven’t actually defined Maybe or Integer anywhere. Let’s do so! We can define them as simple macros that expand into uses of #%type:con, which can be done easily using make-variable-like-transformer from syntax/transformer:

(define-syntax Maybe (make-variable-like-transformer #'(#%type:con Maybe)))
(define-syntax Integer (make-variable-like-transformer #'(#%type:con Integer)))

Now, if we try expanding that same type again:

(expand-type #'(#%type:app Maybe Integer))
; => (#%type:app (#%type:con Maybe) (#%type:con Integer))

…it works! Neat. Now we just need to add the cases for the remaining forms in our type language:

(begin-for-syntax
  (define-syntax-class expanded-type
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [type-literals]
    [pattern (#%type:con ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:app ~! a:type b:type)
             #:attr expansion #'(#%type:app a.expansion b.expansion)]
    [pattern (#%type:forall ~! x:id t:type)
             #:attr expansion #'(#%type:forall x t.expansion)]
    [pattern (#%type:qual ~! a:type b:type)
             #:attr expansion #'(#%type:qual a.expansion b.expansion)]
    [pattern (#%type:bound-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:wobbly-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:rigid-var ~! _:id)
             #:attr expansion this-syntax]))

This is pretty good already, and to a first approximation, it’s done! However, it doesn’t actually work as well as we’d really like it to. One of the whole points of doing things this way is to allow other macros like let-syntax to work in types. For example, we ought to be able to create a local type binding with let-syntax and have it just work. Unfortunately, it doesn’t:

(expand-type #'(let-syntax ([Bool (make-variable-like-transformer #'(#%type:con Bool))])
                 (#%type:app Maybe Bool)))
; => let-syntax: expected one of these identifiers: `#%type:con', `#%type:app', `#%type:forall', `#%type:qual', `#%type:bound-var', `#%type:wobbly-var', or `#%type:rigid-var'
;     at: letrec-syntaxes+values
;     in: (let-syntax ((Bool (make-variable-like-transformer (syntax Bool)))) (#%type:app Maybe Bool))

What went wrong? And why is it complaining about letrec-syntaxes+values? Well, if you read the documentation for local-expand, you’ll find that its behavior is a little more complicated than you might at first believe:

If stop-ids is [a nonempty list containing more than just module*], then begin, quote, set!, #%plain-lambda, case-lambda, let-values, letrec-values, if, begin0, with-continuation-mark, letrec-syntaxes+values, #%plain-app, #%expression, #%top, and #%variable-reference are implicitly added to stop-ids. Expansion stops when the expander encounters any of the forms in stop-ids, and the result is the partially-expanded form.

That’s a little strange, isn’t it? I am not completely sure why the behavior works quite this way, though I’m sure backwards compatibility plays a significant part, but while some of the behavior seems unnecessary, the issue with letrec-syntaxes+values (which let-syntax expands to) is a reasonable one. If the expander naïvely expanded letrec-syntaxes+values in the presence of a nonempty stop list, it could cause some significant problems!

Allow me to illustrate with an example. Let’s imagine we are the expander, and we are instructed to expand the following program:

(let-syntax ([Bool (make-variable-like-transformer #'(#%type:con Bool))])
  (#%type:app Maybe Bool))

We see let-syntax, so we start by evaluating the expression on the right hand side of the Bool binding. This produces a transformer expression, so we bind Bool to the transformer in the local environment, then move onto expanding the body. At this point, the expander is looking at this:

; local bindings:
;   Bool -> #<variable-like-transformer>
(#%type:app Maybe Bool)

Now, the identifier in application position is #%type:app, and #%type:app is in the stop list. Therefore, expansion must stop, and it does not attempt to expand any further. But what should the result of expansion be? Well, the let-syntax needs to go away when we expand it—local syntax bindings are erased as part of macroexpansion—so the logical thing to expand into is (#%type:app Maybe Bool). But this is a problem, because when we then go to expand Bool, Bool isn’t in the local binding table anymore! The let-syntax was already erased, and Bool is unbound!

When expanding recursively, this isn’t a problem, since the entire expression is guaranteed to be expanded while the local binding is still in the expander’s environment. As soon as we introduce partial expansion, however, we run the risk of a binding getting erased too early. So we’re stuck: we can’t recursively expand, or we’ll expand too much, but we can’t partially expand, since we might expand too little.

Confronted with this problem, there is some good news and some bad news. The good news is that, while the macroexpander can’t help us, we can help the macroexpander by doing some of the necessary bookkeeping for it. We can do this using first-class definition contexts, which allow us to manually extend the local environment when we call local-expand. The bad news is that first-class definition contexts are complicated, and using them properly is a surprisingly subtle problem.

Fortunately, I’ve already spent a lot of time figuring out what needs to be done to properly manipulate the necessary definition contexts in this particular situation. The first step is to parameterize our type and expanded-type syntax classes so that we may thread a definition context around as we recursively expand:

(begin-for-syntax
  (define-syntax-class (type [intdef-ctx #f])
    #:description "type"
    #:attributes [expansion]
    [pattern _ #:with {~var || (expanded-type intdef-ctx)}
                      (local-expand this-syntax 'expression type-literal-ids intdef-ctx)])

  (define-syntax-class (expanded-type intdef-ctx)
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [type-literals]
    [pattern (#%type:con ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:app ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion #'(#%type:app a.expansion b.expansion)]
    [pattern (#%type:forall ~! x:id {~var t (type intdef-ctx)})
             #:attr expansion #'(#%type:forall x t.expansion)]
    [pattern (#%type:qual ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion #'(#%type:qual a.expansion b.expansion)]
    [pattern (#%type:bound-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:wobbly-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:rigid-var ~! _:id)
             #:attr expansion this-syntax]))

Now, we can add an additional case to expanded-type to handle letrec-syntaxes+values, which will explicitly create a new definition context, add bindings to it, and use it when parsing the body:

[pattern (letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
         #:do [(define intdef-ctx* (syntax-local-make-definition-context))
               (for ([ids (in-list (attribute id))]
                     [e (in-list (attribute e))])
                 (syntax-local-bind-syntaxes ids e intdef-ctx*))]
         #:with {~var t* (type intdef-ctx*)} #'t
         #:attr expansion #'t*.expansion]

But even this isn’t quite right. The problem with this implementation is that it throws away the existing intdef-ctx argument to expanded-type, which means those bindings will be lost as soon as we introduce a new set. To fix this, we have to make the new definition context a child of the previous definition context by passing the old context as an argument to syntax-local-make-definition-context. This will ensure the parent bindings are brought into scope when expanding using the child context:

[pattern (letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
         #:do [(define intdef-ctx* (syntax-local-make-definition-context intdef-ctx))
               (for ([ids (in-list (attribute id))]
                     [e (in-list (attribute e))])
                 (syntax-local-bind-syntaxes ids e intdef-ctx*))]
         #:with {~var t* (type intdef-ctx*)} #'t
         #:attr expansion #'t*.expansion]

With this in place, our example using let-syntax actually works!

(expand-type #'(let-syntax ([Bool (make-variable-like-transformer #'(#%type:con Bool))])
                 (#%type:app Maybe Bool)))
; => (#%type:app (#%type:con Maybe) (#%type:con Bool))

Pretty cool, isn’t it?

Preserving syntax properties and source locations

We’ve now managed to essentially implement an expander for our custom language by periodically yielding to the Racket macroexpander, and for the most part, it works. However, our implementation isn’t perfect. The real Racket macroexpander takes great care to preserve source locations and syntax properties on syntax objects wherever possible, which our implementation does not do. Normally we don’t have to worry so much about such things, since the macroexpander automatically copies properties when expanding macros, but since we’re circumventing the expander, we don’t get that luxury. In order to properly preserve this information, we’ll have to be a little more careful.

To start, we really ought to copy the identifier in application position into the output wherever we can. In addition to preserving source location information and syntax properties, it also preserves the even more visible renamings. For example, if a user imports #%type:app under a different name, like #%type:apply, we should expand to a piece of syntax that still has #%type:apply in application position instead of replacing it with #%type:app.

To do this, we just need to bind each of the identifiers in application position, then use that binding when we produce output. For example, we would adjust the #%type:app clause to the following:

[pattern (head:#%type:app ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
         #:attr expansion #'(head a.expansion b.expansion)]

But even after doing this, some source locations and syntax properties are lost, since we’re still reconstructing the pair from scratch. To ensure we copy everything, we can define two helper macros, syntax/loc/props and quasisyntax/loc/props, which are like syntax/loc and quasisyntax/loc but copy properties in addition to source location information:

(begin-for-syntax
  (define-syntaxes [syntax/loc/props quasisyntax/loc/props]
    (let ()
      (define (make-syntax/loc/props name syntax-id)
        (syntax-parser
          [(_ from-stx-expr:expr {~describe "template" template})
           #`(let ([from-stx from-stx-expr])
               (unless (syntax? from-stx)
                 (raise-argument-error '#,name "syntax?" from-stx))
               (let* ([stx (#,syntax-id template)]
                      [stx* (syntax-disarm stx #f)])
                 (syntax-rearm (datum->syntax stx* (syntax-e stx*) from-stx from-stx) stx)))]))
      (values (make-syntax/loc/props 'syntax/loc/props #'syntax)
              (make-syntax/loc/props 'quasisyntax/loc/props #'quasisyntax)))))

Using syntax/loc/props, we can be truly thorough about ensuring all properties are preserved:

[pattern (head:#%type:app ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
         #:attr expansion (syntax/loc/props this-syntax
                            (head a.expansion b.expansion))]

Applying this to the other relevant clauses, we get an updated version of the expanded-type syntax class:

(begin-for-syntax
  (define-syntax-class (expanded-type intdef-ctx)
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [kernel-literals type-literals]
    [pattern (letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
             #:do [(define intdef-ctx* (syntax-local-make-definition-context intdef-ctx))
                   (for ([ids (in-list (attribute id))]
                         [e (in-list (attribute e))])
                     (syntax-local-bind-syntaxes ids e intdef-ctx*))]
             #:with {~var t* (type intdef-ctx*)} #'t
             #:attr expansion #'t*.expansion]
    [pattern (#%type:con ~! _:id)
             #:attr expansion this-syntax]
    [pattern (head:#%type:app ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head a.expansion b.expansion))]
    [pattern (head:#%type:forall ~! x:id {~var t (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head x t.expansion))]
    [pattern (head:#%type:qual ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head a.expansion b.expansion))]
    [pattern (#%type:bound-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:wobbly-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:rigid-var ~! _:id)
             #:attr expansion this-syntax]))

Now we’re getting closer, but if you can believe it, even this isn’t good enough. The real expander’s implementation of letrec-syntaxes+values does two things our implementation does not: it copies properties and updates the 'origin property to indicate the syntax came from a use of letrec-syntaxes+values, and it adds a 'disappeared-use property to record the erased bindings for use by tools like DrRacket. We can apply syntax-track-origin and internal-definition-context-track to the resulting syntax to add the same properties the expander would:

[pattern (head:letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
         #:do [(define intdef-ctx* (syntax-local-make-definition-context intdef-ctx))
               (for ([ids (in-list (attribute id))]
                     [e (in-list (attribute e))])
                 (syntax-local-bind-syntaxes ids e intdef-ctx*))]
         #:with {~var t* (type intdef-ctx*)} #'t
         #:attr expansion (~> (internal-definition-context-track intdef-ctx* #'t*.expansion)
                              (syntax-track-origin this-syntax #'head))]

Now we’ve finally dotted all our i’s and crossed our t’s. While it does take a lot to properly emulate what the macroexpander is doing, the important thing is that it’s actually possible! The end result of all this definition context juggling and property copying is that we’ve effectively managed to move some of the macroexpander’s logic into userspace code, which allows us to manipulate it as we see fit.

Connecting our custom language to Hackett

It took a lot of work, but we finally managed to write a custom type language, and while the code is not exactly simple, it’s not actually very long. The entire implementation of our custom type language is less than 80 lines of code:

#lang racket/base

(require (for-meta 2 racket/base
                     syntax/parse)
         (for-syntax racket/base
                     syntax/intdef
                     threading)
         syntax/parse/define)

(begin-for-syntax
  (define-syntaxes [syntax/loc/props quasisyntax/loc/props]
    (let ()
      (define (make-syntax/loc/props name syntax-id)
        (syntax-parser
          [(_ from-stx-expr:expr {~describe "template" template})
           #`(let ([from-stx from-stx-expr])
               (unless (syntax? from-stx)
                 (raise-argument-error '#,name "syntax?" from-stx))
               (let* ([stx (#,syntax-id template)]
                      [stx* (syntax-disarm stx #f)])
                 (syntax-rearm (datum->syntax stx* (syntax-e stx*) from-stx from-stx) stx)))]))
      (values (make-syntax/loc/props 'syntax/loc/props #'syntax)
              (make-syntax/loc/props 'quasisyntax/loc/props #'quasisyntax)))))

(define-syntaxes [#%type:con #%type:app #%type:forall #%type:qual
                  #%type:bound-var #%type:wobbly-var #%type:rigid-var]
  (let ([type-literal (λ (stx) (raise-syntax-error #f "cannot be used as an expression" stx))])
    (values type-literal type-literal type-literal type-literal
            type-literal type-literal type-literal)))

(begin-for-syntax
  (define type-literal-ids
    (list #'#%type:con #'#%type:app #'#%type:forall #'#%type:qual
          #'#%type:bound-var #'#%type:wobbly-var #'#%type:rigid-var))

  (define-literal-set type-literals
    [#%type:con #%type:app #%type:forall #%type:qual
     #%type:bound-var #%type:wobbly-var #%type:rigid-var])

  (define-syntax-class (type [intdef-ctx #f])
    #:description "type"
    #:attributes [expansion]
    [pattern _ #:with {~var || (expanded-type intdef-ctx)}
                      (local-expand this-syntax 'expression type-literal-ids intdef-ctx)])

  (define-syntax-class (expanded-type intdef-ctx)
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [kernel-literals type-literals]
    [pattern (head:letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
             #:do [(define intdef-ctx* (syntax-local-make-definition-context intdef-ctx))
                   (for ([ids (in-list (attribute id))]
                         [e (in-list (attribute e))])
                     (syntax-local-bind-syntaxes ids e intdef-ctx*))]
             #:with {~var t* (type intdef-ctx*)} #'t
             #:attr expansion (~> (internal-definition-context-track intdef-ctx* #'t*.expansion)
                                  (syntax-track-origin this-syntax #'head))]
    [pattern (#%type:con ~! _:id)
             #:attr expansion this-syntax]
    [pattern (head:#%type:app ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head a.expansion b.expansion))]
    [pattern (head:#%type:forall ~! x:id {~var t (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head x t.expansion))]
    [pattern (head:#%type:qual ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head a.expansion b.expansion))]
    [pattern (#%type:bound-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:wobbly-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:rigid-var ~! _:id)
             #:attr expansion this-syntax])

  (define expand-type (syntax-parser [t:type #'t.expansion])))

But what now? Just as Racket fully-expanded programs are useless without a compiler to turn them into something useful, our custom type language doesn’t do anything at all in isolation. As it happens, in the case of the type language, we don’t have a compiler at all—we have a typechecker. The Hackett typechecker consumes fully-expanded types as input and uses them to perform its typechecking process. The actual implementation of Hackett’s typechecker is outside the scope of this blog post, since it’s really an entirely separate problem, but you can probably imagine what such a thing might look like, in an extremely vague, handwavy sense.

But we don’t just need a typechecker. Just as the authors of Racket don’t expect users to write programs using the core forms directly, we also don’t expect users to write their types using the fully-expanded syntax. If we did, all this fancy expansion machinery would be pretty pointless! Hackett provides a custom #%app binding that converts n-ary type applications to nested uses of #%type:app, as well as a nicer forall macro that supports specifying multiple type variables and multiple typeclass constraints all at once. The best part, though, is that these macros can be defined in a completely straightforward way, just as any ordinary Racket macro would be written, and the machinery will work precisely as intended. It’s also perfectly okay to have two different versions of #%app—one for types and one for values—since Hackett supports multiple namespaces, and each can have its own #%app binding.

The real implementation of Hackett’s type language is a little bit longer than the one in this blog post because it includes some extra definitions to provide custom syntax/parse pattern expanders for matching types and some template metafunctions for producing them, which are used by the typechecker, but if you’d like to see the whole thing, it’s available on GitHub here.

Evaluation, limitations, and acknowledgements

Reimplementing Hackett’s type language took about a week and a half, about half of which was supplemented by the extra time I had before I started my new job this past week. A portion of that time was spent deciding what I actually wanted to do, and a lot of it was spent hunting down fiddly bugs. All told, the rewrite resulted in a net addition of 250 lines of code to the Hackett codebase. However, 350 of the added lines reside in a new, self-contained module dedicated to Hackett’s type language, so the change actually resulted in a net removal of 100 lines from the rest of the codebase, which I consider an organizational win.

As for whether or not the change will accomplish the goals I had in mind, I think signs currently point to a strong likelihood of the answer being yes. The very same night I finalized and merged the changes to the type language, I dusted off an old prototype of typeclass deriving I had not been able to get working due to insufficiencies of the old type representation. Not only was I able to get it working quickly and easily, I was able to do it in no more than 20 lines of code. While the implementation is not as robust as it should ideally be, nor is it safe or simple enough yet to be easy for Hackett users to write themselves, making the impossible possible is usually a sign of motion in the right direction.

Unfortunately, the technique outlined in this blog post is not completely flawless. Due to its reliance on the local-expand stop list, this technique is incompatible with macros that force recursive expansion using an empty stop list. In the upcoming reimplementation of the Racket macroexpander to be released in Racket 7, this includes syntax-parameterize, which unfortunately means syntax parameters don’t work in the type language. This is a problem, and while it’s not a dealbreaker, it is something that will almost certainly have to be fixed at some point. Fortunately, it isn’t intractable, and I’ve been discussing some potential approaches to fixing the problem, whether via changes to the macroexpander or by making macros like syntax-parameterize cooperate better with things like Hackett’s type language.

Finally, as seems to be the case more and more with my blog posts, I cannot express enough thanks to Matthew Flatt, without whose help I would probably not have been able to get everything working (not to mention that the Racket macro system would not exist without Matthew inventing and implementing it nearly singlehandedly). Matthew does an almost unfathomable number of things for Racket already without me pestering him with questions, bug reports, and feature requests, but he’s always patient and helpful all the same. Also, once again, I’d like to thank Ryan Culpepper for his incredible work on constructing tools for the working macro developer, including writing the fantastic syntax/parse library that powers essentially everything I do. Thank you both.

An opinionated guide to Haskell in 2018

2018-02-10T00:00:00Z

For me, this month marks the end of an era in my life: as of February 2018, I am no longer employed writing Haskell. It’s been a fascinating two years, and while I am excitedly looking forward to what I’ll be doing next, it’s likely I will continue to write Haskell in my spare time. I’ll probably even write it again professionally in the future.

In the meantime, in the interest of both sharing with others the small amount of wisdom I’ve gained and preserving it for my future self, I’ve decided to write a long, rather dry overview of a few select parts of the Haskell workflow I developed and the ecosystem I settled into. This guide is, as the title notes, opinionated—it is what I used in my day-to-day work, nothing more—and I don’t claim that anything here is the only way to write Haskell, nor even the best way. It is merely what I found helpful and productive. Take from it as much or as little as you’d like.

Build tools and how to use them

When it comes to building Haskell, you have options. And frankly, most of them are pretty good. There was a time when cabal-install had a (warranted) reputation for being nearly impossible to use and regularly creating dependency hell, but I don’t think that’s the case anymore (though you do need to be a little careful about how you use it). Sandboxed builds work alright, and cabal new-build and the other cabal new-* commands are even better. That said, the UX of cabal-install is still less-than-stellar, and it has sharp edges, especially for someone coming from an ecosystem without a heavyweight compilation process like JavaScript, Ruby, or Python.

Nix is an alternative way to manage Haskell dependencies, and it seems pretty cool. It has a reputation for being large and complicated, and that reputation does not seem especially unfair, but you get lots of benefits if you’re willing to pay the cost. Unfortunately, I have never used it (though I’ve read a lot about it), so I can’t comment much on it here. Perhaps I’ll try to go all-in with Nix when I purchase my next computer, but for now, my workflow works well enough that I don’t feel compelled to switch.

Personally, I use stack as my Haskell build tool. It’s easy to use, it works out of the box, and while it doesn’t enjoy the same amount of caching as cabal new-build or Nix, it caches most packages, and it also makes things like git-hosted sources incredibly easy, which (as far as I can tell) can’t be done with cabal-install alone.

This section is going to be a guide on how I use stack. If you use cabal-install with or without Nix, great! Those tools seem good, too. This is not an endorsement of stack over the other build tools, just a description of how I use it, the issues I ran into, and my solutions to them.

Understanding `stack`’s model and avoiding its biggest gotcha

Before using stack, there are a few things every programmer should know:

stack is not a package manager, it is a build tool. It does not manage a set of “installed” packages; it simply builds targets and their dependencies.
The command to build a target is stack build <target>. Just using stack build on its own will build the current project’s targets.
You almost certainly do not want to use stack install.

This is the biggest point of confusion I see among new users of stack. After all, when you want to install a package with npm, you type npm install <package>. So a new Haskeller decides to install lens, types stack install lens, and then later tries stack uninstall lens, only to discover that no such command exists. What happened?

stack install is not like npm install. stack install is like make install. It is nothing more than an alias for stack build --copy-bins, and all it does is build the target and copy all of its executables into some relatively global location like ~/.local/bin. This is usually not what you want.

This design decision is not unique to stack; cabal-install suffers from it as well. One can argue that it isn’t unintuitive because it really is just following what make install conventionally does, and the fact that it happens to conflict with things like npm install or even apt-get install is just a naming clash. I think that argument is a poor one, however, and I think the decision to even include a stack install command was a bad idea.

So, remember: don’t use stack install! stack works best when everything lives inside the current project’s local sandbox, and stack install copies executables into a global location by design. While it might sometimes appear to work, it’s almost always wrong. The only situation in which stack install is the right answer is when you want to install an executable for a use unrelated to Haskell development (that is, something like pandoc) that just so happens to be provided by a Haskell package. This means no running stack install ghc-mod or stack install intero either, no matter what READMEs might tell you! Don’t worry: I’ll cover the proper way to install those things later.

Actually building your project with `stack`

Okay, so now that you know to never use stack install, what do you use? Well, stack build is probably all you need. Let’s cover some variations of stack build that I use most frequently.

Once you have a stack project, you can build it by simply running stack build within the project directory. However, for local development, this is usually unnecessarily slow because it runs the GHC optimizer. For faster development build times, pass the --fast flag to disable optimizations:

$ stack build --fast

By default, stack builds dependencies with coarse-grained, package-level parallelism, but you can enable more fine-grained, module-level parallel builds by adding --ghc-options=-j. Unfortunately, there are conflicting accounts on whether or not this actually makes things faster or slower in practice, and I haven’t extensively tested to see whether or not this is the case, so I mostly leave it off.

Usually, you also want to build and run the tests along with your code, which you can enable with the --test flag. Additionally, stack test is an alias for stack build --test, so these two commands are equivalent:

$ stack build --fast --test
$ stack test --fast

Also, it is useful to build documentation as well as code! You can do this by passing the --haddock flag, but unfortunately, I find Haddock sometimes takes an unreasonably long time to run. Therefore, since I usually only care about running Haddock on my dependencies, I usually pass the --haddock-deps flag instead, which prevents having to re-run Haddock every time you build:

$ stack test --fast --haddock-deps

Finally, I usually want to build and test my project in the background whenever my code changes. Fortunately, this can be done easily by using the --file-watch flag, making it easy to incrementally change project code and immediately see results:

$ stack test --fast --haddock-deps --file-watch

This is the command I usually use to develop my Haskell projects.

Accessing local documentation

While Haskell does not always excel on the documentation front, a small amount of documentation is almost always better than no documentation at all, and I find my dependencies’ documentation to be an invaluable resource while developing. I find many people just look at docs on Hackage or use the hosted instance of Hoogle, but this sometimes leads people astray: they might end up looking at the wrong version of the documentation! Fortunately, there’s an easy solution to this problem, which is to browse the documentation stack installs locally, which is guaranteed to match the version you are using in your current project.

The easiest way to open local documentation for a particular package is to use the stack haddock --open command. For example, to open the documentation for lens, you could use the following command:

$ stack haddock --open lens

This will open the local documentation in your web browser, and you can browse it at your leisure. If you have already built the documentation using the --haddock-deps option I recommended in the previous section, this command should complete almost instantly, but if you haven’t built the documentation yet, you’ll have to wait as stack builds it for you on-demand.

While this is a good start, it isn’t perfect. Ideally, I want to have searchable documentation, and fortunately, this is possible to do by running Hoogle locally. This is easy enough with modern versions of stack, which have built-in Hoogle integration, but it still requires a little bit of per-project setup, since you need to build the Hoogle search index with the following command:

$ stack hoogle -- generate --local

This will install Hoogle into the current project if it isn’t already installed, and it will index your dependencies’ documentation and generate a new Hoogle database. Once you’ve done that, you can start a web server that serves a local Hoogle search page with the following command:

$ stack hoogle -- server --local --port=8080

Navigate to http://localhost:8080 in your web browser, and you’ll have a fully-searchable index of all your Haskell packages’ documentation. Isn’t that neat?

Unfortunately, you will have to manually regenerate the Hoogle database when you install new packages and their documentation, which you can do by re-running stack hoogle -- generate --local. Fortunately, regenerating the database doesn’t take very long, as long as you’ve been properly rebuilding the documentation with --haddock-deps.

Configuring your project

Every project built with stack is configured with two separate files:

The stack.yaml file, which controls which packages are built and what versions to pin your dependencies to.
The <project>.cabal file or package.yaml file, which specifies build targets, their dependencies, and which GHC options to apply, among other things.

The .cabal file is, ultimately, what is used to build your project, but modern versions of stack generate projects that use hpack, which uses an alternate configuration file, the package.yaml file, to generate the .cabal file. This can get a little bit confusing, since it means you have three configuration files in your project, one of which is generated from the other one.

I happen to use and like hpack, so I use a package.yaml file and allow hpack to generate the .cabal file. I have no real love for YAML, and in fact I think custom configuration formats are completely fine, but the primary advantage of hpack is the ability to specify things like GHC options and default language extensions for all targets at once, instead of needing to duplicate them per-target.

You can think of the .cabal or package.yaml file as a specification for how your project is built and what packages it depends on, but the stack.yaml file is a specification of precisely which version of each package should be used and where it should be fetched from. Also, each .cabal file corresponds to precisely one Haskell package (though it may have any number of executable targets), but a stack.yaml file can specify multiple different packages to build, useful for multi-project builds that share a common library. The details here can be a little confusing, more than I am likely going to be able to explain in this blog post, but for the most part, you can get away with the defaults unless you’re doing something fancy.

Setting up editor integration

Currently, I use Atom to write Haskell. Atom is not a perfect editor by any means, and it leaves a lot to be desired, but it’s easy to set up, and the Haskell editor integration is decent.

Atom’s editor integration is powered by ghc-mod, a program that uses the GHC API to provide tools to inspect Haskell programs. Installing ghc-mod must be done manually so that Atom’s haskell-ghc-mod package can find it, and this is where a lot of people get tripped up. They run stack install ghc-mod, it installs ghc-mod into ~/.local/bin, they put that in their PATH, and things work! …except when a new version of GHC is released a few months later, everything stops working.

As mentioned above, stack install is not what you want. Tools like ghc-mod, hlint, hoogle, weeder, and intero work best when installed as part of the sandbox, not globally, since that ensures they will match the current GHC version your project is using. This can be done per-project using the ordinary stack build command, so the easiest way to properly install ghc-mod into a stack project is with the following command:

$ stack build ghc-mod

Unfortunately, this means you will need to run that command inside every single stack project individually in order to properly set it up so that stack exec -- ghc-mod will find the correct executable. One way to circumvent this is by using a recently-added stack flag designed for this explicit purpose, --copy-compiler-tool. This is like --copy-bins, but it copies the executables into a compiler-specific location, so a tool built for GHC 8.0.2 will be stored separately from the same tool built for GHC 8.2.2. stack exec arranges for the executables for the current compiler version to end up in the PATH, so you only need to build and install your tools once per compiler version.

Does this kind of suck? Yes, a little bit, but it sucks a whole lot less than all your editor integration breaking every time you switch to a project that uses a different version of GHC. I use the following command in a fresh sandbox when a Stackage LTS comes out for a new version of GHC:

$ stack build --copy-compiler-tool ghc-mod hoogle weeder

This way, I only have to build those tools once, and I don’t worry about rebuilding them again until a the next release of GHC. To verify that things are working properly, you should be able to create a fresh stack project, run a command like this one, and get a similar result:

$ stack exec -- which ghc-mod
/Users/alexis/.stack/compiler-tools/x86_64-osx/ghc-8.2.2/bin/ghc-mod

Note that this path is scoped to my operating system and my compiler version, but nothing else—no LTS or anything like that.

Warning flags for a safe build

Haskell is a relatively strict language as programming languages go, but in my experience, it isn’t quite strict enough. Many things are not errors that probably ought to be, like orphan instances and inexhaustive pattern matches. Fortunately, GHC provides warnings that catch these problems statically, which fill in the gaps. I recommend using the following flags on all projects to ensure everything is caught:

The -Wall option turns on most warnings, but (ironically) not all of them. The -Weverything flag truly turns on all warnings, but some of the warnings left disabled by -Wall really are quite silly, like warning when type signatures on polymorphic local bindings are omitted. Some of them, however, are legitimately useful, so I recommend turning them on explicitly.

-Wcompat enables warnings that make your code more robust in the face of future backwards-incompatible changes. These warnings are trivial to fix and serve as free future-proofing, so I see no reason not to turn these warnings on.

-Wincomplete-record-updates and -Wincomplete-uni-patterns are things I think ought to be enabled by -Wall because they both catch what are essentially partial pattern-matches (and therefore runtime errors waiting to happen). The fact that -Wincomplete-uni-patterns isn’t enabled by -Wall is so surprising that it can lead to bugs being overlooked, since the extremely similar -Wincomplete-patterns is enabled by -Wall.

-Wredundant-constraints is a useful warning that helps to eliminate unnecessary typeclass constraints on functions, which can sometimes occur if a constraint was previously necessary but ends up becoming redundant due to a change in the function’s behavior.

I put all five of these flags in the .cabal file (or package.yaml), which enables them everywhere, but this alone is unlikely to enforce a warning-free codebase, since the build will still succeed even in the presence of warnings. Therefore, when building projects in CI, I pass the -Werror flag (using --ghc-options=-Werror for stack), which treats warnings as errors and halts the build if any warnings are found. This is useful, since it means warnings don’t halt the whole build while developing, making it possible to write some code that has warnings and still run the test suite, but it still enforces that pushed code be warning-free.

Any flavor you like

Haskell is both a language and a spectrum of languages. It is both a standard and a specific implementation. Haskell 98 and Haskell 2010 are good, small languages, and there are a few different implementations, but when people talk about “Haskell”, unqualified, they’re almost always talking about GHC.

GHC Haskell, in stark contrast to standard Haskell, is neither small nor particularly specific, since GHC ships with dozens of knobs and switches that can be used to configure the language. In theory, this is a little terrifying. How could anyone ever hope to talk about Haskell and agree upon how to write it if there are so many different Haskells, each a little bit distinct? Having a cohesive ecosystem would be completely hopeless.

Fortunately, in practice, this is not nearly as bad as it seems. The majority of GHC extensions are simple switches: a feature is either on or it is off. Turning a feature on rarely affects code that does not use it, so most extensions can be turned on by default, and programmers may simply avoid the features they do not wish to use, just as any programmer in any programming language likely picks a subset of their language’s features to use on a daily basis. Writing Haskell is not different in this regard, only in the sense that it does not allow all features to be used by default; everything from minor syntactic tweaks to entirely new facets of the type system are opt-in.

Frankly, I think the UX around this is terrible. I recognize the desire to implement a standard Haskell, and the old -fglasgow-exts was not an especially elegant solution for people wishing to use nonstandard Haskell, but having to insert LANGUAGE pragmas at the top of every module just to take advantage of the best features GHC has to offer is a burden, and it is unnecessarily intimidating. I think much of the Haskell community finds the use of LANGUAGE pragmas preferable to enabling extensions globally using the default-extensions list in the .cabal file, but I cut across the grain on that issue hard. The vast majority of language extensions I use are extensions I want enabled all the time; a list of them at the top of a module is just distracting noise, and it only serves to bury the extensions I really do want to enable on a module-by-module basis. It also makes it tricky to communicate with a team which extensions are acceptable (or even preferable) and which are discouraged.

My strong recommendation if you decide to write GHC Haskell on a team is to agree as a group to a list of extensions the team is happy with enabling everywhere and putting those extensions in the default-extensions list in the .cabal file. This eliminates clutter, busywork, and the conceptual overhead of remembering which extensions are in favor, and which are discouraged. This is a net win, and it isn’t at all difficult to look in the .cabal file when you want to know which extensions are in use.

Now, with that small digression out of the way, the question becomes precisely which extensions should go into that default-extensions list. I happen to like using most of the features GHC makes available, so I enable a whopping 34 language extensions by default. As of GHC 8.2, here is my list:

This is a lot, and a few of them are likely to be more controversial than others. Since I do not imagine everyone will agree with everything in this list, I’ve broken it down into smaller chunks, arranged from what I think ought to be least controversial to most controversial, along with a little bit of justification why each extension is in each category. If you’re interested in coming up with your own list of extensions, the rest of this section is for you.

Trivial lifting of standards-imposed limitations

A few extensions are tiny changes that lift limitations that really have no reason to exist, other than that they are mandated by the standard. I am not sure why these restrictions are in the standard to begin with, other than perhaps a misguided attempt at making the language simpler. These extensions include the following:

These extensions have no business not being turned on everywhere. FlexibleContexts and FlexibleInstances end up being turned on in almost any nontrivial Haskell module, since without them, the typeclass system is pointlessly and artificially limited.

InstanceSigs is extremely useful, completely safe, and has zero downsides.

MultiParamTypeClasses are almost impossible to avoid, given how many libraries use them, and they are a completely obvious generalization of single-parameter typeclasses. Much like FlexibleContexts and FlexibleInstances, I see no real reason to ever leave these disabled.

EmptyCase is even stranger to me, since EmptyDataDecls is in Haskell 2010, so it’s possible to define empty datatypes in standard Haskell but not exhaustively pattern-match on them! This is silly, and EmptyCase should be standard Haskell.

Syntactic conveniences

A few GHC extensions are little more than trivial, syntactic abbreviations. These things would be tiny macros in a Lisp, but they need to be extensions to the compiler in Haskell:

All of these extensions are only triggered by explicit use of new syntax, so existing programs will never change behavior when these extensions are introduced.

LambdaCase only saves a few characters, but it eliminates the need to come up with a fresh, unique variable name that will only be used once, which is sometimes hard to do and leads to worse names overall. Sometimes, it really is better to leave something unnamed.

MultiWayIf isn’t something I find I commonly need, but when I do, it’s nice to have. It’s far easier to read than nested if...then...else chains, and it uses the existing guard syntax already used with function declarations and case...of, so it’s easy to understand, even to those unfamiliar with the extension.

NamedFieldPuns avoids headaches and clutter when using Haskell records without the accidental identifier capture issues of RecordWildCards. It’s a nice, safe compromise that brings some of the benefits of RecordWildCards without any downsides.

TupleSections is a logical generalization of tuple syntax in the same vein as standard operator sections, and it’s quite useful when using applicative notation. I don’t see any reason to not enable it.

Extensions to the deriving mechanism

GHC’s typeclass deriving mechanism is one of the things that makes Haskell so pleasant to write, and in fact I think Haskell would be nearly unpalatable to write without it. Boilerplate generation is a good thing, since it defines operations in terms of a single source of truth, and generated code is code you do not need to maintain. There is rarely any reason to write a typeclass instance by hand when the deriving mechanism will write it automatically.

These extensions give GHC’s typeclass deriving mechanism more power without any cost. Therefore, I see no reason not to enable them:

The first five of these simply extend the list of typeclasses GHC knows how to derive, something that will only ever be triggered if the user explicitly requests GHC derive one of those classes. GeneralizedNewtypeDeriving is quite possibly one of the most important extensions in all of Haskell, since it dramatically improves newtypes’ utility. Wrapper types can inherit instances they need without any boilerplate, and making increased type safety easier and more accessible is always a good thing in my book.

DerivingStrategies is new to GHC 8.2, but it finally presents the functionality of GHC’s DeriveAnyClass extension in a useful way. DeriveAnyClass is useful when used with certain libraries that use DefaultSignatures (discussed later) with GHC.Generics to derive instances of classes without the deriving being baked into GHC. Unfortunately, enabling DeriveAnyClass essentially disables the far more useful GeneralizedNewtypeDeriving, so I do not recommend enabling DeriveAnyClass. Fortunately, with DerivingStrategies, it’s possible to opt into the anyclass deriving strategy on a case-by-case basis, getting some nice boilerplate reduction in the process.

StandaloneDeriving is useful when GHC’s deriving algorithms aren’t quite clever enough to deduce the instance context automatically, so it allows specifying it manually. This is only useful in a few small situations, but it’s nice to have, and there are no downsides to enabling it, so it ought to be turned on.

Lightweight syntactic adjustments

A couple extensions tweak Haskell’s syntax in more substantial ways than things like LambdaCase, but not in a significant enough way for them to really be at all surprising:

BangPatterns mirror strictness annotations on datatypes, so they are unlikely to be confusing, and they provide a much more pleasant notation for annotating the strictness of bindings than explicit uses of seq.

KindSignatures are also fairly self-explanatory: they’re just like type annotations, but for types instead of values. Writing kind signatures explicitly is usually unnecessary, but they can be helpful for clarity or for annotating phantom types when PolyKinds is not enabled. Enabling KindSignatures doesn’t have any adverse effects, so I see no reason not to enable it everywhere.

TypeOperators adjusts the syntax of types slightly, allowing operators to be used as type constructors and written infix, which is technically backwards-incompatible, but I’m a little suspicious of anyone using (!@#$) as a type variable (especially since standard Haskell does not allow them to be written infix). This extension is useful with some libraries like natural-transformations that provide infix type constructors, and it makes the type language more consistent with the value language.

Polymorphic string literals

I’m putting this extension in a category all of its own, mostly because I don’t think any other Haskell extensions have quite the same set of tradeoffs:

OverloadedStrings

For me, OverloadedStrings is not optional. Haskell’s infamous “string problem” (discussed in more detail at the end of this blog post) means that String is a linked list of characters, and all code that cares about performance actually uses Text. Manually invoking pack on every single string literal in a program is just noise, and OverloadedStrings solves that noise.

That said, I actually find I don’t use the polymorphism of string literals very often, and I’d be alright with monomorphic literals if I could make them all have type Text. Unfortunately, there isn’t a way to do this, so OverloadedStrings is the next best thing, even if it sometimes causes some unnecessary ambiguities that require type annotations to resolve.

OverloadedStrings is an extension that I use so frequently, in so many modules (especially in my test suites) that I would rather keep it on everywhere so I don’t have to care about whether or not it’s enabled in the module I’m currently writing. On the other hand, it certainly isn’t my favorite language extension, either. I wouldn’t go as far as to call it a necessary evil, since I don’t think it’s truly “evil”, but it does seem to be necessary.

Simple extensions to aid type annotation

The following two extensions significantly round out Haskell’s language for referring to types, making it much easier to insert type annotations where necessary (for removing ambiguity or for debugging type errors):

That the behavior of ScopedTypeVariables is not the default is actually one of the most common gotchas for new Haskellers. Sadly, it can theoretically adjust the behavior of existing Haskell programs, so I cannot include it in the list of trivial changes, but I would argue such programs were probably confusing to begin with, and I have never seen a program in practice that was impacted by that problem. I think leaving ScopedTypeVariables off is much, much more likely to be confusing than turning it on.

TypeApplications is largely unrelated, but I include it in this category because it’s quite useful and cooperates well with ScopedTypeVariables. Use of TypeApplications makes instantiation much more lightweight than full-blown type annotations, and once again, it has no downsides if it is enabled and unused (since it is a syntactic addition). I recommend enabling it.

Simple extensions to the Haskell type system

A few extensions tweak the Haskell type system in ways that I think are simple enough to be self-explanatory, even to people who might not have known they existed. These are as follows:

ConstraintKinds is largely just used to define typeclass aliases, which is both useful and self-explanatory. Unifying the type and constraint language also has the effect of allowing type-level programming with constraints, which is sometimes useful, but far rarer in practice than the aforementioned use case.

RankNTypes are uncommon, looking at the average type in a Haskell program, but they’re certainly nice to have when you need them. The idea of pushing foralls further into a type to adjust how variables are quantified is something that I find people find fairly intuitive, especially after seeing them used once or twice, and higher-rank types do crop up regularly, if infrequently.

Intermediate syntactic adjustments

Three syntactic extensions to Haskell are a little bit more advanced than the ones I’ve already covered, and none of them are especially related:

ApplicativeDo is, on the surface, simple. It changes do notation to use Applicative operations where possible, which allows using do notation with applicative functors that are not monads, and it also makes operations potentially more performant when (<*>) can be implemented more efficiently than (>>=). In theory, it sounds like there are no downsides to enabling this everywhere. However, there are are a few drawbacks that lead me to put it so low on this list:

It considerably complicates the desugaring of do blocks, to the point where the algorithm cannot even be easily syntactically documented. In fact, an additional compiler flag, -foptimal-applicative-do, is a way to opt into optimal solutions for do block expansions, tweaking the desugaring algorithm to have an O(n³) time complexity! This means that the default behavior is guided by a heuristic, and desugaring isn’t even especially predictable. This isn’t necessarily so bad, since it’s really only intended as an optimization when some Monad operations are still necessary, but it does dramatically increase the complexity of one of Haskell’s core forms.
The desugaring, despite being O(n²) by default, isn’t even especially clever. It relies on a rather disgusting hack that recognizes return e, return $ e, pure e, or pure $ e expressions syntactically, and it completely gives up if an expression with precisely that shape is not the final statement in a do block. This is a bit awkward, since it effectively turns return and pure into syntax when before they were merely functions, but that isn’t all. It also means that the following do block is not desugared using Applicative operations:
```
do foo a b
   bar s t
   baz y z
```
This will use the normal, monadic desugaring, despite the fact that it is trivially desugared into Applicative operations as foo a b *> bar s t *> baz y z. In order to get ApplicativeDo to trigger here, the do block must be contorted into the following:
```
do foo a b
   bar s t
   r <- baz y z
   pure r
```
This seems like an odd oversight.
TemplateHaskell doesn’t seem able to cope with do blocks when ApplicativeDo is enabled. I reported this as an issue on the GHC bug tracker, but it hasn’t received any attention, so it’s not likely to get fixed unless someone takes the initiative to do so.
Enabling ApplicativeDo can cause problems with code that may have assumed do would always be monadic, and sometimes, that can cause code that typechecks to lead to an infinite loop at runtime. Specifically, if do notation is used to define (<*>) in terms of (>>=), enabling ApplicativeDo will cause the definition of (<*>) to become self-referential and therefore divergent. Fortunately, this issue can be easily mitigated by simply writing (<*>) = ap instead, which is clearer and shorter than the equivalent code using do.

Given all these things, it seems ApplicativeDo is a little too new in a few places, and it isn’t quite baked. Still, I keep it enabled by default. Why? Well, usually it works fine without any problems, and when I run into issues, I can disable it on a per-module basis by writing {-# LANGUAGE NoApplicativeDo #-}. I still find that keeping it enabled by default is fine the vast majority of the time, I just sometimes need to work around the bugs.

In contrast, DefaultSignatures isn’t buggy at all, as far as I can tell, it’s just not usually useful without fairly advanced features like GADTs (for type equalities) or GHC.Generics. I mostly use it for making lifting instances for mtl-style typeclasses easier to write, which I’ve found to be a tiny bit tricky to explain (mostly due to the use of type equalities in the context), but it works well. I don’t see any real reason to leave this disabled, but if you don’t think you’re going to use it anyway, it doesn’t really matter one way or the other.

Finally, PatternSynonyms allow users to extend the pattern language just as they are allowed to extend the value language. Bidirectional pattern synonyms are isomorphisms, and it’s quite useful to allow those isomorphisms to be used with Haskell’s usual pattern-matching syntax. I think this extension is actually quite benign, but I put it so low on this list because it seems infrequently used, and I get the sense most people consider it fairly advanced. I would argue, however, that it’s a very pleasant, useful extension, and it’s no more complicated than a number of the features in Haskell 98.

Intermediate extensions to the Haskell type system

Now we’re getting into the meat of things. Everything up to this point has been, in my opinion, completely self-evident in its usefulness and simplicity. As far as I’m concerned, the extensions in the previous six sections have no business ever being left disabled. Starting in this section, however, I could imagine a valid argument being made either way.

The following three extensions add some complexity to the Haskell type system in return for some added expressive power:

ExistentialQuantification and GADTs are related, given that the former is subsumed by the latter, but GADTs also enables an alternative syntax. Both syntaxes allow packing away a typeclass dictionary or equality constraint that is brought into scope upon a successful pattern-match against a data constructor, something that is sometimes quite useful but certainly a departure from Haskell’s simple ADTs.

FunctionalDependencies extend multi-parameter typeclasses, and they are almost unavoidable, given their use in the venerable mtl library. Like GADTs, FunctionalDependencies add an additional layer of complexity to the typeclass system in order to express certain things that would otherwise be difficult or impossible.

All of these extensions involve a tradeoff. Enabling GADTs also implies MonoLocalBinds, which disables let generalization, one of the most likely ways a program that used to typecheck might subsequently fail to do so. Some might argue that this is a good reason to turn GADTs on in a per-module basis, but I disagree: I actually want my language to be fairly consistent, and given that I know I am likely going to want to use GADTs somewhere, I want MonoLocalBinds enabled everywhere, not inconsistently and sporadically.

That aside, all these extensions are relatively safe. They are well-understood, and they are fairly self-contained extensions to the Haskell type system. I think these extensions have a very good power to cost ratio, and I find myself using them regularly (especially FunctionalDependencies), so I keep them enabled globally.

Advanced extensions to the Haskell type system

Finally, we arrive at the last set of extensions in this list. These are the most advanced features Haskell’s type system currently has to offer, and they are likely to be the most controversial to enable globally:

All of these extensions exist exclusively for the purpose of type-level programming. DataKinds allows datatype promotion, creating types that are always uninhabited and therefore can only be used phantom. TypeFamilies allows the definition of type-level functions that map types to other types. Both of these are minor extensions to Haskell’s surface area, but they have rather significant ramifications on the sort of programming that can be done and the way GHC’s typechecker must operate.

TypeFamilies is an interesting extension because it comes in so many flavors: associated type synonyms, associated datatypes, open and closed type synonym families, and open and closed datatype families. Associated types tend to be easier to grok and easier to use, though they can also be replaced by functional dependencies. Open type families are also quite similar to classes and instances, so they aren’t too tricky to understand. Closed type families, on the other hand, are a rather different beast, and they can be used to do fairly advanced things, especially in combination with DataKinds.

I happen to appreciate GHC’s support for these features, and while I’m hopeful that an eventual DependentHaskell will alleviate many of the existing infelicities with dependently typed programming in GHC, in the meantime, it’s often useful to enjoy what exists where practically applicable. Therefore, I have little problem keeping them enabled, since, like the vast majority of extensions on this list, these extensions merely lift restrictions, not adjust semantics of the language without the extensions enabled. When I am going to write a type family, I am going to turn on TypeFamilies; I see no reason to annotate the modules in which I decide to do so. I do not write an annotation at the top of each module in which I define a typeclass or a datatype, so why should I do so with type families?

TypeFamilyDependencies is a little bit different, since it’s a very new extension, and it doesn’t seem to always work as well as I would hope. Still, when it doesn’t work, it fails with a very straightforward error message, and when it works, it is legitimately useful, so I don’t see any real reason to leave it off if TypeFamilies is enabled.

Extensions intentionally left off this list

Given what I’ve said so far, it may seem like I would advocate flipping on absolutely every lever GHC has to offer, but that isn’t actually true. There are a few extensions I quite intentionally do not enable.

UndecidableInstances is something I turn on semi-frequently, since GHC’s termination heuristic is not terribly advanced, but I turn it on per-module, since it’s useful to know when it’s necessary (and in application code, it rarely is). OverlappingInstances and IncoherentInstances, in contrast, are completely banned—not only are they almost always a bad idea, GHC has a better, more fine-grained way to opt into overlapping instances, using the {-# OVERLAPPING #-}, {-# OVERLAPPABLE #-}, and {-# INCOHERENT #-} pragmas.

TemplateHaskell and QuasiQuotes are tricky ones. Anecdotes seem to suggest that enabling TemplateHaskell everywhere leads to worse compile times, but after trying this on a few projects and measuring, I wasn’t able to detect any meaningful difference. Unless I manage to come up with some evidence that these extensions actually slow down compile times just by being enabled, even if they aren’t used, then I may add them to my list of globally-enabled extensions, since I use them regularly.

Other extensions I haven’t mentioned are probably things I just don’t use very often and therefore haven’t felt the need to include on this list. It certainly isn’t exhaustive, and I add to it all the time, so I expect I will continue to do so in the future. This is just what I have for now, and if your favorite extension isn’t included, it probably isn’t a negative judgement against that extension. I just didn’t think to mention it.

Libraries: a field guide

Now that you’re able to build a Haskell project and have chosen which handpicked flavor of Haskell you are going to write, it’s time to decide which libraries to use. Haskell is an expressive programming language, and the degree to which different libraries can shape the way you structure your code is significant. Picking the right libraries can lead to clean code that’s easy to understand and maintain, but picking the wrong ones can lead to disaster.

Of course, there are thousands of Haskell libraries on Hackage alone, so I cannot hope to cover all of the ones I have ever found useful, and I certainly cannot cover ones that would be useful but I did not have the opportunity to try (of which there are certainly many). This blog post is long enough already, so I’ll just cover a few categories of libraries that I think I can offer interesting commentary on; most libraries can generally speak for themselves.

Having an effect

One of the first questions Haskell programmers bump into when they begin working on a large application is how they’re going to model effects. Few practical programming languages are pure, but Haskell is one of them, so there’s no getting away from coming up with a way to manage side-effects.

For some applications, Haskell’s built-in solution might be enough: IO. This can work decently for data processing programs that do very minimal amounts of I/O, and the types of side-effects they perform are minimal. For these applications, most of the logic is likely to be pure, which means it’s already easy to reason about and easy to test. For other things, like web applications, it’s more likely that a majority of the program logic is going to be side-effectful by its nature—it may involve making HTTP requests to other services, interacting with a database, and writing to logfiles.

Figuring out how to structure these effects in a type-safe, decoupled, composable way can be tricky, especially since Haskell has so many different solutions. I could not bring myself to choose just one, but I did choose two: the so-called “mtl style” and freer monads.

mtl style is so named because it is inspired by the technique of interlocking monadic typeclasses and lifting instances used to model effects using constraints that is used in the mtl library. Here is a small code example of what mtl style typeclasses and handlers look like:

class Monad m => MonadFileSystem m where
  readFile :: FilePath -> m String
  writeFile :: FilePath -> String -> m ()

  default readFile :: (MonadTrans t, MonadFileSystem m', m ~ t m') => FilePath -> m String
  readFile a = lift $ readFile a

  default writeFile :: (MonadTrans t, MonadFileSystem m', m ~ t m') => FilePath -> String -> m ()
  writeFile a b = lift $ writeFile a b

instance MonadFileSystem IO where
  readFile = Prelude.readFile
  writeFile = Prelude.writeFile

instance MonadFileSystem m => MonadFileSystem (ExceptT e m)
instance MonadFileSystem m => MonadFileSystem (MaybeT m)
instance MonadFileSystem m => MonadFileSystem (ReaderT r m)
instance MonadFileSystem m => MonadFileSystem (StateT s m)
instance MonadFileSystem m => MonadFileSystem (WriterT w m)

newtype InMemoryFileSystemT m a = InMemoryFileSystemT (StateT [(FilePath, String)] m a)
  deriving (Functor, Applicative, Monad, MonadError e, MonadReader r, MonadWriter w)

instance Monad m => MonadFileSystem (InMemoryFileSystemT m) where
  readFile path = InMemoryFileSystemT $ do
    vfs <- get
    case lookup path vfs of
      Just contents -> pure contents
      Nothing -> error ("readFile: no such file " ++ path)

  writeFile path contents = InMemoryFileSystemT $ modify $ \vfs ->
    (path, contents) : delete (path, contents) vfs

This is the most prevalent way to abstract over effects in Haskell, and it’s been around for a long time. Due to the way it uses the typeclass system, it’s also very fast, since GHC can often specialize and inline the typeclass dictionaries to avoid runtime dictionary passing. The main drawbacks are the amount of boilerplate required and the conceptual difficulty of understanding exactly how monad transformers, monadic typeclasses, and lifting instances all work together to discharge mtl style constraints.

There are various alternatives to mtl’s direct approach to effect composition, most of which are built around the idea of reifying a computation as a data structure and subsequently interpreting it. The most popular of these is the Free monad, a clever technique for deriving a monad from a functor that happens to be useful for modeling programs. Personally, I think Free is overhyped. It’s a cute, mathematically elegant technique, but it involves a lot of boilerplate, and composing effect algebras is still a laborious process. The additional expressive power of Free, namely its ability to choose an interpreter dynamically, at runtime, is rarely necessary or useful, and it adds complexity and reduces performance for few benefits. (And in fact, this is still possible to do with mtl style, it’s just uncommon because there is rarely any need to do so.)

A 2017 blog post entitled Free monad considered harmful discussed Free in comparison with mtl style, and unsurprisingly cast Free in a rather unflattering light. I largely agree with everything outlined in that blog post, so I will not retread its arguments here. I do, however, think that there is another abstraction that is quite useful: the so-called “freer monad” used to implement extensible effects.

Freer moves even further away from worrying about functors and monads, since its effect algebras do not even need to be functors. Instead, freer’s effect algebras are ordinary GADTs, and reusable, composable effect handlers are easily written to consume elements of these datatypes. Unfortunately, the way this works means that GHC is still not clever enough to optimize freer monads as efficiently as mtl style, since it can’t easily detect when the interpreter is chosen statically and use that information to specialize and inline effect implementations, but the cost difference is significantly reduced, and I’ve found that in real application code, the vast majority of the cost does not come from the extra overhead introduced by a more expensive (>>=).

There are a few different implementations of freer monads, but I, sadly, was not satisfied with any of them, so I decided to contribute to the problem by creating yet another one. My implementation is called freer-simple, and it includes a streamlined API with more documentation than any other freer implementation. Writing the above mtl style example using freer-simple is more straightforward:

data FileSystem r where
  ReadFile :: FilePath -> FileSystem String
  WriteFile :: FilePath -> String -> FileSystem ()

readFile :: Member FileSystem r => FilePath -> Eff r String
readFile a = send $ ReadFile a

writeFile :: Member FileSystem r => FilePath -> String -> Eff r ()
writeFile a b = send $ WriteFile a b

runFileSystemIO :: LastMember IO r => Eff (FileSystem ': r) ~> Eff r
runFileSystemIO = interpretM $ \case
  ReadFile a -> Prelude.readFile a
  WriteFile a b -> Prelude.writeFile a b

runFileSystemInMemory :: [(FilePath, String)] -> Eff (FileSystem ': effs) ~> Eff effs
runFileSystemInMemory initVfs = runState initVfs . fsToState where
  fsToState :: Eff (FileSystem ': effs) ~> Eff (State [(FilePath, String)] ': effs)
  fsToState = reinterpret $ case
    ReadFile path -> get >>= \vfs -> case lookup path vfs of
      Just contents -> pure contents
      Nothing -> error ("readFile: no such file " ++ path)
    WriteFile path contents -> modify $ \vfs ->
      (path, contents) : delete (path, contents) vfs

(It could be simplified further with a little bit of Template Haskell to generate the readFile and writeFile function definitions, but I haven’t gotten around to writing that.)

So which effect system do I recommend? I used to recommend mtl style, but as of only two months ago, I now recommend freer-simple. It’s easier to understand, involves less boilerplate, achieves “good enough” performance, and generally gets out of the way wherever possible. Its API is designed to make it easy to do the sorts of the things you most commonly need to do, and it provides a core set of effects that can be used to build a real-world application.

That said, freer is indisputably relatively new and relatively untested. It has success stories, but mtl style is still the approach used by the majority of the ecosystem. mtl style has more library support, its performance characteristics are better understood, and it is a tried and true way to structure effects in a Haskell application. If you understand it well enough to use it, and you are happy with it in your application, my recommendation is to stick with it. If you find it confusing, however, or you end up running up against its limits, give freer-simple a try.

Through the looking glass: to lens or not to lens

There’s no getting around it: lens is a behemoth of a library. For a long time, I wrote Haskell without it, and honestly, it worked out alright. I just wasn’t doing a whole lot of work that involved complicated, deeply-nested data structures, and I didn’t feel the need to bring in a library with such a reputation for having impenetrable operators and an almost equally impenetrable learning curve.

But, after some time, I decided I wanted to take the plunge. So I braced myself for the worst, pulled out my notebook, and started writing some code. To my surprise… it wasn’t that hard. It made sense. Sure, I still don’t know how it works on the inside, and I never did learn the majority of the exports in Control.Lens.Operators, but I had no need to. Lenses were useful in the way I had expected them to be, and so were prisms. One thing led to another, and before long, I understood the relationship between the various optics, the most notable additions to my toolkit being folds and traversals. Sure, the type errors were completely opaque much of the time, but I was able to piece things together with ample type annotations and time spent staring at ill-typed expressions. Before long, I had developed an intuition for lens.

After using it for a while, I retrospected on whether or not I liked it, and honestly, I still can’t decide. Some lensy expressions were straightforward to read and were a pleasant simplification, like this one:

paramSpecs ^.. folded._Required

Others were less obviously improvements, such as this beauty:

M.fromList $ paramSpecs ^.. folded._Optional.filtered (has $ _2._UsePreviousValue)

But operator soup aside, there was something deeper about lens that bothered me, and I just wasn’t sure what. I didn’t know how to articulate my vague feelings until I read a 2014 blog post entitled Lens is unidiomatic Haskell, which includes a point that I think is spot-on:

Usually, types in Haskell are rigid. This leads to a distinctive style of composing programs: look at the types and see what fits where. This is impossible with lens, which takes overloading to the level mainstream Haskell probably hasn’t seen before.
We have to learn the new language of the lens combinators and how to compose them, instead of enjoying our knowledge of how to compose Haskell functions. Formally, lens types are Haskell function types, but while with ordinary Haskell functions you immediately see from types whether they can be composed, with lens functions this is very hard in practice.
[…]
Now let me clarify that this doesn’t necessarily mean that lens is a bad library. It’s an unusual library. It’s almost a separate language, with its own idioms, embedded in Haskell.

The way lens structures its types deliberately introduces a sort of subtyping relationship—for example, all lenses are traversals and all traversals are folds, but not vice versa—and indeed, knowing this subtyping relationship is essential to working with the library and understanding how to use it. It is helpfully documented with a large diagram on the lens package overview page, and that diagram was most definitely an invaluable resource for me when I was learning how to use the library.

On the surface, this isn’t unreasonable. Subtyping is an enormously useful concept! The only reason Haskell dispenses with it entirely is because it makes type inference notoriously difficult. The subtyping relation between optics is one of the things that makes them so useful, since it allows you to easily compose a lens with a prism and get a traversal out. Unfortunately, the downside of all this is that Haskell does not truly have subtyping, so all of lens’s “types” really must be type aliases for types of roughly the same shape, namely functions. This makes type errors completely baffling, since the errors do not mention the aliases, only the fully-expanded types (which are often rather complicated, and their meaning is not especially clear without knowing how lens works under the hood).

So the above quote is correct: working with lens really is like working in a separate embedded language, but I’m usually okay with that. Embedded, domain-specific languages are good! Unfortunately, in this case, the host language is not very courteous to its guest. Haskell does not appear to be a powerful enough language for lens to be a language in its own right, so it must piggyback on top of Haskell’s error reporting mechanisms, which are insufficient for lens to be a cohesive linguistic abstraction. Just as debugging code by stepping through the assembly it produces (or, perhaps more relevant in 2018, debugging a compile-to-JS language by looking at the emitted JavaScript instead of the source code) makes for an unacceptably leaky language. We would never stand for such a thing in our general-purpose language tooling, and we should demand better even in our embedded languages.

That said, lens is just too useful to ignore. It is a hopelessly leaky abstraction, but it’s still an abstraction, and a powerful one at that. Given my selection of default extensions as evidence, I think it’s clear I have zero qualms with “advanced” Haskell; I will happily use even singletons where it makes sense. Haskell’s various language extensions are sometimes confusing in their own right, but their complexity is usually fundamental to the expressive power they bring. lens has some fundamental complexity, too, but it is mostly difficult for the wrong reasons. Still, while it is not the first library I reach for on every new Haskell project, manipulating nested data without lens is just too unpleasant after tasting the nectar, so I can’t advise against it in good faith.

Sadly, this means I’m a bit wishy-washy when it comes to using lens, but I do have at least one recommendation: if you decide to use lens, it’s better to go all-in. Don’t generate lenses for just a handful of datatypes, do it for all of them. You can definitely stick to a subset of the lens library’s features, but don’t apply it in some functions but not others. Having too many different, equally valid ways of doing things leads to confusion and inconsistency, and inconsistency minimizes code reuse and leads to duplication and spaghetti. Commit to using lens, or don’t use it at all.

Mitigating the string problem

Finally, Haskell has a problem with strings. Namely, String is a type alias for [Char], a lazy, singly linked list of characters, which is an awful representation of text. Fortunately, the answer to this problem is simple: ban String in your programs.

Use Text everywhere. I don’t really care if you pick strict Text or lazy Text, but pick one and stick to it. Don’t ever use String, and especially don’t ever, ever, ever use ByteString to represent text! There are enormously few legitimate cases for using ByteString in a program that is not explicitly about reading or writing raw data, and even at that level, ByteString should only be used at program boundaries. In that sense, I treat ByteString much the same way I treat IO: push it to the boundaries of your program.

One of Haskell’s core tenets is making illegal states unrepresentable. Strings are not especially useful datatypes for this, since they are sequences of arbitrary length made up of atoms that can be an enormously large number of different things. Still, string types enforce a very useful invariant, a notion of a sequence of human-readable characters. In the presence of Unicode, this is a more valuable abstraction than it might seem, and the days of treating strings as little different from sequences of bytes are over. While strings make a poor replacement for enums, they are quite effective at representing the incredible amount of text humans produce in a staggeringly large number of languages, and they are the right type for that job.

ByteString, on the other hand, is essentially never the right type for any job. If a type classifies a set of values, ByteString is no different from Any. It is the structureless type, the all-encompassing blob of bits. A ByteString could hold anything at all—some text, an image, an executable program—and the type system certainly isn’t going to help to answer that question. The only use case I can possibly imagine for passing around a ByteString in your program rather than decoding it into a more precise type is if it truly holds opaque data, e.g. some sort of token or key provided by a third party with no structure guaranteed whatsoever. Still, even this should be wrapped in a newtype so that the type system enforces this opaqueness.

Troublingly, ByteString shows up in many libraries’ APIs where it has no business being. In many cases, this seems to be things where ASCII text is expected, but this is hardly a good reason to willingly accept absolutely anything and everything! Make an ASCII type that forbids non-ASCII characters, and provide a ByteString -> Maybe ASCII function. Alternatively, think harder about your problem in question to properly support Unicode as you almost certainly ought to.

Other places ByteString appears are similarly unfortunate. Base-64 encoding, for example, could be given the wonderfully illustrative type ByteString -> Text, or even ByteString -> ASCII! Such a type makes it immediately clear why base-64 is useful: it allows transforming arbitrary binary data into a reliable textual encoding. If we consider that ByteString is essentially Any, this function has the type Any -> ASCII, which is amazingly powerful! We can convert anything to ASCII text!

Existing libraries, however, just provide the boring, disappointingly inaccurate type ByteString -> ByteString, which is one of the most useless types there is. It is essentially Any -> Any, the meaningless function type. It conveys nothing about what it does, other than that it is pure. Giving a function this type is scarcely better than dynamic typing. Its mere existence is a failure of Haskell library design.

But wait, it gets worse! Data.Text.Encoding exports a function called decodeUtf8, which has type ByteString -> Text. What an incredible function with a captivating type! Whatever could it possibly do? Again, this function’s type is basically Any -> Text, which is remarkable in the power it gives us. Let’s try it out, shall we?

ghci> decodeUtf8 "\xc3\x28"
"*** Exception: Cannot decode byte '\x28': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream

Oh. Well, that’s a disappointment.

Haskell’s string problem goes deeper than String versus Text; it seems to have wound its way around the collective consciousness of the Haskell community and made it temporarily forget that it cares about types and totality. This isn’t that hard, I swear! I can only express complete befuddlement at how many of these APIs are just completely worthless.

Fortunately, there is a way out, and that way out is text-conversions. It is the first Haskell library I ever wrote. It provides type safe, total conversions between Text and various other types, and it is encoding aware. It provides appropriately-typed base-16 and base-64 conversion functions, and is guaranteed to never raise any exceptions. Use it, and apply the Haskell philosophy to your strings, just as you already do for everything else in your program.

Closing thoughts

Phew.

When I started writing this blog post, it used the phrase “short overview” in the introduction. It is now over ten thousand words long. I think that’s all I have it in me to say for now.

Haskell is a wonderful language built by a remarkable group of people. Its community is often fraught with needlessly inflammatory debates about things like the value of codes of conduct, the evils of Hackage revisions, and precisely how much or how little people ought to care about the monad laws. These flame wars frustrate me to no end, and they sometimes go so far as to make me ashamed to call myself a part of the Haskell community. Many on the “outside” seem to view Haskellers as an elitist, mean-spirited cult, more interested in creating problems for itself than solving them.

That perception is categorically wrong.

I have never been in a community of programmers so dedicated and passionate about applying thought and rigor to building software, then going out and actually doing it. I don’t know anywhere else where a cutting-edge paper on effect systems is discussed by the very same people who are figuring out how to reliably deploy distributed services to AWS. Some people view the Haskell community as masturbatory, and to some extent, they are probably right. One of my primary motivators for writing Haskell is that it is fun and it challenges me intellectually in ways that other languages don’t. But that challenge is not a sign of uselessness, it is a sign that Haskell is so close to letting me do the right thing, to solving the problem the right way, to letting me work without compromises. When I write in most programming languages, I must constantly accept that my program will never be robust in all the ways I want it to be, and I might as well give up before I even start. Haskell’s greatest weakness is that it tempts me to try.

Haskell is imperfect, as it will always be. I doubt I will ever be satisfied by any language or any ecosystem. There will always be more to learn, more to discover, better tools and abstractions to develop. Many of them will not look anything like Haskell; they may not involve formal verification or static types or effect systems at all. Perhaps live programming, structural editors, and runtime hotswapping will finally take over the world, and we will find that the problems we thought we were solving were irrelevant to begin with. I can’t predict the future, and while I’ve found great value in the Haskell school of program construction, I dearly hope that we do not develop such tunnel vision that we cannot see that there may be other ways to solve these problems. Many of the solutions are things we likely have not even begun to think about. Still, whether that happens or not, it is clear to me that Haskell is a point in the design space unlike any other, and we learn almost as much from the things it gets wrong as we do from the things it gets right.

It’s been a wonderful two years, Haskell. I won’t be a stranger.

A space of their own: adding a type namespace to Hackett

2017-10-27T00:00:00Z

As previously discussed on this blog, my programming language, Hackett, is a fusion of two languages, Haskell and Racket. What happens when two distinctly different programming languages collide? Hackett recently faced that very problem when it came to the question of namespacing: Haskell has two namespaces, one for values and another for types, but Racket is a staunch Lisp-1 with a single namespace for all bindings. Which convention should Hackett adopt?

For now, at least, the answer is that Hackett will emulate Haskell: Hackett now has two namespaces. Of course, Hackett is embedded in Racket, so what did it take to add an entirely new namespace to a language that possesses only one? The answer was a little more than I had hoped, but it was still remarkably simple given the problem: after two weeks of hacking, I’ve managed to get something working.

Why two namespaces?

Before delving into the mechanics of how multi-namespace Hackett is implemented, it’s important to understand what Hackett’s namespaces actually are and why they exist in the first place. Its host language, Racket, is a descendant of Scheme, a Lisp derivative that famously chose to only use a single namespace. This means everything—from values to functions to classes—lives in a single namespace in Racket.

This is in stark contrast to Common Lisp, which opts to divide bindings into many namespaces, most notably pushing functions into a separate namespace from other variables. You can see this difference most strikingly when applying higher-order functions. In Racket, Clojure, and Scheme, functions can be passed freely as values:

> (map first '((1 a) (2 b) (3 c)))
'(1 2 3)

In Common Lisp and other languages with two namespaces, functions may still be passed as values, but the programmer must explicitly annotate when they wish to use a value from a different namespace:

> (mapcar #'car '((1 a) (2 b) (3 c)))
(1 2 3)

The Common Lisp #'x reader abbreviation is equivalent to (function x), and function is a special form that references a value in the function namespace.

While this distinction is somewhat arbitrary, it is generally my belief that the Scheme approach was, indeed, the right one. Runtime values are values, whether they are numbers, strings, or functions, and they ought to all be treated as equal citizens. After all, if a programmer wishes to define their own function-like thing, they should not be forced to make their abstraction a second-class citizen merely because it is slightly different from the built-in notion of a function. Higher-order functional programming encourages treating functions as ordinary values, and an arbitrary stratification of the namespace is antithetical to that mental model.

However, Hackett is a little different from all of the aforementioned languages because Hackett has types. Types are rather different from runtime values because they do not exist at all at runtime. One cannot use a type where a value is expected, nor can one use a value where a type is expected, so this distinction is always syntactically unambiguous.¹ Even if types and values live in separate namespaces, there is no need for a type form a la CL’s function because it can always be determined implicitly.

For this reason, it makes a great deal of sense for Hackett to have separate type and value namespaces, permitting declarations such as the following:

(data (Tuple a b) (Tuple a b))

This defines a binding named Tuple at the type level, which is a type constructor of two arguments that produces a type of kind *,² and another binding named Tuple at the value level, which is a value constructor of two arguments that produces a value of type (Tuple a b).

But why do we want to overload names in this way, anyway? How hard would it really be to just name the value constructor tuple instead of Tuple? Well, it wouldn’t be hard at all, if it weren’t for the unpleasant ambiguity such a naming convention introduces when pattern-matching. Consider the following code snippet:

(data Foo bar (baz Integer))

(defn foo->integer : {Foo -> Integer}
  [[bar    ] 0]
  [[(baz y)] y])

This works fine. But what happens if the programmer decides to change the name of the bar value?

(data Foo qux (baz Integer))

(defn foo->integer : {Foo -> Integer}
  [[bar    ] 0]
  [[(baz y)] y])

Can you spot the bug? Disturbingly, this code still compiles! Even though bar is not a member of Foo anymore, it’s still a valid pattern, since names used as patterns match anything, just as the y pattern matches against any integer inside the baz constructor. If Hackett had a pattern redundancy checker, it could at least hopefully catch this mistake, but as things are, this could would silently compile and do the wrong thing: (foo->integer (baz 42)) will still produce 0, not 42, since the first case always matches.

Haskell escapes this flaw by syntactically distinguishing between patterns and ordinary bindings by requiring all constructors start with an uppercase letter. This means that programmers often want to define data constructors and type constructors with the same name, such as the Tuple example above, which is illegal if a programming language only supports a single namespace.

Although Hackett now supports two namespaces, it does not currently enforce this naming convention, but it seems like an increasingly good idea. Separating the namespaces is the biggest hurdle needed to implement such a feature, and happily, it is now complete. The Tuple example from above is perfectly legal Hackett.

Adding namespaces to a language

Hopefully, we now agree that it would be nice if Hackett had two namespaces, but that doesn’t really get us any closer to being able to implement such a feature. At its core, Hackett is still a Racket language, and Racket’s binding structure has no notion of namespaces. How can it possibly support a language with more than one namespace?

Fortunately, Racket is no ordinary language—it is a language with a highly formalized notion of lexical scope, and many of its low-level scope control features are accessible to ordinary programmers. Before we get into the details, however, a forewarning: the remainder of this blog post is highly technical, and some of it involves some of the more esoteric corners of Racket’s macro system. This blog post is not representative of most macros written in Racket, nor is it at all necessary to understand these things to be a working Racket or Hackett macrologist. It is certainly not a tutorial on any of these concepts, so if you find it intimidating, there is no shame in skipping the rest of this post! If, however, you think you can handle it, or if you simply want to stare into the sun, by all means, read on.

Namespaces as scopes

With that disclaimer out of the way, let’s begin. As of this writing, the current Racket macroexpander uses a scoping model known as sets of scopes, which characterizes the binding structure of a program by annotating identifiers with sets of opaque markers known as “scopes”. The details of Racket’s macro system are well outside the scope of this blog post, but essentially, two identifiers with the same name can be made to refer to different bindings by adding a unique scope to each identifier.

Using this system of scopes, it is surprisingly simple to create a system of two namespaces: we only need to arrange for all identifiers in a value position to have a particular scope, which we will call the value scope, and all identifiers in type position must have a different scope, which we will call the type scope. How do we create these scopes and apply them to identifiers? In Racket, we use a function called make-syntax-introducer, which produces a function that encapsulates a fresh scope. This function can be applied to any syntax object (Racket’s structured representation of code that includes lexical binding information) to do one of three things: it can add the scope to all pieces of the syntax object, remove the scope, or flip the scope (that is, add it to pieces of the syntax object that do not have it and remove it from pieces that do have it). In practice, this means we need to call make-syntax-introducer once for each namespace:

(begin-for-syntax
  (define value-introducer (make-syntax-introducer))
  (define type-introducer (make-syntax-introducer)))

We define these in a begin-for-syntax block because these definitions will be used in our compile-time macros (aka “phase 1”), not in runtime code (aka “phase 0”). Now, we can write some macros that use these introducer functions to apply the proper scopes to their contents:

(require syntax/parse/define)

(define-simple-macro (begin/value form ...)
  #:with [form* ...] (map (λ (stx) (value-introducer stx 'add))
                          (attribute form))
  (begin form* ...))

(define-simple-macro (begin/type form ...)
  #:with [form* ...] (map (λ (stx) (type-introducer stx 'add))
                          (attribute form))
  (begin form* ...))

Each of these two forms is like begin, which is a Racket form that is, for our purposes, essentially a no-op, but it applies value-introducer or type-introducer to add the appropriate scope. We can test that this works by writing a program that uses the two namespaces:

(begin/value
  (define x 'value-x))

(begin/type
  (define x 'type-x))

(begin/value
  (println x))

(begin/type
  (println x))

This program produces the following output:

'value-x
'type-x

It works! Normally, if you try to define two bindings with the same name in Racket, it will produce a compile-time error, but by assigning them different scopes, we have essentially managed to create two separate namespaces.

However, although this is close, it isn’t quite right. What happens if we nest the two inside each other?

(begin/value
  (begin/type
    (println x)))

x: identifier's binding is ambiguous
  context...:
   #(189267 module) #(189268 module anonymous-module 0) #(189464 use-site)
   #(189465 use-site) #(190351 use-site) #(190354 use-site) #(190358 local)
   #(190359 intdef)
  matching binding...:
   #<module-path-index:()>
   #(189267 module) #(189268 module anonymous-module 0) #(189464 use-site)
  matching binding...:
   #<module-path-index:()>
   #(189267 module) #(189268 module anonymous-module 0) #(189465 use-site)

Oh no! That didn’t work at all. The error is a bit of a scary one, but the top of the error message is essentially accurate: the use of x is ambiguous because it has both scopes on it, so it could refer to either binding. What we really want is for nested uses of begin/value or begin/type to override outer ones, ensuring that a use can only be in a single namespace at a time.

To do this, we simply need to adjust begin/value and begin/type to remove the other scope in addition to adding the appropriate one:

(define-simple-macro (begin/value form ...)
  #:with [form* ...] (map (λ (stx)
                            (type-introducer (value-introducer stx 'add) 'remove))
                          (attribute form))
  (begin form* ...))

(define-simple-macro (begin/type form ...)
  #:with [form* ...] (map (λ (stx)
                            (value-introducer (type-introducer stx 'add) 'remove))
                          (attribute form))
  (begin form* ...))

Now our nested program runs, and it produces 'type-x, which is exactly what we want—the “nearest” scope wins.

With just a few lines of code, we’ve managed to implement the two-namespace system Hackett needs: we simply maintain two scopes, one for each namespace, and arrange for all the types to have the type scope applied and everything else to have the value scope applied. Easy, right? Well, not quite. Things start to get a lot more complicated once our programs span more than a single module.

Namespaces that cross module boundaries

The system of using two syntax introducers to manage scopes is wonderfully simple as long as all of our programs are contained within a single module, but obviously, that is never true in practice. It is critical that users are able to export both values and types from one module and import them into another, as that is a pretty fundamental feature of any language. This is, unfortunately, where we start to run into problems.

Racket’s notion of hygiene is pervasive, but it is still essentially scoped to a single module. This makes sense, since each module conceptually has its own “module scope”, and it wouldn’t be very helpful to inject a binding from a different module with the other module’s scope—it would be impossible to reference the binding in the importing module. Instead, Racket’s modules essentially export symbols, not identifiers (which, in Racket terminology, are symbols packaged together with their lexical scope). When a Racket module provides a binding named foo, there is no other information attached to that binding. It does not have any scopes attached to it, since it is the require form’s job to attach the correct scopes to imported identifiers.

This completely makes sense for all normal uses of the Racket binding system, but it has unfortunate implications for our namespace system: Racket modules cannot export more than one binding with a given symbolic name!³ This won’t work at all, since a Hackett programmer might very well want to export a type and value with the same name from a single module. Indeed, this capability is one of the primary points of having multiple namespaces.

What to do? Sadly, Racket does not have nearly as elegant a solution for this problem, at least not at the time of this writing. Fortunately, hope is not lost. While far from perfect, we can get away with a relatively simple name-mangling scheme to prefix types upon export and unprefix them upon import. Since Racket’s require and provide forms are extensible, it’s even possible to implement this mangling in a completely invisible way.

Currently, the scheme that Hackett uses is to prefix #%hackett-type: onto the beginning of any type exports. This can be defined in terms of a provide pre-transformer, which is essentially a macro that cooperates with Racket’s provide form to control the export process. In this case, we can define our type-out provide pre-transformer in terms of prefix-out, a form built-in to Racket that allows prefixing the names of exports:

(define-syntax type-out
  (make-provide-pre-transformer
   (λ (stx modes)
     (syntax-parse stx
       [(_ provide-spec ...)
        (pre-expand-export
         #`(prefix-out #%hackett-type:
                       #,(type-introducer
                          #'(combine-out provide-spec ...)))
         modes)]))))

Note that we call type-introducer in this macro! That’s because we want to ensure that, when a user writes (provide (type-out Foo)), we look for Foo in the module’s type namespace. Of course, once it is provided, all that scoping information is thrown away, but we still need it around so that provide knows which Foo is being provided.

Once we have referenced the correct binding, the use of prefix-out will appropriately add the #%hackett-type: prefix, so the exporting side is already done. Users do need to explicitly write (type-out ....) if they are exporting a particular type-level binding, but this is rarely necessary, since most users use data or class to export datatypes or typeclasses respectively, which can be modified to use type-out internally. Very little user code actually needs to change to support this adjustment.

Handling imports is, comparatively, tricky. When exporting, we can just force the user to annotate which exports are types, but we don’t have that luxury when importing, since it is merely whether or not a binding has the #%hackett-type: prefix that indicates which namespace it should be imported into. This means we’ll need to explicitly iterate through every imported binding and check if it has the prefix or not. If it does, we need to strip it off and add the type namespace; otherwise, we just pass it through unchanged.

Just as we extended provide with a provide pre-transformer, we can extend require using a require transformer. In code, this entire process looks like this:

(begin-for-syntax
  (define (unmangle-type-name name)
    (and~> (regexp-match #rx"^#%hackett-type:(.+)$" name) second)))

(define-syntax unmangle-types-in
  (make-require-transformer
   (syntax-parser
     [(_ require-spec ...)
      #:do [(define-values [imports sources]
              (expand-import #'(combine-in require-spec ...)))]
      (values
       (map (match-lambda
              [(and i (import local-id src-sym src-mod-path mode req-mode orig-mode orig-stx))
               (let* ([local-name (symbol->string (syntax-e local-id))]
                      [unmangled-type-name (unmangle-type-name local-name)])
                 (if unmangled-type-name
                     (let* ([unmangled-id
                             (datum->syntax local-id
                                            (string->symbol unmangled-type-name)
                                            local-id
                                            local-id)])
                       (import (type-introducer unmangled-id)
                               src-sym src-mod-path mode req-mode orig-mode orig-stx))
                     i))])
            imports)
       sources)])))

This is a little intimidating if you are not familiar with the intricacies of Racket’s low-level macro system, but the bulk of the code isn’t as scary as it may seem. It essentially does three things:

It iterates over each import and calls unmangle-type-name on the imported symbol. If the result is #f, that means the import does not have the #%hackett-type: prefix, and it can be safely passed through unchanged.
If unmangle-type-name does not return #f, then it returns the unprefixed name, which is then provided to datum->syntax, which allows users to forge new identifiers in an unhygienic (or “hygiene-bending”) way. In this case, we want to forge a new identifier with the name we get back from unmangle-type-name, but with the lexical context of the original identifier.
Finally, we pass the new identifier to type-introducer to properly add the type scope, injecting the fresh binding into the type namespace.

With this in place, we now have a way for Hackett users to import and export type bindings, but while it is not much of a burden to write type-out when exporting types, it is unlikely that users will want to write unmangle-types-in around each and every import in their program. For that reason, we can define a slightly modified version of require that implicitly wraps all of its subforms with unmangle-types-in:

(provide (rename-out [require/unmangle require]))

(define-simple-macro (require/unmangle require-spec ...)
  (require (unmangle-types-in require-spec) ...))

…and we’re done. Now, Hackett modules can properly import and export type-level bindings.

Namespaces plus submodules: the devil’s in the details

Up until this point, adding namespaces has required some understanding of the nuances of Racket’s macro system, but it hasn’t been particularly difficult to implement. However, getting namespaces right is a bit trickier than it appears. One area where namespaces are less than straightforward is Racket’s system of submodules.

Submodules are a Racket feature that allows the programmer to arbitrarily nest modules. Each file always corresponds to a single outer module, but that module can contain an arbitrary number of submodules. Each submodule can have its own “module language”, which even allows different languages to be mixed within a single file.

Submodules in Racket come in two flavors: module and module*. The difference is what order, semantically, they are defined in. Submodules defined with module are essentially defined before their enclosing module, so they cannot import their enclosing module, but their enclosing module can import them. Modules defined with module* are the logical dual to this: they are defined after their enclosing module, so they can import their enclosing module, but the enclosing module cannot import them.

How do submodules interact with namespaces? Well, for the most part, they work totally fine. This is because submodules are really, for the most part, treated like any other module, so the same machinery that works for ordinary Racket modules works fine with submodules.

However, there is a special sort of module* submodule that uses #f in place of a module language, which gives a module access to all of its enclosing module’s bindings, even ones that aren’t exported! This is commonly used to create a test submodule that contains unit tests, and functions can be tested in such a submodule even if they are not part of the enclosing module’s public API:

#lang racket

; not provided
(define (private-add1 x)
  (+ x 1))

(module* test #f
  (require rackunit)
  (check-equal? (private-add1 41) 42))

It would be nice to be able to use these sorts of submodules in Hackett, too, but if we try, we’ll find that types from the enclosing module mysteriously can’t be referenced by the submodule. Why? Well, the issue is in how we naïvely create our type and value introducers:

(begin-for-syntax
  (define value-introducer (make-syntax-introducer))
  (define type-introducer (make-syntax-introducer)))

Remember that make-syntax-introducer is generative—each time it is called, it produces a function that operates on a fresh scope. This is a problem, since those functions will be re-evaluated on every module instantiation, as ensured by Racket’s separate compilation guarantee. This means that each module gets its own pair of scopes. This means the body of a module* submodule will have different scopes from its enclosing module, and the enclosing modules bindings will not be accessible.

Fortunately, there is a way to circumvent this. While we cannot directly preserve syntax introducers across module instantiations, we can preserve syntax objects by embedding them in the expanded program, and we can attach scopes to syntax objects. Using make-syntax-delta-introducer, we can create a syntax introducer the adds or removes the difference between scopes on two syntax objects. Pairing this with a little bit of clever indirection, we can arrange for value-introducer and type-introducer to always operate on the same scopes on each module instantiation:

(define-simple-macro (define-value/type-introducers
                       value-introducer:id type-introducer:id)
  #:with scopeless-id (datum->syntax #f 'introducer-id)
  #:with value-id ((make-syntax-introducer) #'scopeless-id)
  #:with type-id ((make-syntax-introducer) #'scopeless-id)
  (begin-for-syntax
    (define value-introducer
      (make-syntax-delta-introducer #'value-id #'scopeless-id))
    (define type-introducer
      (make-syntax-delta-introducer #'type-id #'scopeless-id))))

(define-value/type-introducers value-introducer type-introducer)

The way this trick works is subtle, but to understand it, it’s important to understand that when a module is compiled, its macro uses are only evaluated once. Subsequent imports of the same module will not re-expand the module. However, code inside begin-for-syntax blocks is still re-evaluated every time the module is instantiated! This means we are not circumventing that re-evaluation directly, we are merely arranging for each re-evaluation to always produce the same result.

We still use make-syntax-introducer to create our two scopes, but critically, we only call make-syntax-introducer inside the define-value/type-introducers macro, which is, again, only run once (when the module is expanded). The resulting compiled module embeds value-id and type-id as syntax objects in the fully-expanded program, so they never change on each module instantiation, and they already contain the appropriate scopes. We can use make-syntax-delta-introducer to convert the “inert” scopes into introducer functions that we can use to apply the scopes to other syntax objects as we see fit.

By guaranteeing each namespace’s scope is always the same, even for different modules, module* submodules now work properly, and they are able to refer to bindings inherited from their enclosing module as desired.

The final stretch: making Scribble documentation namespace-aware

As discussed in my previous blog post, Hackett has comprehensive documentation powered by Racket’s excellent documentation tool, Scribble. Fortunately for Hackett, Scribble is incredibly flexible, and it can absolutely cope with a language with multiple namespaces. Less fortunately, it is clear that Scribble’s built-in documentation forms were not at all designed with multiple namespaces in mind.

In general, documenting such a language is tricky, assuming one wishes all identifiers to be properly hyperlinked to their appropriate definition (which, of course, I do). However, documentation is far more ambiguous than code when attempting to determine which identifiers belong in which namespace. When actually writing Hackett code, forms can always syntactically deduce the appropriate namespace for their subforms and annotate them accordingly, but this is not true in documentation. Indeed, it’s entirely possible that a piece of documentation might include intentionally incorrect code, which cannot be expanded at all!

Haskell’s documentation tool, Haddock, does not appear to attempt to tackle this problem at all—when given an identifier that exists in both namespaces, it will generate a hyperlink to the type, not the value. I do not know if there is a way around this, but if there is, it isn’t documented. This works alright for Haddock because Haskell’s documentation generally contains fewer examples, and Haskell programmers do not expect all examples to be appropriately hyperlinked, so a best-effort approach is accepted. Racket programmers, however, are used to a very high standard of documentation, and incorrectly hyperlinked docs are unacceptable.

To work around this problem, Hackett’s documentation requires that users explicitly annotate which identifiers belong to the type namespace. Identifiers in the type namespace are prefixed with t: upon import, and they are bound to Scribble element transformers that indicate they should be typeset without the t: prefix. Fortunately, Scribble’s documentation forms do understand Racket’s model of lexical scope (mostly), so they can properly distinguish between two identifiers with the same name but different lexical context.

In practice, this means Hackett documentation must now include a proliferation of t: prefixes. For example, here is the code for a typeset REPL interaction:

@(hackett-examples
  (defn square : (t:-> t:Integer t:Integer)
    [[x] {x * x}])
  (square 5))

Note the use of t:-> and t:Integer instead of -> and Integer. When the documentation is rendered and the example is evaluated, the prefixes are stripped, resulting in properly-typeset Hackett code.

This also means Hackett’s documentation forms have been updated to understand multiple namespaces. Hackett now provides deftype and deftycon forms for documenting types and type constructors, respectively, which will use the additional lexical information attached to t:-prefixed identifiers to properly index documented forms. Similarly, defdata and defclass have been updated with an understanding of types.

The implementation details of these changes is less interesting than the ones made to the code itself, since it mostly just involved tweaking Racket’s implementation of defform slightly to cooperate with the prefixed identifiers. To summarize, Hackett defines a notion of “type binding transformers” that include information about both prefixed and unprefixed versions of types, and Hackett provides documentation forms that consume that information when typesetting. A require transformer converts imported bindings into t:-prefixed ones and attaches the necessary compile-time information to them. It isn’t especially elegant, but it works.

Analysis and unsolved problems

When laid out from top to bottom in this blog post, the amount of code it takes to actually implement multiple namespaces in Racket is surprisingly small. In hindsight, it does not feel like two weeks worth of effort, but it would be disingenuous to suggest that any of this was obvious. I tried a variety of different implementation strategies and spent a great deal of time staring at opaque error messages and begging Matthew Flatt for help before I got things working properly. Fortunately, with everything in place, the implementation seems reliable, predictable, and useful for Hackett’s users (or, as the case may be, users-to-be).

For the most part, all the machinery behind multiple namespaces is invisible to the average Hackett programmer, and it seems to “just work”. For completeness, however, I must mention one unfortunate exception: remember the work needed to unmangle type names? While it’s true that all imports into Hackett modules are automatically unmangled by the custom require form, types provided by a module’s language are not automatically unmangled. This is because Racket does not currently provide a hook to customize how bindings from a module language are introduced, unlike require’s require transformers.

To circumvent this restriction, #lang hackett’s reader includes a somewhat ad-hoc solution that actually inserts a require into users’ programs that unmangles and imports all the types provided by the module. This mostly works, but due to the way Racket’s imports work, it isn’t possible for Racket programmers to import different types with the same names as Hackett core types; the two bindings will conflict, and there is no way for users to hide these implicitly imported bindings. Whether or not this is actually a common problem remains to be seen. If it is rare, it might be sufficient to introduce an ad-hoc mechanism to hide certain type imports, but it might be better to extend Racket in some way to better support this use-case.

That issue aside, multi-namespace Hackett is now working smoothly. It’s worth nothing that I did not have to do any special work to help Racket’s tooling, such as DrRacket’s Check Syntax tool, understand the binding structure of Hackett programs. Since other tools, such as racket-mode for Emacs, use the same mechanisms under the hood, Racket programmers’ existing tools will be able to properly locate the distinct definition sites for types and values with the same name, another example of how Racket successfully internalizes extra-linguistic mechanisms.

As closing notes, even if the majority of this blog post was gibberish to you, do note that Hackett has come quite a long way in just the past two months, adding much more than just a separate type namespace. I might try and give a more comprehensive update at a later date, but here’s a quick summary of the meaningful changes for those interested:

Multi-parameter typeclasses are implemented, along with default typeclass method implementations.
Pattern-matching performs basic exhaustiveness checking, so unmatched cases are a compile-time error.
Hackett ships with a larger standard library, including an Either type and appropriate functions, an Identity type, a MonadTrans typeclass, and the ReaderT and ErrorT monad transformers.
More things are documented, and parts of the documentation are slightly improved. Additionally, Hackett’s internals are much more heavily commented, hopefully making the project more accessible to new contributors.
Parts of the typechecker are dramatically simplified, improving the mechanisms behind dictionary elaboration and clearing the way for a variety of additional long-term improvements, including multiple compilation targets and a type-aware optimizer.
As always, various bug fixes.

Finally, special mention to two new contributors to Hackett, Milo Turner and Brendan Murphy. Also special thanks to Matthew Flatt and Michael Ballantyne for helping me overcome two of the trickiest macro-related problems I’ve encountered in Hackett to date. It has now been just over a year since Hackett’s original conception and roughly six months since the first commit of its current implementation, and the speed at which I’ve been able to work would not have been possible without the valuable help of the wonderful Racket community. Here’s hoping this is only the beginning.

“But what about dependent types?” you may ask. Put simply, Hackett is not dependently typed, and it is not going to be dependently typed. Dependent types are currently being bolted onto Haskell, but Haskell does not have #lang. Racket does. It seems likely that a dependently-typed language would be much more useful as a separate #lang, not a modified version of Hackett, so Hackett can optimize its user experience for what it is, not what it might be someday. ↩
Hackett does not actually have a real kind system yet, but pleasantly, this same change will allow * to be used to mean “type” at the kind level and “multiply” at the value level. ↩
This isn’t strictly true, as readers familiar with Racket’s macro system may likely be aware that Racket modules export bindings at different “phase levels”, where phase levels above 0 correspond to compile-time macroexpansion phases. Racket modules are allowed to export a single binding per name, per phase, so the same symbolic name can be bound to different things at different phases. This isn’t meaningfully relevant for Hackett, however, since types and values are both exported at phase 0, and there are reasons that must be the case, this phase separation does not make this problem any simpler. ↩

Hackett progress report: documentation, quality of life, and snake

2017-08-28T00:00:00Z

Three months ago, I wrote a blog post describing my new, prototype implementation of my programming language, Hackett. At the time, some things looked promising—the language already included algebraic datatypes, typeclasses, laziness, and even a mini, proof of concept web server. It was, however, clearly still rather rough around the edges—error messages were poor, features were sometimes brittle, the REPL experience was less than ideal, and there was no documentation to speak of. In the time since, while the language is still experimental, I have tackled a handful of those issues, and I am excited to announce the first (albeit quite incomplete) approach to Hackett’s documentation.

I’d recommend clicking that link above and at least skimming around before reading the rest of this blog post, as its remainder will describe some of the pieces that didn’t end up in the documentation: the development process, the project’s status, a small demo, and some other details from behind the scenes.

A philosophy of documentation

Racket, as a project, has always had wonderful documentation. There are many reasons for this—Racket’s educational origins almost certainly play a part, and it helps that the core packages set the bar high—but one of the biggest reasons is undoubtably Scribble, the Racket documentation tool. Scribble is, in many ways, the embodiment of the Racket philosophy: it is a user-extensible, fully-featured, domain-specific programming language designed for typesetting, with a powerful library for documenting Racket code. Like the Racket language itself, Scribble comes with a hygienic macro system, and in fact, all Racket libraries are trivially usable from within Scribble documents, if desired. The macro system is used to great effect to provide typesetting forms tailored to the various sorts of things a Racket programmer might wish to document, such as procedures, structures, and macros.

Scribble documents are decoupled from a rendering backend, so a single Scribble document can be rendered to plain text, a PDF, or HTML, but the HTML backend is the most useful for writing docs. Scribble documents themselves use a syntax inspired by (La)TeX’s syntax, but Scribble uses an @ character instead of \. It also generalizes and regularizes TeX in many ways, creating a much more uniform language without nearly so much magic or complexity. Since Scribble’s “at-expressions” are merely an alternate syntax for Racket’s more traditional s-expressions, Scribble documents can be built out of ordinary Racket macros. For example, to document a procedure in Racket, one would use the provided defproc form:

@defproc[(add1 [z number?]) number?]{
Returns @racket[(+ z 1)].}

This syntax may look alien to someone more familiar with traditional, Javadoc-style documentation comments, but the results are quite impressive. The above snippet renders into something like this:

The fact that Scribble documents are fully-fledged programs equips the programmer with a lot of power. One of the most remarkable tools Scribble provides is the scribble/example module, a library that performs sandboxed evaluation as part of the rendering process. This allows Scribble documents to include REPL-style examples inline, automatically generated as part of typesetting, always kept up to date from a single source of truth: the implementation. It even provides a special eval:check form that enables doctest-like checking, which allows documentation to serve double duty as a test suite.

Of course, Hackett is not Racket, though it shares many similarities. Fortunately, all of Racket is designed with the goal of supporting many different programming languages, and Scribble is no exception. Things like scribble/example essentially work out of the box with Hackett, and most of scribble/manual can be reused. However, what about documenting algebraic datatypes? What about documenting typeclasses? Well, remember: Scribble is extensible. The defproc and defstruct forms are hardly builtins; they are defined as part of the scribble/manual library in terms of Scribble primitives, and we can do the same.

Hackett’s documentation already defines three new forms, defdata, defclass, and defmethod, for documenting algebraic datatypes, typeclasses, and typeclass methods, respectively. They typeset documentation custom-tailored to Hackett’s needs, so Hackett’s documentation need not be constrained by Racket’s design decisions. For example, one could document the Functor typeclass using defclass like this:

@defclass[(Functor f)
          [map : (forall [a b] {(a -> b) -> (f a) -> (f b)})]]{

A class of types that are @deftech{functors}, essentially types that provide a
mapping or “piercing” operation. The @racket[map] function can be viewed in
different ways:

...}

With only a little more than the above code, Hackett’s documentation includes a beautifully-typeset definition of the Functor typeclass, including examples and rich prose:

Scribble makes Hackett’s documentation shine.

A tale of two users

For a programming language, documentation is critical. Once we have grown comfortable with a language, it’s easy to take for granted our ability to work within it, but there is always a learning period, no matter how simple or familiar the language may be. When learning a new language, we often relate the languages’ concepts and features to those which we already know, which is why having a broad vocabulary of languages makes picking up new ones so much easier.

A new user of a language needs a gentle introduction to its features, structured in a logical way, encouraging this period of discovery and internalization. Such an introduction should come equipped with plenty of examples, and it shouldn’t worry itself with being an authoritative reference. Some innocent simplifications are often conducive to learning, and it is unlikely to be helpful to force the full power of a language onto a user all at once.

However, for experienced users, an authoritative reference is exactly what they need. While learners want tutorial-style documentation that encourages experimentation and exploration, working users of a language need something closer to a dictionary or encyclopedia: a way to look up forms and functions by name and find precise definitions, complete explanations, and hopefully a couple of examples. Such a user does not want information to be scattered across multiple chapters of explanatory text; they simply need a focused, targeted, one-stop shop for the information they’re looking for.

This dichotomy is rarely well-served by existing programming language documentation. Most programming languages suffer from either failing entirely to serve both types of users, or doing so in a way that enforces too strong a separation between the styles of documentation. For example:

Java ships with a quintessential example of a documentation generator: Javadoc. Java is a good case study because, although its documentation is not particularly good, it still manages to be considerably better than most languages’ docs.
Java’s API documentation documents its standard library, but it doesn’t document the language. Reference-style language documentation is largely relegated to the Java Language Specification, which is highly technical and rather low-level. It is more readable than the standards for some other languages, but it’s still mostly only useful to language lawyers. For Java, this ends up being mostly okay, largely because Java is a fairly small language that does not often change.
On the other hand, Java’s reference documentation is inconsistent, rarely provides any examples, and certainly does not do a good job of serving new users. Java does provide guide-style documentation in the form of the Java Tutorials, but they are of inconsistent quality.
More importantly, while the Java tutorials link to the API docs, the reverse is not true, which is a real disservice. One of the most beautiful things about the web is how information can be extensively cross-linked, and exploring links is many times easier than turning pages of a physical book. Anyone who’s explored topics on Wikipedia for an hour (or more) at a time knows how amazing this can be.
Language documentation isn’t quite the same as an encyclopedia, but it’s a shame that Java’s documentation does not lend itself as easily to curious, open-ended learning. If the API docs frequently linked to relevant portions of the tutorials, then a user could open the Javadoc for a class or method they are using, then quickly jump to the relevant guide. As the documentation is currently organized, this is nearly impossible, and tutorials are only discovered when explicitly looking for them.
Other languages, such as JavaScript, are in even worse boats than Java when it comes to documentation. For whatever reason, structured documentation of any kind doesn’t seem to have caught on in the JavaScript world, probably largely because no documentation tool ships with the language, and no such tool ever became standard. Whatever the reason, JavaScript libraries’ documentation largely resides in markdown documents spread across version control repositories and various webpages.
The closest thing that JavaScript has to official language documentation, aside from the (largely incomprehensible) language standard, is MDN. MDN’s docs are actually quite good, and they tend to mix lots of examples together with reference-style documentation. They’re indexed and searchable, and they have a great Google search ranking. MDN is easily my go-to place to read about core JavaScript functions.
The trouble, of course, is that MDN only houses documentation for the standard library, and while new standards make it bigger than ever, huge amounts of critical functionality are often offloaded to separate packages. These libraries all have their own standards and styles of documentation, and virtually none of them even compare to MDN.
This means that documentation for JavaScript libraries, even the most popular ones, tends to be all over the map. Ramda’s documentation is nothing but a reference, which makes it easy to look up information about a specific function, but nearly impossible to find anything if you don’t have a specific name to look for. In contrast, Passport’s docs are essentially only a set of tutorials, which is great for learners, but enormously frustrating if I just want to look up what the heck a specific function or method does. Fortunately, there are some libraries, like React, that absolutely nail this, and they have both styles of documentation that are actually cross-referenced. Unfortunately, those are mostly the exceptions, not the norm.
Python’s documentation is interesting, since it includes a set of tutorials alongside the API reference, and it also ships a language reference written for ordinary users. In many ways, it does everything right, but disappointingly, it generally doesn’t link back to the tutorials from the API docs, even though the reverse is true. For example, the section in the tutorial on if links to the section in the reference about if, but nothing goes in the other direction, which is something of a missed opportunity.
Haskell manages to be especially bad here (maybe even notoriously bad) despite having an ubiquitous documentation generator, Haddock. Unfortunately, Haddock’s format makes writing prose and examples somewhat unpleasant, and very few packages provide any sort of tutorial. For those that do, the tutorial is often not included in the API docs, a common theme at this point.
It’s generally a bad sign when your documentation tool isn’t even powerful enough to document itself, and Haddock’s docs are pretty impressively bad, though mostly serviceable if you’re willing to look.

The takeaway here is that I just don’t think most languages’ documentation is particularly good, and programmers seem to have gotten so used to this state of affairs that the bar is set disappointingly low. Fortunately, this is another area where Racket delivers. Racket, like Python, ships with two pieces of documentation: the Racket Guide and the Racket Reference. The guide includes over one hundred thousand words of explanations and examples, and the reference includes roughly half a million. Racket’s documentation is impressive on its own, but what’s equally impressive is how carefully and methodically cross-linked it is. Margin notes often provide links to corresponding sections in the relevant companion manual, so it’s easy to look up a form or function by name, then quickly jump to the section of the guide explaining it.

Hackett is obviously not going to have hundreds of thousands of words worth of documentation in its first few months of existence, but it already has nearly ten thousand, and that’s not nothing. More importantly, it is structured the same way that Racket’s docs are: it’s split into the Hackett Guide and the Hackett Reference, and the two are cross-referenced as much as possible. Haskell is a notoriously difficult language to learn, but my hope is that does not necessarily need to be the case. Documentation cannot make the language trivial, but my hope is that it can make it a lot more accessible without making it any less useful for power users.

Rounding Hackett’s library, sanding its edges

One of the best things about sitting down and writing documentation—whether it’s for a tool, a library, or a language—is how it forces you, the author, to think about how someone else might perceive the project when seeing it for the first time. This encompasses everything: error messages, ease of installation, completeness of a standard library, friendliness of tooling, etc. Writing Hackett’s documentation forced me to make a lot of improvements, and while very few of them are flashy features, they make Hackett feel much less like a toy and more like a tool.

Hackett currently has no formal changelog because it is considered alpha quality, and its API is still unstable. There is no guarantee that things won’t change at any moment. Still, it’s useful to put together an ad-hoc list of changes made in the past few months. Here’s a very brief summary:

Hackett includes a Double type for working with IEEE 754 double-precision floating-point numbers.
Local definitions are supported via the let and letrec forms.
The prelude includes many more functions, especially functions on lists.
The Hackett reader has been adjusted to support using . as a bare symbol, since . is the function composition operator.
The Hackett REPL supports many more forms, including ADT, class, and instance definitions. Additionally, the REPL now uses Show instances to display the results of expressions. To compensate for the inability to print non-Showable things, a new (#:type expr) syntax is permitted to print the type of any expression.
Missing instance errors are now dramatically improved, now correctly highlighting the source location of expressions that led to the error.

Alongside these changes are a variety of internal code improvements that make the Hackett code simpler, more readable, and hopefully more accessible to contributors. Many of the trickiest functions are now heavily commented with the hope that the codebase won’t be so intimidating to people unfamiliar with Racket or the techniques behind Hackett’s typechecker. I will continue to document the internals of Hackett as I change different places of the codebase, and I have even considered writing a separate Scribble document describing the Hackett internals. It certainly wouldn’t hurt.

One of the most exciting things about documenting Hackett has been realizing just how much already exists. Seriously, if you have gotten to this point in the blog post but haven’t read the actual documentation yet, I would encourage you to do so. No longer does the idea of writing real programs in this language feel out of reach; indeed, aside from potential performance problems, the language is likely extremely close to being usable for very simple things. After all, that’s the goal, isn’t it? As I’ve mentioned before, I’m writing Hackett for other people, but I’m also very much writing it for me: it’s a language I’d like to use.

Still, writing a general-purpose programming language is a lot of work, and I’ve known from the start that it isn’t something I can accomplish entirely on my own. While this iteration of work on Hackett is a sort of “documentation release”, it might be more accurate to call it an “accessibility release”. If you’re interested in contributing, I finally feel comfortable encouraging you to get involved!

A demo with pictures

Now, if you’re like me, all of this documentation stuff is already pretty exciting. Still, even I view documentation as simply a means to an end, not an end in itself. Documentation is successful when it gets out of the way and makes it possible to write good code that does cool things. Let’s write some, shall we?

Hackett ships with a special package of demo libraries in the aptly-named hackett-demo package, which are essentially simple, lightweight bindings to existing, dynamically-typed Racket libraries. In the previous Hackett blog post, I demonstrated the capabilities of hackett/demo/web-server. In this blog post, we’re going to use hackett/demo/pict and hackett/demo/pict/universe, which make it possible to write interactive, graphical programs in Hackett with just a few lines of code!

As always, we’ll start with #lang hackett, and we’ll import the necessary libraries:

#lang hackett

(require hackett/demo/pict
         hackett/demo/pict/universe)

With that, we can start immediately with a tiny example. Just to see how hackett/demo/pict works, let’s start by rendering a red square. We can do this by writing a main action that calls print-pict:

(main (print-pict (colorize red (filled-square 50.0))))

If you run the above program in DrRacket, you should see a 50 pixel red square printed into the interactions window!

Using the REPL, we can inspect the type of print-pict:

> (#:type print-pict)
: (-> Pict (IO Unit))

Unsurprisingly, displaying a picture to the screen needs IO. However, what’s interesting is that the rest of the expression is totally pure. Take a look at the type of filled-square:

> (#:type filled-square)
: (-> Double Pict)

No IO to be seen! This is because “picts” are entirely pure values that represent images built out of simple shapes, and they can be put together to make more complex images. For example, we can put two squares next to one another:

(main (print-pict {(colorize red (filled-square 50.0))
                   hc-append
                   (colorize blue (filled-square 50.0))}))

This code will print out a red square to the left of a blue one.

Again, hc-append is a simple, pure function, a binary composition operator that places two picts side by side to produce a new one:

> (#:type hc-append)
: (-> Pict (-> Pict Pict))

Using the various features of this toolkit, not only can we make interesting pictures and diagrams, we can even create a foundation for a game!

Implementing a snake clone

This blog post is not a Hackett tutorial; it is merely a demo. For that reason, I am not going to spend much time explaining how the following program is built. This section is closer to annotated source code than a guide to the pict or universe libraries. Hopefully it’s still illustrative.

We’ll start by writing some type definitions. We’ll need a type to represent 2D points on a grid, as well as a type to represent a cardinal direction (to keep track of which direction the player is moving, for example). We’ll also want an Eq instance for our points.

(data Point (point Integer Integer))
(data Direction d:left d:right d:up d:down)

(instance (Eq Point)
  [== (λ [(point a b) (point c d)] {{a == c} && {b == d}})])

With these two datatypes, we can implement a move function that accepts a point and a direction and produces a new point for an adjacent tile:

(defn move : {Direction -> Point -> Point}
  [[d:left  (point x y)] (point {x - 1} y)]
  [[d:right (point x y)] (point {x + 1} y)]
  [[d:up    (point x y)] (point x {y - 1})]
  [[d:down  (point x y)] (point x {y + 1})])

The next step is to define a type for our world state. The big-bang library operates using a game loop, with a function to update the state that’s called each “tick”. Our state will need to hold all the information about our game, which in this case, is just three things:

(data World-State (world-state
                   Direction    ; snake direction
                   (List Point) ; snake blocks
                   (List Point) ; food blocks
                   ))

It will also be useful to have a functional setter for the direction, which we’ll have to write ourselves, since Hackett does not (currently) have anything like Haskell’s record syntax:

(defn set-ws-direction [[d (world-state a b c)] (world-state d b c)])

Next, we’ll write some top-level constants that we’ll use in our rendering function, such as the number of tiles in the game board, the size of each tile in pixels, and some simple picts that represent the tiles we’ll use to draw our game:

(def board-width 50)
(def board-height 30)
(def tile->absolute {(d* 15.0) . integer->double})
(def empty-board (blank-rect (tile->absolute board-width) (tile->absolute board-height)))

(def block (filled-square 13.0))
(def food-block (colorize red block))
(def snake-block (colorize black block))

Now we can write our actual render function. To do this, we simply need to render each Point in our World-State’s two lists as a block on an empty-board. We’ll write a helper function, render-on-board, which does exactly that:

(defn render-on-board : {Pict -> (List Point) -> Pict}
  [[pict points]
   (foldr (λ [(point x y) acc]
            (pin-over acc (tile->absolute x) (tile->absolute y) pict))
          empty-board points)])

This function uses foldr to collect each point and place the provided pict at the right location using pin-over on an empty board. Using render-on-board, we can write the render function in just a couple of lines:

(defn render : {World-State -> Pict}
  [[(world-state _ snake-points food-points)]
   (pin-over (render-on-board snake-block snake-points)
             0.0 0.0
             (render-on-board food-block food-points))])

Next, we’ll need to handle the update logic. On each tick, the snake should advance by a single tile in the direction it’s currently moving. If it runs into a food tile, it should grow one tile larger, and we need to generate a new food tile elsewhere on the board. To help with that last part, the big-bang library provides a random-integer function, which we can use to write a random-point action:

(def random-point : (IO Point)
  {point <$> (random-integer 0 board-width)
         <*> (random-integer 0 board-height)})

Hackett supports applicative notation using infix operators, so random-point looks remarkably readable. It also runs in IO, since the result is, obviously, random. Fortunately, the on-tick function runs in IO as well (unlike render, which must be completely pure), so we can use random-point when necessary to generate a new food block:

(def init! : (forall [a] {(List a) -> (List a)})
  {reverse . tail! . reverse})

(defn on-tick : {World-State -> (IO World-State)}
  [[(world-state dir snake-points food-points)]
   (let ([new-snake-point (move dir (head! snake-points))])
     (if {new-snake-point elem? food-points}
         (do [new-food-point <- random-point]
             (pure (world-state dir {new-snake-point :: snake-points}
                                {new-food-point :: (delete new-snake-point food-points)})))
         (pure (world-state dir {new-snake-point :: (init! snake-points)}
                            food-points))))])

This function is the most complicated one in the whole program, but it’s still not terribly complex. It figures out what the snake’s next location is and binds it to new-snake-point, then checks if there is a food block at that location. If there is, it generates a new-food-point, then puts it in the new world state. Otherwise, it removes the last snake point and continues as usual.

The game is already almost completely written. The next step is just to handle key events, which are obviously important for allowing the player to control the snake. Fortunately, this is easy, since we can just use our set-ws-direction function that we wrote earlier:

(defn on-key : {KeyEvent -> World-State -> (IO World-State)}
  [[ke:left ] {pure . (set-ws-direction d:left)}]
  [[ke:right] {pure . (set-ws-direction d:right)}]
  [[ke:up   ] {pure . (set-ws-direction d:up)}]
  [[ke:down ] {pure . (set-ws-direction d:down)}]
  [[_       ] {pure . id}])

The on-key function runs in IO, but we don’t actually need that power, since all of our keypress update logic is completely pure, so we just wrap everything in pure.

We’re almost done now—all we need to do is set up the initial state when the game begins. We’ll write a small binding that creates a world state with the snake in the middle of the board and some random food locations scattered about:

(def initial-state
  (do [initial-food <- (sequence (take 5 (repeat random-point)))]
      (pure (world-state d:right
                         {(point 25 15) :: (point 24 15) :: (point 23 15) :: nil}
                         initial-food))))

Notably, we can use the repeat function to create an infinite list of random-point actions, take the first five of them, then call sequence to execute them from left to right. Now, all we have to do is put the pieces together in a main block:

(main (do [state <- initial-state]
          (big-bang state
            #:to-draw render
            #:on-tick on-tick 0.2
            #:on-key on-key)))

And that’s it! We haven’t implemented any win or loss conditions, but the basics are all there. In 80 lines of code, we’ve implemented a working snake game in Hackett.

Contributing to Hackett

If you are excited enough about Hackett to be interested in contributing, your first question is very likely “What can I do?” or “Where do I start?” My answer to that is (perhaps a little unhelpfully): it depends! My general recommendation is to try and write something with Hackett, and if you run into anything that prevents you from accomplishing your goal, look into what would need to be changed to support your program. Having a use case is a great way to come up with useful improvements.

On the other hand, you might not have anything in mind, or you might find Hackett’s scope a little too overwhelming to just jump right in and start contributing. Fortunately, Hackett has an issue tracker, so feel free to take a look and pick something that looks interesting and achievable. Alternatively, the standard library can always use fleshing out, and quite a lot of that can be written without ever even touching the scary Hackett internals.

Additionally, if you have any questions, please don’t hesitate to ask them! If you have a question about the codebase, get stuck implementing something, or just don’t know where to start, feel free to open an issue on GitHub, send me a message on the #racket IRC channel on Freenode, or ping me on the Racket Slack team.

Acknowledgements

Speaking of contributors, I’m excited to say that this is the first time I can truly say Hackett includes code written by someone other than me! I want to call attention to Samuel Gélineau, aka gelisam, who is officially the second contributor to Hackett. He helped to implement the new approach the Hackett REPL uses for printing expressions, which ended up being quite useful when implementing some of the other REPL improvements.

Additionally, I want to specially thank Matthew Flatt, Robby Findler, and Sam Tobin-Hochstadt for being especially responsive and helpful to my many questions about Scribble and the Racket top level. Racket continues to be extremely impressive, both as a project and as a community.

Finally, many thanks to the various people who have expressed interest in the project and continue to push me and ask questions. Working on Hackett is a lot of work—both time and effort—and it’s your continued enthusiasm that inspires me to put in the hours.

User-programmable infix operators in Racket

2017-08-12T00:00:00Z

Lisps are not known for infix operators, quite the opposite; infix operators generally involve more syntax and parsing than Lispers are keen to support. However, in Hackett, all functions are curried, and variable-arity functions do not exist. Infix operators are almost necessary for that to be palatable, and though there are other reasons to want them, it may not be obvious how to support them without making the reader considerably more complex.

Fortunately, if we require users to syntactically specify where they wish to use infix expressions, support for infix operators is not only possible, but can support be done without modifying the stock #lang racket reader. Futhermore, the resulting technique makes it possible for fixity information to be specified locally in a way that cooperates nicely with the Racket macro system, allowing the parsing of infix expressions to be manipulated at compile-time by users’ macros.

Our mission

Before we embark, let’s clarify our goal. We want to support infix operators in Racket, of course, but that could mean a lot of different things! Let’s start with what we do want:

Infix operators should be user-extensible, not limited to a special set of built-in operators.
Furthermore, operators’ names should not be restricted to a separate “operator” character set. Any valid Lisp identifier should be usable as an infix operator.
We want to be able to support fixity/associativity annotations. Some operators should associate to the left, like subtraction, but others should associate to the right, like cons. This allows 5 - 1 - 2 to be parsed as (- (- 5 1) 2), but 5 :: 1 :: nil to be parsed as (:: 5 (:: 1 nil)).

These are nice goals, but we also won’t be too ambitious. In order to keep things simple and achievable, we’ll keep the following restrictions:

We will not permit infix expressions in arbitrary locations, since that would be impossible to parse given how we want to allow users to pick any names for operators they wish. Instead, infix expressions must be wrapped in curly braces, e.g. replacing (+ 1 2) with {1 + 2}.
Our implementation will not support any notion of operator precedence; all operators will have equal precedence, and it will be illegal to mix operators of different associativity in the same expression. Precedence is entirely possible to implement in theory, but it would be considerably more work, so this blog post does not include it.
All operators will be binary, and we will not support unary or mixfix operators. My intuition is that this technique should be able to be generalized to both of those things, but it would be considerably more complicated.

With those points in mind, what would the interface for our infix operator library look like for our users? Ideally, something like this:

#lang racket

(require (prefix-in racket/base/ racket/base)
         "infix.rkt")

(define-infix-operator - racket/base/- #:fixity left)
(define-infix-operator :: cons #:fixity right)

{{2 - 1} :: {10 - 3} :: '()}
; => '(1 7)

Let’s get started.

Implementing infix operators

Now that we know what we want, how do we get there? Well, there are a few pieces to this puzzle. We’ll need to solve a two main problems:

How do we “hook into” expressions wrapped with curly braces so that we can perform a desugaring pass?
How can we associate fixity information with certain operators?

We’ll start by tackling the first problem, since its solution will inform the answer to the second. Since we won’t have any fixity information to start with, we’ll just assume that all operators associate left by default.

So, how do we detect if a Racket expression is surrounded by curly braces? Normally, in #lang racket, parentheses, square brackets, and curly braces are all interchangeable. Indeed, if you use curly braces in the REPL, you will find that they are treated exactly the same as parentheses:

> {+ 1 2}
3

If they are treated identically, giving them special behavior might seem hopeless, but don’t despair! Racket is no ordinary programming language, and it provides some tools to help us out here.

Someone who has worked with Lisps before is likely already aware that Lisp source code is a very direct representation of its AST, composed mostly of lists, pairs, symbols, numbers, and strings. In Racket, this is also true, but Racket also wraps these datums in boxes known as syntax objects. Syntax objects contain extra metadata about the code, most notably its lexical context, necessary for Racket’s hygiene system. However, syntax objects can also contain arbitrary metadata, known as syntax properties. Macros can attach arbitrary values to the syntax objects they produce using syntax properties, and other macros can inspect them. Racket’s reader (the syntax parser that turns program text into Racket syntax objects) also attaches certain syntax properties as part of its parsing process. One of those is named 'paren-shape.

This syntax property, as the name implies, keeps track of the shape of parentheses in syntax objects. You can see that for yourself by inspecting the property’s value for different syntax objects in the REPL:

> (syntax-property #'(1 2 3) 'paren-shape)
#f
> (syntax-property #'[1 2 3] 'paren-shape)
#\[
> (syntax-property #'{1 2 3} 'paren-shape)
#\{

This syntax property gives us the capability to distinguish between syntax objects that use curly braces and those that don’t, which is a step in the right direction, but it still doesn’t give us any hook with which we can change the behavior of certain expressions. Fortunately, there’s something else that can.

Customizing application

Racket is a language designed to be extended, and it provides a variety of hooks in the language for the purposes of tweaking pieces in minor ways. One such hook is named #%app, which is automatically introduced by the macroexpander whenever it encounters a function application. That means it effectively turns this:

(+ 1 2)

…into this:

(#%app + 1 2)

What’s special about #%app is that the macroexpander will use whichever #%app is in scope in the expression’s lexical context, so if we write our own version of #%app, it will be used instead of the one from #lang racket. This is what we will use to hook into ordinary Racket expressions.

To write our custom version of #%app, we will use the usual tool: Racket’s industrial-strength macro-authoring DSL, syntax/parse. We’ll also use a helper library that provides some tools for pattern-matching on syntax objects with the 'paren-shape syntax property, syntax/parse/class/paren-shape. Using these, we can transform expressions that are surrounded in curly braces differently from how we would transform expressions surrounded by parentheses:

#lang racket

(require (for-syntax syntax/parse/class/paren-shape)
         (prefix-in racket/base/ racket/base)
         syntax/parse/define)

(define-syntax-parser #%app
  [{~braces _ arg ...}
   #'(#%infix arg ...)]
  [(_ arg ...)
   #'(racket/base/#%app arg ...)])

This code will transform any applications surrounded in curly braces into one that starts with #%infix instead of #%app, so {1 + 2} will become (#%infix 1 + 2), for example. The identifier #%infix isn’t actually special in any way, it just has a funny name, but we haven’t actually defined #%infix yet, so we need to do that next!

To start, we’ll just handle the simplest case: infix expressions with precisely three subexpressions, like {1 + 2}, should be converted into the equivalent prefix expressions, in this case (+ 1 2). We can do this with a simple macro:

(define-syntax-parser #%infix
  [(_ a op b)
   #'(racket/base/#%app op a b)])

Due to the way Racket propagates syntax properties, we explicitly indicate that the resulting expansion should use the #%app from racket/base, which will avoid any accidental infinite recursion between our #%app and #%infix. With this in place, we can now try our code out in the REPL, and believe it or not, we now support infix expressions with just those few lines of code:

> (+ 1 2)
3
> {1 + 2}
3

That’s pretty cool!

Of course, we probably want to support infix applications with more than just a single binary operator, such as {1 + 2 + 3}. We can implement that just by adding another case to #%infix that handles more subforms:

(define-syntax-parser #%infix
  [(_ a op b)
   #'(racket/base/#%app op a b)]
  [(_ a op b more ...)
   #'(#%infix (#%infix a op b) more ...)])

…and now, just by adding those two lines, we support arbitrarily-large sequences of infix operators:

> {1 + 2 + 3}
6
> {1 + 2 + 3 + 4}
10

I don’t know about you, but I think being able to do this in less than 20 lines of code is pretty awesome. We can even mix different operators in the same expression:

> {1 + 2 * 3 - 4}
5

Of course, all of our infix expressions currently assume that all operators associate left, as was our plan. In general, though, there are lots of useful operators that associate right, such as cons, nested -> types or contracts for curried functions, and expt, the exponentiation operator.

Tracking operator fixity

Clearly, we need some way to associate operator fixity with certain identifiers, and we need to be able to do it at compile-time. Fortunately, Racket has a very robust mechanism for creating compile-time values. Unfortunately, simply associating metadata with an identifier is a little less convenient than it could be, but there is a general technique that can be done with little boilerplate.

Essentially, Racket (like Scheme) uses a define-syntax form to define macros, which is what define-syntax-parser eventually expands into. However, unlike Scheme, Racket’s define-syntax is not just for defining macros—it’s for defining arbitrary bindings with compile-time (aka “phase 1”) values. Using this, we can define bindings that have entirely arbitrary values at compile-time, including plain data like numbers or strings:

(define-syntax foo 3)

Once a binding has been defined using define-syntax, a macro can look up the value associated with it by using the syntax-local-value function, which returns the compile-time value associated with an identifier:

(begin-for-syntax
  (println (syntax-local-value #'foo)))
; => 3

The cool thing is that syntax-local-value gets the value associated with a specific binding, not a specific name. This means a macro can look up the compile-time value associated with an identifier provided to it as a subform. This is close to what we want, since we could use syntax-local-value to look up something associated with our infix operator bindings, but the trouble is that they would then cease to be usable as ordinary functions. For example, if you try and use the foo binding from the above example as an expression, Racket will complain about an “illegal use of syntax”, which makes sense, because foo is not bound to anything at runtime.

To solve this problem, we can use something of a trick: any compile-time binding that happens to have a procedure as its value will be treated like a macro—that is, using it as an expression will cause the macroexpander to invoke the procedure with a syntax object representing the macro invocation, and the procedure is expected to produce a new syntax object as output. Additionally, Racket programmers can make custom datatypes valid procedures by using the prop:procedure structure type property.

If you are not familiar with the Racket macro system, this probably sounds rather complicated, but in practice, it’s not as confusing as it might seem. The trick here is to create a custom structure type at compile-time that we can use to track operator fixity alongside its runtime binding:

(require (for-syntax syntax/transformer))

(begin-for-syntax
  (struct infix-operator (runtime-binding fixity)
    #:property prop:procedure
    (λ (operator stx)
      ((set!-transformer-procedure
        (make-variable-like-transformer
         (infix-operator-runtime-binding operator)))
       stx))))

This is quite the magical incantation, and all the details of what is going on here are outside the scope of this blog post. Essentially, though, we can use values of this structure as a compile-time binding that will act just like the identifier provided for runtime-binding, but we can also include a value of our choosing for fixity. Here’s an example:

(define-syntax :: (infix-operator #'cons 'right))

This new :: binding will act, in every way, just like cons. If we use it in the REPL, you can see that it acts exactly the same:

> (:: 1 '())
'(1)

However, we can also use syntax-local-value to extract this binding’s fixity at compile-time, and that’s what makes it interesting:

(begin-for-syntax
  (println (infix-operator-fixity (syntax-local-value #'::))))
; => 'right

Using this extra compile-time information, we can adjust our #%infix macro to inspect bindings and determine their fixity, then use that to make decisions about parsing. Just like we used syntax/parse/class/paren-shape to make decisions based on the 'paren-shape syntax property, we can use syntax/parse/class/local-value to pattern-match on bindings with a particular compile-time value. We’ll wrap this in a syntax class of our own to make the code easier to read:

(begin-for-syntax
  (define-syntax-class infix-op
    #:description "infix operator"
    #:attributes [fixity]
    [pattern {~var op (local-value infix-operator?)}
             #:attr fixity (infix-operator-fixity (attribute op.local-value))]))

Now, we can update #%infix to use our new infix-op syntax class:

(define-syntax-parser #%infix
  [(_ a op:infix-op b)
   #'(racket/base/#%app op a b)]
  [(_ a op:infix-op b more ...)
   #:when (eq? 'left (attribute op.fixity))
   #'(#%infix (#%infix a op b) more ...)]
  [(_ more ... a op:infix-op b)
   #:when (eq? 'right (attribute op.fixity))
   #'(#%infix more ... (#%infix a op b))])

Notably, we now require all operators to be bound to compile-time infix operator values, and we include two conditions via #:when clauses. These clauses check to ensure that the operator in question has the expected fixity before committing to that clause; if the condition fails, then parsing backtracks. Using this new definition of #%infix, we can successfully use :: in an infix expression, and it will be parsed with the associativity that we expect:

> {1 :: 2 :: 3 :: '()}
'(1 2 3)

Exciting!

A nicer interface for defining infix operators

We currently have to define infix operators by explicitly using define-syntax, but this is not a very good interface. Users of infix syntax probably don’t want to have to understand the internal workings of the infix operator implementation, so we just need to define one final macro to consider this done: the define-infix-operator form from the example at the very beginning of this blog post.

Fortunately, this macro is absolutely trivial to write. In fact, we can do it in a mere three lines of code, since it’s very minor sugar over the define-syntax definitions we were already writing:

(define-simple-macro (define-infix-operator op:id value:id
                       #:fixity {~and fixity {~or {~datum left} {~datum right}}})
  (define-syntax op (infix-operator #'value 'fixity)))

With this in hand, we can define some infix operators with a much nicer syntax:

(define-infix-operator + racket/base/+ #:fixity left)
(define-infix-operator - racket/base/- #:fixity left)
(define-infix-operator * racket/base/* #:fixity left)
(define-infix-operator / racket/base// #:fixity left)

(define-infix-operator ^ expt #:fixity right)
(define-infix-operator :: cons #:fixity right)

With these simple definitions, we can write some very nice mathematical expressions that use infix syntax, in ordinary #lang racket:

> {1 + 2 - 4}
-1
> {2 ^ 2 ^ 3}
256
> {{2 ^ 2} ^ 3}
64

And you know what’s most amazing about this? The entire thing is only 50 lines of code. Here is the entire implementation of infix operators from this blog post in a single code block, with absolutely nothing hidden or omitted:

#lang racket

(require (for-syntax syntax/parse/class/local-value
                     syntax/parse/class/paren-shape
                     syntax/transformer)
         (prefix-in racket/base/ racket/base)
         syntax/parse/define)

(begin-for-syntax
  (struct infix-operator (runtime-binding fixity)
    #:property prop:procedure
    (λ (operator stx)
      ((set!-transformer-procedure
        (make-variable-like-transformer
         (infix-operator-runtime-binding operator)))
       stx)))

  (define-syntax-class infix-op
    #:description "infix operator"
    #:attributes [fixity]
    [pattern {~var op (local-value infix-operator?)}
             #:attr fixity (infix-operator-fixity (attribute op.local-value))]))

(define-syntax-parser #%app
  [{~braces _ arg ...}
   #'(#%infix arg ...)]
  [(_ arg ...)
   #'(racket/base/#%app arg ...)])

(define-syntax-parser #%infix
  [(_ a op:infix-op b)
   #'(racket/base/#%app op a b)]
  [(_ a op:infix-op b more ...)
   #:when (eq? 'left (attribute op.fixity))
   #'(#%infix (#%infix a op b) more ...)]
  [(_ more ... a op:infix-op b)
   #:when (eq? 'right (attribute op.fixity))
   #'(#%infix more ... (#%infix a op b))])

(define-simple-macro (define-infix-operator op:id value:id
                       #:fixity {~and fixity {~or {~datum left} {~datum right}}})
  (define-syntax op (infix-operator #'value 'fixity)))

(define-infix-operator + racket/base/+ #:fixity left)
(define-infix-operator - racket/base/- #:fixity left)
(define-infix-operator * racket/base/* #:fixity left)
(define-infix-operator / racket/base// #:fixity left)

(define-infix-operator ^ expt #:fixity right)
(define-infix-operator :: cons #:fixity right)

Racket is a hell of a programming language.

Applications, limitations, and implications

This blog post has outlined a complete, useful model for infix operators, and it is now hopefully clear how they work, but many of the most interesting properties of this implementation are probably not obvious. As far as I can make out, this embedding of infix operators into a macro system is novel, and I am almost certain that the way this implementation tracks fixity information is unique. One of the most interesting capabilities gained from this choice of implementation is the ability for macros to define infix operators and control their fixity, even locally.

What does this mean? Well, remember that infix operators are just special syntax bindings. Racket includes a variety of forms for binding or adjusting macros locally, such as let-syntax and syntax-parameterize. Using these tools, it would be entirely possible to implement a with-fixity macro, that could adjust the fixity of an operator within a syntactic block. This could be used, for example, to make / right associative within a block of code:

> {1 / 2 / 3}
1/6
> (with-fixity ([/ right])
    {1 / 2 / 3})
1 1/2

In fact, this macro is hardly theoretical, since it could be implemented in a trivial 7 lines, simply expanding to uses of splicing-let and splicing-let-syntax:

(define-simple-macro
  (with-fixity ([op:id {~and fixity {~or {~datum left} {~datum right}}}] ...)
    body ...)
  #:with [op-tmp ...] (generate-temporaries #'[op ...])
  (splicing-let ([op-tmp op] ...)
    (splicing-let-syntax ([op (infix-operator #'op-tmp 'fixity)] ...)
      body ...)))

This is not especially useful given the current set of infix operator features, but it’s easy to imagine how useful it could be in a system that also supported a notion of precedence. It is not entirely uncommon to encounter certain expressions that could be more cleanly expressed with a local set of operator precedence rules, perhaps described as a set of relations between operators rather than a global table of magic precedence numbers. With traditional approaches to infix operators, parsing such code would be difficult without a very rigid syntactic structure, but this technique makes it easy.

As mentioned at the beginning of this blog post, this technique is also not merely a novelty—as of now, I am actively using this in Hackett to support infix operators with all of the features outlined here. The Hackett implementation is a little bit fancier than the one in this blog post, since it works harder to produce better error messages. It explicitly disallows mixing left associative and right associative operators in the same expression, so it does some additional validation as part of expansion, and it arranges for source location information to be copied onto the result. It also make a different design decision to allow any expression to serve as an infix operator, assuming left associativity if no fixity annotation is available.

If you’re interested in the code behind the additional steps Hackett takes to make infix operators more usable and complete, take a look at this file for the definition of infix bindings, as well as this file for the defintion of infix application. My hope is to eventually add support for some sort of precedence information, though who knows—maybe infix operators will be easier to reason about if the rules are kept extremely simple. I am also considering adding support for so-called “operator sections” at some point, which would allow things like {_ - 1} to serve as a shorthand for (lambda [x] {x - 1}), but I haven’t yet decided if I like the tradeoffs involved.

It’s possible that this implementation of infix operators might also be useful in languages in the Racket ecosystem besides Hackett. However, I’m not sure it makes a ton of sense in #lang racket without modifications, as variadic functions subsume many of the cases where infix operators are needed in Haskell. If there is a clamoring for this capability, I would be happy to consider extracting the functionality into a library, but as of right now, I don’t have any plans to do so.

Finally, the main point of this blog post is to showcase how easy it is to do things in Racket that would be impossible in most languages and difficult even in most Lisps. It also helps to show off how Hackett is already benefitting from those capabilities: while this particular feature is built-in to #lang hackett, there’s no reason something similar but more powerful couldn’t be built as a separate library by a user of Hackett. Even as Hackett’s author, I think that’s exciting, since makes it possible for users to experiment with improvements to the language on their own. Some of those improvements may eventually be rolled into the core language or standard library, but many of them can likely live effectively in separate libraries, accessible on-demand to those who need them. After all, that’s one of Racket’s most important promises—languages as libraries—and it’s why Hackett is a part of the Racket ecosystem.

Unit testing effectful Haskell with monad-mock

2017-06-29T00:00:00Z

Nearly eight months ago, I wrote a blog post about unit testing effectful Haskell code using a library called test-fixture. That library has served us well, but it wasn’t as easy to use as I would have liked, and it worked better with certain patterns than others. Since then, I’ve learned more about Haskell and more about testing, and I’m pleased to announce that I am releasing an entirely new testing library, monad-mock.

A first glance at monad-mock

The monad-mock library is, first and foremost, designed to be easy. It doesn’t ask much from you, and it requires almost zero boilerplate.

The first step is to write an mtl-style interface that encodes an effect you want to mock. For example, you might want to test some code that interacts with the filesystem:

class Monad m => MonadFileSystem m where
  readFile :: FilePath -> m String
  writeFile :: FilePath -> String -> m ()

Now you just have to write your code as normal. For demonstration purposes, here’s a function that defines copying a file in terms of readFile and writeFile:

copyFile :: MonadFileSystem m => FilePath -> FilePath -> m ()
copyFile a b = do
  contents <- readFile a
  writeFile b contents

Making this function work on the real filesystem is trivial, since we just need to define an instance of MonadFileSystem for IO:

instance MonadFileSystem IO where
  readFile = Prelude.readFile
  writeFile = Prelude.writeFile

But how do we test this? Well, we could run some real code in IO, which might not be so bad for such a simple function, but this seems like a bad idea. For one thing, a bad implementation of copyFile could do some pretty horrible things if it misbehaved and decided to overwrite important files, and if you’re constantly running a test suite whenever a file changes, it’s easy to imagine causing a lot of damage. Running tests against the real filesystem also makes tests slower and harder to parallelize, and it only gets much worse once you are doing more complex effects than interacting with the filesystem.

Using monad-mock, we can test this function in just a couple of lines of code:

import Control.Exception (evaluate)
import Control.Monad.Mock
import Control.Monad.Mock.TH
import Data.Function ((&))
import Test.Hspec

makeMock "FileSystemAction" [ts| MonadFileSystem |]

spec = describe "copyFile" $
  it "reads a file and writes its contents to another file" $
    evaluate $ copyFile "foo.txt" "bar.txt"
      & runMock [ ReadFile "foo.txt" :-> "contents"
                , WriteFile "bar.txt" "contents" :-> () ]

That’s it!

The last two lines of the above snippet are the real interesting bits, which specify the actions that are expected to be executed, and it couples them with their results. You will find that if you tweak the list in any way, such as reordering the actions, eliminating one or both of them, or adding an additional action to the end, the test will fail. We could even turn this into a property-based test that generated arbitrary file paths and file contents.

Admittedly, in this trivial example, the mock is a little silly, since converting this into a property-based test would demonstrate how much we’ve basically just reimplemented the function in our test. However, once our function starts to do somewhat more complicated things, then our tests become more meaningful. Here’s a similar function that only copies a file if it is nonempty:

copyNonemptyFile :: MonadFileSystem m => FilePath -> FilePath -> m ()
copyNonemptyFile a b = do
  contents <- readFile a
  unless (null contents) $
    writeFile b contents

This function has some logic which is very clearly not expressed in its type, and it would be difficult to encode that information into the type in a safe way. Fortunately, we can guarantee that it works by writing some tests:

describe "copyNonemptyFile" $ do
  it "copies a file with contents" $
    evaluate $ copyNonemptyFile "foo.txt" "bar.txt"
      & runMock [ ReadFile "foo.txt" :-> "contents"
                , WriteFile "bar.txt" "contents" :-> () ]

  it "does nothing with an empty file" $
    evaluate $ copyNonemptyFile "foo.txt" "bar.txt"
      & runMock [ ReadFile "foo.txt" :-> "" ]

These tests are much more useful, and they have some actual value to them. Imagine we had accidentally written when instead of unless, an easy typo to make. Our tests would fail with some useful error messages:

1) copyNonemptyFile copies a file with contents
     uncaught exception: runMockT: expected the following unexecuted actions to be run:
       WriteFile "bar.txt" "contents"

2) copyNonemptyFile does nothing with an empty file
     uncaught exception: runMockT: expected end of program, called writeFile
       given action: WriteFile "bar.txt" ""

You now know enough to write tests with monad-mock.

Why unit test?

When the issue of testing is brought up in Haskell, it is often treated with a certain distaste by a portion of the community. There are some points I’ve seen a number of times, and though they take different forms, they boil down to two ideas:

“Haskell code does not need tests because the type system can prove correctness.”
“Testing in Haskell is trivial because it is a pure language, and testing pure functions is easy.”

I’ve been writing Haskell professionally for over a year now, and I can happily say that there is some truth to both of those things! When my Haskell code typechecks, I feel a confidence in it that I would not feel were I using a language with a less powerful type system. Furthermore, Haskell encourages a “pure core, impure shell” approach to system design that makes testing many things pleasant and straightforward, and it completely eliminates the worry of subtle nondeterminism leaking into tests.

That said, Haskell is not a proof assistant, and its type system cannot guarantee everything, especially for code that operates on the boundaries of what Haskell can control. For much the same reason, I find that my pure code is the code I am least likely to need to test, since it is also the code with the strongest type safety guarantees, operating on types in my application’s domain. In contrast, the effectful code is often what I find the most value in extensively testing, since it often contains the most subtle complexity, and it is frequently difficult or even impossible to encode into types.

Haskell has the power to provide remarkably strong correctness guarantees with a surprisingly small amount of effort by using a combination of tests and types, using each to accommodate for the other’s weaknesses and playing to each technique’s strengths. Some code is test-driven, other code is type-driven. Most code ends up being a mix of both. Testing is just a tool like any other, and it’s nice to feel confident in one’s ability to effectively structure code in a decoupled, testable manner.

Why mock?

Even if you accept that testing is good, the question of whether or not to mock is a subtler issue. To some people, “unit testing” is synonymous with mocks. This is emphatically not true, and in fact, overly aggressive mocking is one of the best ways to make your test suite completely worthless. The monad-mock approach to mocking is a bit more principled than mocking in many dynamic, object-oriented languages, but it comes with many of the same drawbacks: mocks couple your tests to your implementation in ways that make them less valuable and less meaningful.

For the MonadFileSystem example above, I would actually probably not use a mock. Instead, I would use a fake, in-memory filesystem implementation:

newtype FakeFileSystemT m a = FakeFileSystemT (StateT [(FilePath, String)] m a)
  deriving (Functor, Applicative, Monad)

fakeFileSystemT :: Monad m => [(FilePath, String)]
                -> FakeFileSystemT m a -> m (a, [(FilePath, String)])
fakeFileSystemT fs (FakeFileSystemT x) = second sort <$> runStateT x fs

instance Monad m => MonadFileSystem (FakeFileSystemT m) where
  readFile path = FakeFileSystemT $ get >>= \fs -> lookup path fs &
    maybe (fail $ "readFile: no such file ‘" ++ path ++ "’") return
  writeFile path contents = FakeFileSystemT . modify $ \fs ->
    (path, contents) : filter ((/= path) . fst) fs

The above snippet demonstrates how easy it is to define a MonadFileSystem implementation in terms of StateT, and while this may seem like a lot of boilerplate, it really isn’t. You have to write a fake once per interface, and the above block is a minuscule twelve lines of code. With this technique, you are still able to write tests that depend on the state of the filesystem before and after running the implementation, but you decouple yourself from the precise process of getting there:

describe "copyNonemptyFile" $ do
  it "copies a file with contents" $ do
    let ((), fs) = runIdentity $ copyNonemptyFile "foo.txt" "bar.txt"
          & fakeFileSystemT [ ("foo.txt", "contents") ]
    fs `shouldBe` [ ("bar.txt", "contents"), ("foo.txt", "contents") ]

  it "does nothing with an empty file" $ do
    let ((), fs) = runIdentity $ copyNonemptyFile "foo.txt" "bar.txt"
          & fakeFileSystemT [ ("foo.txt", "") ]
    fs `shouldBe` [ ("foo.txt", "") ]

This is better than using a mock, and I would highly recommend doing it if you can! However, a lot of real applications have to interact with services of much greater complexity than an idealized filesystem, and creating that sort of in-memory fake is not always practical. One such situation might be interacting with AWS CloudFormation, for example:

class Monad m => MonadAWS m where
  createStack :: StackName -> StackTemplate -> m (Either AWSError StackId)
  listStacks :: m (Either AWSError [StackSummaries])
  describeStack :: StackId -> m (Either AWSError StackInfo)
  -- and so on...

AWS is a very complex system, and it can do dozens of different things (and fail in dozens of different ways) based on an equally complex set of inputs. For example, in the above API, createStack needs to parse its template, which can be YAML or JSON, in order to determine which of many possible errors and behaviors can be produced, both on the initial call and on subsequent ones.

Creating a fake implementation of AWS is hardly feasible, and this is where a mock can be useful. By simply writing makeMock "AWSAction" [ts| MonadAWS |], we can test functions that interact with AWS in a pure way without necessarily needing to replicate all of its complexity.

Isolating mocks

Of course, tests that use mocks provide less value than tests that use “smarter” fakes, since they are far more tightly coupled to the implementation, and it’s dramatically more likely that you will need to change the tests when you change the logic. To avoid this, it can be helpful to create multiple interfaces to the same thing: a high-level interface and a low-level one. If our above MonadAWS is a low-level interface, we could create a high-level counterpart that does precisely what our application needs:

class Monad m => MonadDeploy m where
  executeDeployment :: Deployment -> m (Either DeployError ())

When running our application “for real”, we would use MonadAWS to implement MonadDeploy:

executeDeploymentImpl :: MonadAWS m => Deployment -> m (Either DeployError ())
executeDeploymentImpl = ...

The nice thing about this is we can actually test executeDeploymentImpl using a MonadAWS mock, so we can still have unit test coverage of the code on the boundaries of our system! Additionally, by containing the mock to a single place, we can test the rest of our code using a smarter fake implementation of MonadDeploy, helping to decouple our code from AWS’s complex API and improve the reliability and usefulness of our test suite.

They key point here is that mocking is just a small piece of the larger testing puzzle in any language, and that is just as true in Haskell. An overemphasis on mocking is an easy way to end up with a test suite that feels useless, probably because it is. Use mocks as a technique to insulate your application from the complexity in others’ APIs, then use more domain-specific testing techniques and type-level assertions to ensure the correctness of your logic.

How monad-mock works

If you’ve read this far and are convinced that monad-mock is useful, you may safely stop reading now. However, if you are interested in the details of what it actually does and what makes it tick, the rest of this blog post is going to focus on how the implementation works and how it compares to other techniques.

The centerpiece of monad-mock’s API is its monad transformer, MockT, which is a type constructor that accepts three types:

newtype MockT (f :: * -> *) (m :: * -> *) (a :: *)

The m and a type variables obviously correspond to the usual monad transformer arguments, which represent the underlying monad and the result of the monadic computation, respectively. The f variable is more interesting, since it’s what makes MockT work at all, and it isn’t even a type: it’s a type constructor with kind * -> *. What does it mean?

Looking at the type signature of runMockT gives us a little bit more information about what that f actually represents:

runMockT :: (Action f, Monad m) => [WithResult f] -> MockT f m a -> m a

This type signature provides two pieces of key information:

The f parameter is constrained by the Action f constraint.
Running a mocked computation requires supplying a list of WithResult f values. This list corresponds to the list of expectations provided to runMock in earlier examples.

To understand both of these things, it helps to examine the definition of an actual datatype that can have an Action instance. For the filesystem example, the action datatype looks like this:

data FileSystemAction r where
  ReadFile :: FilePath -> FileSystemAction String
  WriteFile :: FilePath -> String -> FileSystemAction ()

Notice how each constructor clearly corresponds to one of the methods of MonadFileSystem, with a type to match. Now the purpose of the type provided to the FileSystemAction constructor (in this case r) should hopefully become clear: it represents the type of the value produced by each method. Also note that the type is completely phantom—it does not appear in negative position in any of the constructors.

With this in mind, we can take a look at the definition of WithResult:

data WithResult f where
  (:->) :: f r -> r -> WithResult f

This is what defines the (:->) constructor from earlier in the blog post, and you can see that it effectively just represents a tuple of an action and a value of its associated result. It’s completely type-safe, since it ensures the result matches the type argument to the action.

Finally, this brings us to the Action class, which is not complex, but is unfortunately necessary:

class Action f where
  eqAction :: f a -> f b -> Maybe (a :~: b)
  showAction :: f a -> String

Notice that these methods are effectively just (==) and show, lifted to type constructors of kind * -> *. One significant difference is that eqAction produces Maybe (a :~: b) instead of Bool, where (:~:) is from Data.Type.Equality. This is a type equality witness, which means a successful equality between two values allows the compiler to be sure that the two types are equal. This is necessary for the implementation of runMockT due to the phantom type in actions—in order to convince GHC that we can properly return the result of a mocked action, we need to assure it that the value we’re going to return is actually of the proper type.

Implementing this typeclass is not particularly burdensome, but it’s entirely boilerplate, so even if you want to define your own action type (that is, you don’t want to use makeMock), you can use the deriveAction function from Control.Monad.Mock.TH to derive an Action instance on an existing datatype.

Connecting the mock to its class

Now that we have an action with which to mock a class, we need to actually define an instance of that class for MockT. For this process, monad-mock provides a mockAction function with the following type:

mockAction :: (Action f, Monad m) => String -> f r -> MockT f m r

This function accepts two arguments: the name of the method being mocked and the action that represents the current call. This is easier to illustrate with an actual instance of MonadFileSystem using MockT and our FileSystemAction type:

instance Monad m => MonadFileSystem (MockT FileSystemAction m) where
  readFile a = mockAction "readFile" (ReadFile a)
  writeFile a b = mockAction "writeFile" (WriteFile a b)

This allows readFile and writeFile to defer to the mock, and providing the names of the functions as strings helps monad-mock to produce useful error messages upon failure. Internally, MockT is a StateT that keeps track of a list of WithResult f values as its state. Each call to the mock checks the action against the internal list of calls, and if they match, it returns the associated result. Otherwise, it throws an exception.

This scheme is simple, but it seems to work remarkably well. There are some obvious enhancements that will probably be eventually necessary, like allowing action results that run in the underlying monad m in order to support things like throwError from MonadError, but so far, it hasn’t been necessary for what we’ve been using it for. Certain tricky signatures defy this simple technique, such as signatures where a monadic action appears in a negative position (that is, the signatures you need things like monad-control or monad-unlift for), but we’ve found that most of our effects don’t have any reason to include such signatures.

A brief comparison with free(r) monads

At this point, astute readers will likely be thinking about free monads, which parts of this technique greatly resemble. The representation of actions as GADTs is especially similar to freer, which does something extremely similar. Indeed, you can think of this technique as something that combines a freer-style representation with mtl-style classes. Given that freer already does this, you might ask yourself what the point is.

If you are already sold on free monads, monad-mock may very well be uninteresting to you. From the perspective of theoretical novelty, monad-mock is not anything new or different. However, there are a variety of practical reasons to prefer mtl over free, and it’s nice to see how easy it is to enjoy the testing benefits of free without too much extra effort.

An in-depth comparison between mtl and free is well outside the scope of this blog post. However, the key point is that this technique only affects test code, so the real runtime implementation will not be affected in any way. This means you can take advantage of the performance benefits and ecosystem support of mtl without sacrificing simple, expressive testing.

Conclusion

To cap things off, I want to emphasize monad-mock’s role as a single part of a larger initiative we’ve been making for the better part of the past eighteen months. Haskell is a language with ever-evolving techniques and style, and it’s sometimes dizzying to figure out how to use all the pieces together to develop robust, maintainable applications. While monad-mock might not be anything drastically different from existing testing techniques, my hope is that it can provide an opinionated mechanism to make testing easy and accessible, even for complex interactions with other services and systems.

I’ve made an effort to make it abundantly clear in this blog post that monad-mock is not a silver bullet to testing, and in fact, I would prefer other techniques for ensuring correctness whenever possible. Even so, mocking is a nice tool to have in your toolbox, and it’s a good fallback to get even the worst APIs under test coverage.

If you want to try out monad-mock for yourself, take a look at the documentation on Hackage and start playing around! It’s still early software, so it’s not the most proven or featureful, but we’ve managed to get mileage out of it already, all the same. If you find any problems, have a use case it does not support, or just find something about it unclear, please do not hesitate to open an issue on the GitHub repository—we obviously can’t fix issues we don’t know about.

Thanks as always to the many people who have contributed ideas that have shaped my philosophy and approach to testing and have helped provide the tools that make this library work. Happy testing!

Realizing Hackett, a metaprogrammable Haskell

2017-05-27T00:00:00Z

Almost five months ago, I wrote a blog post about my new programming language, Hackett, a fanciful sketch of a programming language from a far-off land with Haskell’s type system and Racket’s macros. At that point in time, I had a little prototype that barely worked, that I barely understood, and was a little bit of a technical dead-end. People saw the post, they got excited, but development sort of stopped.

Then, almost two months ago, I took a second stab at the problem in earnest. I read a lot, I asked a lot of people for help, and eventually I got something sort of working. Suddenly, Hackett is not only real, it’s working, and you can try it out yourself!

A first look at Hackett

Hackett is still very new, very experimental, and an enormous work in progress. However, that doesn’t mean it’s useless! Hackett is already a remarkably capable programming language. Let’s take a quick tour.

As Racket law decrees it, every Hackett program must begin with #lang. We can start with the appropriate incantation:

#lang hackett

If you’re using DrRacket or racket-mode with background expansion enabled, then congratulations: the typechecker is online. We can begin by writing a well-typed, albeit boring program:

#lang hackett

(main (println "Hello, world!"))

In Hackett, a use of main at the top level indicates that running the module as a program should execute some IO action. In this case, println is a function of type {String -> (IO Unit)}. Just like Haskell, Hackett is pure, and the runtime will figure out how to actually run an IO value. If you run the above program, you will notice that it really does print out Hello, world!, exactly as we would like.

Of course, hello world programs are boring—so imperative! We are functional programmers, and we have our own class of equally boring programs we must write when learning a new language. How about some Fibonacci numbers?

#lang hackett

(def fibs : (List Integer)
  {0 :: 1 :: (zip-with + fibs (tail! fibs))})

(main (println (show (take 10 fibs))))

Again, Hackett is just like Haskell in that it is lazy, so we can construct an infinite list of Fibonacci numbers, and the runtime will happily do nothing at all. When we call take, we realize the first ten numbers in the list, and when you run the program, you should see them printed out, clear as day!

But these programs are boring. Printing strings and laziness may have been novel when you first learned about them, but if you’re reading this blog post, my bet is that you probably aren’t new to programming. How about something more interesting, like a web server?

#lang hackett

(require hackett/demo/web-server)

(data Greeting (greeting String))

(instance (->Body Greeting)
  [->body (λ [(greeting name)] {"Hello, " ++ name ++ "!"})])

(defserver run-server
  [GET "/"               -> String   => "Hello, world!"]
  [GET "greet" -> String -> Greeting => greeting])

(main (do (println "Running server on port 8080.")
          (run-server 8080)))

$ racket my-server.rkt
Running server on port 8080.
^Z
$ bg
$ curl 'http://localhost:8080/greet/Alexis'
Hello, Alexis!

Welcome to Hackett.

What is Hackett?

Excited yet? I hope so. I certainly am.

Before you get a little too excited, however, let me make a small disclaimer: the above program, while quite real, is a demo. It is certainly not a production web framework, and it actually just uses the Racket web server under the hood. It does not handle very many things right now. You cannot use it to build your super awesome webapp, and even if you could, I would not recommend attempting to do so.

All that said, it is a real tech demo, and it shows off the potential for Hackett to do some pretty cool things. While the server implementation is just reusing Racket’s dynamically typed web server, the Hackett interface to it is 100% statically typed, and the above example shows off a host of features:

Algebraic datatypes. Hackett has support for basic ADTs, including recursive datatypes (though not yet mutually recursive datatypes).
Typeclasses. The demo web server uses a ->Body typeclass to render server responses, and this module implements a ->Body instance for the custom Greeting datatype.
Macros. The defserver macro provides a concise, readable, type safe way to define a simple, RESTful web server. It defines two endpoints, a homepage and a greeting, and the latter parses a segment from the URL.
Static typechecking. Obviously. If you try and change the homepage endpoint to produce a number instead of a string, you will get a type error! Alternatively, try removing the ->Body instance and see what happens.
Infix operators. In Hackett, { curly braces } enter infix mode, which permits arbitrary infix operators. Most Lisps have variadic functions, so infix operators are not strictly necessary, but Hackett only supports curried, single-argument functions, so infix operators are some especially sweet sugar.
Pure, monadic I/O. The println and run-server functions both produce (IO Unit), and IO is a monad. do notation is provided as a macro, and it works with any type that implements the Monad typeclass.

All these features are already implemented, and they really work! Of course, you might look at this list and be a little confused: sure, there are macros, but all these other things are firmly Haskellisms. If you thought that, you’d be quite right! Hackett is much closer to Haskell than Racket, even though it is syntactically a Lisp. Keep this guiding principal in mind as you read this blog post or explore Hackett. Where Haskell and Racket conflict, Hackett usually prefers Haskell.

For a bit more information about what Hackett is and what it aims to be, check out my blog post from a few months ago from back when Hackett was called Rascal. I won’t reiterate everything I said there, but I do want to give a bit of a status update, explain what I’ve been working on, and hopefully give you some idea about where Hackett is going.

The story so far, and getting to Hackett 0.1

In September of 2016, I attended (sixth RacketCon), where I saw a pretty incredible and extremely exciting talk about implementing type systems as macros. Finally, I could realize my dream of having an elegant Lisp with a safe, reliable macro system and a powerful, expressive type system! Unfortunately, reality ensued, and I remembered I didn’t actually know any type theory.

Therefore, in October, I started to learn about type systems, and I began to read through Pierce’s Types and Programming Languages, then tried to learn the things I would need to understand Haskell’s type system. I learned about Hindley-Milner and basic typeclasses, and I tried to apply these things to the Type Systems as Macros approach. Throughout October, I hacked and I hacked, and by the end of the month, I stood back and admired my handiwork!

…it sort of worked?

The trouble was that I found myself stuck. I wasn’t sure how to proceed. My language had bugs, programs sometimes did things I didn’t understand, the typechecker was clearly unsound, and there didn’t seem to be an obvious path forward. Other things in my life became distracting or difficult, and I didn’t have the energy to work on it anymore, so I stopped. I put Hackett (then Rascal) on the shelf for a couple months, only to finally return to it in late December.

At the beginning of January, I decided it would be helpful to be public about what I was working on, so I wrote a blog post! Feedback was positive, overwhelmingly so, and while it was certainly encouraging, I suddenly felt nervous about expectations I had not realized I was setting. Could I really build this? Did I have the knowledge or the time? At that point, I didn’t really, so work stalled.

Fortunately, in early April, some things started to become clear. I took another look at Hackett, and I knew I needed to reimplement it from the ground up. I also knew that I needed a different technique, but this time, I knew a bit more about where to find it. I got some help from Sam Tobin-Hochstadt and put together an implementation of Pierce and Turner’s Local Type Inference. Unfortunately, it didn’t really provide the amount of type inference I was looking for, but fortunately, implementing it helped me figure out how to understand the rather more complicated (though very impressive) Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism. After that, things just sort of started falling into place:

First, I implemented the Complete and Easy paper in Haskell, including building a little parser and interpreter. That helped me actually understand the paper, and Haskell really is a rather wonderful language for doing such a thing.
Three days later, I ported the Haskell implementation to Racket, using (and somewhat abusing) the Type Systems as Macros techniques. It wasn’t the prettiest, but it seemed to work, and that was rather encouraging.
After that, however, I got a little stuck again, as I wasn’t sure how to generalize what I had. I was also incredibly busy with my day job, and I wasn’t able to really make progress for a few weeks. In early May, however, I decided to take a vacation for a week, and with some time to focus, I souped up the Haskell implementation with products and sums. This was progress!
The following day I managed to make similar changes to the Racket implementation, but rather than add anonymous products and sums, I added arbitrary type constructors.
A couple days later and with more than a bit of help from Phil Freeman, I rebranded the Racket implementation as Hackett, Mk II, and I started working towards turning it into a real programming language.

Less than three weeks later, and I have a programming language with everything from laziness and typeclasses to a tiny, proof-of-concept web server with editor support. The future of Hackett looks bright, and though there’s a lot of work left before I will be even remotely satisfied with it, I am excited and reassured that it already seems to be bearing some fruit.

So what’s left? Is Hackett ready for an initial release? Can you start writing programs in it today? Well, unfortunately, the answer is mostly no, at least if you want those programs to be at all reliable in a day or two. If everything looks so cheery, though, what’s left? What is Hackett still missing?

What Hackett still isn’t

I have a laundry list of features I want for Hackett. I want GADTs, indexed type families, newtype deriving, and a compiler that can target multiple backends. These things, however, are not essential. You can probably imagine writing useful software without any of them. Before I can try to tackle those, I first need to tackle some of the bits of the foundation that simply don’t exist yet (or have at least been badly neglected).

Fortunately, these things are not insurmountable, nor are they necessarily especially hard. They’re things like default class methods, static detection and prevention of orphan instances, exhaustiveness checking for pattern-matching, and a real kind system. That’s right—right now, Hackett’s type system is effectively dynamically typed, and even though you can write a higher-kinded type, there is no such thing as a “kind error”.

Other things are simply necessary quality of life improvements before Hackett can become truly usable. Type errors are currently rather atrocious, though they could certainly be worse. Additionally, typechecking currently just halts whenever it encounters a type error, and it makes no attempt to generate more than one type error at a time. Derivation of simple instances like Show and Eq is important, and it will also likely pave the way for a more general form of typeclass deriving (since it can most certainly be implemented via macros), so it’s uncharted territory that still needs to be explored.

Bits of plumbing are still exposed in places, whether it’s unexpected behavior when interoperating with Racket or errors sometimes reported in terms of internal forms. Local bindings are, if you can believe it, still entirely unimplemented, so let and letrec need to be written up. The standard library needs fleshing out, and certain bits of code need to be cleaned up and slotted into the right place.

Oh, and of course, the whole thing needs to be documented. That in and of itself is probably a pretty significant project, especially since there’s a good chance I’ll want to figure out how to best make use of Scribble for a language that’s a little bit different from Racket.

All in all, there’s a lot of work to be done! I am eager to make it happen, but I also work a full-time job, and I don’t have it in me to continue at the pace I’ve been working at for the past couple of weeks. Still, if you’re interested in the project, stay tuned and keep an eye on it—if all goes as planned, I hope to make it truly useful before too long.

Answering some questions

It’s possible that this blog post does not seem like much; after all, it’s not terribly long. However, if you’re anything like me, there’s a good chance you are interested enough to have some questions! Obviously, I cannot anticipate all your questions and answer them here in advance, but I will try my best.

Can I try Hackett?

Yes! With the caveat that it’s alpha software in every sense of the word: undocumented, not especially user friendly, and completely unstable. However, if you do want to give it a try, it isn’t difficult: just install Racket, then run raco pkg install hackett. Open DrRacket and write #lang hackett at the top of the module, then start playing around.

Also, note that the demo web server used in the example at the top of this blog post is not included when you install the hackett package. If you want to try that out, you’ll have to run raco pkg install hackett-demo to install the demo package as well.

Are there any examples of Hackett code?

Unfortunately, not a lot right now, aside from the tiny examples in this blog post. However, if you are already familiar with Haskell, the syntax likely won’t be hard to pick up. Reading the Hackett source code is not especially recommended, given that it is filled with implementation details. However, if you are interested, reading the module where most of the prelude is defined isn’t so bad. You can find it on GitHub here, or you can open the hackett/private/prim/base module on a local installation.

How can I learn more / ask questions about Hackett?

Feel free to ping me and ask me questions! I may not always be able to get back to you immediately, but if you hang around, I will eventually send you a response. The best ways to contact me are via the #racket IRC channel on Freenode, the snek Slack community (which you can sign up for here), sending me a DM on Twitter, opening an issue on the GitHub repo, or even just sending me an email (though I’m usually a bit slower to respond to the latter).

How can I help?

Probably the easiest way to help out is to try Hackett for yourself and report any bugs or infelicities you run into. Of course, many issues right now are known, there’s just so much to do that I haven’t had the chance to clean everything up. For that reason, the most effective way to contribute is probably to pick an existing issue and try and implement it yourself, but I wouldn’t be surprised if most people found the existing implementation a little intimidating.

If you are interested in helping out, I’d be happy to give you some pointers and answer some questions, since it would be extremely nice to have some help. Please feel free to contact me using any of the methods mentioned in the previous section, and I’ll try and help you find something you could work on.

How does Hackett compare to X / why doesn’t Hackett support Y?

These tend to be complex questions, and I don’t always have comprehensive answers for them, especially since the language is evolving so quickly. Still, if you want to ask me about this, feel free to just send the question to me directly. In my experience, it’s usually better to have a conversation about this sort of thing rather than just answering in one big comparison, since there’s usually a fair amount of nuance.

When will Hackett be ready for me to use?

I don’t know.

Obviously, there is a lot left to implement, that is certainly true, but there’s more to it than that. If all goes well, I don’t see any reason why Hackett can’t be early beta quality by the end of this year, even if it doesn’t support all of the goodies necessary to achieve perfection (which, of course, it never really can).

However, there are other things to consider, too. The Racket package system is currently flawed in ways that make rapidly iterating on Hackett hard, since it is extremely difficult (if not impossible) to make backwards-incompatible changes without potentially breaking someone’s program (even if they don’t update anything about their dependencies)! This is a solvable problem, but it would take some work modifying various elements of the package system and build tools, so that might need to get done before I can recommend Hackett in good faith.

Appendix

It would be unfair not to mention all the people that have made Hackett possible. I cannot list them all here, but I want to give special thanks to Stephen Chang, Joshua Dunfield, Robby Findler, Matthew Flatt, Phil Freeman, Ben Greenman, Alex Knauth, Neelakantan Krishnaswami, and Sam Tobin-Hochstadt. I’d also like to thank everyone involved in the Racket and Haskell projects as a whole, as well as everyone who has expressed interest and encouragement about what I’ve been working on.

As a final point, just for fun, I thought I’d keep track of all the albums I’ve been listening to while working on Hackett, just in the past few weeks. It is on theme with the name, after all. This list is not completely exhaustive, as I’m sure some slipped through the cracks, but you can thank the following artists for helping me power through a few of the hills in Hackett’s implementation:

The Beach Boys — Pet Sounds
Boards of Canada — Music Has The Right To Children, Geogaddi
Bruce Springsteen — Born to Run
King Crimson — In the Court of the Crimson King, Larks’ Tongues in Aspic, Starless and Bible Black, Red, Discipline
Genesis — Nursery Cryme, Foxtrot, Selling England by the Pound, The Lamb Lies Down on Broadway, A Trick of the Tail
Mahavishnu Orchestra — Birds of Fire
Metric — Fantasies, Synthetica, Pagans in Vegas
Muse — Origin of Symmetry, Absolution, The Resistance
Peter Gabriel — Peter Gabriel I, II, III, IV / Security, Us, Up
Pink Floyd — Wish You Were Here
Supertramp — Breakfast In America
The Protomen — The Protomen, Act II: The Father of Death
Talking Heads — Talking Heads: 77, More Songs About Buildings and Food, Fear of Music, Remain in Light
Yes — Fragile, Relayer, Going For The One

And of course, Voyage of the Acolyte, by Steve Hackett.

Lifts for free: making mtl typeclasses derivable

2017-04-28T00:00:00Z

Perhaps the most important abstraction a Haskell programmer must understand to effectively write modern Haskell code, beyond the level of the monad, is the monad transformer, a way to compose monads together in a limited fashion. One frustrating downside to monad transformers is a proliferation of lifts, which explicitly indicate which monad in a transformer “stack” a particular computation should run in. Fortunately, the venerable mtl provides typeclasses that make this lifting mostly automatic, using typeclass machinery to insert lift where appropriate.

Less fortunately, the mtl approach does not actually eliminate lift entirely, it simply moves it from use sites to instances. This requires a small zoo of extraordinarily boilerplate-y instances, most of which simply implement each typeclass method using lift. While we cannot eliminate the instances entirely without somewhat dangerous techniques like overlapping instances, we can automatically derive them using features of modern GHC, eliminating the truly unnecessary boilerplate.

The problem with mtl-style typeclasses

To understand what problem it is exactly that we’re trying to solve, we first need to take a look at an actual mtl-style typeclass. I am going to start with an mtl-style typeclass, rather than an actual typeclass in the mtl, due to slight complications with mtl’s actual typeclasses that we’ll get into later. Instead, let’s start with a somewhat boring typeclass, which we’ll call MonadExit:

import System.Exit (ExitCode)

class Monad m => MonadExit m where
  exitWith :: ExitCode -> m ()

This is a simple typeclass that abstracts over the concept of early exit, given an exit code. The most obvious implementation of this typeclass is over IO, which will actually exit the program:

import qualified System.Exit as IO (exitWith)

instance MonadExit IO where
  exitWith = IO.exitWith

One of the cool things about these typeclasses, though, is that we don’t have to have just one implementation. We could also write a pure implementation of MonadExit, which would simply short-circuit the current computation and return the ExitCode:

instance MonadExit (Either ExitCode) where
  exitWith = Left

Instead of simply having an instance on a concrete monad, though, we probably want to be able to use this in a larger monad stack, so we can define an ExitT monad transformer that can be inserted into any monad transformer stack:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

import Control.Monad.Except (ExceptT, runExceptT, throwError)
import Control.Monad.Trans (MonadTrans)

newtype ExitT m a = ExitT (ExceptT ExitCode m a)
  deriving (Functor, Applicative, Monad, MonadTrans)

runExitT :: ExitT m a -> m (Either ExitCode a)
runExitT (ExitT x) = runExceptT x

instance Monad m => MonadExit (ExitT m) where
  exitWith = ExitT . throwError

With this in place, we can write actual programs using our ExitT monad transformer:

ghci> runExitT $ do
        lift $ putStrLn "hello"
        exitWith (ExitFailure 1)
        lift $ putStrLn "world"
hello
Left (ExitFailure 1)

This is pretty cool! Unfortunately, experienced readers will see the rather large problem with what we have so far. Specifically, it won’t actually work if we try and wrap ExitT in another monad transformer:

ghci> logIn password = runExitT $ flip runReaderT password $ do
        password <- ask
        unless (password == "password1234") $ -- super secure password
          exitWith (ExitFailure 1)
        return "access granted"

ghci> logIn "not the right password"
<interactive>: error:
    • No instance for (MonadExit (ReaderT [Char] (ExitT m0)))
        arising from a use of ‘it’
    • In a stmt of an interactive GHCi command: print it

The error message is relatively self-explanatory if you are familiar with mtl error messages: there is no MonadExit instance for ReaderT. This makes sense, since we only defined a MonadExit instance for ExitT, nothing else. Fortunately, the instance for ReaderT is completely trivial, since we just need to use lift to delegate to the next monad in the stack:

instance MonadExit m => MonadExit (ReaderT r m) where
  exitWith = lift . exitWith

Now that the delegating instance is set up, we can actually use our logIn function:

ghci> logIn "not the right password"
Left (ExitFailure 1)
ghci> logIn "password1234"
Right "access granted"

An embarrassment of instances

We’ve managed to make our program work properly now, but we’ve still only defined the delegating instance for ReaderT. What if someone wants to use ExitT with WriterT? Or StateT? Or any of ExceptT, RWST, or ContT? Well, we have to define instances for each and every one of them, and as it turns out, the instances are all identical!

instance (MonadExit m, Monoid w) => MonadExit (WriterT w m) where
  exitWith = lift . exitWith

instance MonadExit m => MonadExit (StateT s m) where
  exitWith = lift . exitWith

instance (MonadExit m, Monoid w) => MonadExit (RWST r w s m) where
  exitWith = lift . exitWith

instance MonadExit m => MonadExit (ExceptT e m) where
  exitWith = lift . exitWith

instance MonadExit m => MonadExit (ContT r m) where
  exitWith = lift . exitWith

This is bad enough on its own, but this is actually the simplest case: a typeclass with a single method which is trivially lifted through any other monad transformer. Another thing we’ve glossed over is actually defining all the delegating instances for the other mtl typeclasses on ExitT itself. Fortunately, we can derive these ones with GeneralizedNewtypeDeriving, since ExceptT has already done most of the work for us:

newtype ExitT m a = ExitT (ExceptT ExitCode m a)
  deriving ( Functor, Applicative, Monad, MonadIO -- base
           , MonadBase IO -- transformers-base
           , MonadTrans, MonadReader r, MonadWriter w, MonadState s -- mtl
           , MonadThrow, MonadCatch, MonadMask -- exceptions
           , MonadTransControl, MonadBaseControl IO -- monad-control
           )

Unfortunately, we have to write the MonadError instance manually if we want it, since we don’t want to pick up the instance from ExceptT, but rather wish to defer to the underlying monad. This means writing some truly horrid delegation code:

instance MonadError e m => MonadError e (ExitT m) where
  throwError = lift . throwError

  catchError (ExitT x) f = ExitT . ExceptT $ catchError (runExceptT x) $ \e ->
    let (ExitT x') = f e in runExceptT x'

(Notably, this is so awful because catchError is more complex than the simple exitWith method we’ve studied so far, which is why we’re starting with a simpler typeclass. We’ll get more into this later, as promised.)

This huge number of instances is sometimes referred to as the “n² instances” problem, since it requires every monad transformer have an instance of every single mtl-style typeclass. Fortunately, in practice, this proliferation is often less horrible than it might seem, mostly because deriving helps a lot. However, remember that if ExitT weren’t a simple wrapper around an existing monad transformer, we wouldn’t be able to derive the instances at all! Instead, we’d have to write them all out by hand, just like we did with all the MonadExit instances.

It’s a shame that these typeclass instances can’t be derived in a more general way, allowing derivation for arbitrary monad transformers instead of simply requiring the newtype deriving machinery. As it turns out, with clever use of modern GHC features, we actually can. It’s not even all that hard.

Default instances with default signatures

It’s not hard to see that our MonadExit instances are all exactly the same: just lift . exitWith. Why is that, though? Well, every instance is an instance on a monad transformer over a monad that is already an instance of MonadExit. In fact, we can express this in a type signature, and we can extract lift . exitWith into a separate function:

defaultExitWith :: (MonadTrans t, MonadExit m) => ExitCode -> t m ()
defaultExitWith = lift . exitWith

However, writing defaultExitWith really isn’t any easier than writing lift . exitWith, so this deduplication doesn’t really buy us anything. However, it does indicate that we could write a default implementation of exitWith if we could require just a little bit more from the implementing type. With GHC’s DefaultSignatures extension, we can do precisely that.

The idea is that we can write a separate type signature for a default implementation of exitWith, which can be more specific than the type signature for exitWith in general. This allows us to use our defaultExitWith implementation more or less directly:

{-# LANGUAGE DefaultSignatures #-}

class Monad m => MonadExit m where
  exitWith :: ExitCode -> m ()

  default exitWith :: (MonadTrans t, MonadExit m1) => ExitCode -> t m1 ()
  exitWith = lift . exitWith

We have to use m1 instead of m, since type variables in the instance head are always scoped, and the names would conflict. However, this creates another problem, since our specialized type signature replaces m with t m1, which won’t quite work (as GHC can’t automatically figure out they should be the same). Instead, we can use m in the type signature, then just add a type equality constraint ensuring that m and t m1 must be the same type:

class Monad m => MonadExit m where
  exitWith :: ExitCode -> m ()

  default exitWith :: (MonadTrans t, MonadExit m1, m ~ t m1) => ExitCode -> m ()
  exitWith = lift . exitWith

Now we can write all of our simple instances without even needing to write a real implementation! All of the instance bodies can be empty:

instance MonadExit m => MonadExit (ReaderT r m)
instance (MonadExit m, Monoid w) => MonadExit (WriterT w m)
instance MonadExit m => MonadExit (StateT s m)
instance (MonadExit m, Monoid w) => MonadExit (RWST r w s m)
instance MonadExit m => MonadExit (ExceptT e m)
instance MonadExit m => MonadExit (ContT r m)

While this doesn’t completely alleviate the pain of writing instances, it’s definitely an improvement over what we had before. With GHC 8.2’s new DerivingStrategies extension, it becomes especially beneficial when defining entirely new transformers that should also have ExitT instances, since they can be derived with DeriveAnyClass:

newtype ParserT m a = ParserT (Text -> m (Maybe (Text, a)))
  deriving anyclass (MonadExit)

This is pretty wonderful.

Given that only MonadExit supports being derived in this way, we sadly still need to implement the other, more standard mtl-style typeclasses ourselves, like MonadIO, MonadBase, MonadReader, MonadWriter, etc. However, what if all of those classes provided the same convenient default signatures that our MonadExit does? If that were the case, then we could write something like this:

newtype ParserT m a = ParserT (Text -> m (Maybe (Text, a)))
  deriving anyclass ( MonadIO, MonadBase b
                    , MonadReader r, MonadWriter w, MonadState s
                    , MonadThrow, MonadCatch, MonadMask
                    , MonadExit
                    )

Compared to having to write all those instances by hand, this would be a pretty enormous difference. Unfortunately, many of these typeclasses are not quite as simple as our MonadExit, and we’d have to be a bit more clever to make them derivable.

Making mtl’s classes derivable

Our MonadExit class was extremely simple, since it only had a single method with a particularly simple type signature. For reference, this was the type of our generic exitWith:

exitWith :: MonadExit m => ExitCode -> m ()

Let’s now turn our attention to MonadReader. At first blush, this typeclass should not be any trickier to implement than MonadExit, since the types of ask and reader are both quite simple:

ask :: MonadReader r m => m r
reader :: MonadReader r m => (r -> a) -> m a

However, the type of the other method, local, throws a bit of a wrench in our plans. It has the following type signature:

local :: MonadReader r m => (r -> r) -> m a -> m a

Why is this so much more complicated? Well, the key is in the second argument, which has the type m a. That’s not something that can be simply lifted away! Try it yourself: try to write a MonadReader instance for some monad transformer. It’s not as easy as it looks!

We can illustrate the problem by creating our own version of MonadReader and implementing it for something like ExceptT ourselves. We can start with the trivial methods first:

class Monad m => MonadReader r m | m -> r where
  ask :: m r
  local :: (r -> r) -> m a -> m a
  reader :: (r -> a) -> m a

instance MonadReader r m => MonadReader r (ExceptT e m) where
  ask = lift ask
  reader = lift . reader

However, implementing local is harder. Let’s specialize the type signature to ExceptT to make it more clear why:

local :: MonadReader r m => (r -> r) -> ExceptT e m a -> ExceptT e m a

Our base monad, m, implements local, but we have to convert the first argument from ExceptT e m a into m (Either e a) first, run it through local in m, then wrap it back up in ExceptT:

instance MonadReader r m => MonadReader r (ExceptT e m) where
  ask = lift ask
  reader = lift . reader
  local f x = ExceptT $ local f (runExceptT x)

This operation is actually a mapping operation of sorts, since we’re mapping local f over x. For that reason, this can be rewritten using the mapExceptT function provided from Control.Monad.Except:

instance MonadReader r m => MonadReader r (ExceptT e m) where
  ask = lift ask
  reader = lift . reader
  local = mapExceptT . local

If you implement MonadReader instances for other transformers, like StateT and WriterT, you’ll find that the instances are exactly the same except for mapExceptT, which is replaced with mapStateT and mapWriterT, respectively. This is sort of obnoxious, given that we want to figure out how to create a generic version of local that works with any monad transformer, but this requires concrete information about which monad we’re in. Obviously, the power MonadTrans gives us is not enough to make this generic. Fortunately, there is a typeclass which does: MonadTransControl from the monad-control package.

Using MonadTransControl, we can write a generic mapT function that maps over an arbitrary monad transformer with a MonadTransControl instance:

mapT :: (Monad m, Monad (t m), MonadTransControl t)
     => (m (StT t a) -> m (StT t b))
     -> t m a
     -> t m b
mapT f x = liftWith (\run -> f (run x)) >>= restoreT . return

This type signature may look complicated (and, well, it is), but the idea is that the StT associated type family encapsulates the monadic state that t introduces. For example, for ExceptT, StT (ExceptT e) a is Either e a. For StateT, StT (StateT s) a is (a, s). Some transformers, like ReaderT, have no state, so StT (ReaderT r) a is just a.

I will not go into the precise mechanics of how MonadTransControl works in this blog post, but it doesn’t matter significantly; the point is that we can now use mapT to create a generic implementation of local for use with DefaultSignatures:

class Monad m => MonadReader r m | m -> r where
  ask :: m r
  default ask :: (MonadTrans t, MonadReader r m1, m ~ t m1) => m r
  ask = lift ask

  local :: (r -> r) -> m a -> m a
  default local :: (MonadTransControl t, MonadReader r m1, m ~ t m1) => (r -> r) -> m a -> m a
  local = mapT . local

  reader :: (r -> a) -> m a
  reader f = f <$> ask

Once more, we now get instances of our typeclass, in this case MonadReader, for free:

instance MonadReader r m => MonadReader r (ExceptT e m)
instance (MonadReader r m, Monoid w) => MonadReader r (WriterT w m)
instance MonadReader r m => MonadReader r (StateT s m)

It’s also worth noting that we don’t get a ContT instance for free, even though ContT has a MonadReader instance in mtl. Unlike the other monad transformers mtl provides, ContT does not have a MonadTransControl instance because it cannot be generally mapped over. While a mapContT function does exist, its signature is more restricted:

mapContT :: (m r -> m r) -> ContT k r m a -> ContT k r m a

It happens that local can still be implemented for ContT, so it can still have a MonadReader instance, but it cannot be derived in the same way as it can for the other transformers. Still, in practice, I’ve found that most user-defined transformers do not have such complex control flow, so they can safely be instances of MonadTransControl, and they get this deriving for free.

Extending this technique to other mtl typeclasses

The default instances for the other mtl typeclasses are slightly different from the one for MonadReader, but for the most part, the same general technique applies. Here’s a derivable MonadError:

class Monad m => MonadError e m | m -> e where
  throwError :: e -> m a
  default throwError :: (MonadTrans t, MonadError e m1, m ~ t m1) => e -> m a
  throwError = lift . throwError

  catchError :: m a -> (e -> m a) -> m a
  default catchError :: (MonadTransControl t, MonadError e m1, m ~ t m1) => m a -> (e -> m a) -> m a
  catchError x f = liftWith (\run -> catchError (run x) (run . f)) >>= restoreT . return

instance MonadError e m => MonadError e (ReaderT r m)
instance (MonadError e m, Monoid w) => MonadError e (WriterT w m)
instance MonadError e m => MonadError e (StateT s m)
instance (MonadError e m, Monoid w) => MonadError e (RWST r w s m)

The MonadState interface turns out to be extremely simple, so it doesn’t even need MonadTransControl at all:

class Monad m => MonadState s m | m -> s where
  get :: m s
  default get :: (MonadTrans t, MonadState s m1, m ~ t m1) => m s
  get = lift get

  put :: s -> m ()
  default put :: (MonadTrans t, MonadState s m1, m ~ t m1) => s -> m ()
  put = lift . put

  state :: (s -> (a, s)) -> m a
  state f = do
    s <- get
    let (a, s') = f s
    put s'
    return a

instance MonadState s m => MonadState s (ExceptT e m)
instance MonadState s m => MonadState s (ReaderT r m)
instance (MonadState s m, Monoid w) => MonadState s (WriterT w m)

Everything seems to be going well! However, not everything is quite so simple.

A `MonadWriter` diversion

Unexpectedly, MonadWriter turns out to be by far the trickiest of the bunch. It’s not too hard to create default implementations for most of the methods of the typeclass:

class (Monoid w, Monad m) => MonadWriter w m | m -> w where
  writer :: (a, w) -> m a
  default writer :: (MonadTrans t, MonadWriter w m1, m ~ t m1) => (a, w) -> m a
  writer = lift . writer

  tell :: w -> m ()
  default tell :: (MonadTrans t, MonadWriter w m1, m ~ t m1) => w -> m ()
  tell = lift . tell

  listen :: m a -> m (a, w)
  default listen :: (MonadTransControl t, MonadWriter w m1, m ~ t m1) => m a -> m (a, w)
  listen x = do
    (y, w) <- liftWith (\run -> listen (run x))
    y' <- restoreT (return y)
    return (y', w)

However, MonadWriter has a fourth method, pass, which has a particularly tricky type signature:

pass :: m (a, w -> w) -> m a

As far as I can tell, this is not possible to generalize using MonadTransControl alone, since it would require inspection of the result of the monadic argument (that is, it would require a function from StT t (a, b) -> (StT t a, b)), which is not possible in general. My gut is that this could likely also be generalized with a slightly more powerful abstraction than MonadTransControl, but it is not immediately obvious to me what that abstraction should be.

One extremely simple way to make this possible would be to design something to serve this specific use case:

type RunSplit t = forall m a b. Monad m => t m (a, b) -> m (StT t a, Maybe b)
class MonadTransControl t => MonadTransSplit t where
  liftWithSplit :: Monad m => (RunSplit t -> m a) -> t m a

Instances of MonadTransSplit would basically just provide a way to pull out bits of the result, if possible:

instance MonadTransSplit (ReaderT r) where
  liftWithSplit f = liftWith $ \run -> f (fmap split . run)
    where split (x, y) = (x, Just y)

instance MonadTransSplit (ExceptT e) where
  liftWithSplit f = liftWith $ \run -> f (fmap split . run)
    where split (Left e) = (Left e, Nothing)
          split (Right (x, y)) = (Right x, Just y)

instance MonadTransSplit (StateT s) where
  liftWithSplit f = liftWith $ \run -> f (fmap split . run)
    where split ((x, y), s) = ((x, s), Just y)

Then, using this, it would be possible to write a generic version of pass:

default pass :: (MonadTransSplit t, MonadWriter w m1, m ~ t m1) => m (a, w -> w) -> m a
pass m = do
  r <- liftWithSplit $ \run -> pass $ run m >>= \case
    (x, Just f) -> return (x, f)
    (x, Nothing) -> return (x, id)
  restoreT (return r)

However, this seems pretty overkill for just one particular method, given that I have no idea if MonadTransSplit would be useful anywhere else. One interesting thing about going down this rabbit hole, though, is that I learned that pass has some somewhat surprising behavior when mixed with transformers like ExceptT or MaybeT, if you don’t carefully consider how it works. It’s a strange method with a somewhat strange interface, so I don’t think I have a satisfactory conclusion about MonadWriter yet.

Regrouping and stepping back

Alright, that was a lot of fairly intense, potentially confusing code. What the heck did we actually accomplish? Well, we got a couple of things:

First, we developed a technique for writing simple mtl-style typeclasses that are derivable using DeriveAnyClass (or simply writing an empty instance declaration). We used a MonadExit class as a proof of concept, but really, the technique is applicable to most mtl-style typeclasses that represent simple effects (including, for example, MonadIO).
This technique is useful in isolation, even if you completely disregard the rest of the blog post. For an example where I recently applied it in real code, see the default signatures provided with MonadPersist from the monad-persist library, which make defining instances completely trivial. If you use mtl-style typeclasses in your own application to model effects, I don’t see much of a reason not to use this technique.
After MonadExit, we applied the same technique to the mtl-provided typeclasses MonadReader, MonadError, and MonadState. These are a bit trickier, since the first two need MonadTransControl in addition to the usual MonadTrans.
Whether or not this sort of thing should actually be added to mtl itself probably remains to be seen. For the simplest typeclass, MonadState, it seems like there probably aren’t many downsides, but given the difficulty implementing it for MonadWriter (or, heaven forbid, MonadCont, which I didn’t even seriously take a look at for this blog post), it doesn’t seem like an obvious win. Consistency is important.
Another downside that I sort of glossed over is possibly even more significant from a practical point of view: adding default signatures to MonadReader would require the removal of the default implementation of ask that is provided by the existing library (which implements ask in terms of reader). This would be backwards-incompatible, so it’d be difficult to change, even if people wanted to do it. Still, it’s interesting to consider what these typeclasses might look like if they were designed today.

Overall, these techniques are not a silver bullet for deriving mtl-style typeclasses, nor do they eliminate the n² instances problem that mtl style suffers from. That said, they do significantly reduce boilerplate and clutter in the simplest cases, and they demonstrate how modern Haskell’s hierarchy of typeclasses provides a lot of power, both to describe quite abstract concepts and to alleviate the need to write code by hand.

I will continue to experiment with the ideas described in this blog post, and I’m sure some more pros and cons will surface as I explore the design space. If you have any suggestions for how to deal with “the MonadWriter problem”, I’d be very interested to hear them! In the meantime, consider using the technique in your application code when writing effectful, monadic typeclasses.

Rascal is now Hackett, plus some answers to questions

2017-01-05T00:00:00Z

Since I published my blog post introducing Rascal, I’ve gotten some amazing feedback, more than I had ever anticipated! One of the things that was pointed out, though, is that Rascal is a language that already exists. Given that the name “Rascal” came from a mixture of “Racket” and “Haskell”, I always had an alternative named planned, and that’s “Hackett”. So, to avoid confusion as much as possible, Rascal is now known as Hackett.

With that out of the way, I also want to answer some of the other questions I received, both to hopefully clear up some confusion and to have something I can point to if I get the same questions in the future.

What’s in a name?

First, a little trivia.

I’ve already mentioned that the old “Rascal” name was based on the names “Racket” and “Haskell”, which is true. However, it had a slightly deeper meaning, too: the name fit a tradition of naming languages in the Scheme family after somewhat nefarious things, such as “Gambit”, “Guile”, “Larceny”, and “Racket” itself. The name goes back a little bit further to the Planner programming language; Scheme was originally called Schemer, but it was (no joke) shorted due to filename length restrictions.

Still, my language isn’t really a Scheme, so the weak connection wasn’t terribly relevant. Curious readers might be wondering if there’s any deeper meaning to the name “Hackett” than a mixture of the two language names. In fact, there is. Hackett is affectionately named after the Genesis progressive rock guitarist, Steve Hackett, one of my favorite musicians. The fact that the name is a homophone with “hack-it” is another convenient coincidence.

Perhaps not the most interesting thing in this blog post, but there it is.

Why Racket? Why not Haskell?

One of the most common questions I received is why I used Racket as the implementation language instead of Haskell. This is a decent question, and I think it likely stems at least in part from an item of common confusion: Racket is actually two things, a programming language and a programming language platform. The fact that the two things have the same name is probably not ideal, but it’s what we’ve got.

Racket-the-language is obviously the primary language used on the Racket platform, but there’s actually surprisingly little need for that to be the case; it’s simply the language that is worked on the most. Much of the Racket tooling, including the compiler, macroexpander, and IDE, are actually totally language agnostic. If someone came along and wrote a language that got more popular than #lang racket, then there wouldn’t really be anything hardcoded into any existing tooling that would give the impression that #lang racket was ever the more “dominant” language, aside from the name.

For this reason, Racket is ideal for implementing new programming languages, moreso than pretty much any other platform out there. The talk I linked to in the previous blog post, Languages in an Afternoon, describes this unique capability. It’s short, only ~15 minutes, but if you’re not into videos, I can try and explain why Racket is so brilliant for this sort of thing.

By leveraging the Racket platform instead of implementing my language from scratch, I get the following things pretty much for free:

I get a JIT compiler for my code, and I don’t have to implement a compiler myself.
I also get a package manager that can cooperate with Hackett code to deliver Hackett modules.
I get a documentation system that is fully indexed and automatically locally installed when you install Hackett or any package written in Hackett, and that documentation is automatically integrated with the editor.
The DrRacket IDE can be used out of the box with Hackett code, it automatically does syntax highlighting and indenting, and it even provides interactive tools for inspecting bindings (something that I demo in my aforementioned talk).
If you don’t want to use DrRacket, you can use the racket-mode major mode for Emacs, which uses the same sets of tools that DrRacket uses under the hood, so you get most of the same DrRacket goodies without sacrificing Emacs’s power of customization.

Reimplementing all of that in another language would take years of work, and I haven’t even mentioned Racket’s module system and macroexpander, which are the underpinnings of Hackett. GHC’s typechecker is likely roughly as complex as Racket’s macroexpander combined with its module system, but I am not currently implementing GHC’s typechecker, since I do not need all of OutsideIn(X)’s features, just Haskell 98 + some extensions.

In contrast, I truly do need all of the Racket macroexpander to implement Hackett, since the Type Systems as Macros paper uses pretty much every trick the Racket macro system has to offer to implement typechecking as macroexpansion. For those reasons, implementing the Racket macroexpander alone in Haskell would likely be monumentally more work than implementing a Hindley-Milner typechecker in Racket, so it doesn’t really make sense to use Haskell for that job.

Actually running Hackett code

Now, it’s worth noting that GHC is much more efficient as a compiler than Racket is, for a whole host of reasons. However, since typechecking and macroexpansion are inherently strictly compile-time phases, it turns out to be totally feasible to run the typechecker/macroexpander in Racket (since in Hackett, the two things are one and the same), then compile the resulting fully-expanded, well-typed code to GHC Core. That could then be handed off to GHC itself and compiled using the full power of the GHC optimizer and compiler toolchain.

This would be no small amount of work, but it seems theoretically possible, so eventually it’s something I’d love to look into. There are various complexities to making it work, but I think it would let me get the best of both worlds without reinventing the wheel, so it’s something I want long-term.

There’s also the question of how “native” Hackett code would be, were it compiled to GHC Core. Would Hackett code be able to use Haskell libraries, and vice versa? My guess is that the answer is “yes, with some glue”. It probably wouldn’t be possible to do it completely seamlessly, because Hackett provides type information at macroexpansion time that likely wouldn’t exist in the same form in GHC. It might be possible to do some incredibly clever bridging to be able to use Haskell libraries in Hackett almost directly, but the inverse might not be true if a library’s interface depends on macros.

How do Template Haskell quasiquoters compete with macros?

Quasiquoters have a number of drawbacks, but the two main ones are complexity and lack of composition.

S-expressions happen to be simple, and this means s-expression macros have two lovely properties: they’re easy to write, given good libraries (Racket has syntax/parse), and they’re easy for tools to understand. Quasiquoters force implementors to write their own parsers from raw strings of characters, which is quite a heavy burden, and it usually means those syntaxes are confusing and brittle. To give a good example, consider persistent’s quasiquoters: they look sort of like Haskell data declarations, but they’re not really, and I honestly have no idea what their actual syntax really is. It feels pretty finicky, though. In contrast, an s-expression based version of the same syntax would basically look just like the usual datatype declaration form, plus perhaps some extra goodies.

Additionally, s-expression macros compose, and this should probably be valued more than anything else. If you’re writing code that doesn’t compose, it’s usually a bad sign. So much of functional programming is about writing small, reusable pieces of code that can be composed together, and macros are no different. Racket’s match, for example, is an expression, and it contains expressions, so match can be nested within itself, as well as other arbitrary macros that produce expressions. Similarly, many Racket macros can be extended, which is possible due to having such uniform syntax.

Making macros “stand out” is an issue of some subjectivity, but in my experience such a fear of macros tends to stem from a familiarity with bad macro systems (which, to be fair, is almost all of them) and poor tooling. I’ve found that, in practice, most of the reasons people want to know “is this a macro??” is because macros are scary black boxes and people want to know which things to be suspicious of.

Really, though, one of the reasons macros are complicated isn’t knowing which things are macros, but it’s knowing which identifiers are uses and which identifiers are bindings, and things like that. Just knowing that something is a macro use doesn’t actually help at all there—the syntax won’t tell you. Solve that problem with tools that address the problem head on, not by making a syntax that makes macros second-class citizens. One of the reasons I used the phrase “syntactic abstractions” in my previous blog post is because you specifically want them to be abstractions. If you have to think of a macro in terms of the thing it expands to then it isn’t a very watertight abstraction. You don’t think about Haskell pattern-matching in terms of what the patterns compile to, you just use them. Macros should be (and can be) just as fluid.

How can I help?

Right now, what I really need is someone who understands type system implementation. You don’t need to be up to date on what’s cutting edge—I’m not implementing anything nearly as complicated as GADTs or dependent types yet—you just need to understand how to implement Haskell 98. If you have that knowledge and you’re interested in helping, even if it just means answering some of my questions, please contact me via email, IRC (the #racket channel on Freenode is a good place for now), or Slack (I’m active in the snek Slack community, which you can sign up for here).

If you aren’t familiar with those things, but you’re still interested in helping out, there’s definitely plenty of work that needs doing. If you want to find somewhere you can pitch in, contacting me via any of the above means is totally fine, and I can point you in the right direction. Even if you just want to be a guinea pig, that’s useful.

Rascal: a Haskell with more parentheses

2017-01-02T00:00:00Z

Note: since the writing of this blog post, Rascal has been renamed to Hackett. You can read about why in the followup blog post.

“Hey! You got your Haskell in my Racket!”

“No, you got your Racket in my Haskell!”

Welcome to the Rascal programming language.

Why Rascal?

Why yet another programming language? Anyone who knows me knows that I already have two programming languages that I really like: Haskell and Racket. Really, I think they’re both great! Each brings some things to the table that aren’t really available in any other programming language I’ve ever used.

Haskell, in many ways, is a programming language that fits my mental model of how to structure programs better than any other programming language I’ve used. Some people would vehemently disagree, and it seems that there is almost certainly some heavy subjectivity in how people think about programming. I think Haskell’s model is awesome once you get used to it, though, but this blog post is not really going to try and convince you why you should care about Haskell (though that is something I want to write at some point). What you should understand, though, is that to me, Haskell is pretty close to what I want in a programming language.

At the same time, though, Haskell has problems, and a lot of that revolves around its story for metaprogramming. “Metaprogramming” is another M word that people seem to be very afraid of, and for good reason: most metaprogramming systems are ad-hoc, unsafe, unpredictable footguns that require delicate care to use properly, and even then the resulting code is brittle and difficult to understand. Haskell doesn’t suffer from this problem as much as some languages, but it isn’t perfect by any means: Haskell has at least two different metaprogramming systems (generics and Template Haskell) that are designed for different tasks, but they’re both limited in scope and both tend to be pretty complicated to use.

Discussing the merits and drawbacks of Haskell’s various metaprogramming capabilities is also outside the scope of this blog post, but there’s one fact that I want to bring up, which is that Haskell does not provide any mechanism for adding syntactic abstractions to the language. What do I mean by this? Well, in order to understand what a “syntactic abstraction” is and why you should care about it, I want to shift gears a little and take a look at why Racket is so amazing.

A programmable programming language: theory and practice

I feel confident in saying that Racket has the most advanced macro system in the world, and it is pretty much unparalleled in that space. There are many languages with powerful type systems, but Racket is more or less alone in many of the niches it occupies. Racket has a large number of innovations that I don’t know of in any other programming language, and a significant portion of them focus on making Racket a programmable programming language, a language for building languages.

This lofty goal is backed up by decades of research, providing Racket with an unparalleled toolkit for creating languages that can communicate, be extended, and even cooperate with tooling to provide introspection and error diagnostics. Working in Haskell feels like carefully designing a mould that cleanly and precisely fits your domain, carefully carving, cutting, and whittling. In contrast, working with Racket feels like moulding your domain until it looks the way you want it to look, poking and prodding at a pliable substrate. The sheer ease of it all is impossible for me to convey in words, so you will have to see it for yourself.

All this stuff is super abstract, though. What does it mean for practical programming, and why should you care? Well, I’m not going to try and sell you if you’re extremely skeptical, but if you’re interested, I gave a talk on some of Racket’s linguistic capabilities last year called Languages in an Afternoon. If you’re curious, give it a watch, and you might find yourself (hopefully) a little impressed. If you prefer reading, well, I have some blog posts on this very blog that demonstrate what Racket can do.

The basic idea, though, is that by having a simple syntax and a powerful macro system with a formalization of lexical scope, users can effectively invent entirely new language constructs as ordinary libraries, constructs that would have to be core forms in other programming languages. For example, Racket supports pattern-matching, but it isn’t built into the compiler: it’s simply implemented in the racket/match module distributed with Racket. Not only is it defined in ordinary Racket code, it’s actually extensible, so users can add their own pattern-matching forms that cooperate with match.

This is the power of a macro system to produce “syntactic abstractions”, things that can transform the way a user thinks of the code they’re writing. Racket has the unique capability of making these abstractions both easy to write and watertight, so instead of being a scary tool you have to handle with extreme care, you can easily whip up a powerful, user-friendly embedded domain specific language in a matter of minutes, and it’ll be safe, provide error reporting for misuse, and cooperate with existing tooling pretty much out of the box.

Fusing Haskell and Racket

So, let’s assume that we do want Haskell’s strong type system and that we also want a powerful metaprogramming model that permits syntactic extensions. What would that look like? Well, one way we could do it is to put one in front of the other: macro expansion is, by nature, a compile-time pass, so we could stick a macroexpander in front of the typechecker. This leads to a simple technique: first, macroexpand the program to erase the macros, then typecheck it and erase the types, then send the resulting code off to be compiled. This technique has the following properties:

First of all, it’s easy to implement. Racket’s macroexpander, while complex, is well-documented in academic literature and works extremely well in practice. In fact, this strategy has already been implemented! Typed Racket, the gradually-typed sister language of Racket, expands every program before typechecking. It would be possible to effectively create a “Lisp-flavored Haskell” by using this technique, and it might not even be that hard.
Unfortunately, there’s a huge problem with this approach: type information is not available at macroexpansion time. This is the real dealbreaker with the “expand, then typecheck” model, since static type information is some of the most useful information possibly available to a macro writer. In an ideal world, macros should not only have access to type information, they should be able to manipulate it and metaprogram the typechecker as necessary, but if macroexpansion is a separate phase from typechecking, then that information simply doesn’t exist yet.

For me, the second option is unacceptable. I am not satisfied by a “Lisp-flavored Haskell”; I want my types and macros to be able to cooperate and communicate with each other. The trouble, though, is that solving that problem is really, really hard! For a couple years now, I’ve been wishing this ideal language existed, but I’ve had no idea how to make it actually work. Template Haskell implements a highly restricted system of interweaving typechecking and splice evaluation, but it effectively does it by running the typechecker and the splice expander alternately, splitting the source into chunks and typechecking them one at a time. This works okay for Template Haskell, but for the more powerful macro system I am looking for, it wouldn’t scale.

There’s something a little bit curious, though, about the problem as I just described it. The processes of “macroexpanding the program to erase the macros” and “typechecking the program to erase the types” sound awfully similar. It seems like maybe these are two sides of the same coin, and it would be wonderful if we could encode one in terms of the other, effectively turning the two passes into a single, unified pass. Unfortunately, while this sounds great, I had no idea how to do this (and it didn’t help that I really had no idea how existing type systems were actually implemented).

Fortunately, last year, Stephen Chang, Alex Knauth, and Ben Greenman put together a rather exciting paper called Type Systems as Macros, which does precisely what I just described, and it delivers it all in a remarkably simple and elegant presentation. The idea is to “distribute” the task of typechecking over the individual forms of the language, leveraging existing macro communication facilities avaiable in the Racket macroexpander to propagate type information as macros are expanded. To me, it was exactly what I was looking for, and I almost immediately started playing with it and seeing what I could do with it.

The result is Rascal, a programming language built in the Racket ecosystem that attempts to implement a Haskell-like type system.

A first peek at Rascal

Rascal is a very new programming language I’ve only been working on over the past few months. It is extremely experimental, riddled with bugs, half-baked, and may turn your computer into scrambled eggs. Still, while I might not recommend that you actually use it just yet, I want to try and share what it is I’m working on, since I’d bet at least a few other people will find it interesting, too.

First, let me say this up front: Rascal is probably a lot closer to Haskell than Racket. That might come as a surprise, given that Rascal has very Lisp-y syntax, it’s written in Racket, and it runs on the Racket platform, but semantically, Rascal is mostly just Haskell 98. This is important, because it may come as a surprise, given that there are so few statically typed Lisps, but there’s obviously no inherent reason that Lisps need to be dynamically typed. They just seem to have mostly evolved that way.

Taking a look at a snippet of Rascal code, it’s easy to see that the language doesn’t work quite like a traditional Lisp, though:¹

(def+ map-every-other : (forall [a] {{a -> a} -> (List a) -> (List a)})
  [_ nil            -> nil]
  [_ {x :: nil}     -> {x :: nil}]
  [f {x :: y :: ys} -> {x :: (f y) :: (map-every-other f ys)}])

This is a Lisp with all the goodies you would expect out of Haskell: static types, parametric polymorphism, automatically curried functions, algebraic datatypes, pattern-matching, infix operators, and of course, typeclasses. Yes, with Rascal you can have your monads in all their statically dispatched glory:

(data (Maybe a)
  (just a)
  nothing)

(instance (Monad Maybe)
  [join (case-lambda
          [(just (just x)) (just x)]
          [_               nothing])])

So far, though, this really is just “Haskell with parentheses”. As alluded to above, however, Rascal is a bit more than that.

Core forms can be implemented as derived concepts

Rascal’s type system is currently very simple, being nothing more than Hindley-Milner plus ad-hoc polymorphism in the form of typeclasses. Something interesting to note about it is that it does not implement ADTs or pattern-matching anywhere in the core! In fact, ADTs are defined as two macros data and case, in an entirely separate module, which can be imported just like any other library.

The main rascal language provides ADTs by default, of course, but it would be perfectly possible to produce a rascal/kernel language which does not include them at all. In this particular case, it seems unlikely that Rascal programmers would want their own implementation of ADTs, but it’s an interesting proof of concept, and it hints at other “core” features that could be implemented using macros.

Simple syntactic transformations are, of course, trivially defined as macros. Haskell do notation is defined as an eleven-line macro in rascal/monad, and GHC’s useful LambdaCase extension is also possible to implement without modifying Rascal at all. This is useful, because there are many syntactic shorthands that are extremely useful to implement, but don’t make any sense to be in GHC because they are specific to certain libraries or applications. Racket’s macro system makes those not only possible, but actually pretty easy.

While the extent of what is possible to implement as derived forms remains to be seen, many useful GHC features seem quite possible to implement without touching the core language, including things like GeneralizedNewtypeDeriving and other generic deriving mechanisms like GHC.Generics, DeriveGeneric, and DeriveAnyClass.

The language is not enough

No language is perfect. Most people would agree with this, but I would take it a step further: no language is even sufficient! This makes a lot of sense, given that general-purpose programming languages are designed to do everything, and it’s impossible to do everything well.

Haskell programmers know this, and they happily endorse the creation of embedded domain specific languages. These are fantastic, and we need more of them. Things like servant let me write a third of the code I might otherwise need to, and the most readable code is the code you didn’t have to write in the first place. DSLs are good.

Unfortunately, building DSLs is traditionally difficult, largely in part because building embedded DSLs means figuring out a way to encode your domain into your host language of choice. Sometimes, your domain simply does not elegantly map to your host language’s syntax or semantics, and you have to come up with a compromise. This is easy to see with servant, which, while it does a remarkably good job, still has to resort to some very clever type magic to create some semblance of an API description in Haskell types:

type UserAPI = "users" :> Get '[JSON] [User]
          :<|> "users" :> ReqBody '[JSON] User :> Post '[JSON] User
          :<|> "users" :> Capture "userid" Integer
                       :> Get '[JSON] User
          :<|> "users" :> Capture "userid" Integer
                       :> ReqBody '[JSON] User
                       :> Put '[JSON] User

The above code is remarkably readable for what it is, but what if we didn’t have to worry about working within the constraints of Haskell’s syntax? What if we could design a syntax that was truly the best for the job? Perhaps we would come up with something like this:

(define-api User-API
  #:content-types [JSON]
  [GET  "users"                    => (List User)]
  [POST "users"                    => User -> User]
  [GET  "users" [userid : Integer] => User]
  [PUT  "users" [userid : Integer] => User -> User])

This would be extremely easy to write with Racket’s macro-writing utilities, and it could even be made extensible. This could also avoid having to do the complicated typeclass trickery servant has to perform to then generate code from the above specification, since it would be much easier to just generate the necessary code directly (which still maintaining type safety).

In addition to the type-level hacks that Haskell programmers often have to pull in order to make these kinds of fancy DSLs work, free monads tend to be used to create domain-specific languages. This works okay for some DSLs, but remember that when you use a free monad, you are effectively writing a runtime interpreter for your language! Macros, on the other hand, are compiled, and you get ability to compile your DSL to code that can be optimized by all the existing facilities of the compiler toolchain.

Rascal is embryonic

I’m pretty excited about Rascal. I think that it could have the potential to do some pretty interesting things, and I have some ideas in my head for how having macros in a Haskell-like language could change things. I also think that, based on what I’ve seen so far, having both macros and a Haskell-like type system could give rise to completely different programming paradigms than exist in either Haskell or Racket today. My gut tells me that this is a case where the whole might actually be greater than the sum of its parts.

That said, Rascal doesn’t really exist yet. Yes, there is a GitHub repository, and it has some code in it that does… something. Unfortunately, the code is also currently extremely buggy, to the point of being borderline broken, and it’s also in such early stages that you can’t really do anything interesting with it, aside from some tiny toy programs.

As I have worked on Rascal, I’ve come to a somewhat unfortunate conclusion, which is that I really have almost zero interest in implementing type systems. I felt that way before I started the project, but I was hoping that maybe once I got into them, I would find them more interesting. Unfortunately, as much as I love working with powerful type systems (and really, I adore working with Haskell and using all the fancy features GHC provides), I find implementing the software that makes them tick completely dull.

Still, I’m willing to invest the time to get something that I can use. Even so, resources for practical type system implementation are scarce. I want to thank Mark P Jones for his wonderful resource Typing Haskell in Haskell, without which getting to where I am now would likely have been impossible. I also want to thank Stephen Diehl for his wonderful Write You a Haskell series, which was also wonderfully useful to study, even if it is unfinished and doesn’t cover anything beyond ML just yet.

Even with these wonderful resources, I’ve come to the realization that I probably can’t do all of this on my own. I consider myself pretty familiar with macros and macro expanders at this point, but I don’t know much about type systems (at least not their implementation), and I could absolutely use some help. So if you’re interested in Rascal and think you might be able to pitch in, please: I would appreciate even the littlest bits of help or guidance!

In the meantime, I will try to keep picking away at Rascal in the small amount of free time I currently have. Thanks, as always, to all the amazing people who have contributed to the tools I’ve been using for this project: special thanks to the authors of Type Systems as Macros for their help as well as the people I mentioned just above, and also to all of the people who have built Racket and Haskell and made them what they are today. Without them, Rascal would most definitely not exist.

Note that most of the Rascal code in this blog post probably doesn’t actually work on the current Rascal implementation. Pretty much all of it can be implemented in the current implementation, the syntax just isn’t quite as nice yet. ↩

Using types to unit-test in Haskell

2016-10-03T00:00:00Z

Object-oriented programming languages make unit testing easy by providing obvious boundaries between units of code in the form of classes and interfaces. These boundaries make it easy to stub out parts of a system to test functionality in isolation, which makes it possible to write fast, deterministic test suites that are robust in the face of change. When writing Haskell, it can be unclear how to accomplish the same goals: even inside pure code, it can become difficult to test a particular code path without also testing all its collaborators.

Fortunately, by taking advantage of Haskell’s expressive type system, it’s possible to not only achieve parity with object-oriented testing techniques, but also to provide stronger static guarantees as well. Furthermore, it’s all possible without resorting to extra-linguistic hacks that static object-oriented languages sometimes use for mocking, such as dynamic bytecode generation.

First, an aside on testing philosophy

Testing methodology is a controversial topic within the larger programming community, and there are a multitude of different approaches. This blog post is about unit testing, an already nebulous term with a number of different definitions. For the purposes of this post, I will define a unit test as a test that stubs out collaborators of the code under test in some way. Accomplishing that in Haskell is what this is primarily about.

I want to be clear that I do not think that unit tests are the only way to write tests, nor the best way, nor even always an applicable way. Depending on your domain, rigorous unit testing might not even make sense, and other forms of tests (end-to-end, integration, benchmarks, etc.) might fulfill your needs.

In practice, though, implementing those other kinds of tests seems to be well-documented in Haskell compared to pure, object-oriented style unit testing. As my Haskell applications have grown, I have found myself wanting a more fine-grained testing tool that allows me to both test a piece of my codebase in isolation and also use my domain-specific types. This blog post is about that.

With that disclaimer out of the way, let’s talk about testing in Haskell.

Drawing seams using types

One of the primary attributes of unit tests in object-oriented languages, especially statically-typed ones, is the concept of “seams” within a codebase. These are internal boundaries between components of a system. Some boundaries are obvious—interactions with a database, manipulation of the file system, and performing I/O over the network, to name a few examples—but others are more subtle. Especially in larger codebases, it can be helpful to isolate two related but distinct pieces of functionality as much as possible, which makes them easier to reason about, even if they’re actually part of the same codebase.

In OO languages, these seams are often marked using interfaces, whether explicitly (in the case of static languages) or implicitly (in the case of dynamic ones). By programming to an interface, it’s possible to create “fake” implementations of that interface for use in unit tests, effectively making it possible to stub out code that isn’t directly relevant to the code being tested.

In Haskell, representing these seams is a lot less obvious. Consider a fairly trivial function that reverses a file’s contents on the file system:

reverseFile :: FilePath -> IO ()
reverseFile path = do
  contents <- readFile path
  writeFile path (reverse contents)

This function is impossible to test without testing against a real file system. It simply performs I/O directly, and there’s no way to “mock out” the file system for testing purposes. Now, admittedly, this function is so trivial that a unit test might not seem worth the cost, but consider a slightly more complicated function that interacts with a database:

renderUserProfile :: Id User -> IO HTML
renderUserProfile userId = do
  user <- fetchUser userId
  posts <- fetchRecentPosts userId

  return $ div
    [ h1 (userName user <> "’s Profile")
    , h2 "Recent Posts"
    , ul (map (li . postTitle) posts)
    ]

It might now be a bit more clear that it could be useful to test the above function without running a real database and doing all the necessary context setup before each test case. Indeed, it would be nice if a test could just provide stubbed implementations for fetchUser and fetchRecentPosts, then make assertions about the output.

One way to solve this problem is to pass the results of those two functions to renderUserProfile as arguments, turning it into a pure function that could be easily tested. This becomes obnoxious for functions of even just slightly more complexity, though (it is not unreasonable to imagine needing a handful of different queries to render a user’s profile page), and it requires significantly restructuring code simply because the tests need it.

The above code is not only difficult to test, however—it has another problem, too. Specifically, both functions return IO values, which means they can effectively do anything. Haskell has a very strong type system for typing terms, but it doesn’t provide any guarantees about effects beyond a simple yes/no answer about function purity. Even though the renderUserProfile function should really only need to interact with the database, it could theoretically delete files, send emails, make HTTP requests, or do any number of other things.

Fortunately, it’s possible to solve both problems—a lack of testability and a lack of type safety—using the same general technique. This approach is reminiscent of the interface-based seams of object-oriented languages, but unlike most object-oriented approaches, it provides additional type safety guarantees without the need to explicitly modify the code to support some kind of dependency injection.

Making implicit interfaces explicit

Statically typed, object-oriented languages provide interfaces as a language construct to encode certain kinds of contracts into the type system, and Haskell has something similar. Typeclasses are, in many ways, an analog to OO interfaces, and they can be used in a similar way. In the above case, let’s write down interfaces that the reverseFile and renderUserProfile functions can use:

class Monad m => MonadFS m where
  readFile :: FilePath -> m String
  writeFile :: FilePath -> String -> m ()

class Monad m => MonadDB m where
  fetchUser :: Id User -> m User
  fetchRecentPosts :: Id User -> m [Post]

The really nice thing about these interfaces is that our function implementations don’t have to change at all to take advantage of them. In fact, all we have to change is their types:

reverseFile :: MonadFS m => FilePath -> m ()
reverseFile path = do
  contents <- readFile path
  writeFile path (reverse contents)

renderUserProfile :: MonadDB m => Id User -> m HTML
renderUserProfile userId = do
  user <- fetchUser userId
  posts <- fetchRecentPosts userId

  return $ div
    [ h1 (userName user <> "’s Profile")
    , h2 "Recent Posts"
    , ul (map (li . postTitle) posts)
    ]

This is pretty neat, since we haven’t had to alter our code at all, but we’ve managed to completely decouple ourselves from IO. This has the direct effect of both making our code more abstract (we no longer rely on the “real” file system or a “real” database, which makes our code easier to test) and restricting what our functions can do (just from looking at the type signatures, we know what side-effects they can perform).

Of course, since we’re now coding against an interface, our code doesn’t actually do much of anything. If we want to actually use the functions we’ve written, we’ll have to define instances of MonadFS and MonadDB. When actually running our code, we’ll probably still use IO (or some monad transformer stack with IO at the bottom), so we can define trivial instances for that existing use case:

instance MonadFS IO where
  readFile = Prelude.readFile
  writeFile = Prelude.writeFile

instance MonadDB IO where
  fetchUser = SQL.fetchUser
  fetchRecentPosts = SQL.fetchRecentPosts

Even if we go no further, this is already incredibly useful. By restricting the sorts of effects our functions can perform at the type level, it becomes a lot easier to see which code is interacting with what. This can be invaluable when working in a part of a moderately large codebase that you are unfamiliar with. Even if the only instance of these typeclasses is IO, the benefits are immediately apparent.

Of course, this blog post is about testing, so we’re going to go further and take advantage of these seams we’ve now drawn. The question is: how?

Testing with typeclasses: an initial attempt

Given that we now have functions depending on an interface instead of IO, we can create separate instances of our typeclasses for use in tests. Let’s start with the renderUserProfile function. We’ll create a simple wrapper around the Identity type, since we don’t actually care much about the “effects” of our MonadDB methods:

import Data.Functor.Identity

newtype TestM a = TestM (Identity a)
  deriving (Functor, Applicative, Monad)

unTestM :: TestM a -> a
unTestM (TestM (Identity x)) = x

Now, we’ll create a trivial instance of MonadDB for TestM:

instance MonadDB TestM where
  fetchUser _ = return User { userName = "Alyssa" }
  fetchRecentPosts _ = return
    [ Post { postTitle = "Metacircular Evaluator" } ]

With this instance, it’s now possible to write a simple unit test of the renderUserProfile function that doesn’t need a real database running at all:

spec = describe "renderUserProfile" $ do
  it "shows the user’s name" $ do
    let result = unTestM (renderUserProfile (intToId 1234))
    result `shouldContainElement` h1 "Alyssa’s Profile"

  it "shows a list of the user’s posts" $ do
    let result = unTestM (renderUserProfile (intToId 1234))
    result `shouldContainElement` ul [ li "Metacircular Evaluator" ]

This is pretty nice, and running the above tests reveals a nice property of these kinds of isolated test cases: the test suite runs really, really fast. Communicating with a database, even in extremely simple ways, takes a measurable amount of time, especially with dozens of tests. In contrast, even with hundreds of tests, our unit test suite runs in less than a tenth of a second.

This all seems to be successful, so let’s try and apply the same testing technique to reverseFile.

Testing side-effectful code

Looking at the type signature for reverseFile, we have a small problem:

reverseFile :: MonadFS m => FilePath -> m ()

Specifically, the return type is (). Making any assertions against the result of this function would be completely worthless, given that it’s guaranteed to be the same exact thing each time. Instead, reverseFile is inherently side-effectful, so we want to be able to test that it properly interacts with the file system in the correct way.

In order to do this, a simple wrapper around Identity won’t be enough, but we can replace it with something more powerful: Writer. Specifically, we can use a writer monad to “log” what gets called in order to test side-effects. We’ll start by creating a new TestM type, just like last time:

newtype TestM a = TestM (Writer [String] a)
  deriving (Functor, Applicative, Monad, MonadWriter [String])

logTestM :: TestM a -> [String]
logTestM (TestM w) = execWriter w

Using this slightly more powerful type, we can write a useful instance of MonadFS that will track the argument given to writeFile:

instance MonadFS TestM where
  readFile _ = return "hello"
  writeFile _ contents = tell [contents]

Again, the instance is quite simple, but it now enables us to write a straightforward unit test for reverseFile:

spec = describe "reverseFile" $
  it "reverses a file’s contents on the filesystem" $ do
    let calls = logTestM (reverseFile "foo.txt")
    calls `shouldBe` ["olleh"]

Again, quite simple to both implement and use, and the test itself is blindingly fast. There’s another problem, though, which is that we have technically left part of reverseFile untested: we’ve completely ignored the path argument.

In this contrived example, it may seem silly to test something so trivial, but in real code, it’s quite possible that one would care very much about testing multiple different aspects about a single function. When testing renderUserProfile, this was not hard, since we could reuse the same TestM type and MonadDB instance for both test cases, but in the reverseFile example, we’ve ignored the path entirely.

We could adjust our MonadFS instance to also track the path provided to each method, but this has a few problems. First, it means every test case would depend on all the various properties we are testing, which would mean updating every test case when we add a new one. It would also be simply impossible if we needed to track multiple types—in this particular case, it turns out that String and FilePath are actually the same type, but in practice, there may be a handful of disparate, incompatible types.

Both of the above issues could be fixed by creating a sum type and manually filtering out the relevant elements in each test case, but a much more intuitive approach would be to simply have a separate instance for each case. Unfortunately, in Haskell, creating a new instance means creating an entirely new type. To illustrate how much duplication that would entail, we could create the following type and instance for testing proper propagation of the path argument:

newtype TestM' a = TestM' (Writer [FilePath] a)
  deriving (Functor, Applicative, Monad, MonadWriter [FilePath])

logTestM' :: TestM' a -> [FilePath]
logTestM' (TestM' w) = execWriter w

instance MonadFS TestM' where
  readFile path = tell [path] >> return ""
  writeFile path _ = tell [path]

Now it’s possible to add an extra test case that asserts that the proper path is provided to the two filesystem functions:

spec = describe "reverseFile" $ do
  it "reverses a file’s contents on the filesystem" $ do
    let calls = logTestM (reverseFile "foo.txt")
    calls `shouldBe` ["olleh"]

  it "operates on the file at the provided path" $ do
    let paths = logTestM' (reverseFile "foo.txt")
    paths `shouldBe` ["foo.txt", "foo.txt"]

This works, but it’s ultimately unacceptably complicated. Our test harness code is now significantly larger than the actual tests themselves, and the amount of boilerplate is frustrating. Verbose test suites are especially bad, since forcing programmers to jump through hoops just to implement a single test reduces the likelihood that people will actually write good tests, if they write tests at all. In contrast, if writing tests is easy, then people will naturally write more of them.

The above strategy to writing tests is not good enough, but it does reveal a particular problem: in Haskell, typeclass instances are not first-class values that can be manipulated and abstracted over, they are static constructs that can only be managed by the compiler, and users do not have a direct way to modify them. With some cleverness, however, we can actually create an approximation of first-class typeclass dictionaries, which will allow us to dramatically simplify the above testing mechanism.

Creating first-class typeclass instances

In order to provide an easy way to construct instances, we need a way to represent instances as ordinary Haskell values. This is not terribly difficult, given that instances are conceptually just records containing a collection of functions. For example, we could create a datatype that represents an instance of the MonadFS typeclass:

data MonadFSInst m = MonadFSInst
  { _readFile :: FilePath -> m String
  , _writeFile :: FilePath -> String -> m ()
  }

To avoid namespace clashes with the actual method identifiers, the record fields are prefixed with an underscore, but otherwise, the translation is remarkably straightforward. Using this record type, we can easily create values that represent the two instances we defined above:

contentInst :: MonadWriter [String] m => MonadFSInst m
contentInst = MonadFSInst
  { _readFile = \_ -> return "hello"
  , _writeFile = \_ contents -> tell [contents]
  }

pathInst :: MonadWriter [FilePath] m => MonadFSInst m
pathInst = MonadFSInst
  { _readFile = \path -> tell [path] >> return ""
  , _writeFile = \path _ -> tell [path]
  }

These two values represent two different implementations of MonadFS, but since they’re ordinary Haskell values, they can be manipulated and even extended like any other records. This can be extremely useful, since it makes it possible to create a sort of “base” instance, then have individual test cases override individual pieces of functionality piecemeal.

Of course, although we’ve written these two instances, we have no way to actually use them. After all, Haskell does not provide a way to explicitly provide typeclass dictionaries. Fortunately, we can create a sort of “proxy” type that will use a reader to thread the dictionary around explicitly, and the instance can defer to the dictionary’s implementation.

Creating an instance proxy

To represent our proxy type, we’ll use a combination of a Writer and a ReaderT; the former to implement the logging used by instances, and the latter to actually thread around the dictionary. Our type will look like this:

newtype TestM log a =
    TestM (ReaderT (MonadFSInst (TestM log)) (Writer log) a)
  deriving ( Functor, Applicative, Monad
           , MonadReader (MonadFSInst (TestM log))
           , MonadWriter log
           )

logTestM :: MonadFSInst (TestM log) -> TestM log a -> log
logTestM inst (TestM m) = execWriter (runReaderT m inst)

This might look rather complicated, and it is, but let’s break down exactly what it’s doing.

The TestM type includes two type parameters. The first is the type of value that will be logged (hence the name log), which corresponds to the argument to Writer from previous incarnations of TestM. Unlike those types, though, we want this version to work with any Monoid, so we’ll make it a type parameter. The second parameter is simply the type of the current monadic value, as before.
The type itself is defined as a wrapper around a small monad transformer stack, the first of which is ReaderT. The state threaded around by the reader is, in this case, the instance dictionary, which is MonadFSInst.
However, recall that MonadFSInst accepts a type variable—the type of a monad itself—so we must provide TestM log as an argument to MonadFSInst. This slight bit of indirection allows us to tie the knot between the mutually dependent instances and proxy type.
The base monad in the transformer stack is Writer, which is used to actually implement the logging functionality, just like in prior cases. The only difference now is that the log type parameter now determines what the writer actually produces.
Finally, as before, we use GeneralizedNewtypeDeriving to derive all the relevant mtl classes, adding the somewhat wordy MonadReader constraint to the list.

Using this single type, we can now implement a MonadFS instance that defers to the dictionary carried around within TestM’s reader state:

instance Monoid log => MonadFS (TestM log) where
  readFile path = do
    f <- asks _readFile
    f path
  writeFile path contents = do
    f <- asks _writeFile
    f path contents

This may seem somewhat boilerplate-y, and it is to some extent, but the important consideration is that this boilerplate only needs to be written once. With this in place, it’s now possible to write an arbitrary number of first-class instances that use the above mechanism without extending the mechanism at all.

To see what actually using this code would look like, let’s update the reverseFile tests to use the new TestM implementation, as well as the contentInst and pathInst dictionaries from earlier:

spec = describe "reverseFile" $ do
  it "reverses a file’s contents on the filesystem" $ do
    let calls = logTestM contentInst (reverseFile "foo.txt")
    calls `shouldBe` ["olleh"]

  it "operates on the file at the provided path" $ do
    let paths = logTestM pathInst (reverseFile "foo.txt")
    paths `shouldBe` ["foo.txt", "foo.txt"]

We can do a little bit better, though. Really, the definitions of contentInst and pathInst are specific to each test case. With ordinary typeclass instances, we cannot scope them to any particular block, but since MonadFSInst is just an ordinary Haskell datatype, we can manipulate them just like any other Haskell values. Therefore, we can just inline those instances’ definitions into the test cases themselves to keep them closer to the actual tests.

spec = describe "reverseFile" $ do
  it "reverses a file’s contents on the filesystem" $ do
    let contentInst = MonadFSInst
          { _readFile = \_ -> return "hello"
          , _writeFile = \_ contents -> tell [contents]
          }
    let calls = logTestM contentInst (reverseFile "foo.txt")
    calls `shouldBe` ["olleh"]

  it "operates on the file at the provided path" $ do
    let pathInst = MonadFSInst
          { _readFile = \path -> tell [path] >> return ""
          , _writeFile = \path _ -> tell [path]
          }
    let paths = logTestM pathInst (reverseFile "foo.txt")
    paths `shouldBe` ["foo.txt", "foo.txt"]

This is pretty good. We’re now able to create inline instances of our MonadFS typeclass, which allows us to write extremely concise unit tests using ordinary Haskell typeclasses as system seams. We’ve managed to cut down on the boilerplate considerably, though we still have a couple problems. For one, this example only uses a single typeclass containing only two methods. A real MonadFS typeclass would likely have at least a dozen methods for performing various filesystem operations, and writing out the instance dictionaries for every single method, even the ones that aren’t used within the code under test, would be pretty frustratingly verbose.

This problem is solvable, though. Since instances are just ordinary Haskell records, we can create a “base” instance that just throws an exception whenever the method is called:

baseInst :: MonadFSInst m
baseInst = MonadFSInst
  { _readFile = error "unimplemented instance method ‘_readFile’"
  , _writeFile = error "unimplemented instance method ‘_writeFile’"
  }

Then code that only uses readFile could only override that particular method, for example:

let myInst = baseInst { _readFile = ... }

Normally, of course, this would be a terrible idea. However, since this is all just test code, it can be extremely useful in quickly figuring out what methods need to be stubbed out for a particular test case. Since all the code actually gets run at test time, attempts to use unimplemented instance methods will immediately raise an error, informing the programmer which methods need to be implemented to make the test pass. This can also help to significantly cut down on the amount of effort it takes to implement each test.

Another problem is that our approach is specialized exclusively to MonadFS. What about functions that use both MonadFS and MonadDB, for example? Fortunately, that is not hard to solve, either. We can adapt the MonadFSInst type to include fields for all of the typeclasses relevant to our system, turning it into a generic test fixture of sorts:

data FixtureInst m = FixtureInst
  { -- MonadFS
    _readFile :: FilePath -> m String
  , _writeFile :: FilePath -> String -> m ()

    -- MonadDB
  , _fetchUser :: Id User -> m User
  , _fetchRecentPosts :: Id User -> m [Post]
  }

Updating TestM to use FixtureInst instead of MonadFSInst is trivial, and all the rest of the infrastructure still works. However, this means that every time a new typeclass is added, three things need to be updated:

Its methods need to be added to the FixtureInst record.
Those methods need to be given error-raising defaults in the baseInst value.
An actual instance of the typeclass needs to be written for TestM that defers to the FixtureInst value.

Furthermore, most of this manual manipulation of methods is required every time a particular typeclass changes, whether that means adding a method, removing a method, renaming a method, or changing a method’s type. This is especially frustrating given that all this code is really just mechanical boilerplate that could all be derived by the set of typeclasses being tested.

That last point is especially important: aside from the instances themselves, every piece of boilerplate above is obviously possible to generate from existing types alone. With that piece of information in mind, we can do even better: we can use Template Haskell.

Removing the boilerplate using `test-fixture`

The above code was not only rather boilerplate-heavy, it was pretty complicated. Fortunately, you don’t actually have to write it. Enter the library test-fixture:

import Control.Monad.TestFixture
import Control.Monad.TestFixture.TH

mkFixture "FixtureInst" [''MonadFS, ''MonadDB]

spec = describe "reverseFile" $ do
  it "reverses a file’s contents on the filesystem" $ do
    let contentInst = def
          { _readFile = \_ -> return "hello"
          , _writeFile = \_ contents -> log contents
          }
    let calls = logTestFixture (reverseFile "foo.txt") contentInst
    calls `shouldBe` ["olleh"]

  it "operates on the file at the provided path" $ do
    let pathInst = def
          { _readFile = \path -> log path >> return ""
          , _writeFile = \path _ -> log path
          }
    let paths = logTestFixture (reverseFile "foo.txt") pathInst
    paths `shouldBe` ["foo.txt", "foo.txt"]

That’s it. The above code automatically generates everything you need to write fast, simple, deterministic unit tests in Haskell. The mkFixture function is a Template Haskell macro that expands into a definition quite similar to the FixtureInst type we wrote by hand, but since it’s automatically generated from the typeclass definitions, it never needs to be updated.

The logTestFixture function replaces the logTestM function we wrote by hand, but it works exactly the same. The Control.Monad.TestFixture library also exports a log function that is a synonym for tell . singleton, but using tell directly still works if you prefer.

The mkFixture function also generates a Default instance, which replaces the baseInst value defined earlier. It functions the same way, though, producing useful error messages that refer to the names of unimplemented typeclass methods that have not been stubbed out.

This blog post is not a test-fixture tutorial—indeed, it is much more complicated than a test-fixture tutorial would be, since it covers what the library is really doing under the hood—but if you’re interested, I would highly recommend you take a look at the test-fixture documentation on Hackage.

Conclusion, credits, and similar techniques

This blog post came about as the result of a need my coworkers and I found when writing Haskell code; we wanted a way to write unit tests quickly and easily, but we didn’t find much advice from the rest of the Haskell ecosystem. The test-fixture library is the result of that exploratory work, and we currently use it to test a significant portion of our Haskell code.

It would be extremely unfair to suggest that I was the inventor of this technique or the inventor of the library. Two of my coworkers, Joe Vargas and Greg Wiley, came up with the general approach and wrote Control.Monad.TestFixture, and I simply wrote the Template Haskell macro to eliminate the boilerplate. With that in mind, I think I can say with some fairness that I think this technique is a joy to use when unit testing is a desirable goal, and I would definitely recommend it if you are interested in doing isolated testing in Haskell.

The general technique of using typeclasses to emulate effects was in part inspired by the well-known mtl library. An alternate approach to writing unit-testable Haskell code is using free monads, but overall, I prefer this approach over free monads because the typeclass constraints add type safety in ways that free monads do not (at least not without additional boilerplate), and this approach also lends itself well to static analysis-based boilerplate reduction techniques. It has its own tradeoffs, though, so if you’ve had success with free monads, then I certainly make no claim this is a superior approach, just one that I’ve personally found pleasant.

As a final note, if you do check out test-fixture, feel free to leave feedback by opening issues on the GitHub issue tracker—even things like confusing documentation are worth a bug report.

Understanding the npm dependency model

2016-08-24T00:00:00Z

Currently, npm is the package manager for the frontend world. Sure, there are alternatives, but for the time being, npm seems to have won. Even tools like Bower are being pushed to the wayside in favor of the One True Package Manager, but what’s most interesting to me is npm’s relatively novel approach to dependency management. Unfortunately, in my experience, it is actually not particularly well understood, so consider this an attempt to clarify how exactly it works and how it affects you as a user or package developer.

First, the basics

At a high level, npm is not too dissimilar from other package managers for programming languages: packages depend on other packages, and they express those dependencies with version ranges. npm happens to use the semver versioning scheme to express those ranges, but the way it performs version resolution is mostly immaterial; what matters is that packages can depend on ranges rather than specific versions of packages.

This is rather important in any ecosystem, since locking a library to a specific set of dependencies could cause significant problems, but it’s actually much less of a problem in npm’s case compared to other, similar package systems. Indeed, it is often safe for a library author to pin a dependency to a specific version without affecting dependent packages or applications. The tricky bit is determining when this is safe and when it’s not, and this is what I so frequently find that people get wrong.

Dependency duplication and the dependency tree

Most users of npm (or at least most package authors) eventually learn that, unlike other package managers, npm installs a tree of dependencies. That is, every package installed gets its own set of dependencies rather than forcing every package to share the same canonical set of packages. Obviously, virtually every single package manager in existence has to model a dependency tree at some point, since that’s how dependencies are expressed by programmers.

For example, consider two packages, foo and bar. Each of them have their own set of dependencies, which can be represented as a tree:

foo
├── hello ^0.1.2
└── world ^1.0.7

bar
├── hello ^0.2.8
└── goodbye ^3.4.0

Imagine an application that depends on both foo and bar. Obviously, the world and goodbye dependencies are totally unrelated, so how npm handles them is relatively uninteresting. However, consider the case of hello: both packages require conflicting versions.

Most package managers (including RubyGems/Bundler, pip, and Cabal) would simply barf here, reporting a version conflict. This is because, in most package management models, only one version of any particular package can be installed at a time. In that sense, one of the package manager’s primary responsibilities is to figure out a set of package versions that will satisfy every version constraint simultaneously.

In contrast, npm has a somewhat easier job: it’s totally okay with installing different versions of the same package because each package gets its own set of dependencies. In the aforementioned example, the resulting directory structure would look something like this:

node_modules/
├── foo/
│   └── node_modules/
│       ├── hello/
│       └── world/
└── bar/
    └── node_modules/
        ├── hello/
        └── goodbye/

Notably, the directory structure very closely mirrors the actual dependency tree. The above diagram is something of a simplification: in practice, each transitive dependency would have its own node_modules directory and so on, but the directory structure can get pretty messy pretty quickly. (Furthermore, npm 3 performs some optimizations to attempt to share dependencies when it can, but those are ultimately unnecessary to actually understanding the model.)

This model is, of course, extremely simple. The obvious effect is that every package gets its own little sandbox, which works absolutely marvelously for utility libraries like ramda, lodash, or underscore. If foo depends on ramda@^0.19.0 but bar depends on ramda@^0.22.0, they can both coexist completely peacefully without any problems.

At first blush, this system is obviously better than the alternative, flat model, so long as the underlying runtime supports the required module loading scheme. However, it is not without drawbacks.

The most apparent downside is a significant increase in code size, given the potential for many, many copies of the same package, all with different versions. An increase in code size can often mean more than just a larger program—it can have a significant impact on performance. Larger programs just don’t fit into CPU caches as easily, and merely having to page a program in and out can significantly slow things down. That’s mostly just a tradeoff, though, since you’re sacrificing performance, not program correctness.

The more insidious problem (and the one that I see crop up quite a lot in the npm ecosystem without much thought) is how dependency isolation can affect cross-package communication.

Dependency isolation and values that pass package boundaries

The earlier example of using ramda is a place where npm’s default dependency management scheme really shines, given that Ramda just provides a bunch of plain ol’ functions. Passing these around is totally harmless. In fact, mixing functions from two different versions of Ramda would be totally okay! Unfortunately, not all cases are nearly that simple.

Consider, for a moment, react. React components are very much not plain old data; they are complex values that can be extended, instantiated, and rendered in a variety of ways. React represents component structure and state using an internal, private format, using a mixture of carefully arranged keys and values and some of the more powerful features of JavaScript’s object system. This internal structure might very well change between React versions, so a React component defined with react@0.3.0 likely won’t work quite right with react@15.3.1.

With that in mind, consider two packages that define their own React components and export them for consumers to use. Looking at their dependency tree, we might see something like this:

awesome-button
└── react ^0.3.0

amazing-modal
└── react ^15.3.1

Given that these two packages use wildly different versions of React, npm would give each of them their own copy of React, as requested, and packages would happily install. However, if you tried to use these components together, they wouldn’t work at all! A newer version of React simply cannot understand an old version’s component, so you would get a (likely confusing) runtime error.

What went wrong? Well, dependency isolation works great when a package’s dependencies are purely implementation details, never observable from outside of a package. However, as soon as a package’s dependency becomes exposed as part of its interface, dependency isolation is not only subtly wrong, it can cause complete failure at runtime. These are cases when traditional dependency management are much better—they will tell you as soon as you attempt to install two packages that they just don’t work together, rather than waiting for you to figure that out for yourself.

This might not sound too bad—after all, JavaScript is a very dynamic language, so static guarantees are mostly few and far between, and your tests should catch these problems should they arise—but it can cause unnecessary issues when two packages can theoretically work together fine, but because npm assigned each one its own copy of a particular package (that is, it wasn’t quite smart enough to figure out it could give them both the same copy), things break down.

Looking outside of npm specifically and considering this model when applied to other languages, it becomes increasingly clear that this won’t do. This blog post was inspired by a Reddit thread discussing the npm model applied to Haskell, and this flaw was touted as a reason why it couldn’t possibly work for such a static language.

Due to the way the JavaScript ecosystem has evolved, it’s true that most people can often get away with this subtle potential for incorrect behavior without any problems. Specifically, JavaScript tends to rely on duck typing rather than more restrictive checks like instanceof, so objects that satisfy the same protocol will still be compatible, even if their implementations aren’t quite the same. However, npm actually provides a robust solution to this problem that allows package authors to explicitly express these “cross-interface” dependencies.

Peer dependencies

Normally, npm package dependencies are listed under a "dependencies" key in the package’s package.json file. There is, however, another, less-used key called "peerDependencies", which has the same format as the ordinary dependencies list. The difference shows up in how npm performs dependency resolution: rather than getting its own copy of a peer dependency, a package expects that dependency to be provided by its dependent.

This effectively means that peer dependencies are effectively resolved using the “traditional” dependency resolution mechanism that tools like Bundler and Cabal use: there must be one canonical version that satisfies everyone’s constraint. Since npm 3, things are a little bit less straightforward (specifically, peer dependencies are not automatically installed unless a dependent package explicitly depends on the peer package itself), but the basic idea is the same. This means that package authors must make a choice for each dependency they install: should it be a normal dependency or a peer dependency?

This is where I think people tend to get a little lost, even those familiar with the peer dependency mechanism. Fortunately, the answer is relatively simple: is the dependency in question visible in any place in the package’s interface?

This is sometimes hard to see in JavaScript because the “types” are invisible; that is, they are dynamic and rarely explicitly written out. However, just because the types are dynamic does not mean they are not there at runtime (and in the heads of various programmers), so the rule still holds: if the type of a function in a package’s public interface somehow depends on a dependency, it should be a peer dependency.

To make this a little more concrete, let’s look at a couple of examples. First off, let’s take a look at some simple cases, starting with some uses of ramda:

import { merge, add } from 'ramda'

export const withDefaultConfig = (config) =>
  merge({ path: '.' }, config)

export const add5 = add(5)

The first example here is pretty obvious: in withDefaultConfig, merge is used purely as an implementation detail, so it’s safe, and it’s not part of the module’s interface. In add5, the example is a little trickier: the result of add(5) is a partially-applied function created by Ramda, so technically, a Ramda-created value is a part of this module’s interface. However, the contract add5 has with the outside world is simply that it is a JavaScript function that adds five to its argument, and it doesn’t depend on any Ramda-specific functionality, so ramda can safely be a non-peer dependency.

Now let’s look at another example using the jpeg image library:

import { Jpeg } from 'jpeg'

export const createSquareBuffer = (size, cb) =>
  createSquareJpeg(size).encode(cb)

export const createSquareJpeg = (size) =>
  new Jpeg(Buffer.alloc(size * size, 0), size, size)

In this case, the createSquareBuffer function invokes a callback with an ordinary Node.js Buffer object, so the jpeg library is an implementation detail. If that were the only function exposed by this module, jpeg could safely be a non-peer dependency. However, the createSquareJpeg function violates that rule: it returns a Jpeg object, which is an opaque value with a structure defined exclusively by the jpeg library. Therefore, a package with the above module must list jpeg as a peer dependency.

This sort of restriction works in reverse, too. For example, consider the following module:

import { writeFile } from 'fs'

export const writeJpeg = (filename, jpeg, cb) =>
  jpeg.encode((image) => fs.writeFile(filename, image, cb))

The above module does not even import the jpeg package, yet it implicitly depends on the encode method of the Jpeg interface. Therefore, despite not even explicitly using it anywhere in the code, a package containing the above module should include jpeg as a peer dependency.

They key is to carefully consider what contract your modules have with their dependents. If those contracts involve other packages in any way, they should be peer dependencies. If they don’t, they should be ordinary dependencies.

Applying the npm model to other programming languages

The npm model of package management is more complicated than that of other languages, but it provides a real advantage: implementation details are kept as implementation details. In other systems, it’s quite possible to find yourself in “dependency hell”, when you personally know that the version conflict reported by your package manager is not a real problem, but because the package system must pick a single canonical version, there’s no way to make progress without adjusting code in your dependencies. This is extremely frustrating.

This sort of dependency isolation is not the most advanced form of package management in existence—indeed, far from it—but it’s definitely more powerful than most other mainstream systems out there. Of course, most other languages could not adopt the npm model simply by changing the package manager: having a global package namespace can prevent multiple versions of the same package being installed at a runtime level. The reason npm is able to do what it does is because Node itself supports it.

That said, the dichotomy between peer and non-peer dependencies is a little confusing, especially to people who aren’t package authors. Figuring out which packages need to go in which group is not always obvious or trivial. Fortunately, other languages might be able to help.

Returning to Haskell, its strong static type system would potentially allow this distinction to be detected entirely automatically, and Cabal could actually report an error when a package used in an exposed interface was not listed as a peer dependency (much like how it currently prevents importing a transitive dependency without explicitly depending on it). This would allow helper function packages to keep on being implementation details while still maintaining strong interface safety. This would likely take a lot of work to get just right—managing the global nature of typeclass instances would likely make this much more complicated than a naïve approach would accommodate—but it would add a nice layer of flexibility that does not currently exist.

From the perspective of JavaScript, npm has demonstrated that it can be a capable package manager, despite the monumental burden placed upon it by the ever-growing, ever-changing JS ecosystem. As a package author myself, I would implore other users to carefully consider the peer dependencies feature and work hard to encode their interfaces’ contracts using it—it’s a commonly misunderstood gem of the npm model, and I hope this blog post helped to shed at least a little more light upon it.

Climbing the infinite ladder of abstraction

2016-08-11T00:00:00Z

I started programming in elementary school.

When I was young, I was fascinated by the idea of automation. I loathed doing the same repetitive task over and over again, and I always yearned for a way to solve the general problem. When I learned about programming, I was immediately hooked: it was so easy to turn repetitive tasks into automated pipelines that would free me from ever having to do the same dull, frustrating exercise ever again.

Of course, one of the first things I found out once I’d started was that nothing is ever quite so simple. Before long, my solutions to eliminate repetition grew repetitive, and it became clear I spent a lot of time typing out the same things, over and over again, creating the very problem I had initially set out to destroy. It was through this that I grew interested in functions, classes, and other repetition-reducing aids, and soon enough, I discovered the wonderful world of abstraction.

The brick wall of inexpressiveness

When I started programming, I was mostly playing with ActionScript and Java, just tinkering with things and seeing what I could come up with. I had quite a lot of fun, and the joy of solving problems hooked me almost immediately, but I also ran into frustrations pretty quickly. Specifically, I started writing a lot of code that looked like this:

public String getName() {
  return this.name;
}

public void setName(String name) {
  this.name = name;
}

This is a bit of a cheap example, given that Java getters and setters are something of a programming language punching bag at this point, but I really did write them, and I really did get frustrated by them! I learned object-oriented design patterns, and I pored over books, forum threads, blog posts, and Stack Overflow questions about how to structure code to prevent spaghetti, but no matter how hard I tried, I kept having to type things that looked suspiciously similar to each other.

It was really quite frustrating, because no matter how I approached the problem, I ended up with a boilerplate-heavy mess. The whole reason I got started programming was to avoid this sort of thing, so what could I do? Well, it became increasingly obvious to me that Java had to go, and I needed to try something else. I started learning two very different programming languages, JavaScript and Objective-C, and I liked them both, for different reasons.

When I learned JavaScript, I discovered the closure, the first-class function, and I was entranced by it. Through jQuery, I learned of its power to design APIs that could be fun to use, dropping the boring, “heavy” feeling that Java carried around everywhere. With Objective-C, on the other hand, I learned about the power of a more dynamic object system, something with interesting syntax and the ability to handle “message passing” at a far higher level than Java ever could.

Both of these languages were flawed, as all languages are, but they opened my mind to the idea that programming languages could drastically influence the way I thought about problem solving, and they set me on a quest to find the programming language that would eliminate boilerplate once and for all.

Discovering Lisp

Over the next few years, I grew to appreciate JavaScript’s small, simple core, despite rather disliking its object system and poor faculties for user-friendly data modeling. I pored over its history, and I found out that its design was heavily influenced by an obscure little language called Scheme, as well as an even more obscure language called Self, and a part of me started to wonder what it would be like to incorporate those languages’ ideas without some of the compromises JavaScript had made.

This idea lingered in the back of my head for a couple years, and while I tried to play with Scheme a couple times, it was simply too inaccessible for me. I was used to languages with powerful, easy to use IDEs, and when I found myself with nothing more than a command-line executable and rather scarce documentation, I was at a loss for how to begin. Even if I could do math in the REPL, where could I go from there? I’d started programming by building games, then websites. What could I possibly do with Scheme?

The language (or rather, its lack of an ecosystem) proved too intimidating for me at that young age, but the idea of Lisp’s homoiconicity stuck with me. Eventually, I started to design my very own programming language, a highly dynamic Lisp with a prototypal object system called Sol. I worked on it for about a year, and when I was done with it, it had a not-too-shabby complement of features: it had lambdas, macros, a fully-featured object model, and a CommonJS-esque module system, complete with the ability to dynamically import arbitrary C extensions. It was by far the largest project I’d ever worked on, and when I was done, I was pretty pleased.

Unfortunately, it was also abysmally slow.

I turned to a local college to find some people who could give me feedback and maybe point me in the right direction, and someone told me about another obscure programming language called Racket. At about the same time, someone pointed me to a totally different language called Haskell. This was uncharted territory for me, and for a while, I didn’t really explore either of those languages further. Eventually, though, I dove into them in earnest, and what I found has dramatically altered my perspective on programming since then.

A journey into complexity

Fast forward about three years, and today, I am employed writing Haskell, and I spend most of my free time writing Racket. These languages left a mark on me, and while I’ve learned so much more since then, I find myself continually bucking the mainstream and coming back to functional programming, hygienic macros, and possibly the most powerful type system in existence in a production-ready programming language.

I’ve also started realizing something else, though: the languages I’ve settled into are really complicated.

When I started programming, I thought about things like numbers, text, and shapes on a screen. Before long, I learned about functions, then classes, then message-passing and lambdas. I dove into macros and typeclasses, and now I speak in functors and monads, sets of scopes and internal definition contexts, and parser combinators and domain specific languages.

Why?

Sometimes I talk to fellow programmers, and they are horrified by the types of terms I fling around. “Why would you ever need something called a ‘monad’?” they ask, completely perplexed. “Macros are confusing,” they argue. “Being explicit is better.”

Obviously, I disagree, but why? What have I given up? If my fellow programmers cannot understand what I’m writing, is it actually worth it?

I’ve searched for years to find a programming language that will eliminate boilerplate, that will allow me to express my ideas succinctly and cleanly, that will let me turn hard problems into trivial ones, and I’ve discovered two completely different approaches to tackling those issues. Racket has macros, and Haskell has its fancy type system. Both of these things are lightyears ahead of where I was nearly a decade ago, writing dozens of lines of repetitive Java that ultimately did very little, but I’m still dealing with the same problems.

Racket knows too little about my program—it can’t figure out what I mean based on the type of thing I’m operating on because it is (mostly) dynamically typed. I still have to clarify myself and write things that feel redundant because the computer isn’t smart enough to figure out the “obvious”. Similarly, Haskell is too limiting—the compiler cannot deduce constraints I can solve in my head in seconds, and its syntax is not extensible like Racket’s is. Every day, I peer into piles upon piles of monadic computation, and really, what have I gained?

Improvement, but never mastery

Like almost anything in life, programming is not really a perfectable art. There’s always some unlearned skill or undiscovered technique, and part of this potential for perpetual self-improvement is one of the things that I find so attractive about the field. That said, I this it is reasonable to say that certain languages have higher ceilings than others.

For example I am pretty confident that I get JavaScript. The language has lots of nooks and crannies that I don’t completely understand, but I feel pretty confident that I understand its semantics well enough to be able to grasp any piece of JavaScript code without too much incredulity. Now, that’s not to say that JavaScript is a simplistic language—far from it—but most of the ways I improve my JavaScripting abilities are learning new techniques within the language, not entirely new linguistic constructs.

On the other hand, languages like Haskell and Racket tend to blur the line. I feel like I have a good grasp of Haskell’s core, but do I have a good intuition for laziness? Do I completely grok type families? What about TypeInType? Ultimately, I have to come to the conclusion that I do not fully understand Haskell, much less a lot of the advanced category theory that composes some of its most powerful libraries. Racket manages to blur the line between language and library even further, and while I consider myself a decent Racketeer, I absolutely do not have a good grasp on all the intricacies of Racket’s macro system.

This is especially obvious to me at work, given that I write Haskell in a team setting. Just like back when I was writing Java, I end up with solutions that don’t satisfy me, and I reach for increasingly powerful constructs to help alleviate my qualms. Sometimes, I find myself cracking out DataKinds, and it might even help my problem, but there’s a cost: my coworkers are sometimes confused.

Every time I climb to the next rung on the ladder of abstraction, those only a couple rungs below me (even if we’re all hundreds of rungs up!) find themselves perplexed. In the worst case, people may even blame their confusion on their own inadequacy or lack of skill. This is terrible, especially when I know that, by the time they’ve caught up, I’ll be off playing with some new toy: comonads or type families or classy lenses. The cycle continues, and nobody is ever truly satisfied—I always want to find a new abstraction that will make things simpler, and those just a couple steps behind me struggle to keep up.

Of course, I experience it from the opposite perspective just as often: I delve into Edward Kmett’s fancier libraries or Phil Freeman’s blog posts about category theory, and I recognize that I am rather lost. Sometimes, I find myself understanding things, but just as often, I cannot wrap my head around the concepts being discussed. I may figure them out eventually, sure, but by then everyone else has moved on to even more advanced things, and still, none of them truly solve my problems.

Ultimately, it all has (at least a little) value

It would be nice to think about all that and say, well, “Let’s finally break the cycle. Let’s stop deluding ourselves into thinking our solutions to our self-made problems are actually solving anything.” It would be great if I could tell myself that, but I unfortunately really can’t.

The scariest part of all is that I think it’s completely worthwhile.

So much of these more and more complicated abstractions are trying to do the same basic thing: come up with a better way of modeling the problem. In some sense, that’s all programming really is, modeling a domain in a way that can be leveraged by a digital computer. Our increasingly complicated DSLs seem unnecessarily complicated, they seem increasingly removed from reality, but that’s only because we’re getting better at creating languages that are closer to our domains without the baggage of preconceptions that came before us.

The downside is that, without an understanding of those preconceptions, a lot of what we come up with seems like patent gibberish to those unaware of our languages’ history.

Most programmers, even those who have never seen BASIC before, can figure out what this snippet does:

10 INPUT "What is your name: "; U$
20 PRINT "Hello "; U$

On the other hand, very few would probably understand this one:

-- | A class for categories.
--   id and (.) must form a monoid.
class Category cat where
    -- | the identity morphism
    id :: cat a a

    -- | morphism composition
    (.) :: cat b c -> cat a b -> cat a c

Yet very few new programs are being written in BASIC, and lots are being written in Haskell.

Even one of the most popular, fastest-growing programming languages in the world, JavaScript, a language considered relatively accessible compared to things like Haskell, would likely be incomprehensible to a programmer not familiar with its syntax:

export const composeWithProps = curry((a, parentProps, b) => {
  const composed = childProps =>
    createElement(a, parentProps, createElement(b, omit(['children'], childProps), childProps.children));
  // give the composed component a pretty display name for debugging
  composed.displayName = `Composed(${getDisplayName(a)}, ${getDisplayName(b)})`;
  return composed;
});

Moving towards increasingly specialized syntaxes is not inherently bad—it can often be indicative of a more streamlined, domain-specific way of thinking—but while it may dramatically increase the productivity of a seasoned programmer, it can be nothing short of baffling to a newcomer.

That, specifically, is the crux of my fear: are we always aware of who we are optimizing for? I do not have a moral problem with writing code to optimize concision for seasoned programmers; after all, brevity is one of the primary ways code is made more readable (verbosity is the enemy of understanding). However, when that concision comes at the cost of beginners’ understanding, the picture becomes a bit more grey. It is not wrong to write things that are highly optimized for one’s own knowledge and understanding, and establishing a group of such people can make for an extremely productive team. It’s just also important to understand that others will likely be confused, and without being willing to invest the time and money into education, smart, diligent people will still fail to grasp the concepts, and they will likely be wholly uninterested in them.

Reactionary anti-intellectualism and the search for moderation

I have noticed lately that people close to my circles have started regularly slinging insults at people who work in highly specialized notation. Math, including things like category and type theory, has become an especially acceptable punching bag. I recently tweeted a picture of some rather dense mathematics from a paper I’d read, and I was frankly disturbed at some of the vitriolic responses. Academia is sometimes described as “masturbatory”, and honestly, that is both offensive and hypocritical.

Mathematical notation is not perfect, no more than dense Haskell, heavily metaprogrammed Ruby, or IIFE-packed JavaScript. Still, it serves a purpose, and sometimes spelling things out is neither practically feasible nor a theoretical improvement. Programmers would not take kindly to being asked to write all their code out as prose, nor would they like being told that using higher-order functions like map should be banned because they are too confusing and not immediately self-explanatory.

I am glad that people are focusing on usability and accessibility more than ever, and I think that’s one of the areas I’m the most interested in. I want to get the best of both worlds: I aim to write code in a highly concise, precise style, but I try and produce intuitive interfaces with human-readable errors upon failure. To me, a user-hostile yet technically functional library is a buggy one, and I would happily file a bug report about a confusing API or error message.

Abstraction is what seems to make programming possible, and indeed, it’s what makes most modern technology possible. It’s what allows people to drive a car without knowing how an internal combustion engine works, and it’s what allows people to browse the web without having a deep understanding of internet protocol. In programming, abstraction serves a similar purpose. Of course, just like all tools, abstractions can have rather different goals: the average user will not pick up Photoshop in a day, but a power user is not going to be satisfied with Paint.

Programmers are professionals, and we work in a technical domain. I am absolutely of the belief that programming, like any other field, is not always about what comes easiest: sometimes it’s important to sit down and study for a while to grok a particularly complicated concept, and other times, it’s simply important to learn by trying, failing, and asking questions. I strive to find that blend of accessible, concise, and robust, and just like everything else, that target shifts depending on the situation and people I’m working with.

I honestly don’t know if Racket and Haskell are worth their costs in complexity. At the end of the day, maybe what really matters is writing simple, consistent things that other people can understand. I really hope that there is a place for more powerful languages within a team, but there’s something to be said about which languages tend to get the most popular.

Ultimately, though, I am just trying to be aware of the tradeoffs I’m making, the benefits I’m getting, and the impact on those I’m working with. I will continue to search for abstractions that can better fit my needs, and I am sure I will keep on climbing the ladder of abstraction for years to come—I just really hope I’m not wasting my time.

Four months with Haskell

2016-06-12T00:00:00Z

At the end of January of this year, I switched to a new job, almost exclusively because I was enticed by the idea of being able to write Haskell. The concept of using such an interesting programming language every day instead of what I’d been doing before (mostly Rails and JavaScript) was very exciting, and I’m pleased to say that the switch seems to have been well worth it.

Haskell was a language I had played with in the past but never really used for anything terribly practical, but lately I think I can confidently say that it really is an incredible programming language. At the same time, it has some significant drawbacks, too, though probably not the ones people expect. I certainly wasn’t prepared for some of the areas where Haskell would blow me away, nor was I capable of realizing which parts would leave me hopelessly frustrated until I actually sat down and started writing lots and lots of code.

Dispelling some myths

Before moving on and discussing my experiences in depth, I want to take a quick detour to dispel some frequent rumors I hear about why Haskell is at least potentially problematic. These are things I hear a lot, and nothing in my experience so far would lead me to believe these are actually true. Ultimately, I don’t want to spend too much time on these—I think that, for the most part, they are nitpicks that people complain about to avoid understanding the deeper and more insidious problems with the language—but I think it’s important to at least mention them.

Hiring Haskell developers is not hard

I am on the first Haskell team in my company, and I am among the first Haskell developers we ever hired. Not only were we hiring without much experience with Haskell at all, we explicitly did not want to hire remote. Debate all you like about whether or not permitting remote work is a good idea, but I don’t think anyone would dispute that this constraint makes hiring much harder. We didn’t have any trouble finding a very large stream of qualified applicants, and it definitely seems to have dispelled any fears that we would have trouble finding new candidates in the future.

Performing I/O in Haskell is easy

Haskell’s purity is a point of real contention, and it’s one of the most frustrating complaints I often hear about Haskell. It is surprisingly common to hear concerns along the lines of “I don’t want to use Haskell because its academic devotion to purity sounds like it would make it very hard to get anything done”. There are very valid reasons to avoid Haskell, but in practice, I/O is not one of them. In fact, I found that isolating I/O in Haskell was much the same as isolating I/O in every other language, which I need to do anyway to permit unit testing.

...you do write deterministic unit tests for your impure logic, right?

Working with lots of monads is not very difficult

The “M word” has ended up being a running joke about Haskell that actually ends up coming up fairly rarely within the Haskell community. To be clear, there is no doubt in my mind that monads make Haskell intimidating and provide a steep learning curve for new users. The proliferation of the joke that monads are impossible to explain, to the point of becoming mythologized, is absolutely indicative of a deeper problem about Haskell’s accessibility. However, once people learn the basics about monads, I’ve found that applying them is just as natural as applying any other programming pattern.

Monads are used to assist the programmer, not impede them, and they really do pay off in practice. When something has a monadic interface, there’s a decent chance I already know what that interface is going to do, and that makes working with lots of different monads surprisingly easy. Admittedly, I do rely very, very heavily on tooling to help me out here, but with things like mouseover type tooltips, I’ve actually found that working with a variety of different monads and monad transformers is actually quite pleasant, and it makes things very readable!

Haskell: the good parts

With the disclaimers out of the way, I really just want to gush for a little bit. This is not going to be an objective, reasoned survey of why Haskell is good. I am not even really going to touch upon why types are so great and why purity is so wonderful—I’d love to discuss those in depth, but that’s for a different blog post. For now, I just want to touch upon the real surprises, the real things that made me excited about Haskell in ways I didn’t expect. These are the things that my subjective little experience has found fun.

Language extensions are Haskell

There was a time in my life when I spent a lot of time writing C. There are a lot of compilers for C, and they all implement the language in subtly different but often incompatible ways, especially on different platforms. The only way to maintain a modicum of predictability was to adhere to the standards religiously, even when certain GCC or MSVC extensions seem tantalizingly useful. I was actually bitten a few times by real instances where I figured I’d just use a harmless extension that was implemented everywhere, then found out it worked slightly differently across different compilers in a particular edge case. It was a learning experience.

It seems that this fear provides a very real distrust for using GHC’s numerous language extensions, and indeed, for a long time, I felt that it was probably an admirable goal to stick to Haskell 98 or Haskell 2010 as closely as possible. Sometimes I chose a slightly more verbose solution that was standard Haskell to avoid turning on a trivial extension that would make the code look a little bit cleaner.

About a year later, I’m finding that attitude was not only a mistake, but it forced me to often completely miss out on a lot of Haskell’s core value. GHC won, and now GHC and Haskell are basically synonymous. With that in mind, the portability concerns of language extensions are a bit of a non-issue, and turning them on is a very good idea! Some extensions are more than a little dangerous, so they cannot all be turned on without thinking, but the question is absolutely not “Is using language extensions a good idea?” and more “Is using this language extension a good idea?”

This is important, and I bring it up for a reason: so much of the awesomeness of Haskell is locked behind language extensions. Turning a lot of these on is one of the main things that made me really start to see how incredibly powerful Haskell actually is.

Phantom types

I’m going to start out by talking about phantom types, which are a pretty simple concept but a powerful one, and they serve as the foundation for a lot of other cool type-level tricks that can make Haskell extremely interesting. The basic idea of a phantom type is simple; it’s a type parameter that isn’t actually used to represent any particular runtime value:

newtype Id a = Id Text

This type represents an id for some kind of value, but although the kind of value is specified in the type as the a type parameter, it isn’t actually used anywhere on the data definition—no matter what a is, an Id is just a piece of text. This makes it possible to write functions that operate on specific kinds of ids, and those invariants will be statically checked by the compiler, even though the runtime representation is entirely identical:

fetchUser :: MonadDB m => Id User -> m User

Using FlexibleInstances, it’s also possible to create different instances for different kinds of ids. For example, it would be possible to have different Show instances depending on the type of id in question.

instance Show (Id User) where
  show (Id txt) = "user #" <> unpack txt

instance Show (Id Post) where
  show (Id txt) = "post #" <> unpack txt

This provides a simple framework for encoding entirely arbitrary information into the type system, then asking the compiler to actually check assertions about that information. This is made even more powerful with some other extensions, which I’ll talk about shortly.

Letting the compiler write code

One of the things I really dislike, more than most things, is boilerplate. A little bit of boilerplate is fine—even necessary at times—but as soon as I start wondering if a code generator would improve things, I think the programming language has pretty much failed me.

I write a lot of Racket because, in a sense, Racket is the ultimate boilerplate killer: the macro system is a first-class code generator integrated with the rest of the language, and it means that boilerplate is almost never an issue. Of course, that’s not always true: sometimes a bit of boilerplate is still necessary because macros cannot deduce enough information about the program to generate the code entirely on their own, and in Haskell, some of that information is actually present in the type system.

This leads to two absolutely incredible extensions, both of which are simple and related, but which actually completely change how I approach problems when programming. These two extensions are GeneralizedNewtypeDeriving and StandaloneDeriving.

Newtypes and type safety

The basic idea is that “newtypes” are just simple wrapper types in Haskell. This turns out to be extremely important when trying to find the value of Haskell because they allow you to harden type safety by specializing types to your domain. For example, consider a type representing a user’s name:

newtype Name = Name Text

This type is extremely simple, and in fact isn’t even at all different from a simple Text value with respect to its representation, since all combinations of unicode characters are allowed in a name. Therefore, what’s the point of a separate type? Well, this allows Haskell to introduce actual compilation failures when two different kinds of textual data are mixed. This is not a new idea, and even in languages that don’t support this sort of thing, Joel Spolsky’s old blog post Making Wrong Code Look Wrong describes how it can be done by convention. Still, almost every modern language makes this possible: in C, it would be a single-member struct, in class-based OO languages, it would be a single-member class... this is not a complicated idea.

The difference lies in its usage. In other languages, this strategy is actually not very frequently employed for the simple reason that it is almost always extremely annoying. You are forced to do tons of wrapping/unwrapping, and at that point it isn’t really clear if you’re even getting all that much value out of the distinction when your first solution to a type mismatch is wrapping or unwrapping the value without a second thought. In Haskell, however, this can be heavily mitigated by asking the compiler to automatically derive typeclass implementations, which allow the unwrapping/wrapping to effectively happen implicitly for a constrained set of operations.

Using `GeneralizedNewtypeDeriving`

Consider the Name type once again, but this time, let’s derive a class:

newtype Name = Name Text
  deriving (IsString)

The IsString typeclass in Haskell allows custom types to automatically be created from string literals. It is not handled specially by Haskell’s deriving mechanism. Since Text implements IsString, an instance will be generated that simply defers to the underlying type, automatically generating the code to wrap the result up in a Name box at the end. This means that code like this will now just magically work:

name :: Name
name = "Alyssa P. Hacker"

No boilerplate needs to be written! This is a neat trick, but it actually turns out to be far more useful than that simple example in practice. What really makes this functionality shine is when you want to derive some kinds of functionality but disallow some others. For example, using the text-conversions package, it’s possible to do something like this:

newtype Id a = Id Text
  deriving (Eq, Show, ToText, ToJSON)

This creates an opaque Id type, but it automatically generates conversions to textual formats. However, it does not automatically create FromText or FromJSON instances, which would be dangerous because decoding Ids can potentially fail. It’s then possible to write out those instances manually to preserve a type safety:

instance FromText (Maybe (Id a)) where
  fromText str = if isValidId str then Just (Id str) else Nothing

instance FromJSON (Id a) where
  parseJSON (String val) = maybe (fail "invalid id") return (fromText val)
  parseJSON _            = fail "invalid id"

Using `StandaloneDeriving`

The ordinary deriving mechanism is extremely useful, especially when paired with the above, but sometimes it is desirable to have a little bit more flexibility. In these cases, StandaloneDeriving can help.

Take the Id example again: it has a phantom type, and simply adding something like deriving (ToText) with derive ToText instances for all kinds of ids. It is potentially useful, however, to derive instances for more specific id types. Using standalone deriving constructs permits this sort of flexibility.

deriving instance ToText (Id User)

instance ToText (Id Post) where
  toText = postIdToText

This is an example where GHC language extensions end up becoming significantly more than the sum of their parts, which seems to be a fairly frequent realization. The StandaloneDeriving mechanism is a little bit useful without GeneralizedNewtypeDeriving, but when combined, they are incredibly powerful tools for getting a very fine-grained kind of type safety without writing any boilerplate.

DataKinds are super cool, with caveats

Phantom types are quite wonderful, but they can only encode types, not arbitrary data. That’s where DataKinds and KindSignatures come in: they allow lifting arbitrary datatypes to the type level so that things that would normally be purely runtime values can be used at compile-time as well.

The way this works is pretty simple—when you define a datatype, you also define a “datakind”:

data RegistrationStatus = Registered | Anonymous

Normally, the above declaration declares a type, RegistrationStatus, and two data constructors, Registered and Anonymous. With DataKinds, it also defines a kind, RegistrationStatus, and two type constructors, Registered and Anonymous.

If that’s confusing, the way to understand that is to realize there is a sort of natural ordering here: types describe values, and kinds describe types. Therefore, turning on DataKinds “lifts” each definition by a single level, so types become kinds and values become types. This permits using these things at the type level:

newtype UserId (s :: RegistrationStatus) = UserId Text

In this example, UserId still has a single phantom type variable, s, but this time it is constrained to the RegistrationStatus kind. Therefore, it can only be Registered or Anonymous. This cooperates well with the aforementioned StandaloneDeriving mechanism, and it mostly provides a convenient way to constrain type variables to custom kinds.

In general, DataKinds is a much more powerful extension, allowing things like type-level natural numbers or strings, which can be used to perform actual type-level computation (especially in combination with TypeFamilies) or a sort of metaprogramming. In some cases, they can even be used to implement things emulating things you can do with dependent types.

I think DataKinds are a very cool Haskell extension, but there are a couple caveats. One of the main ones is how new kinds are defined: DataKinds “hijacks” the existing datatype declaration syntax by making every single datatype declaration define a type and a kind. This is a little confusing, and it would be nice if a different syntax was used so that each could be defined independently.

Similarly, it seems that a lot of work is being done to allow using runtime values at the type level, but I wonder if people will ever need to use, say, runtime values at the kind level. This immediately evokes thoughts of Racket’s phase-based macro system, and I wonder if some of this duplication would be unnecessary with something similar.

Food for thought, but overall, DataKinds are a very nice addition to help with precisely and specifically typing particular problems.

Typeclasses can emulate effects

This is something that I’ve found interesting in my time writing Haskell because I have no idea if it’s idiomatic or not, but it seems pretty powerful. The initial motivator for this idea was figuring out how to test our code without constantly dropping into IO.

More generally, we wanted to be able to unit test by “mocking” out collaborators, as it would be described in object oriented programming. I was always semi-distrustful of mocking, and indeed, it seems likely that it is heavily abused in certain circles, but I’ve come to appreciate the need that sometimes it is important to stub things out, even in pure code.

As an example, consider some code that needs access to the current time. This is something that would normally require IO, but we likely want to be able to use the value in a pure context without “infecting” the entire program with IO types. In Haskell, I have generally seen three ways of handling this sort of thing:

Just inject the required values into the function and produce them “higher up” where I/O is okay. If threading the value around becomes too burdensome, use a Reader monad.
Use a free monad or similar to create a pure DSL of sorts, then write interpreters for various implementations, one of which uses IO.
Create custom monadic typeclasses that provide interfaces to the functionality you want to perform, then create instances, one of which is an instance over IO.

This last approach seems to be less common in Haskell, but it’s the approach we took, and it seems to work out remarkably well. Returning to the need to get the current time, we could pretty easily write such a typeclass to encode that need:

class Monad m => CurrentTime m where
  getCurrentTime :: m UTCTime

Now we can write functions that use the current time:

validateToken :: CurrentTime m => Token -> m Bool
validateToken tok = do
  currentTime <- getCurrentTime
  return (tokenExpirationDate tok > currentTime)

Now, we can write instances for CurrentTime that will allow us to run the same code in different contexts:

newtype AppM a = AppM { runAppM :: IO a }
  deriving (Functor, Applicative, Monad, MonadIO)

newtype TestM a = TestM (Identity a)
  deriving (Functor, Applicative, Monad)

runTestM :: TestM a -> a
runTestM (TestM x) = runIdentity x

instance CurrentTime AppM where
  getCurrentTime = liftIO Data.Time.Clock.getCurrentTime

instance CurrentTime TestM where
  getCurrentTime = return $ posixSecondsToUTCTime 0

Where this really starts to shine is when adding additional effects. For example, the above token validation function might also need information about some kind of secret used for signing. Under this model, it’s just another typeclass:

class Monad m => TokenSecret m where
  getTokenSecret :: m Secret

validateToken :: (CurrentTime m, TokenSecret m) => Token -> m Bool
validateToken tok = do
  currentTime <- getCurrentTime
  secret <- getTokenSecret
  return (tokenExpirationDate tok > currentTime
       && verifySignature tok secret)

Of course, so far all of these functions have been extremely simple, and we’ve basically been using them as a glorified reader monad. In practice, though, we use this pattern for lots more than just retrieving values. For example, we might have a typeclass for database interactions:

class Monad m => Persistence m where
  fetchUser :: Id User -> m (Maybe User)
  insertUser :: User -> m (Either PersistenceError (Id User))

With all of this done, it becomes incredibly easy to see which functions are using which effects:

postUsers
  :: (CurrentTime m, Persistence m, TokenSecret m)
  => User -> m Response
postUsers = ...

getHealthcheck
  :: CurrentTime m
  => m Response
getHealthcheck = ...

There’s no need to perform any lifting, and this all seems to scale quite nicely. We’ve written some additional utilities to help write tests against functions using these kinds of monadic interfaces, and even though there’s a little bit of annoying boilerplate in a few spots, overall it seems to work quite elegantly.

I’m not entirely sure how common this is in the Haskell community, but it’s certainly pretty neat how easy it is to get nearly all of the benefits of effect types in other languages simply by composing some of Haskell’s simplest features.

Atom’s ide-haskell tooling is invaluable

Alright, so, confession time: I don’t use Emacs.

I know, I know, how is that possible? I write Lisp, after all. Well, honestly, I tried picking it up a number of times, but none of those times did I get far enough to ditch my other tools. For Racket work, I use DrRacket, but for almost everything else, I use Atom.

Atom has a lot of flaws, but it’s also pretty amazing in places, and I absolutely love the Haskell tooling written by the wonderful atom-haskell folks. I use it constantly, and even though it doesn’t always work perfectly, it works pretty well. When it has problems, I’ve at least figured out how to get it working correctly.

This is probably hard to really explain without seeing it for yourself, but I’ve found that I basically depend on this sort of tooling to be fully productive in Haskell, and I have no problem admitting that. The ability to get instant feedback about type errors tied to visual source locations, to be able to directly manipulate the source by selecting expressions and getting type information, and even the option to get inline linter suggestions means I spend a lot less time glancing at the terminal, and even less time in the REPL.

The tooling is far from perfect, and it leaves a lot to be desired in places (the idea of using that static information for automated, project-wide refactoring a la Java is tantalizing), but most of those things are ideas of what amazing things could be, not broken or missing essentials. I am pretty satisfied with ide-haskell right now, and I can only hope it continues to get better and better.

Frustrations, drawbacks, and pain points

Haskell is not perfect. In fact, far from it. There is a vast array of little annoyances that I have with the language, as is the case with any language. Still, there are a few overarching problems that I would really like to at least mention. These are the biggest sources of frustration for me so far.

Purity, failure, and exception-handling

One of Haskell’s defining features is its purity—I don’t think many would disagree with that. Some people consider it a drawback, others consider it one of its greatest boons. Personally, I like it a lot, and I think one of the best parts about it is how it requires the programmer to be incredibly deliberate about failure.

In many languages, when looking up a value from a container where the key doesn’t exist, there are really two ways to go about expressing this failure:

Throw an exception.
Return null.

The former is scary because it means any call to any function can make the entire program blow up, and it’s often impossible to know which functions even have the potential to throw. This creates a certain kind of non-local control flow that can sometimes cause a lot of unpredictability. The second option is much the same, especially when any value in a program might be null; it just defers the failure.

In languages with option types, this is somewhat mitigated. Java now has option types, too, but they are still frequently cumbersome to use because there is nothing like monads to use to simply chain operations together. Haskell, in comparison, has an incredible complement of tools to simply handle errors without a whole lot of burden on the programmer, and I have found that, in practice, this is actually helpful and I really do write better error-handling code.

First, the good parts

I have seen a comparison drawn between throwing checked exceptions and returning Maybe or Either types, but in practice the difference is massive. Handling checked exceptions is a monotonous chore because they are not first-class values, they are actually entirely separate linguistic constructs. Consider a library that throws a LibraryException, and you want to wrap that library and convert those exceptions to ApplicationExceptions. Well, have fun writing this code dozens of times:

try {
  x = doSomething();
} catch (LibraryException ex) {
  throw ApplicationException.fromLibraryException(ex);
}

// ...

try {
  y = doSomethingElse();
} catch (LibraryException ex) {
  throw ApplicationException.fromLibraryException(ex);
}

In Haskell, failure is just represented by first-class values, and it’s totally possible to write helper functions to abstract over that kind of boilerplate:

libraryToApplication :: LibraryError -> ApplicationError
libraryToApplication = ...

liftLibrary :: Either LibraryError a -> Either ApplicationError a
liftLibrary = mapLeft libraryToApplication

Now, that same boilerplate-y code becomes nearly invisible:

x <- liftLibrary doSomething

-- ...

y <- liftLibrary doSomethingElse

This might not seem like much, but it really cuts down on the amount of visual noise, which ends up making all the difference. Boilerplate incurs a cost much bigger than simply taking the time to type it all out (though that’s important, too): the cognitive overhead of parsing which parts of a program are boilerplate has a significant impact on readability.

So what’s the problem?

If error handling is so great in Haskell, then why am I putting it under the complaints section? Well, it turns out that not everyone seems to think it’s as great as I make it out to be because people seem to keep writing Haskell APIs that throw exceptions!

Despite what some purists would have you believe, Haskell has exceptions, and they are not uncommon. Lots of things can throw exceptions, some of which are probably reasonable. Failing to connect to a database is a pretty catastrophic error, so it seems fair that it would throw. On the other hand, inserting a duplicate record is pretty normal operation, so it seems like that should not throw.

I mostly treat exceptions in Haskell as unrecoverable catastrophes. If I throw an error in my code, I do not intend to catch it. That means something horrible happened, and I just want that horribleness to show up in a log somewhere so I can fix the problem. If I care about failure, there are better ways to handle that failure gracefully.

It’s also probably worth noting that exceptions in Haskell can be thrown from anywhere, even pure code, but can only be caught within the IO monad. This is especially scary, but I’ve seen it happen in actual libraries out in the wild, even ones that the entire Haskell ecosystem is built on. One of the crowning examples of this is the text package, which provides a function called decodeUtf8 to convert bytestrings into text. Its type is very simple:

decodeUtf8 :: ByteString -> Text

But wait, what if the bytestring is not actually a valid UTF-8 string?

Boom. There goes the application.

Okay, okay, well, at least the text package provides another function, this one called decodeUtf8', which returns an Either. This is good, and I’ve trained myself to only ever use decodeUtf8', but it still has some pretty significant problems:

The safe version of this function is the “prime” version, rather than the other way around, which encourages people to use the unsafe one. Ideally, the unsafe one should be explicitly labeled as such... maybe call it unsafeDecodeUtf8?
This is not a hypothetical problem. When using a Haskell JWT library, we found a function that converts a string into a JWT. Since not all strings are JWTs, the library intelligently returns a Maybe. Therefore, we figured we were safe.
A couple weeks later, we found that providing this function with invalid data was returning HTTP 500 errors. Why? Our error handling was meticulous! Well, the answer was a decodeUtf8 call, hidden inside of the JWT library. This is especially egregious, given that the API it exposed returned a Maybe anyway! It would have been trivial to use the safe version there, instead, but the poor, misleading name led the library developer to overlook the bug lurking in the otherwise innocuous function.
Even worse, this function was totally pure, and we used it in pure code, so we could not simply wrap the function and catch the exception. We had two options: use unsafePerformIO (yuck!) or perform a check before handing the data to the buggy function. We chose the latter, but in some cases, I imagine that could be too difficult to do in order to make it feasible.

The point I’m trying to make is that this is a real problem, and it seems to me that throwing exceptions invalidates one of the primary advantages of Haskell. It disappointed me to realize that a significant amount of code written by FP Complete, one of the primary authors of some of the most important “modern Haskell” code in existence (including Stack), seem to very frequently expose APIs that will throw.

I’m not sure how much of this stems from a fundamental divide in the Haskell ecosystem and how much it is simply due to Michael Snoyman’s coding style, given that he is the primary author of a number of these tools and libraries that seem very eager to throw exceptions. As just one example of a real situation in which we were surprised by this behavior, we used Snoyman’s http-client library and found that it mysteriously throws upon nearly any failure state:

A note on exceptions: for the most part, all actions that perform I/O should be assumed to throw an HttpException in the event of some problem, and all pure functions will be total. For example, withResponse, httpLbs, and BodyReader can all throw exceptions.

This doesn’t seem entirely unreasonable—after all, isn’t a failure to negotiate TLS fairly catastrophic?—until you consider our use case. We needed to make a subrequest during the extent of another HTTP request to our server, and if that subrequest fails, we absolutely need to handle that failure gracefully. Of course, this is not terrible given that we are in IO so we can actually catch these exceptions, but since this behavior was only noted in a single aside at the top of the documentation, we didn’t realize we were forgetting error handling until far too late and requests were silently failing.

Exceptions seem to devalue one of the most powerful concepts in Haskell: if I don’t consider all the possibilities, my code does not compile. In practice, when working with APIs that properly encode these possibilities into the type system, this value proposition seems to be real. I really do find myself writing code that works correctly as soon as it compiles. It’s almost magical.

Using exceptions throws that all out the window, and I wish the Haskell ecosystem was generally more cautious about when to use them.

The String problem

I sort of alluded to this a tiny bit in the last section, and that is probably indicative of how bad this issue is. I’m just going to be blunt: In Haskell, strings suck.

This is always a bit of an amusing point whenever it is discussed because of how silly it seems. Haskell is a research language with a cutting-edge type system and some of the fanciest features of any language in existence. When everyday programming might use things like “profunctors”, “injective type families”, and “generalized algebraic datatypes”, you would think that dealing with strings would be a well-solved problem.

But it isn’t. Haskell libraries frequently use not one, not two, but five kinds of strings. Let’s list them off, shall we?

First off, there’s the built-in String type, which is actually an alias for the [Char] type. For those not intimately familiar with Haskell, that’s a linked list of characters. As Stephen Diehl recently put it in a blog post describing the disaster that is Haskell string types:
This is not only a bad representation, it’s quite possibly the least efficient (non-contrived) representation of text data possible and has horrible performance in both time and space. And it’s used everywhere in Haskell.
The point is, it’s really bad. This type is not a useful representation for textual data in practical applications.
Moving on, we have a fairly decent type, Text, which comes from Data.Text in the text package. This is a decent representation of text, and it’s probably the one that everything should use. Well, maybe. Because Text comes in two varieties: lazy and strict. Nobody seems to agree on which of those two should be used, though, and they are totally incompatible types: functions that work with one kind of text won’t work with the other. You have to manually convert between them.
Finally, we have ByteString, which is horribly misnamed because it really isn’t a string at all, at least not in the textual sense. A better name for this type would have simply been Bytes, which sounds a lot scarier. And that would be good, because data typed as a ByteString is as close as you can get in Haskell to not assigning a type at all: a bytestring holds arbitrary bytes without assigning them any meaning whatsoever!
Or at least, that’s the intention. The trouble is that people don’t treat bytestrings like that—they just use them to toss pieces of text around, even when those pieces of text have a well-defined encoding and represent textual data. This leads to the decodeUtf8 problem mentioned above, but it’s bigger than that because it often ends up with some poor APIs that assign some interpretation to ByteString data without assigning it a different type.
Again, this is throwing away so much of Haskell’s safety. It would be like using Int to keep track of boolean data (“just use 0 and 1!”) or using empty and singleton lists instead of using Maybe. When you use the precise type, you encode invariants and contracts into statically-checked assertions, but when you use general types like ByteString, you give that up.
Oh, and did I mention that ByteStrings also come in incompatible lazy and strict versions, too?

So, obviously, the answer is to just stop using the bad types and to just use (one kind of) Text everywhere. Great! Except that the other types are totally inescapable. The entire standard library uses String exclusively—after all, text is a separate package—and small libraries often use String instead of text because they have no need to bring in the dependency. Of course, this just means every real application pays the performance hit of converting between all these different kinds of strings.

Similarly, those that do use Text often use different kinds of text, so code ends up littered with fromStrict or toStrict coercions, which (again) have a cost. I’ve already ranted enough about ByteString, but basically, if you’re using ByteString in your API to pass around data that is semantically text, you are causing me pain. Please stop.

It seems that the way Data.Text probably should have been designed was by making Text a typeclass, then making the lazy and strict implementations instances of that typeclass. Still, the fact that both of them exist would always cause problems. I’m actually unsure which one is the “correct” choice—I don’t know enough about how the two perform in practice—but it seems likely that picking either one would be a performance improvement over the current system, which is constantly spending time converting between the two.

This issue has been ranted about plenty, so I won’t ramble on, but if you’re designing new libraries, please, please use Text. Your users will thank you.

Documentation is nearly worthless

Finally, let’s talk about documentation.

One of my favorite programming languages is Racket. Racket has a documentation tool called Scribble. Scribble is special because it is a totally separate domain-specific language for writing documentation, and it makes it fun and easy to write good explanations. There are even forms for typesetting automatically-rendered examples that look like a REPL. If the examples ever break or become incorrect, the docs don’t even compile.

All of the Racket core library documentation makes sure to set a good example about what good documentation should look like. The vast majority of the documentation is paragraphs of prose and simple but practical examples. There are also type signatures (in the form of contracts), and those are super important, but they are so effective because of how the prose explains what each function does, when to use it, why you’d use it, and why you wouldn’t use it.

Everything is cross-referenced automatically. The documentation is completely searchable locally out of the box. As soon as you install a package, its docs are automatically indexed. User-written libraries tend to have pretty good docs, too, because the standard libraries set such a good example and because the tools are so fantastic. Racket docs are really nice, and they’re so good they actually make things like Stack Overflow or even Google mostly irrelevant. It’s all there in the manual.

Haskell documentation is the opposite of everything I just said.

The core libraries are poorly documented. Most functions include a sentence of description, and almost none include examples. At their worst, the descriptions simply restate the type signature.
Third-party libraries’ documentation is even worse, going frequently completely undocumented and actually only including type signatures and nothing else.
Haddock is an incredibly user-hostile tool for writing anything other than tiny snippets of documentation and is not very good at supporting prose. Notably, Haddock’s documentation is not generated using Haddock (and it still manages to be almost unusable). Forcing all documentation into inline comments makes users unlikely to write much explanation, and there is no ability for abstraction.
Reading documentation locally is very difficult because there is no easy way to open documentation for a particular package in a web browser, and it’s certainly not searchable. This is especially ridiculous given that Hoogle exists, which is one of best ways to search API docs in existence. There should be a stack hoogle command that just opens a Hoogle page for all locally-installed packages and Just Works, but there isn’t.
Most valuable information exists outside of documentation, so Google becomes a go-to immediately after a quick glance at the docs, and information is spread across blog posts, mailing lists, and obscure reddit posts.

This is a problem that cannot be fixed by just making Haddock better, nor can it be fixed simply by improving the existing standard library documentation. There is a fundamental problem with Haskell documentation (which, to be completely fair, is not unique to Haskell), which is that its tools do not support anything more than API docs.

Good documentation is so much more than “here’s what this function does”; it’s about guides and tutorials and case studies and common pitfalls. This is documentation for someone new to lenses. This is not. Take note of the difference.

Conclusion and other thoughts

Haskell is an incredible programming platform, and indeed, it is sometimes mind-boggling how complete it is. It also has a lot of rough edges, sometimes in places that feel like they need a lot more care, or perhaps they’re even simply unfinished.

I could spend weeks writing about all the things I really like or dislike about the language, discussing in fine detail all the things that have made me excited or all the little bits that have made me want to tear my hair out. Heck, I could probably spend a month writing about strings alone. That’s not the point, though... I took a risk with Haskell, and it’s paid off. I’m not yet sure exactly how I feel about it, or when I would chose it relative to other tools, but it is currently very high on my list of favorite technologies.

I did not come to Haskell with a distaste for static typing, despite the fact that I write so much Racket, a dynamically typed language (by default, at least). I don’t really use Typed Racket, and despite my love for Haskell and its type system, I am not sure I will use much more of it than I did before. Haskell and Racket are very different languages, which is justified in some places and probably sort of circumstantial in others.

The future of Haskell seems bright, and a lot of the changes in the just-released GHC 8 are extremely exciting. I did not list records as a pain point because the changes in GHC 8 appear to make them a lot more palatable, although whether or not they solve that problem completely remains to be seen. I will absolutely continue to write Haskell and push it to its limits where I can, and hopefully try and take as much as I can from it along the way.

Simple, safe multimethods in Racket

2016-02-18T00:00:00Z

Racket ships with racket/generic, a system for defining generic methods, functions that work differently depending on what sort of value they are supplied. I have made heavy use of this feature in my collections library, and it has worked well for my needs, but that system does have a bit of a limitation: it only supports single dispatch. Method implementations may only be chosen based on a single argument, so multiple dispatch is impossible.

Motivating multiple dispatch

What is multiple dispatch and why is it necessary? Well, in most cases, it isn’t necessary at all. It has been shown that multiple dispatch is much rarer than single dispatch in practice. However, when actually needed, having multiple dispatch in the toolbox is a valuable asset.

A classic example of multiple dispatch is multiplication over both scalars and vectors. Ideally, all of the following operations should work:

2 × 3 = 6
2 × ⟨3, 4⟩ = ⟨6, 8⟩
⟨3, 4⟩ × 2 = ⟨6, 8⟩

In practice, most languages do not support such flexible dispatch rules without fairly complicated branching constructs to handle each permutation of input types. Furthermore, since most languages only support single dispatch (such as most object-oriented languages), it is nearly impossible to add support for a new combination of types to an existing method.

To illustrate the above, even if a language supported operator overloading and it included a Vector class that overloaded multiplication to properly work with numbers and vectors, it might not implement matrix multiplication. If a user defines a Matrix class, they may overload its multiplication to support numbers, vectors, and matrices, but it is impossible to extend the multiplication implementation for the Vector class. That method is now completely set in stone, unless it is edited directly (and the programmer may not have access to Vector’s implementation).

Multiple dispatch solves all of these problems. Rather than specify implementations of functions for singular types, it is possible to specify implementations for sets of types. In the above example, a programmer would be able to define a new function that operates on Vector and Matrix arguments. Since each definition does not “belong” to any given type, extending this set of operations is trivial.

Multiple dispatch in Racket

This blog post is somewhat long and technical, so before proceeding any further, I want to show some real code that actually works so you can get a feel for what I’m talking about. As a proof-of-concept, I have created a very simple implementation of multiple dispatch in Racket. The above example would look like this in Racket using my module:

#lang racket

(require multimethod)

(provide mul
         (struct-out num)
         (struct-out vec))

(struct num (val))
(struct vec (vals))

(define-generic (mul a b))

(define-instance ((mul num num) x y)
  (num (* (num-val x) (num-val y))))

(define-instance ((mul num vec) n v)
  (vec (map (curry * (num-val n)) (vec-vals v))))

(define-instance ((mul vec num) v n)
  (mul n v))

Pardon the somewhat clunky syntax, but the functionality is there. Using the above code works as expected:

> (mul (num 2) (num 3))
(num 6)
> (mul (num 2) (vec '(3 4)))
(vec '(6 8))
> (mul (vec '(3 4)) (num 2))
(vec '(6 8))

Making the above snippet work is not particularly hard. In fact, it’s likely that most competent Racketeers could do it without much thought. However, there’s a tiny bit more going on behind the scenes than it may seem.

The problem with multiple dispatch

The single-dispatch design limitation of racket/generic comes directly from a desire to avoid what has been described as “spooky action at a distance”, a problem that is prevalent in many systems that support methods with multiple dispatch (aka multimethods). Specifically, the issue arises when new method implementations are defined for existing datatypes, which can have far-reaching effects throughout a program because the method table is global state. Both CLOS and Clojure suffer from this shortcoming.

Interestingly, Haskell with multi-parameter typeclasses (a nonstandard but highly useful extension) makes it quite trivial to create constructs similar to multiple dispatch (though the overload resolution is done at compile-time). The similarities are significant: Haskell also suffers from the possibility of a certain sort of “spooky action”. However, Haskell’s static typing and resolution allows the compiler to catch these potential issues, known as “orphan instances”, at compile time. Even though Racket does not support the same sort of static typing, the same idea can be used to keep multiple dispatch safe using the macro system.

Safe, dynamically-typed multiple dispatch

In order to make multiple dispatch safe, we first need to determine exactly what is unsafe. Haskell has rules for determining what constitutes an “orphan instance”, and these rules are equally applicable for determining dangerous multimethod implementations. Specifically, a definition can be considered unsafe if both of the following conditions are true:

The multimethod that is being implemented was declared in a different module from the implementation.
All of the types used for dispatch in the multimethod instance were declared in a different module from the implementation.

Conversely, a multimethod implementation is safe if either of the following conditions are true:

The multimethod that is being implemented is declared in the same module as the implementation.
Any of the types used for dispatch in the multimethod instance are declared in the same module as the implementation.

Why do these two rules provide a strong enough guarantee to eliminate the dangers created by global state? Well, to understand that, we need to understand what can go wrong if these rules are ignored.

Multimethods and dangerous instances

What exactly is this dangerous-sounding “spooky action”, and what causes it? Well, the trouble stems from the side-effectful nature of multimethod instance definitions. Consider the Racket module from earlier, which defines multiplication instances for scalars and vectors:

(provide mul
         (struct-out num)
         (struct-out vec))

(struct num (val))
(struct vec (vals))

(define-generic (mul a b))

(define-instance ((mul num num) x y)
  (num (* (num-val x) (num-val y))))

(define-instance ((mul num vec) n v)
  (vec (map (curry * (num-val n)) (vec-vals v))))

(define-instance ((mul vec num) v n)
  (mul n v))

Note that there is not actually a (mul vec vec) implementation. This is intentional: there are two ways to take the product of two vectors, so no default implementation is provided. However, it is possible that another module might desire an instance for mul that takes the dot product, and the programmer might write the following definition:

(define-instance ((mul vec vec) x y)
  (num (foldl + 0 (map * (vec-vals x) (vec-vals y)))))

However, there is something fishy about the above definition: it doesn’t need to be exported with provide to work! Since instances don’t create new bindings, they only add dispatch options, they don’t ever need to provide anything. This is problematic, though: it means that a program could continue to happily compile even if the module containing the dot product instance was never loaded with require, but an attempt to multiply two vectors would fail at runtime, claiming that there was no (mul vec vec) implementation. This drastic change of behavior violates Racket programmers’ assumptions about the guarantees made by modules (require should not cause any side-effects if the module’s bindings are not used).

Of course, while this seems potentially unexpected, it is workable: just be careful to require modules containing instances. Unfortunately, it gets much worse—what if a different library defines its own (mul vec vec) instance? What if that instance takes the cross product instead? That library may function entirely properly on its own, but when loaded alongside the program that defines a dot product instance, it is impossible to determine which instance should be used where. Because define-instance operates by modifying the aforementioned global state, the implementations clash, and the two systems cannot continue to operate together as written.

This is pretty bad. Defining extra instances is a reasonable use-case for multiple dispatch, but if these instances can break third-party code, how can they be trusted? This sort of problem can make multiple dispatch difficult to reason about and even more difficult to trust.

What determines safety?

With those problems in mind, we can turn back to the two rules for safe multiple dispatch. How do they prevent the above issues? Well, let’s take them one at a time.

Remember that an instance can be unequivocally determined to be safe if either of the two conditions are true, so we can consider them entirely independently. The first one is simple—an instance is safe if the following condition holds:

The multimethod that is being implemented is declared in the same module as the implementation.

This one is pretty obvious. It is impossible to create a “bad” instance of a method declared in the same module because it is impossible to import the method without also bringing in the instance. Furthermore, a conflicting instance cannot be defined at the place where the types themselves are defined because that would require a circular module dependency, which Racket does not permit.

With the above explanation in mind, the second condition should make sense, too:

Any of the types used for dispatch in the multimethod instance are declared in the same module as the implementation.

The same argument for the first point holds for the second, but with the parties swapped. Again, it is impossible to use the instance without somehow requiring the module that defines the datatype itself, so the instance would always be required, anyway. The most interesting aspect of this condition is that it demonstrates that instances can be defined for existing datatypes (that are defined in other modules) just so long as at least one of the datatypes is defined in the same module. This continues to permit the important use-case of extending the interfaces of existing types.

Encoding the safety rules into Racket’s macro system

In order to keep track of which methods and instances are defined where, I leveraged a technique based on the one used by Typed Racket to keep track of whether or not a typed identifier is used in a typed or untyped context. However, instead of using a simple mutable boolean flag, I used a mutable free identifier set, which keeps track of the identifiers within a given module that should be considered “privileged”.

#lang racket/base

(require syntax/id-set)

(provide mark-id-as-privileged!
         id-privileged?)

(define privileged-ids (mutable-free-id-set))

(define (mark-id-as-privileged! id)
  (free-id-set-add! privileged-ids id))

(define (id-privileged? id)
  (free-id-set-member? privileged-ids id))

Making this work with define-generic is obvious: just invoke mark-id-as-privileged! on the method name to note that the method is “privileged” in the scope of the current module. Keeping track of privileged structs is similarly straightforward, though it is a little more devious: the multimethod module provides a custom struct macro that just expands to struct from racket/base, but adds privilege information.

The define-instance macro does all the heavy lifting to ensure that only privileged identifiers can be used in instance definitions. A simple check for the identifier annotations is performed before proceeding with macro expansion:

(unless (or privileged? (ormap id-privileged? types))
  (assert-privileged-struct! (first types)))

When the privilege checks fail, an error is raised:

(define (assert-privileged-struct! id)
  (unless (id-privileged? id)
    (raise-syntax-error 'define-instance
                        "expected name of struct defined in current module"
                        id)))

With the above safeguards in place, the dangerous dot product implementation from above would not be allowed. The checks manage to encode both of the safety rules into the macro system such that invalid instances will fail at compile time, preventing dangerous uses of multimethods from ever slipping by unnoticed.

Actually implementing multiple dispatch

The rest of the multimethod implementation is relatively straightforward and is not even particularly robust. If anything, it is the bare minimum of what would be needed to allow the safety mechanisms above to work. Lots of features that would likely be needed in a real implementation are not included, and graceful error handling is largely ignored.

Multimethods themselves are implemented as Racket transformer bindings containing custom data, including a reference to the multimethod’s arity and dispatch table. The custom datatype includes a prop:procedure structure type property, which allows such bindings to also function as macros. The macro procedure expands to an operation that looks up the proper instance to use in the multimethod’s dispatch table and invokes it with the supplied arguments.

The relevant code for defining multimethods is reproduced below:

(begin-for-syntax
  (struct multimethod (arity dispatch-table)
    #:property prop:procedure
    (λ (method stx)
      (syntax-parse stx
        [(method arg ...)
         #'(apply-multimethod method (list arg ...))]
        [method
         #'(λ args (apply-multimethod method args))]))))

(define-syntax define-generic
  (syntax-parser
    [(_ (method:id arg:id ...+))
     (with-syntax ([arity (length (attribute arg))]
                   [dispatch-table (generate-temporary #'method)])
       (mark-id-as-privileged! #'method)
       #'(begin
           (define dispatch-table (make-hash))
           (define-syntax method (multimethod arity #'dispatch-table))))]))

The dispatch tables are implemented entirely in terms of Racket’s structure types, so while they can be defined on arbitrary structure types (including ones defined in the Racket standard library), they cannot be defined on primitives such as pairs or vectors. Implementations are registered in the dispatch table using the compile-time information associated with structs’ transformer bindings, and the same information is retrieved from struct instances at runtime to look up the proper implementation to call. Notably, this only works if the struct is #:transparent, or more generally and accurately, if the calling code has access to the struct’s inspector. All structs defined by the struct form from the multimethod module are automatically marked as #:transparent.

The following code implements defining multimethod instances:

(begin-for-syntax
  (define (assert-privileged-struct! id)
    (unless (id-privileged? id)
      (raise-syntax-error 'define-instance
                          "expected name of struct defined in current module"
                          id))))

(define-syntax define-instance
  (syntax-parser
    ; standard (define (proc ...) ...) shorthand
    [(_ ((method type:id ...+) . args) body:expr ...+)
     #'(define-instance (method type ...) (λ args body ...))]
    ; full (define proc lambda-expr) notation
    [(_ (method type:id ...+) proc:expr)
     (let* ([multimethod (syntax-local-value #'method)]
            [privileged? (id-privileged? #'method)])
       (unless (or privileged? (ormap id-privileged? (attribute type)))
         (assert-privileged-struct! (first (attribute type))))
       (with-syntax ([dispatch-table (multimethod-dispatch-table multimethod)]
                     [(struct-type-id ...) (map (compose1 first extract-struct-info syntax-local-value)
                                                (attribute type))])
         #'(let ([struct-types (list struct-type-id ...)])
             (hash-set! dispatch-table struct-types proc))))]))

The resulting implementation is a useful, if certainly incomplete implementation of multimethods in Racket that does not sacrifice the safety provided by racket/generic’s single-dispatch approach.

Related work, advantages and disadvantages, and areas for future improvement

As previously mentioned, this implementation of multiple dispatch was inspired by the types of APIs offered by CLOS and Clojure while also maintaining the safety of racket/generic. The inspiration for the safety rules came from GHC’s detection of orphan instances. Although most of the ideas presented above exist in other places, I am unsure if the concept of safety checking has been used before in any dynamically-typed programming languages.

The primary advantage offered over Racket’s existing generics system is obvious: multiple dispatch. Furthermore, this system can supersede many uses of racket/generic simply by dispatching on a single type. However, the current implementation does not support all of the features of racket/generic, such as supporting non-structure types and allowing fallback implementations. While those are well within the realm of possibility, other things like attaching structure type properties are probably not possible with this approach, so it is unlikely that the existing system could be subsumed by one like this one.

Additionally, this implementation would almost certainly need numerous improvements before being useful to most programmers:

Good error reporting for failure cases. Right now, even something obvious like calling a method on values that do not implement it simply fails with an error produced by hash-ref. In a more interesting sense, using the arity to generate compile-time error messages for define-instance would be a nice improvement.
Support for Racket primitive data types. This might require some cooperation from Racket itself to permit an elegant implementation, but they could also just be special-cased. So long as lookup for primitives was done after consulting the main dispatch table, there wouldn’t be any performance hit for non-primitive types.
Option to supply fallback implementations. This wouldn’t be too hard at all, though it’s questionable whether or not it would be useful without method groupings like define/generic provides. There would likely also need to be some sort of way to check if a set of values implements a particular method.
Better cooperation with structure inspectors to alleviate the need for all structures to be transparent. It’s currently unclear to me how exactly this works and how it should work. There might be a better way to do this without mucking with inspectors.
Much more flexible argument lists, including the ability to specify arguments that are not used for dispatch. This is really a pretty fundamental requirement, but the parsing required was significant enough for me to put it off for this initial prototype.
Scribble forms to document generic methods and their instances. This is something racket/generic doesn’t have, and it has suffered for it. It would be very nice to have easy documentation forms for multimethods.
Proper consideration of struct subtyping. Racket structs support subtyping, which I have not given much thought for this prototype. It is possible that subtyping violates constraints I had assumed would hold, so reviewing the existing code with that context would be useful.

I’m not sure how much effort is involved in most of the above ideas, and in fact I’m not even completely sure how useful this system is to begin with. I have not found myself reaching much for multiple dispatch in my time as a Racket programmer, but that could simply be because it was previously unavailable. It will be interesting to see if that changes now that I have built this system, even if it is a bit rough around the edges.

Conclusion

Despite the lack of need for multiple dispatch to solve most problems, as indicated by its general lack of support in mainstream programming languages, it’s a nice tool to have in the toolbox, and it is asked for in the Racket community from time to time (perhaps due to its familiarity in other parts of the Lisp world). Time will tell if pointing people to something like this will create or stifle interest in multiple dispatch for Racket.

The source for the multimethod package can be found here if you are at all interested in playing with it yourself.

ADTs in Typed Racket with macros

2015-12-21T00:00:00Z

Macros are one of Racket's flagship features, and its macro system really is state of the art. Of course, it can sometimes be difficult to demonstrate why macros are so highly esteemed, in part because it can be hard to find self-contained examples of using macros in practice. Of course, one thing that macros are perfect for is filling a "hole" in the language by introducing a feature a language lacks, and one of those features in Typed Racket is ADTs.

Warning: this is not a macro tutorial

First, a disclaimer: this post assumes at least some knowledge of Scheme/Racket macros. Ideally, you would be familiar with Racket itself. But if you aren't, fear not: if you get lost, don't worry. Hold on to the bigger picture, and you'll likely learn more than someone who knows enough to follow all the way through. If you are interested in learning about macros, I must recommend Greg Hendershott's Fear of Macros. It is good. This is not that.

Now, with that out of the way, let's get started.

What we’re building

Algebraic data types, or ADTs, are a staple of the ML family of functional programming languages. I won't go into detail here—I want to focus on the implementation—but they're a very descriptive way of modeling data that encourages designing functions in terms of pattern-matching, something that Racket is already good at.

Racket also already has a facility for creating custom data structures in the form of structs, which are extremely flexible, but also a little verbose. Racket structs are more powerful than we need, but that means we can implement our ADTs in terms of Racket's struct system.

With that in mind, what should our syntax look like? Well, let's consider a quintessential example of ADTs: modeling a simple tree. For now, let's just consider a tree of integers. For reference, the Haskell syntax for such a data structure would look like this:

data Tree = Empty
          | Leaf Int
          | Node Tree Tree

This already demonstrates a few of the core things we'll need to build:

Each ADT has a data type, in this case Tree. This name only exists in the world of types, it isn't a value.
Each ADT has various data constructors, in this case Leaf and Node.
Each data constructor may accept any number of arguments, each of which have a specific type.
The types that data constructors may accept include the ADT's datatype itself—that is, definitions can be recursive.

Of course, there's one more important feature we're missing: polymorphism. Our definition of a tree is overly-specific, and really, it should be able to hold any kind of data, not just integers. In Haskell, we can do that by adding a type parameter:

data Tree a = Empty
            | Leaf a
            | Node (Tree a) (Tree a)

With this in mind, we can add a fifth and final point to our list:

ADTs must be able to be parametrically polymorphic.

That covers all of our requirements for basic ADTs. Now we're ready to port this idea to Racket.

Describing ADTs in Racket

How should we take the Haskell syntax for an ADT definition and adapt it to Racket's parenthetical s-expressions? By taking some cues from the Haskell implementation, Typed Racket's type syntax, and Racket's naming conventions, a fairly logical syntax emerges:

(define-datatype (Tree a)
  Empty
  (Leaf a)
  (Node (Tree a) (Tree a)))

This looks pretty good. Just like with the Haskell implementation, Tree should only exist at the type level, and Empty, Leaf, and Node should be constructor functions. Our syntax mirrors Racket function application, too—the proper way to create a leaf would be (Leaf 7).

Now that we can create ADT values, how should we extract the values from them? Well, just like in ML-likes, we can use pattern-matching. We don't need to reinvent the wheel for this one; we should be able to just use Racket's match[racket] with our datatypes. For example, a function that sums all the values in a tree might look like this:

(: tree-sum ((Tree Integer) -> Integer))
(define (tree-sum tree)
  (match tree
    [(Empty)    0               ]
    [(Leaf n)   n               ]
    [(Node l r) (+ (tree-sum l)
                   (tree-sum r))]))

Given that Racket's struct form automatically produces identifiers that cooperate with match, this shouldn't be hard at all. And with our syntax settled, we're ready to begin implementation.

Implementing ADTs as syntax

Now for the fun part. To implement our ADT syntax, we'll employ Racket's industrial-strength macro DSL, syntax/parse. The syntax/parse library works like the traditional Scheme syntax-case on steroids, and one of the most useful features is the ability to define "syntax classes" that encapsulate reusable parsing rules into declarative components.

Since this is not a macro tutorial, the following implementation assumes you already know how to use syntax/parse. However, all of the concepts here are well within the reaches of any intermediate macrologist, so don't be intimidated by some of the more complex topics at play.

Parsing types with a syntax class

To implement ADTs, we're going to want to define exactly one syntax class, a class that describes the grammar for a type. As we've seen, types can be bare identifiers, like Tree, or they can be identifiers with parameters, like (Tree a). We'll want to cover both cases.

(begin-for-syntax
  (define-syntax-class type
    (pattern name:id #:attr [param 1] '())
    (pattern (name:id param ...+))))

This syntax class has two rules, one that's a bare identifier, and one that's a list. The ellipsis followed by a plus (...+) in the second example means "one or more", so parsing those parameters will automatically be handled for us. In the bare identifier example, we use #:attr to give the param attribute the default value of an empty list, so this syntax class will actually normalize the input we get in addition to actually parsing it.

A first attempt at `define-datatype`

Now we can move on to actually implementing define-datatype. The rules are simple: we need to generate a structure type for each one of the data constructors, and we need to generate a type definition for the parent type itself. This is pretty simple to implement using syntax-parser, which actually does the parsing for our macro.

(define-syntax define-datatype
  (syntax-parser
    [(_ type-name:type data-constructor:type ...)
     ]))

This definition will do all the parsing we need. It parses the entire macro "invocation", ignoring the first datum with _ (which will just be the identifier define-datatype), then expecting a type-name, which uses the type syntax class we defined above. Next, we expect zero or more data-constructors, which also use the type syntax class. That's all we have to do for parsing. We now have all the information we need to actually output the expansion for the macro.

Of course, it won't be that easy: this is the difficult part. The first step is to generate a Racket struct for each data constructor. We can do this pretty easily with some simple use of Racket's syntax templating facility. A naïve attempt would look like this:

(define-syntax define-datatype
  (syntax-parser
    [(_ type-name:type data-constructor:type ...)
     #'(begin
         (struct data-constructor.name ([f : data-constructor.param] ...)
         ...))]))

This is actually really close to being correct. This will generate a struct definition for each data-constructor, where each struct has the name of the data constructor and the same number of fields as arguments provided. The trouble is that in Racket structs, all of the fields have names, but in our ADTs, all the fields are anonymous and by-position. Currently, we're just using the same name for all the fields, f, so if any data constructor has two or more fields, we'll get an error.

Since we don't care about the field names, what we want to do is just generate random names for every field. To do this, we can use a Racket function called generate-temporary, which generates random identifiers. Our next attempt might look like this:

#`(begin
    (struct data-constructor.name
      ([#,(generate-temporary) : data-constructor.param] ...)
    ...))

The #, lets us "escape" from the template to execute (generate-temporary) and interpolate its result into the syntax. Unfortunately, this doesn't work. We do generate a random field name, but the ellipsis will re-use the same generated value when it repeats the fields, rendering our whole effort pointless. We need to generate the field names once per type.

More leveraging syntax classes

As it turns out, this is also easy to do with syntax classes. We can add an extra attribute to our type syntax class to generate a random identifier with each one. Again, we can use #:attr to do that automatically. Our new definition for type will look like this:

(begin-for-syntax
  (define-syntax-class type
    (pattern name:id
             #:attr [param 1] '()
             #:attr [field-id 1] '())
    (pattern (name:id param ...+)
             #:attr [field-id 1] (generate-temporaries #'(param ...)))))

Here we're using generate-temporaries instead of generate-temporary, which will conveniently generate a new identifier for each of the elements in the list we provide it. This way, we'll get a fresh identifier for each param.

We can now fix our macro to use this field-id attribute instead of the static field name:

#'(begin
    (struct data-constructor.name
      ([data-constructor.field-id : data-constructor.param] ...))
    ...)

Creating the supertype

We're almost done—now we just need to implement our overall type, the one defined by type-name. This is implemented as a trivial type alias, but we need to ensure that polymorphic types are properly handled. For example, a non-polymorphic type would need to be handled like this:

(define-type Tree (U Empty Leaf Node))

However, a polymorphic type alias would need to include the type parameters in each subtype, like this:

(define-type (Tree a) (U (Empty a) (Leaf a) (Node a)))

How can we do this? Well, so far, we've been very declarative by using syntax patterns, templates, and classes. However, this is a more pernicious problem to solve with our declarative tools. Fortunately, it's very easy to fall back to using procedural macros.

To build each properly-instantiated type, we'll use a combination of define/with-syntax and Racket's list comprehensions, for/list. The define/with-syntax form binds values to pattern identifiers, which can be used within syntax patterns just like the ones bound by syntax-parser. This will allow us to break up our result into multiple steps. Technically, define/with-syntax is not strictly necessary—we could just use #` and #,—but it's cleaner to work with.

We'll start by defining a set of instantiated data constructor types, one per data-constructor:

(define/with-syntax [data-type ...]
  (for/list ([name (in-syntax #'(data-constructor.name ...))])
    ))

Now we can fill in the body with any code we'd like, so long as each body returns a syntax object. We can use some trivial branching logic to determine which form we need:

(define/with-syntax [data-type ...]
  (for/list ([name (in-syntax #'(data-constructor.name ...))])
    (if (stx-null? #'(type-name.param ...))
        name
        #`(#,name type-name.param ...))))

Now with our definition for data-type, we can implement our type alias for the supertype extremely easily:

#'(define-type type-name (U data-type ...))

Putting it all together

There's just one more thing to do before we can call this macro finished: we need to ensure that all the type parameters defined by type-name are in scope for each data constructor's structure definition. We can do this by making use of type-name.param within each produced struct definition, resulting in this:

#'(begin
    (struct data-constructor.name (type-name.param ...)
      ([data-constructor.field-id : data-constructor.param] ...))
    ...)

And we're done! The final macro, now completed, looks like this:

(begin-for-syntax
  (define-syntax-class type
    (pattern name:id
             #:attr [param 1] '()
             #:attr [field-id 1] '())
    (pattern (name:id param ...+)
             #:attr [field-id 1] (generate-temporaries #'(param ...)))))

(define-syntax define-datatype
  (syntax-parser
    [(_ type-name:type data-constructor:type ...)

     (define/with-syntax [data-type ...]
       (for/list ([name (in-syntax #'(data-constructor.name ...))])
         (if (stx-null? #'(type-name.param ...))
             name
             #`(#,name type-name.param ...))))

     #'(begin
         (struct (type-name.param ...) data-constructor.name
           ([data-constructor.field-id : data-constructor.param] ...)) ...
         (define-type type-name (U data-type ...)))]))

It's a little bit dense, certainly, but it is not as complicated or scary as it might seem. It's a simple, mostly declarative, powerful way to transform a DSL into ordinary Typed Racket syntax, and now all we have to do is put it to use.

Using our ADTs

With the macro built, we can now actually use our ADTs using the syntax we described! The following is now valid code:

(define-datatype (Tree a)
  Empty
  (Leaf a)
  (Node (Tree a) (Tree a)))

> (Node (Leaf 3) (Node (Empty) (Leaf 7)))
- : (Node Positive-Byte)
(Node (Leaf 3) (Node (Empty) (Leaf 7)))

We can use this to define common data types, such as Haskell's Maybe:

(define-datatype (Maybe a)
  (Just a)
  Nothing)

(: maybe-default (All [a] (Maybe a) a -> a))
(define (maybe-default m v)
  (match m
    [(Just a)  a]
    [(Nothing) v]))

(: maybe-then (All [a] (Maybe a) (a -> (Maybe a)) -> (Maybe a)))
(define (maybe-then m f)
  (match m
    [(Just a)  (f a)]
    [(Nothing) (Nothing)]))

And of course, we can also use it to define ADTs that use concrete types rather that type parameters, if we so desire. This implements a small mathematical language, along with a trivial interpreter:

(define-datatype Expr
  (Value Number)
  (Add Expr Expr)
  (Subtract Expr Expr)
  (Multiply Expr Expr)
  (Divide Expr Expr))

(: evaluate (Expr -> Number))
(define (evaluate e)
  (match e
    [(Value x)      x                            ]
    [(Add a b)      (+ (evaluate a) (evaluate b))]
    [(Subtract a b) (- (evaluate a) (evaluate b))]
    [(Multiply a b) (* (evaluate a) (evaluate b))]
    [(Divide a b)   (/ (evaluate a) (evaluate b))]))

> (evaluate (Add (Value 1)
                 (Multiply (Divide (Value 1) (Value 2))
                           (Value 7))))
4 1/2

There's all the power of ADTs, right in Racket, all implemented in 22 lines of code. If you'd like to see all the code together in a runnable form, I've put together a gist here.

Conclusions and credit

This isn't the simplest macro to create, nor is it the most complex. The code examples might not even make much sense until you try it out yourself. Macros, like any difficult concept, are not always easy to pick up, but they certainly are powerful. The ability to extend the language in such a way, in the matter of minutes, is unparalleled in languages other than Lisp.

This is, of course, a blessing and a curse. Lisps reject some of the syntactic landmarks that often aid in readability for the power to abstract programs into their bare components. In the end, is this uniform conciseness more or less readable? That's an incredibly subjective question, one that has prompted powerfully impassioned discussions, and I will not attempt to argue one way or the other here.

That said, I think it's pretty cool.

Finally, I must give credit where credit is due. Thanks to Andrew M. Kent for the creation of the datatype package, which served as the inspiration for this blog post. Many thanks to Sam Tobin-Hochstadt for his work creating Typed Racket, as well as helping me dramatically simplify the implementation used in this blog post. Also thanks to Ryan Culpepper and Matthias Felleisen for their work on creating syntax/parse, which is truly a marvelous tool for exploring the world of macros, and, of course, a big thanks to Matthew Flatt for his implementation of hygiene in Racket, as well as much of the rest of Racket itself. Not to mention the entire legacy of those who formulated the foundations of the Scheme macro system and created the framework for all of this to be possible so many decades later.

Truly, working in Racket feels like standing on the shoulders of giants. If you're intrigued, give it a shot. It's a fun feeling.

Functionally updating record types in Elm

2015-11-06T00:00:00Z

Elm is a wonderful language for building web apps, and I love so much of its approach to language design. Elm does so many things right straight out of the box, and that's a real breath of fresh air in the intersection of functional programming and web development. Still, it gets one thing wrong, and unfortunately, that one thing is incredibly important. Elm took the "functions" out of "functional record types".

Almost any software program, at its core, is all about data. Maybe it's about computing data, maybe it's about manipulating data, or maybe it's about displaying data, but at the end of the day, some sort of data model is going to be needed. The functional model is a breathtakingly elegant system for handling data and shuttling it around throughout a program, and functional reactive programming, which Elm uses to model event-like interactions, makes this model work even better. The really important thing, though, is what tools Elm actually gives you to model your data.

A brief primer on Elm records

Elm supports all the core datatypes one would expect—numbers, strings, booleans, optionals, etc.—and it allows users to define their own types with ADTs. However, Elm also provides another datatype, which it calls "records". Records are similar to objects in JavaScript: they're effectively key-value mappings. They're cool data structures, and they work well. Here's an example of creating a Point datatype in Elm:

type alias Point =
  { x : Float, y : Float }

Notice that Point is declared as a type alias, not as a separate type like an ADT. This is because record types are truly encoded in the type system as values with named fields, not as disparate types. This allows for some fun tricks, but that's outside the scope of this blog post.

The good

What I'd like to discuss is what it looks like to manipulate these data structures. Constructing them is completely painless, and reading from them is super simple. This is where the record system gets everything very right.

origin : Point
origin = { x = 0, y = 0 }

distanceBetween : Point -> Point -> Float
distanceBetween a b =
  let dx = a.x - b.x
      dy = a.y - b.y
  in sqrt (dx*dx + dy*dy)

The syntax is clean and simple. Most importantly, however, the record system is functional (in the "functional programming" sense). In a functional system, it's useful to express concepts in terms of function composition, and this is very easy to do in Elm. Creating a function to access a field would normally be clunky if you always needed to do record.field to access the value. Fortunately, Elm provides some sugar:

-- These two expressions are equivalent:
(\record -> record.field)
.field

Using the .field shorthand allows writing some other functions in terms of composition, as most functional programmers would desire:

doubledX : Point -> Float
doubledX = ((*) 2) << .x

This satisfies me.

The bad

So if everything in Elm is so great, what am I complaining about? Well, while the syntax to access fields is convenient, the syntax to functionally set fields is questionably clunky. Consider a function that accepts a point and returns a new point with its x field set to 0:

zeroedX : Point -> Point
zeroedX point = { point | x <- 0 }

This doesn't look too bad, does it? It's clear and concise. To me, though, there's something deeply wrong here... this function has a lot of redundancy! It seems to me like we should be able to write this function more clearly in a point-free style. The .field shorthand "functionalizes" the record getter syntax, so there must be a function version of the update syntax, right? Maybe it would look something like this:

zeroedX : Point -> Point
zeroedX = !x 0

But alas, there is no such syntax.

Now you may ask... why does it matter? This seems trivial, and in fact, the explicit updater syntax may actually be more readable by virtue of how explicit it is. You'd be right, because so far, these examples have been horribly contrived. But let's consider a slightly more useful example: functionally updating a record.

What's the difference? Well, say I wanted to take a point and increment its x field by one. Well, I can easily write a function for that:

incrementX : Point -> Point
incrementX point = { point | x <- point.x + 1 }

Not terrible, though a little verbose. Still, what if we want to also add a function that decrements x?

decrementX : Point -> Point
decrementX point = { point | x <- point.x - 1 }

Oh, gosh. That's basically the exact same definition but with the operation flipped. Plus we probably want these operations for y, too. Fortunately, there's an easy solution: just pass a function in to transform the value! We can define an updateX function that allows us to do that easily, then we can define our derived operations in terms of that:

updateX : (Float -> Float) -> Point -> Point
updateX f point = { point | x <- f point.x }

incrementX : Point -> Point
incrementX = updateX ((+) 1)

decrementX : Point -> Point
decrementX = updateX (\x -> x - 1)

Not only is that much cleaner, but we can now use it to implement all sorts of other operations that allow us to add, subtract, multiply, or divide the x field. Now we just need to generalize our solution to work with the x and y fields!

Oh, wait. We can't.

The ugly

This is where everything breaks down completely. Elm does not offer enough abstraction to reduce this level of crazy duplication:

updateX : (Float -> Float) -> Point -> Point
updateX f point = { point | x <- f point.x }

incrementX : Point -> Point
incrementX = updateX ((+) 1)

decrementX : Point -> Point
decrementX = updateX (\x -> x - 1)

updateY : (Float -> Float) -> Point -> Point
updateY f point = { point | y <- f point.y }

incrementY : Point -> Point
incrementY = updateY ((+) 1)

decrementY : Point -> Point
decrementY = updateY (\x -> x - 1)

We sure can give it a shot, though. At the very least, we can implement the increment and decrement functions in a more general way by passing in an updater function:

increment : ((Float -> Float) -> a -> a) -> a -> a
increment update = update ((+) 1)

Now, with updateX and updateY, we can increment either field very clearly and expressively. If we shorten the names to uX and uY, then the resulting code is actually very readable:

pointAbove = uY (\x -> x + 1)
pointBelow = uY (\x -> x - 1)

It's almost like English now: "update Y using this transformation". This is actually pretty satisfactory. The trouble arises when you have a struct with many fields:

type alias PlayerStats =
  { health : Integer
  , strength : Integer
  , charisma : Integer
  , intellect : Integer
  -- etc.
  }

It might be very convenient to have generic functional updaters in this case. One could imagine a game that has Potion items:

type Potion = Potion String (PlayerStats -> PlayerStats)

And then some different kinds of potions:

potions =
  [ (Potion "Health Potion" (uHealth ((+) 1))),
  , (Potion "Greater Intellect Potion" (uIntellect ((+) 3)))
  , (Potion "Potion of Weakness" (uStrength (\x -> x // 5)))
  ]

This is a really elegant way to think about items that can affect a player's stats! Unfortunately, it also means you have to define updater functions for every single field in the record. This can get tedious rather quickly:

uHealth : (Integer -> Integer) -> PlayerStats -> PlayerStats
uHealth f stats = { stats | health <- f stats.health }

uStrength : (Integer -> Integer) -> PlayerStats -> PlayerStats
uStrength f stats = { stats | strength <- f stats.strength }

uCharisma : (Integer -> Integer) -> PlayerStats -> PlayerStats
uCharisma f stats = { stats | charisma <- f stats.charisma }

-- etc.

This is pretty icky. Could there be a better way?

Trying to create a more general abstraction

Interestingly, this pattern doesn't need to be this bad. There are better ways to do this. Let's revisit our updater functions.

Really, update can be defined in terms of two other primitive operations: a read and a (functional) write. What would it look like if we implemented it that way instead of requiring special updater functions to be defined? Well, it would look like this:

update : (a -> b) -> (b -> a -> a) -> (b -> b) -> a -> a
update get set f x = set (f (get x)) x

The type definition is a little long, but it's really pretty simple. We just supply a getter and a setter, then a function to do the transformation, and finally a record to actually transform. Of course, as you can see from the type, this function isn't actually specific to records: it can be used with any value for which a getter and setter can be provided.

The trouble here is that writing field setters isn't any easier in Elm than writing field updaters. They still look pretty verbose:

sHealth : Integer -> PlayerStats -> PlayerStats
sHealth x stats = { stats | health <- x }

uHealth : (Integer -> Integer) -> PlayerStats -> PlayerStats
uHealth = update .health sHealth

So, at the end of it all, this isn't really a better abstraction. Still remember my fantasy !field setter shorthand half a blog post ago? Now perhaps it makes a little more sense. If such a syntax existed, then defining the updater would be incredibly simple:

uHealth : (Integer -> Integer) -> PlayerStats -> PlayerStats
uHealth = update .health !health

Still, no syntax, no easy updaters, and by extension, no easy, declarative description of behavior without quite a bit of boilerplate.

Conclusions and related work

Elm is a very promising language, and it seems to be in fairly rapid development. So far, its author, Evan Czaplicki, has taken a very cautious approach to implementing language features, especially potentially redundant ones. This caution is why things like operator slicing, "where" clauses, and special updater syntax have not yet made it into the language. Maybe at some point these will be deemed important enough to include, but for the time being, they've been excluded.

I obviously think that having this sort of thing is incredibly important to being able to write expressive code without a huge amount of overhead. However, I also do not want to give the impression that I think adding special setter syntax is the only way to do it.

Seasoned functional programmers will surely have noticed that many of these concepts sound a lot like lenses, and Elm actually already has a lens-like library authored by Evan himself, called Focus. This, however, does not actually solve the problem: it requires manual description of setters just like the purely function based approach does. Really, lenses are just the logical next step in the line of abstraction I've already laid out above.

Interestingly, PureScript and Elm, the two Haskell-likes-on-the-frontend that I've paid attention to (though PureScript is much closer to Haskell than Elm), both have this very same problem. Haskell itself solves it with macros via Template Haskell. My favorite language, Racket, solves it with its own macro system. Is there another way to do these things that doesn't involve introducing a heavyweight macro system? Definitely. But I think this is a necessary feature, not a "nice to have", so if a macro system is out of the picture, then a simpler, less flexible solution is the obvious logical alternative.

I really like Elm, and most of my experiences with it have been more than enough to convince me that it is a fantastic language for the job. Unfortunately, the issue of functional record updaters has been quite the frustrating obstacle in my otherwise frictionless ride. I will continue to happily use Elm over other, far less accommodating tools, but I hope that issues like these will be smoothed out as the language and its ecosystem matures.

Canonical factories for testing with factory_girl_api

2015-09-23T00:00:00Z

Modern web applications are often built as single-page apps, which are great for keeping concerns separated, but problematic when tested. Logic needs to be duplicated in front- and back-end test suites, and if the two apps diverge, the tests won't catch the failure. I haven't found a very good solution to this problem aside from brittle, end-to-end integration tests.

To attempt to address a fraction of this problem, I built factory_girl_api, a way to share context setup between both sides of the application.

A brief overview of factory_girl

In the land of Ruby and Rails, factory_girl is a convenient gem for managing factories for models. Out of the box, it integrates with Rails' default ORM, ActiveRecord, and provides declarative syntax for describing what attributes factories should initialize. For example, a factory declaration used to create a widget might look like this:

FactoryGirl.define do
  factory :widget do
    sequence(:name) { |id| 'Widget #' + id }
    price 10

    trait :expensive do
      price 1000
    end
  end
end

This makes it easy to create new instances of Widget and use them for unit tests. For example, this would create and persist a widget with a unique name and a price of 10 units:

widget = FactoryGirl.create :widget

We can also create more expensive widgets by using the :expensive trait.

expensive_widget = FactoryGirl.create :widget, :expensive

Any number of traits can be specified at once. Additionally, it is possible to override individual attributes manually.

fancy_widget = FactoryGirl.create :widget, :expensive, name: 'Fancy Widget'

It works well, and it keeps initialization boilerplate out of individual tests.

Testing on the front-end

Trouble arises when we need to write tests for the JavaScript application that use the same models. Suddenly, we need to duplicate the same kind of logic in our front-end tests. We might start out by setting up object state manually:

var fancyWidget = new Widget({
  name: 'Fancy Widget',
  price: 1000
});

Things can quickly get out of hand when models grow complex. Even if we use a factory library in JavaScript, it's possible for our front-end factories to diverge from their back-end counterparts. This means our integration tests will fail, but our unit tests will still blindly pass. Having to duplicate all that logic in two places is dangerous. It would be nice to have a single, canonical source for all of our factories.

Reusing server-side factories with factory_girl_api

To help alleviate this problem, I created the factory_girl_api gem for Rails and the angular-factory-girl-api Bower package for Angular. These packages cooperate with each other to allow server-side factories to be used in JavaScript tests.

The Angular module provides a service with syntax comparable to factory_girl itself. Both traits and custom attributes are supported:

FactoryGirl.create('widget', 'expensive', { name: 'Fancy Widget' });

In this case, however, a round-trip API call must be made to the server in order to call the factory and return the result. Because of this, the Angular version of FactoryGirl returns a promise that is resolved with the serialized version of the model, which can then be used as sample data in unit tests.

The problems with relying on the server for data

In my preliminary use of this tool, it works. In many ways, it's much nicer than duplicating logic in both places. However, I'm not completely convinced it's the right solution yet.

First of all, it couples the front-end to the back-end, even during unit testing, which is disappointing. It means that a server needs to be running (in test mode) in order for the tests to run at all. For the kinds of projects I work on, this isn't really a bad thing, and the benefits of the reduced duplication far outweigh the downsides.

My real concern is that this solves a very small facet of the general problem with fragile front-end test suites. Single-page applications usually depend wholly on their integration with back-end APIs. If those APIs change, the tests will continue to happily pass as long as the API is simply mocked, which seems to be the usual solution in the front-end universe. This is, frankly, unacceptable in real application development.

Potential improvements and other paths to success

I am ultimately unsatisfied with this approach, but writing brittle end-to-end integration tests is not the solution. This kind of thing may be a step in the right direction: writing tests that aren't really pure unit tests, but also aren't fragile full-stack integration tests. This is a middle-ground that seems infrequently traveled, perhaps due to a lack of tooling (or perhaps because it just doesn't work). I don't know.

Either way, I'm interested in where this is headed, and I'll be curious to see if I run into any roadblocks using the workflow I've created. If anyone else is interested in playing with these two libraries, the READMEs are much more comprehensive than what I've covered here. Take a look, and give them a spin!

Managing application configuration with Envy

2015-08-30T00:00:00Z

Application configuration can be a pain. Modern web apps don't live on dedicated boxes, they run on VPSes somewhere in the amorphous "cloud", and keeping configuration out of your application's repository can seem like more trouble than it's worth. Fortunately, The Twelve-Factor App provides a set of standards for keeping web apps sane, and one of those guidelines advises keeping configuration in the environment.

Envy is the declarative bridge between Racket code and the outside world of the environment.

Introducing Envy

I built Envy to distill the common tasks needed when working with environment variables into a single, declarative interface that eliminates boilerplate and makes it easy to see which environment variables an application depends on (instead of having them littered throughout the codebase). Using it is simple. Just require envy and you're good to go.

The best way to use Envy is to create a "manifest" module that declares all the environment variables your application might use. For example, the following module is a manifest that describes an application that uses three environment variables:

; environment.rkt
#lang typed/racket/base

(require envy)

(define/provide-environment
  api-token
  [log-level : Symbol #:default 'info]
  [parallel? : Boolean])

When this module is required, Envy will automatically do the following:

Envy will check the values of three environment variables: API_TOKEN, LOG_LEVEL, and PARALLEL.

If either API_TOKEN or PARALLEL is not set, an error will be raised:

envy: The required environment variable "API_TOKEN" is not defined.

The values for LOG_LEVEL and PARALLEL will be parsed to match their type annotations.
If LOG_LEVEL is not set, it will use the default value, 'info.
The values will be stored in api-token, log-level, and parallel?, all of which will be provided by the enclosing module.

Now just (require (prefix-in env: "environment.rkt")), and the environment variables are guaranteed to be available in your application's code.

Working with Typed Racket

As you may have noticed by the example above, Envy is built with Typed Racket in mind. In fact, define/provide-environment will only work within a Typed Racket module, but that doesn't mean Envy can't be used with plain Racket—the manifest module can always be required by any kind of Racket module.

However, when using Typed Racket, Envy provides additional bonuses. Environment variables are inherently untyped—they're all just strings—but Envy assigns the proper type to each environment variable automatically, so no casting is necessary.

> parallel?
- : Boolean
#t

Envy really shines when using optional environment variables with the #:default option. The type of the value given to #:default doesn't need to be the same type of the environment variable itself, and if it isn't, Envy will assign the value a union type.

> (define-environment
    [num-threads : Positive-Integer #:default #f])
> num-threads
- : (U Positive-Integer #f)
#f

This added level of type-safety means it's easy to manage optional variables that don't have reasonable defaults: the type system will enforce that all code considers the possibility that such variables do not exist.

And more...

To see the full set of features that Envy already provides, take a look at the documentation. That said, this is just the first release based on my initial use-cases, but I'm sure there are more features Envy could have to accommodate common application configuration patterns. If you have an idea that could make Envy better, open an issue and make a suggestion! I already have plans for a #lang envy DSL, which will hopefully cut the boilerplate out in its entirety.

And finally, to give credit where credit is due, Envy is heavily inspired by Envied (both in name and function), an environment variable manager for Ruby, which I've used to great effect.

Try it out!

raco pkg install envy
Envy on GitHub
Envy documentation

Deploying Racket applications on Heroku

2015-08-22T00:00:00Z

Heroku is a "platform as a service" that provides an incredibly simple way to deploy simple internet applications, and I take liberal advantage of its free tier for testing out simple applications. It has support for a variety of languages built-in, but Racket is not currently among them. Fortunately, Heroku provides an interface for adding custom build processes for arbitrary types of applications, called “buildpacks”. I've built one for Racket apps, and with just a little bit of configuration, it’s possible to get a Racket webserver running on Heroku.

Building the server

Racket's web-server package makes building and running a simple server incredibly easy. Here's all the code we'll need to get going:

#lang racket

(require web-server/servlet
         web-server/servlet-env)

(define (start req)
  (response/xexpr
   '(html (head (title "Racket Heroku App"))
          (body (h1 "It works!")))))

(serve/servlet start #:servlet-path "/")

Running the above file will start up the server on the default port, 8080. When running on Heroku, however, we're required to bind to the port that Heroku provides via the PORT environment variable. We can access this using the Racket getenv[racket] function.

Additionally, the Racket web server specifically binds to localhost, but Heroku doesn't allow that restriction, so we need to pass #f for the #:listen-ip argument.

(define port (if (getenv "PORT")
                 (string->number (getenv "PORT"))
                 8080))
(serve/servlet start
               #:servlet-path "/"
               #:listen-ip #f
               #:port port)

Also, by default, serve/servlet[racket] will open a web browser automatically when the program is run, which is very useful for rapid prototyping within something like DrRacket, but we'll want to turn that off.

(serve/servlet start
               #:servlet-path "/"
               #:listen-ip #f
               #:port port
               #:command-line? #t)

That's it! Now we have a Racket web server that can run on Heroku. Obviously it's not a very interesting application right now, but that's fine for our purposes.

Setting up our app for Heroku

The next step is to actually create an app on Heroku. Don't worry—it's free! That said, explaining precisely how Heroku works is outside the scope of this article. Just make an account, then create an app. I called mine "racket-heroku-sample". Once you've created an app and set up Heroku's command-line tool, you can specify the proper buildpack:

$ git init
$ heroku git:remote -a racket-heroku-sample
$ heroku buildpacks:set https://github.com/lexi-lambda/heroku-buildpack-racket

We'll also need to pick a particular Racket version before we deploy our app. At the time of this writing, Racket 6.2.1 is the latest version, so I just set the RACKET_VERSION environment variable as follows:

$ heroku config:set RACKET_VERSION=6.2.1

Now there's just one thing left to do before we can push to Heroku: we need to tell Heroku what command to use to run our application. To do this, we use something called a "Procfile" that contains information about the process types for our app. Heroku supports multiple processes of different types, but we're just going to have a single web process.

Specifically, we just want to run our serve.rkt module. The Racket buildpack installs the repository as a package, so we can run racket with the -l flag to specify a module path, which will be more robust than specifying a filesystem path directly. Therefore, our Procfile will look like this:

web: racket -l sample-heroku-app/server

Now all that's left to do is push our repository to Heroku's git remote. Once the build completes, we can navigate to our app's URL and actually see it running live!

Conclusion

That's all that's needed to get a Racket app up and running on Heroku, but it probably isn't the best way to manage a real application. Usually it's best to use a continuous integration service to automatically deploy certain GitHub branches to Heroku, after running the tests, of course. Also, a real application would obviously be a little more complicated.

That said, this provides the foundation and shell. If you'd like to see the sample app used in this post, you can find it on GitHub here. For more details on the buildpack itself, it's also available on GitHub here.

Automatically deploying a Frog-powered blog to GitHub pages

2015-07-18T00:00:00Z

So, I have a blog now. It's a simple static blog, but what's unique about it is that it's powered by Racket; specifically, it uses Greg Hendershott's fantastic Frog tool. I've taken this and moulded it to my tastes to build my blog, including configuring automatic deployment via Travis CI, so my blog is always up-to-date.

Setting up Frog

I should note that Frog itself was wonderfully easy to drop in and get running. Just following the readme, a simple raco pkg install frog followed by raco frog --init and raco frog -bp created a running blog and opened it in my web browser. There was nothing more to it. Once that's done, all it takes to write a blog post is raco frog -n "Post Title", and you're good to go.

By default, Frog uses Bootstrap, which provides a lot of the necessary scaffolding for you, but I opted to roll my own layout using flexbox. I also decided to use Sass for my stylesheets, potentially with support for CoffeeScript later, so I wanted to have a good flow for compiling all the resources for deployment. To do that, I used Gulp in conjunction with NPM for build and dependency management.

Going this route has a few advantages, primarily the fact that updating dependencies becomes much easier, and I can build and deploy my blog with just a couple of commands without needing to commit compiled or minified versions of my sources to version control.

Configuring automatic deployment with Travis

Once Frog itself was configured and my styling was finished, I started looking into how to deploy my blog to a GitHub page without needing to check in any of the generated files to source control. I found a couple of resources, the most useful one being this Gist, which describes how to set up deployment for any project. The basic idea is to create a deployment script which will automatically generate your project, initialize a git repository with the generated files, and push to GitHub's special gh-pages branch.

To make this easy, Frog can be configured to output to a separate directory via the .frogrc configuration file. I chose to output to the out directory:

output-dir = out

I also configured my Gulp build to output my CSS into the same output directory. Now, all that's necessary in order to deploy the blog to GitHub is to initialize a Git repository in the output directory, and push the files to the remote branch.

$ cd out
$ git init
$ git add .
$ git commit -m "Deploy to GitHub Pages"
$ git push --force "$REMOTE_URL" master:gh-pages

The next step is to configure Travis so that it can securely push to the GitHub repository with the required credentials. This can be done with Travis's encryption keys along with a GitHub personal access token. Just install the Travis CLI client, copy the access token, and run a command:

$ gem install travis
$ travis encrypt GH_TOKEN=<access token...>

The output of that command is an encrypted value to be placed in an environment variable in the project's .travis.yml configuration file. The URL for the repository on GitHub will also need to be specified as well:

env:
  global:
  - GH_REF: 'github.com/<gh-username>/<gh-repo>.git'
  - secure: <encrypted data...>

Now all that's left is configuring the .travis.yml to run Frog. Since Travis doesn't natively support Racket at the time of this writing, the choice of "language" is somewhat arbitrary, but since I want Pygments installed for syntax highlighting, I set my project type to python, then installed Racket and Frog as pre-installation steps.

env:
  global:
  - GH_REF: 'github.com/<gh-username>/<gh-repo>.git'
  - secure: <encrypted data...>
  - RACKET_DIR: '~/racket'
  - RACKET_VERSION: '6.2'

before_install:
- git clone https://github.com/greghendershott/travis-racket.git
- cat travis-racket/install-racket.sh | bash
- export PATH="${RACKET_DIR}/bin:${PATH}"

install:
- raco pkg install --deps search-auto frog

(It might be worth noting that Greg Hendershott also maintains the repository that contains the above Travis build script!)

Finally, in my case, I wasn't deploying to a project-specific GitHub page. Instead, I wanted to deploy to my user page, which uses master, not gh-pages. Obviously, I didn't want Travis running on my master branch, since it would be deploying to that, so I added a branch whitelist:

branches:
  only:
  - source

All that was left to do was to write up the actual deployment script to be used by Travis. Based on the one provided in the above Gist, mine looked like this:

#!/bin/bash
set -ev # exit with nonzero exit code if anything fails

# clear the output directory
rm -rf out || exit 0;

# build the blog files + install pygments for highlighting support
npm install
npm run build
pip install pygments
raco frog --build

# go to the out directory and create a *new* Git repo
cd out
git init

# inside this git repo we'll pretend to be a new user
git config user.name "Travis CI"
git config user.email "<your@email.here>"

# The first and only commit to this new Git repo contains all the
# files present with the commit message "Deploy to GitHub Pages".
git add .
git commit -m "Deploy to GitHub Pages"

# Force push from the current repo's master branch to the remote
# repo. (All previous history on the branch will be lost, since we are
# overwriting it.) We redirect any output to /dev/null to hide any sensitive
# credential data that might otherwise be exposed.
git push --force --quiet "https://${GH_TOKEN}@${GH_REF}" master > /dev/null 2>&1

For reference, my final .travis.yml looked like this:

language: python
python:
- '3.4'

branches:
  only:
  - source

env:
  global:
  - GH_REF: 'github.com/lexi-lambda/lexi-lambda.github.io.git'
  - secure: <long secure token...>
  - RACKET_DIR: '~/racket'
  - RACKET_VERSION: '6.2'

before_install:
- git clone https://github.com/greghendershott/travis-racket.git
- cat travis-racket/install-racket.sh | bash
- export PATH="${RACKET_DIR}/bin:${PATH}"

install:
- raco pkg install --deps search-auto frog

script: bash ./deploy.sh

That's it! Now I have a working blog that I can publish just by pushing to the source branch on GitHub.

Alexis King’s Blog

An introduction to typeclass metaprogramming

Part 1: Basic building blocks

Typeclasses as functions from types to terms

Type-level interpreters

Overlapping instances

Type families are functions from types to types

Example 1: Generalized concat

Typeclasses as compile-time code generation

Part 2: Generic programming

Open type families and associated types

Example 2: Datatype-generic programming

Generic datatype representations

Improving our definition of Generic

Distinguishing leaves from the spine

Handling empty constructors

Continuing from here

Part 3: Dependent typing

Datatype promotion

GADTs and proof terms

Proofs that work together

Proof inference

Aside: GADTs versus type families

Guiding type inference

Example 3: Subtyping constraints

Wrapping up and closing thoughts

Names are not type safety

Intrinsic and extrinsic safety

Revisiting non-empty lists

Newtypes as tokens

Other newtype use, abuse, and misuse

Final thoughts and related reading

Types as axioms, or: playing god with static types

Seeing the types half-empty

Taking back types

Positive versus negative space

Types as axiom schemas

“But what if I don’t write Haskell?” And other closing thoughts

No, dynamic type systems are not inherently more open

Two typing fallacies

You can’t process what you don’t know

Keeping opaque data opaque

Reflection is not special

Appendix: the reality behind the myths

Parse, don’t validate

The essence of type-driven design

The realm of possibility

Turning partial functions total

Managing expectations

Paying it forward

The power of parsing

The danger of validation

Parsing, not validating, in practice

Recap, reflection, and related reading

Empathy and subjective experience in programming languages

2015 called, and they want their dress back

When something objective isn’t

Subjectivity in programming, and in programming languages specifically

The unsatisfying subjective reality of programming languages

Empathy, and how bad results come from good intentions

Demystifying MonadBaseControl

The higher-order action problem

A naïve solution

The essence of MonadBaseControl

Hiding the input state

Coping with partiality

Scaling to the real MonadBaseControl

Sharing the input state

Sidebar: continuation-passing and impredicativity

Pitfalls

No polymorphism, no lifting

The dangers of discarded state

Rewindable state

Partially discarded state

Forking state

MonadBaseControl in context

The remainder of monad-control

lifted-base and lifted-async

Relationship to MonadUnliftIO

Recap

Example 1: Generalized `concat`

Improving our definition of `Generic`

The essence of `MonadBaseControl`

Scaling to the real `MonadBaseControl`

`MonadBaseControl` in context

The remainder of `monad-control`

`lifted-base` and `lifted-async`

Relationship to `MonadUnliftIO`

A closer look at `local-apply-transformer`

Applying `local-apply-transformer`

Understanding `stack`’s model and avoiding its biggest gotcha

Actually building your project with `stack`