Alexis King’s Blog2021-03-25T00:00:00ZAn introduction to typeclass metaprogramming2021-03-25T00:00:00Z2021-03-25T00:00:00ZAlexis King<article><p><em>Typeclass metaprogramming</em> is a powerful technique available to Haskell programmers to automatically generate term-level code from static type information. It has been used to great effect in several popular Haskell libraries (such as the <a href="https://hackage.haskell.org/package/servant">servant</a> ecosystem), and it is the core mechanism used to implement generic programming via <a href="https://hackage.haskell.org/package/base-4.14.1.0/docs/GHC-Generics.html">GHC generics</a>. Despite this, remarkably little material exists that explains the technique, relegating it to folk knowledge known only to advanced Haskell programmers.</p><p>This blog post attempts to remedy that by providing an overview of the foundational concepts behind typeclass metaprogramming. It does <em>not</em> attempt to be a complete guide to type-level programming in Haskell—such a task could easily fill a book—but it does provide explanations and illustrations of the most essential components. This is also <em>not</em> a blog post for Haskell beginners—familiarity with the essentials of the Haskell type system and several common GHC extensions is assumed—but it does not assume any prior knowledge of type-level programming.</p><h2><a name="part-1-basic-building-blocks"></a>Part 1: Basic building blocks</h2><p>Typeclass metaprogramming is a big subject, which makes covering it in a blog post tricky. To break it into more manageable chunks, this post is divided into several parts, each of which introduces new type system features or type-level programming techniques, then presents an example of how they can be applied.</p><p>To start, we’ll cover the absolute foundations of typeclass metaprogramming.</p><h3><a name="typeclasses-as-functions-from-types-to-terms"></a>Typeclasses as functions from types to terms</h3><p>As its name implies, typeclass metaprogramming (henceforth TMP<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup>) centers around Haskell’s typeclass construct. Traditionally, typeclasses are viewed as a mechanism for principled operator overloading; for example, they underpin Haskell’s polymorphic <code>==</code> operator via the <code>Eq</code> class. Though that is often the most useful way to think about typeclasses, TMP encourages a different perspective: <strong>typeclasses are functions from types to (runtime) terms</strong>.</p><p>What does that mean? Let’s illustrate with an example. Suppose we define a typeclass called <code>TypeOf</code>:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">TypeOf</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">String</span></code></pre><p>The idea is that this typeclass will accept some value and return the name of its type as a string. To illustrate, here are a couple potential instances:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">TypeOf</span> <span class="kt">Bool</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="kr">_</span> <span class="ow">=</span> <span class="s">"Bool"</span>
<span class="kr">instance</span> <span class="kt">TypeOf</span> <span class="kt">Char</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="kr">_</span> <span class="ow">=</span> <span class="s">"Char"</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">TypeOf</span> <span class="n">a</span><span class="p">,</span> <span class="kt">TypeOf</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">TypeOf</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=</span> <span class="s">"("</span> <span class="o">++</span> <span class="n">typeOf</span> <span class="n">a</span> <span class="o">++</span> <span class="s">", "</span> <span class="o">++</span> <span class="n">typeOf</span> <span class="n">b</span> <span class="o">++</span> <span class="s">")"</span></code></pre><p>Given these instances, we can observe that they do what we expect in GHCi:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">typeOf</span> <span class="p">(</span><span class="kt">True</span><span class="p">,</span> <span class="sc">'a'</span><span class="p">)</span>
<span class="s">"(Bool, Char)"</span></code></pre><p>Note that both the <code>TypeOf Bool</code> and <code>TypeOf Char</code> instances ignore the argument to <code>typeOf</code> altogether. This makes sense, as the whole point of the <code>TypeOf</code> class is to get access to <em>type</em> information, which is the same regardless of which value is provided. To make this more explicit, we can take advantage of some GHC extensions to eliminate the value-level argument altogether:</p><pre><code class="pygments"><span class="cm">{-# LANGUAGE AllowAmbiguousTypes, ScopedTypeVariables, TypeApplications #-}</span>
<span class="kr">class</span> <span class="kt">TypeOf</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">::</span> <span class="kt">String</span></code></pre><p>This typeclass definition is a little unusual, as the type parameter <code>a</code> doesn’t appear anywhere in the body. To understand what it means, recall that the type of each method of a typeclass is implicitly extended with the typeclass’s constraint. For example, in the definition</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Show</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">show</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">String</span></code></pre><p>the full type of the <code>show</code> method is implicitly extended with a <code>Show a</code> constraint to yield:</p><pre><code class="pygments"><span class="nf">show</span> <span class="ow">::</span> <span class="kt">Show</span> <span class="n">a</span> <span class="ow">=></span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">String</span></code></pre><p>Furthermore, if we write <code>forall</code>s explicitly, each typeclass method is also implicitly quantified over the class’s type parameters, which makes the following the <em>full</em> type of <code>show</code>:</p><pre><code class="pygments"><span class="nf">show</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="kt">Show</span> <span class="n">a</span> <span class="ow">=></span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">String</span></code></pre><p>In the same vein, we can write out the full type of <code>typeOf</code>, as given by our new definition of <code>TypeOf</code>:</p><pre><code class="pygments"><span class="nf">typeOf</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="kt">TypeOf</span> <span class="n">a</span> <span class="ow">=></span> <span class="kt">String</span></code></pre><p>This type is still unusual, as the <code>a</code> type parameter doesn’t appear anywhere to the right of the <code>=></code> arrow. This makes the type parameter trivially <em>ambiguous</em>, which is to say it’s impossible for GHC to infer what <code>a</code> should be at any call site. Fortunately, <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/type_applications.html">we can use <code>TypeApplications</code></a> to pass a type for <code>a</code> directly, as we can see in the updated definition of <code>TypeOf (a, b)</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">TypeOf</span> <span class="kt">Bool</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">=</span> <span class="s">"Bool"</span>
<span class="kr">instance</span> <span class="kt">TypeOf</span> <span class="kt">Char</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">=</span> <span class="s">"Char"</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">TypeOf</span> <span class="n">a</span><span class="p">,</span> <span class="kt">TypeOf</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">TypeOf</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">=</span> <span class="s">"("</span> <span class="o">++</span> <span class="n">typeOf</span> <span class="o">@</span><span class="n">a</span> <span class="o">++</span> <span class="s">", "</span> <span class="o">++</span> <span class="n">typeOf</span> <span class="o">@</span><span class="n">b</span> <span class="o">++</span> <span class="s">")"</span></code></pre><p>Once again, we can test out our new definitions in GHCi:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">typeOf</span> <span class="o">@</span><span class="kt">Bool</span>
<span class="s">"Bool"</span>
<span class="nf">ghci</span><span class="o">></span> <span class="n">typeOf</span> <span class="o">@</span><span class="p">(</span><span class="kt">Bool</span><span class="p">,</span> <span class="kt">Char</span><span class="p">)</span>
<span class="s">"(Bool, Char)"</span></code></pre><p>This illustrates very succinctly how typeclasses can be seen as functions from types to terms. Our <code>typeOf</code> function is, quite literally, a function that accepts a single type as an argument and returns a term-level <code>String</code>. Of course, the <code>TypeOf</code> typeclass is not a particularly <em>useful</em> example of such a function, but it demonstrates how easy it is to construct.</p><h3><a name="type-level-interpreters"></a>Type-level interpreters</h3><p>One important consequence of eliminating the value-level argument of <code>typeOf</code> is that there is no need for its argument type to actually be <em>inhabited</em>. For example, consider the <code>TypeOf</code> instance on <code>Void</code> from <code>Data.Void</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">TypeOf</span> <span class="kt">Void</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">=</span> <span class="s">"Void"</span></code></pre><p>This above instance is no different from the ones on <code>Bool</code> and <code>Char</code> even though <code>Void</code> is a completely uninhabited type. This is an important point: as we delve into type-level programming, it’s important to keep in mind that the language of types is mostly blind to the term-level meaning of those types. Although we usually write typeclasses that operate on values, this is not at all essential. This turns out to be quite important in practice, even in something as simple as the definition of <code>TypeOf</code> on lists:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">TypeOf</span> <span class="n">a</span> <span class="ow">=></span> <span class="kt">TypeOf</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="kr">where</span>
<span class="n">typeOf</span> <span class="ow">=</span> <span class="s">"["</span> <span class="o">++</span> <span class="n">typeOf</span> <span class="o">@</span><span class="n">a</span> <span class="o">++</span> <span class="s">"]"</span></code></pre><p>If <code>typeOf</code> required a value-level argument, not just a type, our instance above would be in a pickle when given the empty list, since it would have no value of type <code>a</code> to recursively apply <code>typeOf</code> to. But since <code>typeOf</code> only accepts a type-level argument, the term-level meaning of the list type poses no obstacle.</p><p>A perhaps unintuitive consequence of this property is that we can use typeclasses to write interesting functions on types even if none of the types are inhabited at all. For example, consider the following pair of type definitions:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Z</span>
<span class="kr">data</span> <span class="kt">S</span> <span class="n">a</span></code></pre><p>It is impossible to construct any values of these types, but we can nevertheless use them to construct natural numbers at the type level:</p><ul><li><p><code>Z</code> is a type that represents 0.</p></li><li><p><code>S Z</code> is a type that represents 1.</p></li><li><p><code>S (S Z)</code> is a type that represents 2.</p></li></ul><p>And so on. These types might not seem very useful, since they aren’t inhabited by any values, but remarkably, we can still use a typeclass to distinguish them and convert them to term-level values:</p><pre><code class="pygments"><span class="kr">import</span> <span class="nn">Numeric.Natural</span>
<span class="kr">class</span> <span class="kt">ReifyNat</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">reifyNat</span> <span class="ow">::</span> <span class="kt">Natural</span>
<span class="kr">instance</span> <span class="kt">ReifyNat</span> <span class="kt">Z</span> <span class="kr">where</span>
<span class="n">reifyNat</span> <span class="ow">=</span> <span class="mi">0</span>
<span class="kr">instance</span> <span class="kt">ReifyNat</span> <span class="n">a</span> <span class="ow">=></span> <span class="kt">ReifyNat</span> <span class="p">(</span><span class="kt">S</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">reifyNat</span> <span class="ow">=</span> <span class="mi">1</span> <span class="o">+</span> <span class="n">reifyNat</span> <span class="o">@</span><span class="n">a</span></code></pre><p>As its name implies, <code>reifyNat</code> reifies a type-level natural number encoded using our datatypes above into a term-level <code>Natural</code> value:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">reifyNat</span> <span class="o">@</span><span class="kt">Z</span>
<span class="mi">0</span>
<span class="nf">ghci</span><span class="o">></span> <span class="n">reifyNat</span> <span class="o">@</span><span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">)</span>
<span class="mi">1</span>
<span class="nf">ghci</span><span class="o">></span> <span class="n">reifyNat</span> <span class="o">@</span><span class="p">(</span><span class="kt">S</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">))</span>
<span class="mi">2</span></code></pre><p>One way to think about <code>reifyNat</code> is as an <em>interpreter</em> of a type-level language. In this case, the type-level language is very simple, only capturing natural numbers, but in general, it could be arbitrarily complex—and typeclasses can be used to give it a useful meaning, even if it has no term-level representation.</p><h3><a name="overlapping-instances"></a>Overlapping instances</h3><p>Generally, typeclass instances aren’t supposed to overlap. That is, if you write an instance for <code>Show (Maybe a)</code>, you aren’t supposed to <em>also</em> write an instance for <code>Show (Maybe Bool)</code>, since it isn’t clear whether <code>show (Just True)</code> should use the first instance or the second. For that reason, by default, GHC rejects any form of instance overlap as soon as it detects it.</p><p>Usually, this is the right behavior. Due to the way Haskell’s typeclass system is designed to preserve coherency—that is, the same combination of type arguments always selects the same instance—overlapping instances can be unintuitive or even cause nonsensical behavior if orphan instances are defined. However, when doing TMP, it’s useful to make exceptions to that rule of thumb, so GHC provides the option to explicitly opt-in to overlapping instances.</p><p>As a simple example, suppose we wanted to write a typeclass that checks whether a given type is <code>()</code> or not:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">IsUnit</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">isUnit</span> <span class="ow">::</span> <span class="kt">Bool</span></code></pre><p>If we were to write an ordinary, value-level function, we could write something like this pseudo-Haskell:</p><pre><code class="pygments"><span class="c1">-- not actually valid Haskell, just an example</span>
<span class="nf">isUnit</span> <span class="ow">::</span> <span class="o">*</span> <span class="ow">-></span> <span class="kt">Bool</span>
<span class="nf">isUnit</span> <span class="nb">()</span> <span class="ow">=</span> <span class="kt">True</span>
<span class="nf">isUnit</span> <span class="kr">_</span> <span class="ow">=</span> <span class="kt">False</span></code></pre><p>But if we try to translate this to typeclass instances, we’ll get a problem:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">IsUnit</span> <span class="nb">()</span> <span class="kr">where</span>
<span class="n">isUnit</span> <span class="ow">=</span> <span class="kt">True</span>
<span class="kr">instance</span> <span class="kt">IsUnit</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">isUnit</span> <span class="ow">=</span> <span class="kt">False</span></code></pre><p>The problem is that a function definition has a closed set of clauses matched from top to bottom, but typeclass instances are open and unordered.<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup> This means GHC will complain about instance overlap if we try to evaluate <code>isUnit @()</code>:</p><pre><code>ghci> isUnit @()
error:
• Overlapping instances for IsUnit ()
arising from a use of ‘isUnit’
Matching instances:
instance IsUnit a
instance IsUnit ()
</code></pre><p>To fix this, we have to explicitly mark <code>IsUnit ()</code> as overlapping:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="cm">{-# OVERLAPPING #-}</span> <span class="kt">IsUnit</span> <span class="nb">()</span> <span class="kr">where</span>
<span class="n">isUnit</span> <span class="ow">=</span> <span class="kt">True</span></code></pre><p>Now GHC accepts the expression without complaint:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">isUnit</span> <span class="o">@</span><span class="nb">()</span>
<span class="kt">True</span></code></pre><p>What does the <code>{-# OVERLAPPING #-}</code> pragma do, exactly? The gory details are <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/instances.html#overlapping-instances">spelled out in the GHC User’s Guide</a>, but the simple explanation is that <code>{-# OVERLAPPING #-}</code> relaxes the overlap checker as long as the instance is <em>strictly more specific</em> than the instance(s) it overlaps with. In this case, that is true: <code>IsUnit ()</code> is trivially more specific than <code>IsUnit a</code>, since the former only matches <code>()</code> while the latter matches anything at all. That means our overlap is well-formed, and instance resolution should behave the way we’d like.</p><p>Overlapping instances are a useful tool when performing TMP, as they make it possible to write piecewise functions on types in the same way it’s possible to write piecewise functions on terms. However, they must still be used with care, as without understanding how they work, they can produce unintuitive results. For an example of how things can go wrong, consider the following definition:</p><pre><code class="pygments"><span class="nf">guardUnit</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Either</span> <span class="kt">String</span> <span class="n">a</span>
<span class="nf">guardUnit</span> <span class="n">x</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">isUnit</span> <span class="o">@</span><span class="n">a</span> <span class="kr">of</span>
<span class="kt">True</span> <span class="ow">-></span> <span class="kt">Left</span> <span class="s">"unit is not allowed"</span>
<span class="kt">False</span> <span class="ow">-></span> <span class="kt">Right</span> <span class="n">x</span></code></pre><p>The intent of <code>guardUnit</code> is to use <code>isUnit</code> to detect if its argument is of type <code>()</code>, and if it is, to return an error. However, even though we marked <code>IsUnit ()</code> overlapping, we still get an overlapping instance error:</p><pre><code>error:
• Overlapping instances for IsUnit a arising from a use of ‘isUnit’
Matching instances:
instance IsUnit a
instance [overlapping] IsUnit ()
• In the expression: isUnit @a
</code></pre><p>What gives? The problem is that GHC simply doesn’t know what type <code>a</code> is when compiling <code>guardUnit</code>. It <em>could</em> be instantiated to <code>()</code> where it’s called, but it might not be. Therefore, GHC doesn’t know which instance to pick, and an overlapping instance error is still reported.</p><p>This behavior is actually a very, very good thing. If GHC were to blindly pick the <code>IsUnit a</code> instance in this case, then <code>guardUnit</code> would always take the <code>False</code> branch, even when passed a value of type <code>()</code>! That would certainly not be what was intended, so it’s better to reject this program than to silently do the wrong thing. However, in more complicated situations, it can be quite surprising that GHC is complaining about instance overlap even when <code>{-# OVERLAPPING #-}</code> annotations are used, so it’s important to keep their limitations in mind.</p><p>As it happens, in this particular case, the error is easily remedied. We simply have to add an <code>IsUnit</code> constraint to the type signature of <code>guardUnit</code>:</p><pre><code class="pygments"><span class="nf">guardUnit</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="kt">IsUnit</span> <span class="n">a</span> <span class="ow">=></span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Either</span> <span class="kt">String</span> <span class="n">a</span>
<span class="nf">guardUnit</span> <span class="n">x</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">isUnit</span> <span class="o">@</span><span class="n">a</span> <span class="kr">of</span>
<span class="kt">True</span> <span class="ow">-></span> <span class="kt">Left</span> <span class="s">"unit is not allowed"</span>
<span class="kt">False</span> <span class="ow">-></span> <span class="kt">Right</span> <span class="n">x</span></code></pre><p>Now picking the right <code>IsUnit</code> instance is deferred to the place where <code>guardUnit</code> is used, and the definition is accepted.<sup><a href="#footnote-3" id="footnote-ref-3-1">3</a></sup></p><h3><a name="type-families-are-functions-from-types-to-types"></a>Type families are functions from types to types</h3><p>In the previous section, we discussed how typeclasses are functions from types to terms, but what about functions from types to types? For example, suppose we wanted to sum two type-level natural numbers and get a new type-level natural number as a result? For that, we can use a type family:</p><pre><code class="pygments"><span class="cm">{-# LANGUAGE TypeFamilies #-}</span>
<span class="kr">type</span> <span class="kr">family</span> <span class="kt">Sum</span> <span class="n">a</span> <span class="n">b</span> <span class="kr">where</span>
<span class="kt">Sum</span> <span class="kt">Z</span> <span class="n">b</span> <span class="ow">=</span> <span class="n">b</span>
<span class="kt">Sum</span> <span class="p">(</span><span class="kt">S</span> <span class="n">a</span><span class="p">)</span> <span class="n">b</span> <span class="ow">=</span> <span class="kt">S</span> <span class="p">(</span><span class="kt">Sum</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span></code></pre><p>The above is a <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/type_families.html#closed-type-families">closed type family</a>, which works quite a lot like an ordinary Haskell function definition, just at the type level instead of at the value level. For comparison, the equivalent value-level definition of <code>Sum</code> would look like this:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Nat</span> <span class="ow">=</span> <span class="kt">Z</span> <span class="o">|</span> <span class="kt">S</span> <span class="kt">Nat</span>
<span class="nf">sum</span> <span class="ow">::</span> <span class="kt">Nat</span> <span class="ow">-></span> <span class="kt">Nat</span> <span class="ow">-></span> <span class="kt">Nat</span>
<span class="nf">sum</span> <span class="kt">Z</span> <span class="n">b</span> <span class="ow">=</span> <span class="n">b</span>
<span class="nf">sum</span> <span class="p">(</span><span class="kt">S</span> <span class="n">a</span><span class="p">)</span> <span class="n">b</span> <span class="ow">=</span> <span class="kt">S</span> <span class="p">(</span><span class="n">sum</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span></code></pre><p>As you can see, the two are quite similar. Both are defined via a pair of pattern-matching clauses, and though it doesn’t matter here, both closed type families and ordinary functions evaluate their clauses top to bottom.</p><p>To test our definition of <code>Sum</code> in GHCi, we can use <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/ghci.html#ghci-cmd-:kind">the <code>:kind!</code> command</a>, which prints out a type and its kind after reducing it as much as possible:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="kt">:</span><span class="n">kind</span><span class="o">!</span> <span class="kt">Sum</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">)</span> <span class="p">(</span><span class="kt">S</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">))</span>
<span class="kt">Sum</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">)</span> <span class="p">(</span><span class="kt">S</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">))</span> <span class="ow">::</span> <span class="o">*</span>
<span class="ow">=</span> <span class="kt">S</span> <span class="p">(</span><span class="kt">S</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">))</span></code></pre><p>We can also combine <code>Sum</code> with our <code>ReifyNat</code> class from earlier:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">reifyNat</span> <span class="o">@</span><span class="p">(</span><span class="kt">Sum</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">)</span> <span class="p">(</span><span class="kt">S</span> <span class="p">(</span><span class="kt">S</span> <span class="kt">Z</span><span class="p">)))</span>
<span class="mi">3</span></code></pre><p>Type families are a useful complement to typeclasses when performing type-level programming. They allow computation to occur entirely at the type-level, which is necessarily computation that occurs entirely at compile-time, and the result can then be passed to a typeclass method to produce a term-level value from the result.</p><h3><a name="example-1-generalized-concat"></a>Example 1: Generalized <code>concat</code></h3><p>Finally, using what we’ve discussed so far, we can do our first bit of practical TMP. Specifically, we’re going to define a <code>flatten</code> function similar to like-named functions provided by many dynamically-typed languages. In those languages, <code>flatten</code> is like <code>concat</code>, but it works on a list of arbitrary depth. For example, we might use it like this:</p><pre><code class="pygments"><span class="o">></span> <span class="n">flatten</span> <span class="p">[[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]],</span> <span class="p">[[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">],</span> <span class="p">[</span><span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">]]]</span>
<span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">]</span></code></pre><p>In Haskell, lists of different depths have different types, so multiple levels of <code>concat</code> have to be applied explicitly. But using TMP, we can write a generic <code>flatten</code> function that operates on lists of any depth!</p><p>Since this is <em>typeclass</em> metaprogramming, we’ll unsurprisingly begin with a typeclass:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Flatten</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">[</span><span class="o">???</span><span class="p">]</span></code></pre><p>Our first challenge is writing the return type of <code>flatten</code>. Since the argument could be a list of any depth, there’s no direct way to obtain its element type. Fortunately, we can define a type family that does precisely that:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">ElementOf</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">ElementOf</span> <span class="p">[[</span><span class="n">a</span><span class="p">]]</span> <span class="ow">=</span> <span class="kt">ElementOf</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span>
<span class="kt">ElementOf</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">=</span> <span class="n">a</span>
<span class="kr">class</span> <span class="kt">Flatten</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">[</span><span class="kt">ElementOf</span> <span class="n">a</span><span class="p">]</span></code></pre><p>Now we can write our <code>Flatten</code> instances. The base case is when the type is a list of depth 1, in which case we don’t have any flattening to do:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">Flatten</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="n">x</span> <span class="ow">=</span> <span class="n">x</span></code></pre><p>The inductive case is when the type is a nested list, in which case we want to apply <code>concat</code> and recur:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="cm">{-# OVERLAPPING #-}</span> <span class="kt">Flatten</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">=></span> <span class="kt">Flatten</span> <span class="p">[[</span><span class="n">a</span><span class="p">]]</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="n">x</span> <span class="ow">=</span> <span class="n">flatten</span> <span class="p">(</span><span class="n">concat</span> <span class="n">x</span><span class="p">)</span></code></pre><p>Sadly, if we try to compile these definitions, GHC will reject our <code>Flatten [a]</code> instance:</p><pre><code>error:
• Couldn't match type ‘a’ with ‘ElementOf [a]’
‘a’ is a rigid type variable bound by
the instance declaration
Expected type: [ElementOf [a]]
Actual type: [a]
• In the expression: x
In an equation for ‘flatten’: flatten x = x
In the instance declaration for ‘Flatten [a]’
|
| flatten x = x
| ^
</code></pre><p>At first blush, this error looks very confusing. Why doesn’t GHC think <code>a</code> and <code>ElementOf [a]</code> are the same type? Well, consider what would happen if we picked a type like <code>[Int]</code> for <code>a</code>. Then <code>[a]</code> would be <code>[[Int]]</code>, a nested list, so the first case of <code>ElementOf</code> would apply. Therefore, GHC refuses to pick the second equation of <code>ElementOf</code> so hastily.</p><p>In this particular case, we might think that’s rather silly. After all, if <code>a</code> were <code>[Int]</code>, then GHC wouldn’t have picked the <code>Flatten [a]</code> instance to begin with, it would pick the more specific <code>Flatten [[a]]</code> instance defined below. Therefore, the hypothetical situation above could never happen. Unfortunately, GHC does not realize this, so we find ourselves at an impasse.</p><p>Fortunately, we can soothe GHC’s anxiety by adding an extra constraint to our <code>Flatten [a]</code> instance:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="p">(</span><span class="kt">ElementOf</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="o">~</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">Flatten</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="n">x</span> <span class="ow">=</span> <span class="n">x</span></code></pre><p>This is a <em>type equality constraint</em>. Type equality constraints are written with the syntax <code>a ~ b</code>, and they state that <code>a</code> must be the same type as <code>b</code>. Type equality constraints are mostly useful when type families are involved, since they can be used (as in this case) to require a type family reduce to a certain type. In this case, we’re asserting that <code>ElementOf [a]</code> must always be <code>a</code>, which allows the instance to typecheck.</p><p>Note that this doesn’t let us completely wriggle out of our obligation, as the type equality constraint must <em>eventually</em> be checked when the instance is actually used, so initially this might seem like we’ve only deferred the problem to later. But in this case, that’s exactly what we need: by the time the <code>Flatten [a]</code> instance is selected, GHC will know that <code>a</code> is <em>not</em> a list type, and it will be able to reduce <code>ElementOf [a]</code> to <code>a</code> without difficulty. Indeed, we can see this for ourselves by using <code>flatten</code> in GHCi:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">flatten</span> <span class="p">[[[</span><span class="mi">1</span> <span class="ow">::</span> <span class="kt">Integer</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]],</span> <span class="p">[[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">],</span> <span class="p">[</span><span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">]]]</span>
<span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">7</span><span class="p">,</span><span class="mi">8</span><span class="p">]</span></code></pre><p>It works! But why do we need the type annotation on <code>1</code>? If we leave it out, we get a rather hairy type error:</p><pre><code>error:
• Couldn't match type ‘ElementOf [a0]’ with ‘ElementOf [a]’
Expected type: [ElementOf [a]]
Actual type: [ElementOf [a0]]
NB: ‘ElementOf’ is a non-injective type family
The type variable ‘a0’ is ambiguous
</code></pre><p>The issue here stems from the polymorphic nature of Haskell number literals. Theoretically, someone could define a <code>Num [a]</code> instance, in which case <code>1</code> could actually have a list type, and either case of <code>ElementOf</code> could match depending on the choice of <code>Num</code> instance. Of course, no such <code>Num</code> instance exists, nor should it, but the possibility of it being defined means GHC can’t be certain of the depth of the argument list.</p><p>This issue happens to come up a lot in simple examples of TMP, since polymorphic number literals introduce a level of ambiguity. In real programs, this is much less of an issue, since there’s no reason to call <code>flatten</code> on a completely hardcoded list! However, it’s still important to understand what these type errors mean and why they occur.</p><p>That wrinkle aside, <code>flatten</code> is a functioning example of what useful TMP can look like. We’ve written a single, generic definition that flattens lists of any depth, taking advantage of static type information to choose what to do at runtime.</p><h4><a name="typeclasses-as-compile-time-code-generation"></a>Typeclasses as compile-time code generation</h4><p>Presented with the above definition of <code>Flatten</code>, it might not be immediately obvious how to think about <code>Flatten</code> as a function from types to terms. After all, it looks a lot more like an “ordinary” typeclass (like, say, <code>Eq</code> or <code>Show</code>) than the <code>TypeOf</code> and <code>ReifyNat</code> classes we defined above.</p><p>One useful way to shift our perspective is to consider equivalent <code>Flatten</code> instances written using point-free style:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="p">(</span><span class="kt">ElementOf</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="o">~</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">Flatten</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="ow">=</span> <span class="n">id</span>
<span class="kr">instance</span> <span class="cm">{-# OVERLAPPING #-}</span> <span class="kt">Flatten</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">=></span> <span class="kt">Flatten</span> <span class="p">[[</span><span class="n">a</span><span class="p">]]</span> <span class="kr">where</span>
<span class="n">flatten</span> <span class="ow">=</span> <span class="n">flatten</span> <span class="o">.</span> <span class="n">concat</span></code></pre><p>These definitions of <code>flatten</code> no longer (syntactically) depend on term-level arguments, just like our definitions of <code>typeOf</code> and <code>reifyNat</code> didn’t accept any term-level arguments above. This allows us to consider what <code>flatten</code> might “expand to” given a type argument alone:</p><ul><li><p><code>flatten @[Int]</code> is just <code>id</code>, since the <code>Flatten [a]</code> instance is selected.</p></li><li><p><code>flatten @[[Int]]</code> is <code>flatten @[Int] . concat</code>, since the <code>Flatten [[a]]</code> instance is selected. That then becomes <code>id . concat</code>, which can be further simplified to just <code>concat</code>.</p></li><li><p><code>flatten @[[[Int]]]</code> is <code>flatten @[[Int]] . concat</code>, which simplifies to <code>concat . concat</code> by the same reasoning above.</p></li><li><p><code>flatten @[[[[Int]]]]</code> is then <code>concat . concat . concat</code>, and so on.</p></li></ul><p>This meshes quite naturally with our intuition of typeclasses as functions from types to terms. Each application of <code>flatten</code> takes a type as an argument and produces some number of composed <code>concat</code>s as a result. From this perspective, <code>Flatten</code> is performing a kind of compile-time code generation, synthesizing an expression to do the concatenation on the fly by inspecting the type information.</p><p>This framing is one of the key ideas that makes TMP so powerful, and indeed, it explains how it’s worthy of the name <em>metaprogramming</em>. As we continue to more sophisticated examples of TMP, try to keep this perspective in mind.</p><h2><a name="part-2-generic-programming"></a>Part 2: Generic programming</h2><p>Part 1 of this blog post established the foundational techniques used in TMP, all of which are useful on their own. If you’ve read up to this point, you now know enough to start applying TMP yourself, and the remainder of this blog post will simply continue to build upon what you already know.</p><p>In the previous section, we discussed how to use TMP to write a generic <code>flatten</code> operation. In this section, we’ll aim a bit higher: totally generic functions that operate on <em>arbitrary</em> datatypes.</p><h3><a name="open-type-families-and-associated-types"></a>Open type families and associated types</h3><p>Before we can dive into examples, we need to revisit type families. In the previous sections, we discussed closed type families, but we did not cover their counterpart, <em>open type families</em>. Like closed type families, open type families are effectively functions from types to types, but unlike closed type families, they are not defined with a predefined set of equations. Instead, new equations are added separately using <code>type instance</code> declarations. For example, we could define our <code>Sum</code> family from above like this:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">Sum</span> <span class="n">a</span> <span class="n">b</span>
<span class="kr">type</span> <span class="kr">instance</span> <span class="kt">Sum</span> <span class="kt">Z</span> <span class="n">b</span> <span class="ow">=</span> <span class="n">b</span>
<span class="kr">type</span> <span class="kr">instance</span> <span class="kt">Sum</span> <span class="p">(</span><span class="kt">S</span> <span class="n">a</span><span class="p">)</span> <span class="n">b</span> <span class="ow">=</span> <span class="kt">S</span> <span class="p">(</span><span class="kt">Sum</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span></code></pre><p>In the case of <code>Sum</code>, this would not be very useful, and indeed, <code>Sum</code> is much better expressed as a closed type family than an open one. But the advantage of open type families is similar to the advantage of typeclasses: new equations can be added at any time, even in modules other than the one that declares the open type family.</p><p>This extensibility means open type families are used less for type-level computation and more for type-level maps that associate types with other types. For example, one might define a <code>Key</code> open type family that relates types to the types used to index them:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">Key</span> <span class="n">a</span>
<span class="kr">type</span> <span class="kr">instance</span> <span class="kt">Key</span> <span class="p">(</span><span class="kt">Vector</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Int</span>
<span class="kr">type</span> <span class="kr">instance</span> <span class="kt">Key</span> <span class="p">(</span><span class="kt">Map</span> <span class="n">k</span> <span class="n">v</span><span class="p">)</span> <span class="ow">=</span> <span class="n">k</span>
<span class="kr">type</span> <span class="kr">instance</span> <span class="kt">Key</span> <span class="p">(</span><span class="kt">Trie</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">ByteString</span></code></pre><p>This can be combined with a typeclass to provide a generic way to see if a data structure contains a given key:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">HasKey</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">hasKey</span> <span class="ow">::</span> <span class="kt">Key</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Bool</span>
<span class="kr">instance</span> <span class="kt">HasKey</span> <span class="p">(</span><span class="kt">Vector</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">hasKey</span> <span class="n">i</span> <span class="n">vec</span> <span class="ow">=</span> <span class="n">i</span> <span class="o">>=</span> <span class="mi">0</span> <span class="o">&&</span> <span class="n">i</span> <span class="o"><</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Vector</span><span class="o">.</span><span class="n">length</span> <span class="n">vec</span>
<span class="kr">instance</span> <span class="kt">HasKey</span> <span class="p">(</span><span class="kt">Map</span> <span class="n">k</span> <span class="n">v</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">hasKey</span> <span class="ow">=</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Map</span><span class="o">.</span><span class="n">member</span>
<span class="kr">instance</span> <span class="kt">HasKey</span> <span class="p">(</span><span class="kt">Trie</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">hasKey</span> <span class="ow">=</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Trie</span><span class="o">.</span><span class="n">member</span></code></pre><p>In this case, anyone could define their own data structure, define instances of <code>Key</code> and <code>HasKey</code> for their data structure, and use <code>hasKey</code> to see if it contains a given key, regardless of the structure of those keys. In fact, it’s so common for open type families and typeclasses to cooperate in this way that GHC provides the option to make the connection explicit by defining them together:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">HasKey</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Key</span> <span class="n">a</span>
<span class="n">hasKey</span> <span class="ow">::</span> <span class="kt">Key</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Bool</span>
<span class="kr">instance</span> <span class="kt">HasKey</span> <span class="p">(</span><span class="kt">Vector</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Key</span> <span class="p">(</span><span class="kt">Vector</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Int</span>
<span class="n">hasKey</span> <span class="n">i</span> <span class="n">vec</span> <span class="ow">=</span> <span class="n">i</span> <span class="o">>=</span> <span class="mi">0</span> <span class="o">&&</span> <span class="n">i</span> <span class="o"><</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Vector</span><span class="o">.</span><span class="n">length</span> <span class="n">vec</span>
<span class="kr">instance</span> <span class="kt">HasKey</span> <span class="p">(</span><span class="kt">Map</span> <span class="n">k</span> <span class="n">v</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Key</span> <span class="p">(</span><span class="kt">Map</span> <span class="n">k</span> <span class="n">v</span><span class="p">)</span> <span class="ow">=</span> <span class="n">k</span>
<span class="n">hasKey</span> <span class="ow">=</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Map</span><span class="o">.</span><span class="n">member</span>
<span class="kr">instance</span> <span class="kt">HasKey</span> <span class="p">(</span><span class="kt">Trie</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Key</span> <span class="p">(</span><span class="kt">Trie</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">ByteString</span>
<span class="n">hasKey</span> <span class="ow">=</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Trie</span><span class="o">.</span><span class="n">member</span></code></pre><p>An open family declared inside a typeclass like this is called an <em>associated type</em>. It works exactly the same way as the separate definitions of <code>Key</code> and <code>HasKey</code>, it just uses a different syntax. Note that although the <code>family</code> and <code>instance</code> keywords have disappeared from the declarations, that is only an abbreviation; the keywords are simply implicitly added (and explicitly writing them is still allowed, though most people do not).</p><p>Open type families and associated types are extremely useful for abstracting over similar types with slightly different structure, and libraries like <a href="https://hackage.haskell.org/package/mono-traversable"><code>mono-traversable</code></a> are examples of how they can be used to that end for their full effect. However, those use cases can’t really be classified as TMP, just using typeclasses for their traditional purpose of operation overloading.</p><p>However, that doesn’t mean open type families aren’t useful for TMP. In fact, one use case of TMP makes <em>heavy</em> use of open type families: datatype-generic programming.</p><h3><a name="example-2-datatype-generic-programming"></a>Example 2: Datatype-generic programming</h3><p><em>Datatype-generic programming</em> refers to a class of techniques for writing generic functions that operate on arbitrary data structures. Some useful applications of datatype-generic programming include</p><ul><li><p>equality, comparison, and hashing,</p></li><li><p>recursive traversal of self-similar data structures, and</p></li><li><p>serialization and deserialization,</p></li></ul><p>among other things. The idea is that by exploiting the structure of datatype definitions themselves, it’s possible for a datatype-generic function to provide implementations of functionality like the above on <em>any</em> datatype.</p><p>In Haskell, the most popular approach to datatype-generic programming leverages GHC generics, which is quite sophisticated. The <a href="https://hackage.haskell.org/package/base-4.14.1.0/docs/GHC-Generics.html">module documentation for <code>GHC.Generics</code></a> already includes a fairly lengthy explanation of how it works, so I will not regurgitate it here (that could fill a blog post of its own!), but I will show how to construct a simplified version of the system that highlights the key role of TMP.</p><h4><a name="generic-datatype-representations"></a>Generic datatype representations</h4><p>At the heart of the <code>Generic</code> class is a simple concept: all non-GADT Haskell datatypes can be represented as sums of products. For example, if we have</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Authentication</span>
<span class="ow">=</span> <span class="kt">AuthBasic</span> <span class="kt">Username</span> <span class="kt">Password</span>
<span class="o">|</span> <span class="kt">AuthSSH</span> <span class="kt">PublicKey</span></code></pre><p>then we have a type that is essentially equivalent to this one:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">Authentication</span> <span class="ow">=</span> <span class="kt">Either</span> <span class="p">(</span><span class="kt">Username</span><span class="p">,</span> <span class="kt">Password</span><span class="p">)</span> <span class="kt">PublicKey</span></code></pre><p>If we know how to define a function on a nested tree built out of <code>Either</code>s and pairs, then we know how to define it on <em>any</em> such datatype! This is where TMP comes in: recall the way we viewed <code>Flatten</code> as a mechanism for compile-time code generation based on type information. Could we use the same technique to generate implementations of equality, comparison, hashing, etc. from statically-known information about the structure of a datatype?</p><p>The answer to that question is <em>yes</em>. To start, let’s consider a particularly simple example: suppose we want to write a generic function that counts the number of fields stored in an arbitrary constructor. For example, <code>numFields (AuthBasic "alyssa" "pass1234")</code> would return <code>2</code>, while <code>numFields (AuthSSH "<key>")</code> would return <code>1</code>. Not a very useful function, admittedly, but it’s a simple example of what generic programming can do.</p><p>We’ll start by using TMP to implement a “generic” version of <code>numFields</code> that operates on trees of <code>Either</code>s and pairs as described above:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">GNumFields</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Natural</span>
<span class="c1">-- base case: leaf value</span>
<span class="kr">instance</span> <span class="kt">GNumFields</span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="kr">_</span> <span class="ow">=</span> <span class="mi">1</span>
<span class="kr">instance</span> <span class="cm">{-# OVERLAPPING #-}</span> <span class="p">(</span><span class="kt">GNumFields</span> <span class="n">a</span><span class="p">,</span> <span class="kt">GNumFields</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">GNumFields</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="n">a</span> <span class="o">+</span> <span class="n">gnumFields</span> <span class="n">b</span>
<span class="kr">instance</span> <span class="cm">{-# OVERLAPPING #-}</span> <span class="p">(</span><span class="kt">GNumFields</span> <span class="n">a</span><span class="p">,</span> <span class="kt">GNumFields</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">GNumFields</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="p">(</span><span class="kt">Left</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="n">a</span>
<span class="n">gnumFields</span> <span class="p">(</span><span class="kt">Right</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="n">b</span></code></pre><p>Just like our <code>Flatten</code> class from earlier, <code>GNumFields</code> uses the type-level structure of its argument to choose what to do:</p><ul><li><p>If we find a pair, that corresponds to a product, so we recur into both sides and sum the results.</p></li><li><p>If we find <code>Left</code> or <code>Right</code>, that corresponds to the “spine” differentiating different constructors, so we simply recur into the contained value.</p></li><li><p>In the case of any other value, we’re at a “leaf” in the tree of <code>Either</code>s and pairs, which corresponds to a single field, so we just return <code>1</code>.</p></li></ul><p>Now if we call <code>gnumFields (Left ("alyssa", "pass1234"))</code>, we’ll get <code>2</code>, and if we call <code>gnumFields (Right "<key>")</code>, we’ll get <code>1</code>. All that’s left to do is write a bit of code that converts our <code>Authentication</code> type to a tree of <code>Either</code>s and pairs:</p><pre><code class="pygments"><span class="nf">genericizeAuthentication</span> <span class="ow">::</span> <span class="kt">Authentication</span> <span class="ow">-></span> <span class="kt">Either</span> <span class="p">(</span><span class="kt">Username</span><span class="p">,</span> <span class="kt">Password</span><span class="p">)</span> <span class="kt">PublicKey</span>
<span class="nf">genericizeAuthentication</span> <span class="p">(</span><span class="kt">AuthBasic</span> <span class="n">user</span> <span class="n">pass</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Left</span> <span class="p">(</span><span class="n">user</span><span class="p">,</span> <span class="n">pass</span><span class="p">)</span>
<span class="nf">genericizeAuthentication</span> <span class="p">(</span><span class="kt">AuthSSH</span> <span class="n">key</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Right</span> <span class="n">key</span>
<span class="nf">numFieldsAuthentication</span> <span class="ow">::</span> <span class="kt">Authentication</span> <span class="ow">-></span> <span class="kt">Natural</span>
<span class="nf">numFieldsAuthentication</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="o">.</span> <span class="n">genericizeAuthentication</span></code></pre><p>Now we get the results we want on our <code>Authentication</code> type using <code>numFieldsAuthentication</code>, but we’re not done yet, since it only works on <code>Authentication</code> values. Is there a way to define a generic <code>numFields</code> function that works on arbitrary datatypes that implement this conversion to sums-of-products? Yes, with another typeclass:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Generic</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Rep</span> <span class="n">a</span>
<span class="n">genericize</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Rep</span> <span class="n">a</span>
<span class="kr">instance</span> <span class="kt">Generic</span> <span class="kt">Authentication</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Rep</span> <span class="kt">Authentication</span> <span class="ow">=</span> <span class="kt">Either</span> <span class="p">(</span><span class="kt">Username</span><span class="p">,</span> <span class="kt">Password</span><span class="p">)</span> <span class="kt">PublicKey</span>
<span class="n">genericize</span> <span class="p">(</span><span class="kt">AuthBasic</span> <span class="n">user</span> <span class="n">pass</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Left</span> <span class="p">(</span><span class="n">user</span><span class="p">,</span> <span class="n">pass</span><span class="p">)</span>
<span class="n">genericize</span> <span class="p">(</span><span class="kt">AuthSSH</span> <span class="n">key</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Right</span> <span class="n">key</span>
<span class="nf">numFields</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">Generic</span> <span class="n">a</span><span class="p">,</span> <span class="kt">GNumFields</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">a</span><span class="p">))</span> <span class="ow">=></span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Natural</span>
<span class="nf">numFields</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="o">.</span> <span class="n">genericize</span></code></pre><p>Now <code>numFields (AuthBasic "alyssa" "pass1234")</code> returns <code>2</code>, as desired, and it will <em>also</em> work with any datatype that provides a <code>Generic</code> instance. If the above code makes your head spin, don’t worry: this is by far the most complicated piece of code in this blog post up to this point. Let’s break down how it works piece by piece:</p><ul><li><p>First, we define the <code>Generic</code> class, comprised of two parts:</p><ol><li><p>The <code>Rep a</code> associated type maps a type <code>a</code> onto its generic, sums-of-products representation, i.e. one built out of combinations of <code>Either</code> and pairs.</p></li><li><p>The <code>genericize</code> method converts an actual <em>value</em> of type <code>a</code> to the equivalent value using the sums-of-products representation.</p></li></ol></li><li><p>Next, we define a <code>Generic</code> instance for <code>Authentication</code>. <code>Rep Authentication</code> is the sums-of-products representation we described above, and <code>genericize</code> is likewise <code>genericizeAuthentication</code> from above.</p></li><li><p>Finally, we define <code>numFields</code> as a function with a <code>GNumFields (Rep a)</code> constraint. This is where all the magic happens:</p><ul><li><p>When we apply <code>numFields</code> to a datatype, <code>Rep</code> retrieves its generic, sums-of-products representation type.</p></li><li><p>The <code>GNumFields</code> class then uses various TMP techniques we’ve already described so far in this blog post to generate a <code>numFields</code> implementation on the fly from the structure of <code>Rep a</code>.</p></li><li><p>Finally, that generated <code>numFields</code> implementation is applied to the genericized term-level value, and the result is produced.</p></li></ul></li></ul><p>After all that, I suspect you might think this seems like a very convoluted way to define the (rather unhelpful) <code>numFields</code> operation. Surely just defining <code>numFields</code> on each type directly would be far easier? Indeed, if we were just considering <code>numFields</code>, you’d be right, but in fact we get much more than that. Using the same machinery, we can continue to define other generic operations—equality, comparison, etc.—the same way we defined <code>numFields</code>, and all of them would automatically work on <code>Authentication</code> because they all leverage the same <code>Generic</code> instance!</p><p>This is the basic value proposition of generic programming: we can do a little work up front to normalize our datatype to a generic representation <em>once</em>, then get a whole buffet of generic operations on it for free. In Haskell, the code generation capabilities of TMP is a key piece of that puzzle.</p><h4><a name="improving-our-definition-of-generic"></a>Improving our definition of <code>Generic</code></h4><p>You may note that the definition of <code>Generic</code> provided above does not match the one in <code>GHC.Generic</code>. Indeed, our naïve approach suffers from several flaws that the real version does not. This is not a <code>GHC.Generics</code> tutorial, so I will not discuss every detail of the full implementation, but I will highlight a few improvements relevant to the broader theme of TMP.</p><h5><a name="distinguishing-leaves-from-the-spine"></a>Distinguishing leaves from the spine</h5><p>One problem with our version of <code>Generic</code> is that it provides no way to distinguish an <code>Either</code> or pair that should be considered a “leaf”, as in a type like this:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Foo</span> <span class="ow">=</span> <span class="kt">A</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">Int</span> <span class="kt">String</span><span class="p">)</span> <span class="o">|</span> <span class="kt">B</span> <span class="p">(</span><span class="kt">Char</span><span class="p">,</span> <span class="kt">Bool</span><span class="p">)</span></code></pre><p>Given this type, <code>Rep Foo</code> should be <code>Either (Either Int String) (Char, Bool)</code>, and <code>numFields (Right ('a', True))</code> will erroneously return <code>2</code> rather than <code>1</code>. To fix this, we can introduce a simple wrapper newtype that distinguishes leaves specifically:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">Leaf</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Leaf</span> <span class="p">{</span> <span class="n">getLeaf</span> <span class="ow">::</span> <span class="n">a</span> <span class="p">}</span></code></pre><p>Now our <code>Generic</code> instances look like this:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">Generic</span> <span class="kt">Authentication</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Rep</span> <span class="kt">Authentication</span> <span class="ow">=</span> <span class="kt">Either</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="kt">Username</span><span class="p">,</span> <span class="kt">Leaf</span> <span class="kt">Password</span><span class="p">)</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="kt">PublicKey</span><span class="p">)</span>
<span class="n">genericize</span> <span class="p">(</span><span class="kt">AuthBasic</span> <span class="n">user</span> <span class="n">pass</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Left</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="n">user</span><span class="p">,</span> <span class="kt">Leaf</span> <span class="n">pass</span><span class="p">)</span>
<span class="n">genericize</span> <span class="p">(</span><span class="kt">AuthSSH</span> <span class="n">key</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Right</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="n">key</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">Generic</span> <span class="kt">Foo</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Rep</span> <span class="kt">Foo</span> <span class="ow">=</span> <span class="kt">Either</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">Int</span> <span class="kt">String</span><span class="p">))</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="p">(</span><span class="kt">Char</span><span class="p">,</span> <span class="kt">Bool</span><span class="p">))</span>
<span class="n">genericize</span> <span class="p">(</span><span class="kt">A</span> <span class="n">x</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Left</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="n">x</span><span class="p">)</span>
<span class="n">genericize</span> <span class="p">(</span><span class="kt">B</span> <span class="n">x</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Right</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="n">x</span><span class="p">)</span></code></pre><p>Since the <code>Leaf</code> constructor now distinguishes a leaf, rather than the absence of an <code>Either</code> or <code>(,)</code> constructor, we’ll have to update our <code>GNumFields</code> instances as well. However, this has the additional pleasant effect of eliminating the need for overlapping instances:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">GNumFields</span> <span class="p">(</span><span class="kt">Leaf</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="kr">_</span> <span class="ow">=</span> <span class="mi">1</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">GNumFields</span> <span class="n">a</span><span class="p">,</span> <span class="kt">GNumFields</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">GNumFields</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="n">a</span> <span class="o">+</span> <span class="n">gnumFields</span> <span class="n">b</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">GNumFields</span> <span class="n">a</span><span class="p">,</span> <span class="kt">GNumFields</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">GNumFields</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="p">(</span><span class="kt">Left</span> <span class="n">a</span><span class="p">)</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="n">a</span>
<span class="n">gnumFields</span> <span class="p">(</span><span class="kt">Right</span> <span class="n">b</span><span class="p">)</span> <span class="ow">=</span> <span class="n">gnumFields</span> <span class="n">b</span></code></pre><p>This is a good example of why overlapping instances can be so seductive, but they often have unintended consequences. Even when doing TMP, explicit tags are almost always preferable.</p><h5><a name="handling-empty-constructors"></a>Handling empty constructors</h5><p>Suppose we have a type with nullary data constructors, like the standard <code>Bool</code> type:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Bool</span> <span class="ow">=</span> <span class="kt">False</span> <span class="o">|</span> <span class="kt">True</span></code></pre><p>How do we write a <code>Generic</code> instance for <code>Bool</code>? Using just <code>Either</code>, <code>(,)</code>, and <code>Leaf</code>, we can’t, but if we are willing to add a case for <code>()</code>, we can use it to denote nullary constructors:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">GNumFields</span> <span class="nb">()</span> <span class="kr">where</span>
<span class="n">gnumFields</span> <span class="kr">_</span> <span class="ow">=</span> <span class="mi">0</span>
<span class="kr">instance</span> <span class="kt">Generic</span> <span class="kt">Bool</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Rep</span> <span class="kt">Bool</span> <span class="ow">=</span> <span class="kt">Either</span> <span class="nb">()</span> <span class="nb">()</span>
<span class="n">genericize</span> <span class="kt">False</span> <span class="ow">=</span> <span class="kt">Left</span> <span class="nb">()</span>
<span class="n">genericize</span> <span class="kt">True</span> <span class="ow">=</span> <span class="kt">Right</span> <span class="nb">()</span></code></pre><p>In a similar vein, we could use <code>Void</code> to represent datatypes that don’t have any constructors at all.</p><h4><a name="continuing-from-here"></a>Continuing from here</h4><p>The full version of <code>Generic</code> has a variety of further improvements useful for generic programming, including:</p><ul><li><p>Support for converting from <code>Rep a</code> to <code>a</code>.</p></li><li><p>Special indication of self-recursive datatypes, making generic tree traversals possible.</p></li><li><p>Type-level information about datatype constructor and record accessor names, allowing them to be used in serialization.</p></li><li><p>Fully automatic generation of <code>Generic</code> instances via <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/generics.html#extension-DeriveGeneric">the <code>DeriveGeneric</code> extension</a>, which reduces the per-type boilerplate to essentially nothing.</p></li></ul><p>The <a href="https://hackage.haskell.org/package/base-4.14.1.0/docs/GHC-Generics.html">module documentation for <code>GHC.Generics</code></a> discusses the full system in detail, and it provides an additional example that uses the same essential TMP techniques discussed here.</p><h2><a name="part-3-dependent-typing"></a>Part 3: Dependent typing</h2><p>It’s time for the third and final part of this blog post: an introduction to dependently typed programming in Haskell. A full treatment of dependently typed programming is far, far too vast to be contained in a single blog post, so I will not attempt to do so here. Rather, I will cover some basic idioms for doing dependent programming and highlight how TMP can be valuable when doing so.</p><h3><a name="datatype-promotion"></a>Datatype promotion</h3><p>In part 1, we used uninhabited datatypes like <code>Z</code> and <code>S a</code> to define new type-level constants. This works, but it is awkward. Imagine for a moment that we wanted to work with type-level booleans. Using our previous approach, we could define two empty datatypes, <code>True</code> and <code>False</code>:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">True</span>
<span class="kr">data</span> <span class="kt">False</span></code></pre><p>Now we could define type families to provide operations on these types, such as <code>Not</code>:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">Not</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">Not</span> <span class="kt">True</span> <span class="ow">=</span> <span class="kt">False</span>
<span class="kt">Not</span> <span class="kt">False</span> <span class="ow">=</span> <span class="kt">True</span></code></pre><p>However, this has some frustrating downsides:</p><ul><li><p>First, it’s simply inconvenient that we have to define these new <code>True</code> and <code>False</code> “dummy” types, which are completely distinct from the <code>Bool</code> type provided by the prelude.</p></li><li><p>More significantly, it means <code>Not</code> has a very unhelpful kind:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="kt">:</span><span class="n">kind</span> <span class="kt">Not</span>
<span class="kt">Not</span> <span class="ow">::</span> <span class="o">*</span> <span class="ow">-></span> <span class="o">*</span></code></pre><p>Even though <code>Not</code> is only <em>supposed</em> to be applied to <code>True</code> or <code>False</code>, its kind allows it to be applied to any type at all. You can see this in practice if you try to evaluate something like <code>Not Char</code>:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="kt">:</span><span class="n">kind</span><span class="o">!</span> <span class="kt">Not</span> <span class="kt">Char</span>
<span class="kt">Not</span> <span class="kt">Char</span> <span class="ow">::</span> <span class="o">*</span>
<span class="ow">=</span> <span class="kt">Not</span> <span class="kt">Char</span></code></pre><p>Rather than getting an error, GHC simply spits <code>Not Char</code> back at us. This is a somewhat unintuitive property of closed type families: if none of the clauses match, the type family just gets “stuck,” not reducing any further. This can lead to very confusing type errors later in the typechecking process.</p></li></ul><p>One way to think about <code>Not</code> is that it is largely <em>dynamically kinded</em> in the same way some languages are dynamically typed. That isn’t entirely true, as we technically <em>will</em> get a kind error if we try to apply <code>Not</code> to a type constructor rather than a type, such as <code>Maybe</code>:</p><pre><code>ghci> :kind! Not Maybe
<interactive>:1:5: error:
• Expecting one more argument to ‘Maybe’
Expected a type, but ‘Maybe’ has kind ‘* -> *’
</code></pre><p>…but <code>*</code> is still a very big kind, much bigger than we would like to permit for <code>Not</code>.</p><p>To help with both these problems, GHC provides <em>datatype promotion</em> via <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/data_kinds.html">the <code>DataKinds</code> language extension</a>. The idea is that for each normal, non-GADT type definition like</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Bool</span> <span class="ow">=</span> <span class="kt">False</span> <span class="o">|</span> <span class="kt">True</span></code></pre><p>then in addition to the normal type constructor and value constructors, GHC also defines several <em>promoted</em> constructors:</p><ul><li><p><code>Bool</code> is allowed as both a type and a kind.</p></li><li><p><code>'True</code> and <code>'False</code> are defined as new types of kind <code>Bool</code>.</p></li></ul><p>We can see this in action if we remove our <code>data True</code> and <code>data False</code> declarations and adjust our definition of <code>Not</code> to use promoted constructors:</p><pre><code class="pygments"><span class="cm">{-# LANGUAGE DataKinds #-}</span>
<span class="kr">type</span> <span class="kr">family</span> <span class="kt">Not</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">Not</span> <span class="kt">'True</span> <span class="ow">=</span> <span class="kt">'False</span>
<span class="kt">Not</span> <span class="kt">'False</span> <span class="ow">=</span> <span class="kt">'True</span></code></pre><p>Now the inferred kind of <code>Not</code> is no longer <code>* -> *</code>:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="kt">:</span><span class="n">kind</span> <span class="kt">Not</span>
<span class="kt">Not</span> <span class="ow">::</span> <span class="kt">Bool</span> <span class="ow">-></span> <span class="kt">Bool</span></code></pre><p>Consequently, we will now get a kind error if we attempt to apply <code>Not</code> to anything other than <code>'True</code> or <code>'False</code>:</p><pre><code>ghci> :kind! Not Char
<interactive>:1:5: error:
• Expected kind ‘Bool’, but ‘Char’ has kind ‘*’
</code></pre><p>This is a nice improvement. We can make a similar change to our definitions involving type-level natural numbers:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Nat</span> <span class="ow">=</span> <span class="kt">Z</span> <span class="o">|</span> <span class="kt">S</span> <span class="kt">Nat</span>
<span class="kr">class</span> <span class="kt">ReifyNat</span> <span class="p">(</span><span class="n">a</span> <span class="ow">::</span> <span class="kt">Nat</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">reifyNat</span> <span class="ow">::</span> <span class="kt">Natural</span>
<span class="kr">instance</span> <span class="kt">ReifyNat</span> <span class="kt">'Z</span> <span class="kr">where</span>
<span class="n">reifyNat</span> <span class="ow">=</span> <span class="mi">0</span>
<span class="kr">instance</span> <span class="kt">ReifyNat</span> <span class="n">a</span> <span class="ow">=></span> <span class="kt">ReifyNat</span> <span class="p">(</span><span class="kt">'S</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">reifyNat</span> <span class="ow">=</span> <span class="mi">1</span> <span class="o">+</span> <span class="n">reifyNat</span> <span class="o">@</span><span class="n">a</span></code></pre><p>Note that we need to add an explicit kind signature on the definition of the <code>ReifyNat</code> typeclass, since otherwise GHC will assume <code>a</code> has kind <code>*</code>, since nothing in the types of the typeclass methods suggests otherwise. In addition to making it clearer that <code>Z</code> and <code>S</code> are related, this prevents someone from coming along and defining a nonsensical instance like <code>ReifyNat Char</code>, which previously would have been allowed but will now be rejected with a kind error.</p><p>Datatype promotion is not strictly required to do TMP, but makes the process significantly less painful. It makes Haskell’s kind language extensible in the same way its type language is, which allows type-level programming to enjoy static typechecking (or more accurately, static kindchecking) in the same way term-level programming does.</p><h3><a name="gadts-and-proof-terms"></a>GADTs and proof terms</h3><p>So far in this blog post, we have discussed several different function-like things:</p><ul><li><p>Ordinary Haskell functions are functions from terms to terms.</p></li><li><p>Type families are functions from types to types.</p></li><li><p>Typeclasses are functions from types to terms.</p></li></ul><p>A curious reader may wonder about the existence of a fourth class of function:</p><ul><li><p><em>???</em> are functions from terms to types.</p></li></ul><p>To reason about what could go in the <em>???</em> above, we must consider what “a function from terms to types” would even mean. Functions from terms to terms and types to types are straightforward enough. Functions from types to terms are a little trickier, but they make intuitive sense: we use information known at compile-time to generate runtime behavior. But how could information possibly flow in the other direction? How could we possibly turn runtime information into compile-time information without being able to predict the future?</p><p>In general, we cannot. However, one feature of Haskell allows a restricted form of seemingly doing the impossible—turning runtime information into compile-time information—and that’s GADTs.</p><p>GADTs<sup><a href="#footnote-4" id="footnote-ref-4-1">4</a></sup> are <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/gadt.html">described in detail in the GHC User’s Guide</a>, but the key idea for our purposes is that <em>pattern-matching on a GADT constructor can refine type information</em>. Here’s a simple, silly example:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">WhatIsIt</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">ABool</span> <span class="ow">::</span> <span class="kt">WhatIsIt</span> <span class="kt">Bool</span>
<span class="kt">AnInt</span> <span class="ow">::</span> <span class="kt">WhatIsIt</span> <span class="kt">Int</span>
<span class="nf">doSomething</span> <span class="ow">::</span> <span class="kt">WhatIsIt</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">doSomething</span> <span class="kt">ABool</span> <span class="n">x</span> <span class="ow">=</span> <span class="n">not</span> <span class="n">x</span>
<span class="nf">doSomething</span> <span class="kt">AnInt</span> <span class="n">x</span> <span class="ow">=</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">1</span></code></pre><p>Here, <code>WhatIsIt</code> is a datatype with two nullary constructors, <code>ABool</code> and <code>AnInt</code>, similar to a normal, non-GADT datatype like this one:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">WhatIsIt</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">ABool</span> <span class="o">|</span> <span class="kt">AnInt</span></code></pre><p>What’s special about GADTs is that each constructor is given an explicit type signature. With the plain ADT definition above, <code>ABool</code> and <code>AnInt</code> would both have the type <code>forall a. WhatIsIt a</code>, but in the GADT definition, we explicitly fix <code>a</code> to <code>Bool</code> in the type of <code>ABool</code> and to <code>Int</code> in the type of <code>AnInt</code>.</p><p>This simple feature allows us to do very interesting things. The <code>doSomething</code> function is polymorphic in <code>a</code>, but on the right-hand side of the first equation, <code>x</code> has type <code>Bool</code>, while on the right-hand side of the second equation, <code>x</code> has type <code>Int</code>. This is because the <code>WhatIsIt a</code> argument effectively constrains the type of <code>a</code>, as we can see by experimenting with <code>doSomething</code> in GHCi:</p><pre><code>ghci> doSomething ABool True
False
ghci> doSomething AnInt 10
11
ghci> doSomething AnInt True
error:
• Couldn't match expected type ‘Int’ with actual type ‘Bool’
• In the second argument of ‘doSomething’, namely ‘True’
In the expression: doSomething AnInt True
In an equation for ‘it’: it = doSomething AnInt True
</code></pre><p>One way to think about GADTs is as “proofs” or “witnesses” of type equalities. The <code>ABool</code> constructor is a proof of <code>a ~ Bool</code>, while the <code>AnInt</code> constructor is a proof of <code>a ~ Int</code>. When you construct <code>ABool</code> or <code>AnInt</code>, you must be able to satisfy the equality, and it is in a sense “packed into” the constructor value. When code pattern-matches on the constructor, the equality is “unpacked from” the value, and the equality becomes available on the right-hand side of the pattern match.</p><p>GADTs can be much more sophisticated than our simple <code>WhatIsIt</code> type above. Just like normal ADTs, GADT constructors can have parameters, which makes it possible to write inductive datatypes that carry type equality proofs with them:</p><pre><code class="pygments"><span class="kr">infixr</span> <span class="mi">5</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span>
<span class="kr">data</span> <span class="kt">HList</span> <span class="n">as</span> <span class="kr">where</span>
<span class="kt">HNil</span> <span class="ow">::</span> <span class="kt">HList</span> <span class="kt">'[]</span>
<span class="kt">HCons</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: as)</span></code></pre><p>This type is a <em>heterogenous list</em>, a list that can contain elements of different types:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="kt">:</span><span class="n">t</span> <span class="kt">True</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="s">"hello"</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="mi">42</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span>
<span class="kt">True</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="s">"hello"</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="mi">42</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span>
<span class="ow">::</span> <span class="kt">Num</span> <span class="n">a</span> <span class="ow">=></span> <span class="kt">HList</span> <span class="kt">'[Bool, [Char]</span><span class="p">,</span> <span class="n">a</span><span class="p">]</span></code></pre><p>An <code>HList</code> is parameterized by a type-level list that keeps track of the types of its elements, which allows us to highlight another interesting property of GADTs: if we restrict that type information, the GHC pattern exhaustiveness checker will take the restriction into account. For example, we can write a completely total <code>head</code> function on <code>HList</code>s like this:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="kt">HList</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: as) -> a</span>
<span class="nf">head</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kr">_</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span></code></pre><p>Remarkably, GHC does not complain that this definition of <code>head</code> is non-exhaustive. Since we specified that the argument must be of type <code>HList (a ': as)</code> in the type signature for <code>head</code>, GHC knows that the argument <em>cannot</em> be <code>HNil</code> (which would have the type <code>HList '[]</code>), so it doesn’t ask us to handle that case.</p><p>These examples illustrate the way GADTs serve as a general-purpose construct for relating type- and term-level information. Information flows bidirectionally: type information refines the set of type constructors that can be matched on, and matching on type constructors exposes new type equalities.</p><h4><a name="proofs-that-work-together"></a>Proofs that work together</h4><p>This interplay is wonderfully compositional. Suppose we wanted to write a function that accepts an <code>HList</code> of exactly 1, 2, or 3 elements. There’s no easy way to express that in the type signature the way we did with <code>head</code>, so it might seem like all we can do is write an entirely new container datatype that has three constructors, one for each case.</p><p>However, a more interesting solution exists that takes advantage of the bidirectional nature of GADTs. We can start by writing a <em>proof term</em> that contains no values, it just encapsulates type equalities on a type-level list:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">OneToThree</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">as</span> <span class="kr">where</span>
<span class="kt">One</span> <span class="ow">::</span> <span class="kt">OneToThree</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="kt">'[a]</span>
<span class="kt">Two</span> <span class="ow">::</span> <span class="kt">OneToThree</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="kt">'[a, b]</span>
<span class="kt">Three</span> <span class="ow">::</span> <span class="kt">OneToThree</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="kt">'[a, b, c]</span></code></pre><p>We call it a proof term because a value of type <code>OneToThree a b c as</code> constitutes a <em>proof</em> that <code>as</code> has exactly 1, 2, or 3 elements. Using <code>OneToThree</code>, we can write a function that accepts an <code>HList</code> accompanied by a proof term:</p><pre><code class="pygments"><span class="nf">sumUpToThree</span> <span class="ow">::</span> <span class="kt">OneToThree</span> <span class="kt">Int</span> <span class="kt">Int</span> <span class="kt">Int</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">Int</span>
<span class="nf">sumUpToThree</span> <span class="kt">One</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span>
<span class="nf">sumUpToThree</span> <span class="kt">Two</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">y</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="nf">sumUpToThree</span> <span class="kt">Three</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">y</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">z</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span> <span class="o">+</span> <span class="n">z</span></code></pre><p>As with <code>head</code>, this function is completely exhaustive, in this case because we take full advantage of the bidirectional nature of GADTs:</p><ul><li><p>When we match on the <code>OneToThree</code> proof term, information flows from the term level to the type level, refining the type of <code>as</code> in that branch.</p></li><li><p>The refined type of <code>as</code> then flows back down to the term level, restricting the shape the <code>HList</code> can take and refinine the set of patterns we have to match.</p></li></ul><p>Of course, this example is not especially useful, but in general proof terms can encode any number of useful properties. For example, we can write a proof term that ensures an <code>HList</code> has an even number of elements:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Even</span> <span class="n">as</span> <span class="kr">where</span>
<span class="kt">EvenNil</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="kt">'[]</span>
<span class="kt">EvenCons</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">Even</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: b</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span></code></pre><p>This is a proof which itself has inductive structure: <code>EvenCons</code> takes a proof that <code>as</code> has an even number of elements and produces a proof that adding two more elements preserves the evenness. We can combine this with a type family to write a function that “pairs up” elements in an <code>HList</code>:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">PairUp</span> <span class="n">as</span> <span class="kr">where</span>
<span class="kt">PairUp</span> <span class="kt">'[]</span> <span class="ow">=</span> <span class="kt">'[]</span>
<span class="kt">PairUp</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: b</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="sc">'</span><span class="err">: PairUp as</span>
<span class="nf">pairUp</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="p">(</span><span class="kt">PairUp</span> <span class="n">as</span><span class="p">)</span>
<span class="nf">pairUp</span> <span class="kt">EvenNil</span> <span class="kt">HNil</span> <span class="ow">=</span> <span class="kt">HNil</span>
<span class="nf">pairUp</span> <span class="p">(</span><span class="kt">EvenCons</span> <span class="n">even</span><span class="p">)</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">y</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">xs</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">pairUp</span> <span class="n">even</span> <span class="n">xs</span></code></pre><p>Once again, this definition is completely exhaustive, and we can show that it works in GHCi:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">pairUp</span> <span class="p">(</span><span class="kt">EvenCons</span> <span class="o">$</span> <span class="kt">EvenCons</span> <span class="kt">EvenNil</span><span class="p">)</span>
<span class="p">(</span><span class="kt">True</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="sc">'a'</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="s">"foo"</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span><span class="p">)</span>
<span class="p">(</span><span class="kt">True</span><span class="p">,</span><span class="sc">'a'</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span><span class="s">"foo"</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span></code></pre><p>This ability to capture properties of a type using auxiliary proof terms, rather than having to define an entirely new type, is one of the things that makes dependently typed programming so powerful.</p><h4><a name="proof-inference"></a>Proof inference</h4><p>While our definition of <code>pairUp</code> is interesting, you may be skeptical of its practical utility. It’s fiddly and inconvenient to have to pass the <code>Even</code> proof term explicitly, since it must be updated every time the length of the list changes. Fortunately, this is where TMP comes in.</p><p>Remember that typeclasses are functions from types to terms. As its happens, a value of type <code>Even as</code> can be mechanically produced from the structure of the type <code>as</code>. This suggests that we could use TMP to automatically generate <code>Even</code> proofs, and indeed, we can. In fact, it’s not at all complicated:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">IsEven</span> <span class="n">as</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="n">as</span>
<span class="kr">instance</span> <span class="kt">IsEven</span> <span class="kt">'[]</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">=</span> <span class="kt">EvenNil</span>
<span class="kr">instance</span> <span class="kt">IsEven</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">IsEven</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: b</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">=</span> <span class="kt">EvenCons</span> <span class="n">evenProof</span></code></pre><p>We can now adjust our <code>pairUp</code> function to use <code>IsEven</code> instead of an explicit <code>Even</code> argument:</p><pre><code class="pygments"><span class="nf">pairUp</span> <span class="ow">::</span> <span class="kt">IsEven</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="p">(</span><span class="kt">PairUp</span> <span class="n">as</span><span class="p">)</span>
<span class="nf">pairUp</span> <span class="ow">=</span> <span class="n">go</span> <span class="n">evenProof</span> <span class="kr">where</span>
<span class="n">go</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="p">(</span><span class="kt">PairUp</span> <span class="n">as</span><span class="p">)</span>
<span class="n">go</span> <span class="kt">EvenNil</span> <span class="kt">HNil</span> <span class="ow">=</span> <span class="kt">HNil</span>
<span class="n">go</span> <span class="p">(</span><span class="kt">EvenCons</span> <span class="n">even</span><span class="p">)</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">y</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">xs</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">go</span> <span class="n">even</span> <span class="n">xs</span></code></pre><p>This is essentially identical to its old definition, but by acquiring the proof via <code>IsEven</code> rather than passing it explicitly, we can call <code>pairUp</code> without having to construct a proof manually:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">pairUp</span> <span class="p">(</span><span class="kt">True</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="sc">'a'</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="s">"foo"</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span><span class="p">)</span>
<span class="p">(</span><span class="kt">True</span><span class="p">,</span><span class="sc">'a'</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span><span class="s">"foo"</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span></code></pre><p>This is rather remarkable. Using TMP, we are able to get GHC to <em>automatically construct a proof that a list is even</em>, with no programmer guidance beyond writing the <code>IsEven</code> typeclass. This relies once more on the perspective that typeclasses are functions that accept types and generate term-level code: <code>IsEven</code> is a function that accepts a type-level list and generates an <code>Even</code> proof term.</p><p>From this perspective, <strong>typeclasses are a way of specifying a proof search algorithm</strong> to the compiler. In the case of <code>IsEven</code>, the proofs being generated are rather simple, so the proof search algorithm is quite mechanical. But in general, typeclasses can be used to perform proof search of significant complexity, given a sufficiently clever encoding into the type system.</p><h3><a name="aside-gadts-versus-type-families"></a>Aside: GADTs versus type families</h3><p>Before moving on, I want to explicitly call attention to the relationship between GADTs and type families. Though at first glance they may seem markedly different, there are some similarities between the two, and sometimes they may be used to accomplish similar things.</p><p>Consider again the type of the <code>pairUp</code> function above (without the typeclass for simplicity):</p><pre><code class="pygments"><span class="nf">pairUp</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="p">(</span><span class="kt">PairUp</span> <span class="n">as</span><span class="p">)</span></code></pre><p>We used both a GADT, <code>Even</code>, and a type family, <code>PairUp</code>. But we could have, in theory, used <em>only</em> a GADT and eliminated the type family altogether. Consider this variation on the <code>Even</code> proof term:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">EvenPairs</span> <span class="n">as</span> <span class="n">bs</span> <span class="kr">where</span>
<span class="kt">EvenNil</span> <span class="ow">::</span> <span class="kt">EvenPairs</span> <span class="kt">'[]</span> <span class="kt">'[]</span>
<span class="kt">EvenCons</span> <span class="ow">::</span> <span class="kt">EvenPairs</span> <span class="n">as</span> <span class="n">bs</span> <span class="ow">-></span> <span class="kt">EvenPairs</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: b</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span> <span class="p">((</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="sc">'</span><span class="err">: bs)</span></code></pre><p>This type has two type parameters rather than one, and though there’s no distinction between the two from GHC’s point of view, it can be useful to think of <code>as</code> as an “input” parameter and <code>bs</code> as an “output” parameter. The idea is that any <code>EvenPairs</code> proof relates both an even-length list type and its paired up equivalent:</p><ul><li><p><code>EvenNil</code> has type <code>EvenPairs '[] '[]</code>,</p></li><li><p><code>EvenCons EvenNil</code> has type <code>EvenPairs '[a, b] '[(a, b)]</code>,</p></li><li><p><code>EvenCons (EvenCons EvenNil)</code> has type <code>EvenPairs '[a, b, c, d] '[(a, b), (c, d)]</code>,</p></li><li><p>…and so on.</p></li></ul><p>This allows us to reformulate our <code>pairUp</code> type signature this way:</p><pre><code class="pygments"><span class="nf">pairUp</span> <span class="ow">::</span> <span class="kt">EvenPairs</span> <span class="n">as</span> <span class="n">bs</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="n">bs</span></code></pre><p>The definition is otherwise unchanged. The <code>PairUp</code> type family is completely gone, because now <code>EvenPairs</code> itself defines the relation. In this way, GADTs can be used like type-level functions!</p><p>The inverse, however, is not true, at least not directly: we cannot eliminate the GADT altogether and exclusively use type families. One way to attempt doing so would be to define a type family that returns a constraint rather than a type:</p><pre><code class="pygments"><span class="kr">import</span> <span class="nn">Data.Kind</span> <span class="p">(</span><span class="kt">Constraint</span><span class="p">)</span>
<span class="kr">type</span> <span class="kr">family</span> <span class="kt">IsEvenTF</span> <span class="n">as</span> <span class="ow">::</span> <span class="kt">Constraint</span> <span class="kr">where</span>
<span class="kt">IsEvenTF</span> <span class="kt">'[]</span> <span class="ow">=</span> <span class="nb">()</span>
<span class="kt">IsEvenTF</span> <span class="p">(</span><span class="kr">_</span> <span class="sc">'</span><span class="err">: _</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">IsEvenTF</span> <span class="n">as</span></code></pre><p>The idea here is that <code>IsEvenTF as</code> produces a constraint can only be satisfied if <code>as</code> has an even number of elements, since that’s the only way it will eventually reduce to <code>()</code>, which in this case means the empty set of constraints, not the unit type (yes, the syntax for that is confusing). And in fact, it’s true that putting <code>IsEvenTF as =></code> in a type signature successfully restricts <code>as</code> to be an even-length list, but it doesn’t allow us to write <code>pairUp</code>. To see why, we can try the following definition:</p><pre><code class="pygments"><span class="nf">pairUp</span> <span class="ow">::</span> <span class="kt">IsEvenTF</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">HList</span> <span class="n">as</span> <span class="ow">-></span> <span class="kt">HList</span> <span class="p">(</span><span class="kt">PairUp</span> <span class="n">as</span><span class="p">)</span>
<span class="nf">pairUp</span> <span class="kt">HNil</span> <span class="ow">=</span> <span class="kt">HNil</span>
<span class="nf">pairUp</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">y</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">xs</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">pairUp</span> <span class="n">xs</span></code></pre><p>Unlike the version using the GADT, this version of <code>pairUp</code> is not considered exhaustive:</p><pre><code>warning: [-Wincomplete-patterns]
Pattern match(es) are non-exhaustive
In an equation for ‘pairUp’: Patterns not matched: HCons _ HNil
</code></pre><p>This is because type families don’t provide the same bidirectional flow of information that GADTs do, they’re only type-level functions. The constraint generated by <code>IsEvenTF</code> provides no term-level evidence about the shape of <code>as</code>, so we can’t branch on it the way we can branch on the <code>Even</code> GADT.<sup><a href="#footnote-5" id="footnote-ref-5-1">5</a></sup> (In a sense, <code>IsEvenTF</code> is doing <a href="/blog/2019/11/05/parse-don-t-validate/">validation, not parsing</a>.)</p><p>For this reason, I caution against overuse of type families. Their simplicity is seductive, but all too often you pay for that simplicity with inflexibility. GADTs combined with TMP for proof inference can provide the best of both worlds: complete control over the term-level proof that gets generated while still letting the compiler do most of the work for you.</p><h3><a name="guiding-type-inference"></a>Guiding type inference</h3><p>So far, this blog post has given relatively little attention to type inference. That is in some part a testament to the robustness of GHC’s type inference algorithm: even when fairly sophisticated TMP is involved, GHC often manages to propagate enough type information that type annotations are rarely needed.</p><p>However, when doing TMP, it would be irresponsible to not at least consider the type inference properties of programs. Type inference is what drives the whole typeclass resolution process to begin with, so poor type inference can easily make your fancy TMP construction next to useless. To take advantage of GHC to the fullest extent, programs should proactively guide the typechecker to help it infer as much as possible as often as possible.</p><p>To illustrate what that can look like, suppose we want to use TMP to generate an <code>HList</code> full of <code>()</code> values of an arbitrary length:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">UnitList</span> <span class="n">as</span> <span class="kr">where</span>
<span class="n">unitList</span> <span class="ow">::</span> <span class="kt">HList</span> <span class="n">as</span>
<span class="kr">instance</span> <span class="kt">UnitList</span> <span class="kt">'[]</span> <span class="kr">where</span>
<span class="n">unitList</span> <span class="ow">=</span> <span class="kt">HNil</span>
<span class="kr">instance</span> <span class="kt">UnitList</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">UnitList</span> <span class="p">(</span><span class="nb">()</span> <span class="sc">'</span><span class="err">: as) where</span>
<span class="n">unitList</span> <span class="ow">=</span> <span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">unitList</span></code></pre><p>Testing in GHCi, we can see it behaves as desired:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">unitList</span> <span class="ow">::</span> <span class="kt">HList</span> <span class="kt">'[(), (), ()]</span>
<span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span></code></pre><p>Now suppose we write a function that accepts a list containing exactly one element and returns it:</p><pre><code class="pygments"><span class="nf">unsingleton</span> <span class="ow">::</span> <span class="kt">HList</span> <span class="kt">'[a]</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">unsingleton</span> <span class="p">(</span><span class="n">x</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="kt">HNil</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span></code></pre><p>Naturally, we would expect these to compose without a hitch. If we write <code>unsingleton unitList</code>, our TMP should generate a list of length 1, and we should get back <code>()</code>. However, it may surprise you to learn that <em>isn’t</em>, in fact, what happens:<sup><a href="#footnote-6" id="footnote-ref-6-1">6</a></sup></p><pre><code>ghci> unsingleton unitList
error:
• Ambiguous type variable ‘a0’ arising from a use of ‘unitList’
prevents the constraint ‘(UnitList '[a0])’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
These potential instances exist:
instance UnitList as => UnitList (() : as)
</code></pre><p>What went wrong? The type error says that <code>a0</code> is ambiguous, but it only lists a single matching <code>UnitList</code> instance—the one we want—so how can it be ambiguous which one to select?</p><p>The problem stems from the way we defined <code>UnitList</code>. When we wrote the instance</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">UnitList</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">UnitList</span> <span class="p">(</span><span class="nb">()</span> <span class="sc">'</span><span class="err">: as) where</span></code></pre><p>we said the first element of the type-level list must be <code>()</code>, so there’s nothing stopping someone from coming along and defining another instance:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">UnitList</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">UnitList</span> <span class="p">(</span><span class="kt">Int</span> <span class="sc">'</span><span class="err">: as) where</span>
<span class="n">unitList</span> <span class="ow">=</span> <span class="mi">0</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">unitList</span></code></pre><p>In that case, GHC would have no way to know which instance to pick. Nothing in the type of <code>unsingleton</code> forces the element in the list to have type <code>()</code>, so both instances are equally valid. To hedge against this future possibility, GHC rejects the program as ambiguous from the start.</p><p>Of course, this isn’t what we want. The <code>UnitList</code> class is supposed to <em>always</em> return a list of <code>()</code> values, so how can we force GHC to pick our instance anyway? The answer is to play a trick:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="p">(</span><span class="n">a</span> <span class="o">~</span> <span class="nb">()</span><span class="p">,</span> <span class="kt">UnitList</span> <span class="n">as</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">UnitList</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: as) where</span>
<span class="n">unitList</span> <span class="ow">=</span> <span class="nb">()</span> <span class="p">`</span><span class="kt">HCons</span><span class="p">`</span> <span class="n">unitList</span></code></pre><p>Here we’ve changed the instance so that it has the shape <code>UnitList (a ': as)</code>, with a type variable in place of the <code>()</code>, but we also added an equality constraint that forces <code>a</code> to be <code>()</code>. Intuitively, you might think these two instances are completely identical, but in fact they are not! As proof, our example now typechecks:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">unsingleton</span> <span class="n">unitList</span>
<span class="nb">()</span></code></pre><p>To understand why, it’s important to understand how GHC’s typeclass resolution algorithm works. Let’s start by establishing some terminology. Note that every instance declaration has the following shape:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="o"><</span><span class="n">constraints</span><span class="o">></span> <span class="ow">=></span> <span class="kt">C</span> <span class="o"><</span><span class="n">types</span><span class="o">></span></code></pre><p>The part to the left of the <code>=></code> is known as the <em>instance context</em>, while the part to the right is known as the <em>instance head</em>. Now for the important bit: when GHC attempts to pick which typeclass instance to use to solve a typeclass constraint, <strong>only the instance head matters, and the instance context is completely ignored</strong>. Once GHC picks an instance, it commits to its choice, and only then does it consider the instance context.</p><p>This explains why our two <code>UnitList</code> instances behave differently:</p><ul><li><p>Given the instance head <code>UnitList (() ': as)</code>, GHC won’t select the instance unless it knows the first element of the list is <code>()</code>.</p></li><li><p>But given the instance head <code>UnitList (a ': as)</code>, GHC will pick the instance regardless of the type of the first element. All that matters is that the list is at least one element long.</p></li></ul><p>After the <code>UnitList (a ': as)</code> instance is selected, GHC attempts to solve the constraints in the instance context, including the <code>a ~ ()</code> constraint. This <em>forces</em> <code>a</code> to be <code>()</code>, resolving the ambiguity and allowing type inference to proceed.</p><p>This distinction might seem excessively subtle, but in practice it is enormously useful. It means you, the programmer, have direct control over the type inference process:</p><ul><li><p>If you put a type in the instance head, you’re asking GHC to figure out how to make the types match up by some other means. Sometimes that’s very useful, since perhaps you want that type to inform which instance to pick.</p></li><li><p>But if you put an equality constraint in the instance context, the roles are reversed: you’re saying to the compiler “you don’t tell me, I’ll tell <em>you</em> what type this is,” effectively giving you a role in type inference itself.</p></li></ul><p>From this perspective, typeclass instances with equality constraints make GHC’s type inference algorithm extensible. You get to pick which decisions are made and when, and crucially, you can use knowledge of your own program structure to expose more information to the typechecker.</p><p>Given all of the above, consider again the definition of <code>IsEven</code> from earlier:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">IsEven</span> <span class="n">as</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">::</span> <span class="kt">Even</span> <span class="n">as</span>
<span class="kr">instance</span> <span class="kt">IsEven</span> <span class="kt">'[]</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">=</span> <span class="kt">EvenNil</span>
<span class="kr">instance</span> <span class="kt">IsEven</span> <span class="n">as</span> <span class="ow">=></span> <span class="kt">IsEven</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: b</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">=</span> <span class="kt">EvenCons</span> <span class="n">evenProof</span></code></pre><p>Though it didn’t cause any problems in the examples we tried, this definition isn’t optimized for type inference. If GHC needed to solve an <code>IsEven (a ': b0)</code> constraint, where <code>b0</code> is an ambiguous type variable, it would get stuck, since it doesn’t know that someone won’t come along and define an <code>IsEven '[a]</code> instance in the future.</p><p>To fix this, we can apply the same trick we used for <code>UnitList</code>, just in a slightly different way:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="p">(</span><span class="n">as</span> <span class="o">~</span> <span class="p">(</span><span class="n">b</span> <span class="sc">'</span><span class="err">: bs), IsEven bs) => IsEven (a</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">as</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">evenProof</span> <span class="ow">=</span> <span class="kt">EvenCons</span> <span class="n">evenProof</span></code></pre><p>Again, the idea is to move the type information we <em>learn</em> from picking this instance into the instance context, allowing it to guide type inference rather than making type inference figure it out from some other source. Consistently applying this transformation can <strong>dramatically</strong> improve type inference in programs that make heavy use of TMP.</p><h3><a name="example-3-subtyping-constraints"></a>Example 3: Subtyping constraints</h3><p>At last, we have reached the final example of this blog post. For this one, I have the pleasure of providing a real-world example from a production Haskell codebase: while I was working at <a href="https://hasura.io/">Hasura</a>, I had the opportunity to design an internal parser combinator library that captures aspects of the <a href="https://graphql.org/">GraphQL</a> type system. One such aspect of that type system is a form of subtyping; GraphQL essentially has two “kinds” of types—input types and output types—but some types can be used as both.</p><p>Haskell has no built-in support for subtyping, so most Haskell programs do their best to get away with parametric polymorphism instead. However, in our case, we actually need to distinguish (at runtime) types in the “both” category from those that are exclusively input or exclusively output types. Consequently, our <code>GQLKind</code> datatype has three cases:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">GQLKind</span>
<span class="ow">=</span> <span class="kt">Both</span>
<span class="o">|</span> <span class="kt">Input</span>
<span class="o">|</span> <span class="kt">Output</span></code></pre><p>We use <code>DataKind</code>-promoted versions of this <code>GQLKind</code> type as a parameter to a <code>GQLType</code> GADT:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">GQLType</span> <span class="n">k</span> <span class="kr">where</span>
<span class="kt">TScalar</span> <span class="ow">::</span> <span class="kt">GQLType</span> <span class="kt">'Both</span>
<span class="kt">TInputObject</span> <span class="ow">::</span> <span class="kt">InputObjectInfo</span> <span class="ow">-></span> <span class="kt">GQLType</span> <span class="kt">'Input</span>
<span class="kt">TIObject</span> <span class="ow">::</span> <span class="kt">ObjectInfo</span> <span class="ow">-></span> <span class="kt">GQLType</span> <span class="kt">'Output</span>
<span class="c1">-- ...and so on...</span></code></pre><p>This allows us to write functions that only accept input types or only accept output types, which is a wonderful property to be able to guarantee at compile-time! But there’s a problem: if we write a function that only accepts values of type <code>GQLType 'Input</code>, we can’t pass a <code>GQLType 'Both</code>, even though we really ought to be able to.</p><p>To fix this, we can use a little dependently typed programming. First, we’ll define a type to represent proof terms that witness a subkinding relationship:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">SubKind</span> <span class="n">k1</span> <span class="n">k2</span> <span class="kr">where</span>
<span class="kt">KRefl</span> <span class="ow">::</span> <span class="kt">SubKind</span> <span class="n">k</span> <span class="n">k</span>
<span class="kt">KBoth</span> <span class="ow">::</span> <span class="kt">SubKind</span> <span class="kt">'Both</span> <span class="n">k</span></code></pre><p>The first case, <code>KRefl</code>, states that every kind is trivially a subkind of itself. The second case, <code>KBoth</code>, states that <code>Both</code> is a subkind of any kind at all. (This is a particularly literal example of <a href="/blog/2020/08/13/types-as-axioms-or-playing-god-with-static-types/">using a type to define axioms</a>.) The next step is to use TMP to implement proof inference:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">IsSubKind</span> <span class="n">k1</span> <span class="n">k2</span> <span class="kr">where</span>
<span class="n">subKindProof</span> <span class="ow">::</span> <span class="kt">SubKind</span> <span class="n">k1</span> <span class="n">k2</span>
<span class="kr">instance</span> <span class="kt">IsSubKind</span> <span class="kt">'Both</span> <span class="n">k</span> <span class="kr">where</span>
<span class="n">subKindProof</span> <span class="ow">=</span> <span class="kt">KBoth</span>
<span class="kr">instance</span> <span class="p">(</span><span class="n">k</span> <span class="o">~</span> <span class="kt">'Input</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">IsSubKind</span> <span class="kt">'Input</span> <span class="n">k</span> <span class="kr">where</span>
<span class="n">subKindProof</span> <span class="ow">=</span> <span class="kt">KRefl</span>
<span class="kr">instance</span> <span class="p">(</span><span class="n">k</span> <span class="o">~</span> <span class="kt">'Output</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">IsSubKind</span> <span class="kt">'Output</span> <span class="n">k</span> <span class="kr">where</span>
<span class="n">subKindProof</span> <span class="ow">=</span> <span class="kt">KRefl</span></code></pre><p>These instances use the type equality trick described in the previous section to guide type inference, ensuring that if we ever need to prove that <code>k</code> is a superkind of <code>'Input</code> or <code>'Output</code>, type inference will force them to be equal.</p><p>Using <code>IsSubKind</code>, we can easily resolve the problem described above. Rather than write a function with a type like this:</p><pre><code class="pygments"><span class="nf">nullable</span> <span class="ow">::</span> <span class="kt">GQLParser</span> <span class="kt">'Input</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">GQLParser</span> <span class="kt">'Input</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">a</span><span class="p">)</span></code></pre><p>…we simply use an <code>IsSubKind</code> constraint, instead:</p><pre><code class="pygments"><span class="nf">nullable</span> <span class="ow">::</span> <span class="kt">IsSubKind</span> <span class="n">k</span> <span class="kt">'Input</span> <span class="ow">=></span> <span class="kt">GQLParser</span> <span class="n">k</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">GQLParser</span> <span class="n">k</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">a</span><span class="p">)</span></code></pre><p>Now both <code>'Input</code> and <code>'Both</code> kinds are accepted. In my experience, this caused no trouble at all for callers of these functions; everything worked completely automatically. <em>Consuming</em> the <code>SubKind</code> proofs was slightly more involved, but only ever so slightly. For example, we have a type family that looks like this:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">ParserInput</span> <span class="n">k</span> <span class="kr">where</span>
<span class="kt">ParserInput</span> <span class="kt">'Both</span> <span class="ow">=</span> <span class="kt">InputValue</span>
<span class="kt">ParserInput</span> <span class="kt">'Input</span> <span class="ow">=</span> <span class="kt">InputValue</span>
<span class="kt">ParserInput</span> <span class="kt">'Output</span> <span class="ow">=</span> <span class="kt">SelectionSet</span></code></pre><p>This type family is used to determine what a <code>GQLParser k a</code> actually consumes as input, based on the kind of the GraphQL type it corresponds to. In some functions, we need to prove to GHC that <code>IsSubKind k 'Input</code> implies <code>ParserInput k ~ InputValue</code>.</p><p>Fortunately, that is very easy to do using <a href="https://hackage.haskell.org/package/base-4.14.1.0/docs/Data-Type-Equality.html">the <code>(:~:)</code> type from <code>Data.Type.Equality</code> in <code>base</code></a> to capture a term-level witness of a type equality. It’s an ordinary Haskell GADT that happens to have an infix type constructor, and this is its definition:</p><pre><code class="pygments"><span class="kr">data</span> <span class="n">a</span> <span class="kt">:~:</span> <span class="n">b</span> <span class="kr">where</span>
<span class="kt">Refl</span> <span class="ow">::</span> <span class="n">a</span> <span class="kt">:~:</span> <span class="n">a</span></code></pre><p>Just as with any other GADT, <code>(:~:)</code> can be used to pack up type equalities and unpack them later; <code>a :~: b</code> just happens to be the GADT that corresponds precisely to the equality <code>a ~ b</code>. Using <code>(:~:)</code>, we can write a reusable proof that <code>IsSubKind k 'Input</code> implies <code>ParserInput k ~ InputValue</code>:</p><pre><code class="pygments"><span class="nf">inputParserInput</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">k</span><span class="o">.</span> <span class="kt">IsSubKind</span> <span class="n">k</span> <span class="kt">'Input</span> <span class="ow">=></span> <span class="kt">ParserInput</span> <span class="n">k</span> <span class="kt">:~:</span> <span class="kt">InputValue</span>
<span class="nf">inputParserInput</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">subKindProof</span> <span class="o">@</span><span class="n">k</span> <span class="o">@</span><span class="kt">'Input</span> <span class="kr">of</span>
<span class="kt">KRefl</span> <span class="ow">-></span> <span class="kt">Refl</span>
<span class="kt">KBoth</span> <span class="ow">-></span> <span class="kt">Refl</span></code></pre><p>This function is a very simple proof by cases, where <code>Refl</code> can be read as “Q.E.D.”:</p><ul><li><p>In the first case, matching on <code>KRefl</code> refines <code>k</code> to <code>'Input</code>, and <code>ParserInput 'Input</code> is <code>InputValue</code> by definition of <code>ParserInput</code>.</p></li><li><p>Likewise, in the second case, matching on <code>KBoth</code> refines <code>k</code> to <code>'Both</code>, and <code>ParserInput 'Both</code> is also <code>InputValue</code> by definition of <code>ParserInput</code>.</p></li></ul><p>This <code>inputParserInput</code> helper allows functions like <code>nullable</code>, which internally need <code>ParserInput k ~ InputValue</code>, to take the form</p><pre><code class="pygments"><span class="nf">nullable</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">k</span> <span class="n">a</span><span class="o">.</span> <span class="kt">IsSubKind</span> <span class="n">k</span> <span class="kt">'Input</span> <span class="ow">=></span> <span class="kt">GQLParser</span> <span class="n">k</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">GQLParser</span> <span class="n">k</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">a</span><span class="p">)</span>
<span class="nf">nullable</span> <span class="n">parser</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">inputParserInput</span> <span class="o">@</span><span class="n">k</span> <span class="kr">of</span>
<span class="kt">Refl</span> <span class="ow">-></span> <span class="cm">{- ...implementation goes here... -}</span></code></pre><p>Overall, this burden is quite minimal, so the additional type safety is more than worth the effort. The same could not be said without <code>IsSubKind</code> doing work to infer the proofs at each use site, so in this case, TMP has certainly paid its weight!</p><h2><a name="wrapping-up-and-closing-thoughts"></a>Wrapping up and closing thoughts</h2><p>So concludes my introduction to Haskell TMP. As seems to happen all too often with my blog posts, this one has grown rather long, so allow me to provide a summary of the most important points:</p><ul><li><p>Typeclass metaprogramming is a powerful technique for performing type-directed code generation, making it a form of “value inference” that infers values from types.</p></li><li><p>Unlike most other metaprogramming mechanisms, TMP has a wonderful synergy with type inference, which allows it to take advantage of information the programmer may not have even written explicitly.</p></li><li><p>Though I’ve called the technique “<em>typeclass</em> metaprogramming,” TMP really leverages the entirety of the modern GHC type system. Type families, GADTs, promoted types, and more all have their place in usefully applying type-level programming.</p></li><li><p>Finally, since TMP relies so heavily on type inference to do its job, it’s crucial to be thoughtful about how you design type-level code to give the typechecker as many opportunities to succeed as you possibly can.</p></li></ul><p>The individual applications of TMP covered in this blog post—type-level computation, generic programming, and dependent typing—are all useful in their own right, and this post does not linger on any of them long enough to do any of them justice. That is, perhaps, the cost one pays when trying to discuss such an abstract, general technique. However, I hope that readers can see the forest for the trees and understand how TMP can be a set of techniques in their own right, applicable to the topics described above and more.</p><p>Readers may note that this blog post targets a slightly different audience than my other recent writing has been. That is a conscious choice: there is an unfortunate dearth of resources to help intermediate Haskell programmers become advanced Haskell programmers, in part because it’s hard to write them. The lack of resources makes tackling topics like this rather difficult, as too often it feels as though an entire web of concepts must be explained all at once, with no obvious incremental path that provides sufficient motivation every step of the way.</p><p>It remains to be seen whether my stab at the problem will be successful. But on the chance that it is, I suspect some readers will be curious about where to go next. Here are some ideas:</p><ul><li><p>As mentioned earlier in this blog post, <a href="https://hackage.haskell.org/package/base-4.14.1.0/docs/GHC-Generics.html">the <code>GHC.Generics</code> module documentation</a> is a great resource if you want to explore generic programming further, and generic programming is a great way to put TMP to practical use.</p></li><li><p>I have long believed that <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/">the GHC User’s Guide</a> is a criminally under-read and underappreciated piece of documentation. It is a treasure trove of knowledge, and I highly recommend reading through the sections on type-related language extensions if you want to get a better grasp of the mechanics of the Haskell type system.</p></li><li><p>Finally, if dependently typed programming in Haskell intrigues you, and you don’t mind staring into the sun, the <a href="https://hackage.haskell.org/package/singletons">singletons</a> library provides abstractions and design patterns that can considerably cut down on the boilerplate. (Also, <a href="https://cs.brynmawr.edu/~rae/papers/2012/singletons/paper.pdf">the accompanying paper</a> is definitely worth a read if you’d like to go down that route.)</p></li></ul><p>Even if you don’t decide to pursue type-level programming in Haskell, I hope this blog post helps make some of the concepts involved less mystical and intimidating. I, for one, think this stuff is worth the effort involved in understanding. After all, you never know when it might come in handy.</p><ol class="footnotes"><li id="footnote-1"><p>Not to be confused with C++’s <a href="https://en.wikipedia.org/wiki/Template_metaprogramming"><em>template</em> metaprogramming</a>, though there are significant similarities between the two techniques. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>There have been proposals to introduce ordered instances, known in the literature as <a href="https://homepage.cs.uiowa.edu/~jgmorrs/pubs/morris-icfp2010-instances.pdf"><em>instance chains</em></a>, but as of this writing, GHC does not implement them. <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>Note that this also preserves an important property of the Haskell type system, parametricity. A function like <code>id :: a -> a</code> shouldn’t be allowed to do different things depending on which type is chosen for <code>a</code>, which our first version of <code>guardUnit</code> tried to violate. Typeclasses, being functions on types, can naturally do different things given different types, so a typeclass constraint is precisely what gives us the power to violate parametricity. <a href="#footnote-ref-3-1">↩</a></p></li><li id="footnote-4"><p>Short for <em>generalized algebraic datatypes</em>, which is a rather unhelpful name for actually understanding what they are or what they’re for. <a href="#footnote-ref-4-1">↩</a></p></li><li id="footnote-5"><p>If GHC allowed lightweight existential quantification, we could make that term-level evidence available with a sufficiently clever definition for <code>IsEvenTF</code>:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">family</span> <span class="kt">IsEvenTF</span> <span class="n">as</span> <span class="ow">::</span> <span class="kt">Constraint</span> <span class="kr">where</span>
<span class="kt">IsEvenTF</span> <span class="kt">'[]</span> <span class="ow">=</span> <span class="nb">()</span>
<span class="kt">IsEvenTF</span> <span class="p">(</span><span class="n">a</span> <span class="sc">'</span><span class="err">: as) = exists b a</span><span class="sc">s'</span><span class="o">.</span> <span class="p">(</span><span class="n">as</span> <span class="o">~</span> <span class="p">(</span><span class="n">b</span> <span class="sc">'</span><span class="err">: a</span><span class="sc">s'</span><span class="p">),</span> <span class="kt">IsEvenTF</span> <span class="n">as'</span><span class="p">)</span></code></pre><p>The type refinement provided by matching on <code>HCons</code> would be enough for the second case of <code>IsEvenTF</code> to be selected, which would provide an equality proof that <code>as</code> has at least two elements. Sadly, GHC does not support anything of this sort, and it’s unclear if it would be tractable to implement at all. <a href="#footnote-ref-5-1">↩</a></p></li><li id="footnote-6"><p>Actually, I’ve cheated a little bit here, because <code>unsingleton unitList</code> really does typecheck in GHCi under normal circumstances. That’s because <a href="https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/ghci.html#extension-ExtendedDefaultRules">the <code>ExtendedDefaultRules</code> extension</a> is enabled in GHCi by default, which defaults ambiguous type variables to <code>()</code>, which happens to be exactly what’s needed to make this contrived example typecheck. However, that doesn’t say anything very useful, since the same expression really would fail to typecheck inside a Haskell module, so I’ve turned <code>ExtendedDefaultRules</code> off to illustrate the problem. <a href="#footnote-ref-6-1">↩</a></p></li></ol></article>Names are not type safety2020-11-01T00:00:00Z2020-11-01T00:00:00ZAlexis King<article><p>Haskell programmers spend a lot of time talking about <em>type safety</em>. The Haskell school of program construction advocates “capturing invariants in the type system” and “making illegal states unrepresentable,” both of which sound like compelling goals, but are rather vague on the techniques used to achieve them. Almost exactly one year ago, I published <a href="/blog/2019/11/05/parse-don-t-validate/">Parse, Don’t Validate</a> as an initial stab towards bridging that gap.</p><p>The ensuing discussions were largely productive and right-minded, but one particular source of confusion quickly became clear: Haskell’s <code>newtype</code> construct. The idea is simple enough—the <code>newtype</code> keyword declares a wrapper type, nominally distinct from but representationally equivalent to the type it wraps—and on the surface this <em>sounds</em> like a simple and straightforward path to type safety. For example, one might consider using a <code>newtype</code> declaration to define a type for an email address:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">EmailAddress</span> <span class="ow">=</span> <span class="kt">EmailAddress</span> <span class="kt">Text</span></code></pre><p>This technique can provide <em>some</em> value, and when coupled with a smart constructor and an encapsulation boundary, it can even provide some safety. But it is a meaningfully distinct <em>kind</em> of type safety from the one I highlighted a year ago, one that is far weaker. On its own, a newtype is just a name.</p><p>And names are not type safety.</p><h2><a name="intrinsic-and-extrinsic-safety"></a>Intrinsic and extrinsic safety</h2><p>To illustrate the difference between constructive data modeling (discussed at length in my <a href="/blog/2020/08/13/types-as-axioms-or-playing-god-with-static-types/">previous blog post</a>) and newtype wrappers, let’s consider an example. Suppose we want a type for “an integer between 1 and 5, inclusive.” The natural constructive modeling would be an enumeration with five cases:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">OneToFive</span>
<span class="ow">=</span> <span class="kt">One</span>
<span class="o">|</span> <span class="kt">Two</span>
<span class="o">|</span> <span class="kt">Three</span>
<span class="o">|</span> <span class="kt">Four</span>
<span class="o">|</span> <span class="kt">Five</span></code></pre><p>We could then write some functions to convert between <code>Int</code> and our <code>OneToFive</code> type:</p><pre><code class="pygments"><span class="nf">toOneToFive</span> <span class="ow">::</span> <span class="kt">Int</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="kt">OneToFive</span>
<span class="nf">toOneToFive</span> <span class="mi">1</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="kt">One</span>
<span class="nf">toOneToFive</span> <span class="mi">2</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="kt">Two</span>
<span class="nf">toOneToFive</span> <span class="mi">3</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="kt">Three</span>
<span class="nf">toOneToFive</span> <span class="mi">4</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="kt">Four</span>
<span class="nf">toOneToFive</span> <span class="mi">5</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="kt">Five</span>
<span class="nf">toOneToFive</span> <span class="kr">_</span> <span class="ow">=</span> <span class="kt">Nothing</span>
<span class="nf">fromOneToFive</span> <span class="ow">::</span> <span class="kt">OneToFive</span> <span class="ow">-></span> <span class="kt">Int</span>
<span class="nf">fromOneToFive</span> <span class="kt">One</span> <span class="ow">=</span> <span class="mi">1</span>
<span class="nf">fromOneToFive</span> <span class="kt">Two</span> <span class="ow">=</span> <span class="mi">2</span>
<span class="nf">fromOneToFive</span> <span class="kt">Three</span> <span class="ow">=</span> <span class="mi">3</span>
<span class="nf">fromOneToFive</span> <span class="kt">Four</span> <span class="ow">=</span> <span class="mi">4</span>
<span class="nf">fromOneToFive</span> <span class="kt">Five</span> <span class="ow">=</span> <span class="mi">5</span></code></pre><p>This would be perfectly sufficient for achieving our stated goal, but you’d be forgiven for finding it odd: it would be rather awkward to work with in practice. Because we’ve invented an entirely new type, we can’t reuse any of the usual numeric functions Haskell provides. Consequently, many programmers would gravitate towards a newtype wrapper, instead:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">OneToFive</span> <span class="ow">=</span> <span class="kt">OneToFive</span> <span class="kt">Int</span></code></pre><p>Just as before, we can provide <code>toOneToFive</code> and <code>fromOneToFive</code> functions, with identical types:</p><pre><code class="pygments"><span class="nf">toOneToFive</span> <span class="ow">::</span> <span class="kt">Int</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="kt">OneToFive</span>
<span class="nf">toOneToFive</span> <span class="n">n</span>
<span class="o">|</span> <span class="n">n</span> <span class="o">>=</span> <span class="mi">1</span> <span class="o">&&</span> <span class="n">n</span> <span class="o"><=</span> <span class="mi">5</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="o">$</span> <span class="kt">OneToFive</span> <span class="n">n</span>
<span class="o">|</span> <span class="n">otherwise</span> <span class="ow">=</span> <span class="kt">Nothing</span>
<span class="nf">fromOneToFive</span> <span class="ow">::</span> <span class="kt">OneToFive</span> <span class="ow">-></span> <span class="kt">Int</span>
<span class="nf">fromOneToFive</span> <span class="p">(</span><span class="kt">OneToFive</span> <span class="n">n</span><span class="p">)</span> <span class="ow">=</span> <span class="n">n</span></code></pre><p>If we put these declarations in their own module and choose not to export the <code>OneToFive</code> constructor, these APIs might appear entirely interchangeable. Naïvely, it seems that the newtype version is both simpler and equally type-safe. However—perhaps surprisingly—this is not actually true.</p><p>To see why, suppose we write a function that consumes a <code>OneToFive</code> value as an argument. Under the constructive modeling, such a function need only pattern-match against each of the five constructors, and GHC will accept the definition as exhaustive:</p><pre><code class="pygments"><span class="nf">ordinal</span> <span class="ow">::</span> <span class="kt">OneToFive</span> <span class="ow">-></span> <span class="kt">Text</span>
<span class="nf">ordinal</span> <span class="kt">One</span> <span class="ow">=</span> <span class="s">"first"</span>
<span class="nf">ordinal</span> <span class="kt">Two</span> <span class="ow">=</span> <span class="s">"second"</span>
<span class="nf">ordinal</span> <span class="kt">Three</span> <span class="ow">=</span> <span class="s">"third"</span>
<span class="nf">ordinal</span> <span class="kt">Four</span> <span class="ow">=</span> <span class="s">"fourth"</span>
<span class="nf">ordinal</span> <span class="kt">Five</span> <span class="ow">=</span> <span class="s">"fifth"</span></code></pre><p>The same is not true given the newtype encoding. The newtype is opaque, so the only way to observe it is to convert it back to an <code>Int</code>—after all, it <em>is</em> an <code>Int</code>. An <code>Int</code> can of course contain many other values besides <code>1</code> through <code>5</code>, so we are forced to add an error case to satisfy the exhaustiveness checker:</p><pre><code class="pygments"><span class="nf">ordinal</span> <span class="ow">::</span> <span class="kt">OneToFive</span> <span class="ow">-></span> <span class="kt">Text</span>
<span class="nf">ordinal</span> <span class="n">n</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">fromOneToFive</span> <span class="n">n</span> <span class="kr">of</span>
<span class="mi">1</span> <span class="ow">-></span> <span class="s">"first"</span>
<span class="mi">2</span> <span class="ow">-></span> <span class="s">"second"</span>
<span class="mi">3</span> <span class="ow">-></span> <span class="s">"third"</span>
<span class="mi">4</span> <span class="ow">-></span> <span class="s">"fourth"</span>
<span class="mi">5</span> <span class="ow">-></span> <span class="s">"fifth"</span>
<span class="kr">_</span> <span class="ow">-></span> <span class="ne">error</span> <span class="s">"impossible: bad OneToFive value"</span></code></pre><p>In this highly contrived example, this may not seem like much of a problem to you. But it nonetheless illustrates a key difference in the guarantees afforded by the two approaches:</p><ul><li><p>The constructive datatype captures its invariants in such a way that they are <em>accessible</em> to downstream consumers. This frees our <code>ordinal</code> function from worrying about handling illegal values, as they have been made unutterable.</p></li><li><p>The newtype wrapper provides a smart constructor that <em>validates</em> the value, but the boolean result of that check is used only for control flow; it is not preserved in the function’s result. Accordingly, downstream consumers cannot take advantage of the restricted domain; they are functionally accepting <code>Int</code>s.</p></li></ul><p>Losing exhaustiveness checking might seem like small potatoes, but it absolutely is not: our use of <code>error</code> has punched a hole right through our type system. If we were to add another constructor to our <code>OneToFive</code> datatype,<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup> the version of <code>ordinal</code> that consumes a constructive datatype would be immediately detected non-exhaustive at compile-time, while the version that consumes a newtype wrapper would continue to compile yet fail at runtime, dropping through to the “impossible” case.</p><p>All of this is a consequence of the fact that the constructive modeling is <em>intrinsically</em> type-safe; that is, the safety properties are enforced by the type declaration itself. Illegal values truly are unrepresentable: there is simply no way to represent <code>6</code> using any of the five constructors. The same is not true of the newtype declaration, which has no intrinsic semantic distinction from that of an <code>Int</code>; its meaning is specified extrinsically via the <code>toOneToFive</code> smart constructor. Any semantic distinction intended by a newtype is thoroughly invisible to the type system; it exists only in the programmer’s mind.</p><h3><a name="revisiting-non-empty-lists"></a>Revisiting non-empty lists</h3><p>Our <code>OneToFive</code> datatype is rather artificial, but identical reasoning applies to other datatypes that are significantly more practical. Consider the <code>NonEmpty</code> datatype I’ve repeatedly highlighted in recent blog posts:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">=</span> <span class="n">a</span> <span class="kt">:|</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span></code></pre><p>It may be illustrative to imagine a version of <code>NonEmpty</code> represented as a newtype over ordinary lists. We can use the usual smart constructor strategy to enforce the desired non-emptiness property:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">NonEmpty</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span>
<span class="nf">nonEmpty</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="n">a</span><span class="p">)</span>
<span class="nf">nonEmpty</span> <span class="kt">[]</span> <span class="ow">=</span> <span class="kt">Nothing</span>
<span class="nf">nonEmpty</span> <span class="n">xs</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="o">$</span> <span class="kt">NonEmpty</span> <span class="n">xs</span>
<span class="kr">instance</span> <span class="kt">Foldable</span> <span class="kt">NonEmpty</span> <span class="kr">where</span>
<span class="n">toList</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="n">xs</span><span class="p">)</span> <span class="ow">=</span> <span class="n">xs</span></code></pre><p>Just as with <code>OneToFive</code>, we quickly discover the consequences of failing to preserve this information in the type system. Our motivating use case for <code>NonEmpty</code> was the ability to write a safe version of <code>head</code>, but the newtype version requires another assertion:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">head</span> <span class="n">xs</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">toList</span> <span class="n">xs</span> <span class="kr">of</span>
<span class="n">x</span><span class="kt">:</span><span class="kr">_</span> <span class="ow">-></span> <span class="n">x</span>
<span class="kt">[]</span> <span class="ow">-></span> <span class="ne">error</span> <span class="s">"impossible: empty NonEmpty value"</span></code></pre><p>This might not seem like a big deal, since it seems unlikely such a case would ever happen. But that reasoning hinges entirely on trusting the correctness of the module that defines <code>NonEmpty</code>, while the constructive definition only requires trusting the GHC typechecker. As we generally trust that the typechecker works correctly, the latter is a much more compelling proof.</p><h2><a name="newtypes-as-tokens"></a>Newtypes as tokens</h2><p>If you are fond of newtypes, this whole argument may seem a bit troubling. It may seem like I’m implying newtypes are scarcely better than comments, albeit comments that happen to be meaningful to the typechecker. Fortunately, the situation is not quite that grim—newtypes <em>can</em> provide a sort of safety, just a weaker one.</p><p>The primary safety benefit of newtypes is derived from abstraction boundaries. If a newtype’s constructor is not exported, it becomes opaque to other modules. The module that defines the newtype—its “home module”—can take advantage of this to create a <em>trust boundary</em> where internal invariants are enforced by restricting clients to a safe API.</p><p>We can use the <code>NonEmpty</code> example from above to illustrate how this works. We refrain from exporting the <code>NonEmpty</code> constructor, and we provide <code>head</code> and <code>tail</code> operations that we trust to never actually fail:</p><pre><code class="pygments"><span class="kr">module</span> <span class="nn">Data.List.NonEmpty.Newtype</span>
<span class="p">(</span> <span class="kt">NonEmpty</span>
<span class="p">,</span> <span class="nf">cons</span>
<span class="p">,</span> <span class="nf">nonEmpty</span>
<span class="p">,</span> <span class="nf">head</span>
<span class="p">,</span> <span class="nf">tail</span>
<span class="p">)</span> <span class="kr">where</span>
<span class="kr">newtype</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">NonEmpty</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span>
<span class="nf">cons</span> <span class="ow">::</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">NonEmpty</span> <span class="n">a</span>
<span class="nf">cons</span> <span class="n">x</span> <span class="n">xs</span> <span class="ow">=</span> <span class="kt">NonEmpty</span> <span class="p">(</span><span class="n">x</span><span class="kt">:</span><span class="n">xs</span><span class="p">)</span>
<span class="nf">nonEmpty</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="n">a</span><span class="p">)</span>
<span class="nf">nonEmpty</span> <span class="kt">[]</span> <span class="ow">=</span> <span class="kt">Nothing</span>
<span class="nf">nonEmpty</span> <span class="n">xs</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="o">$</span> <span class="kt">NonEmpty</span> <span class="n">xs</span>
<span class="nf">head</span> <span class="ow">::</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">head</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="p">(</span><span class="n">x</span><span class="kt">:</span><span class="kr">_</span><span class="p">))</span> <span class="ow">=</span> <span class="n">x</span>
<span class="nf">head</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="kt">[]</span><span class="p">)</span> <span class="ow">=</span> <span class="ne">error</span> <span class="s">"impossible: empty NonEmpty value"</span>
<span class="nf">tail</span> <span class="ow">::</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">[</span><span class="n">a</span><span class="p">]</span>
<span class="nf">tail</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="p">(</span><span class="kr">_</span><span class="kt">:</span><span class="n">xs</span><span class="p">))</span> <span class="ow">=</span> <span class="n">xs</span>
<span class="nf">tail</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="kt">[]</span><span class="p">)</span> <span class="ow">=</span> <span class="ne">error</span> <span class="s">"impossible: empty NonEmpty value"</span></code></pre><p>Since the only way to construct or consume <code>NonEmpty</code> values is to use the functions in <code>Data.List.NonEmpty.Newtype</code>’s exported API, the above implementation makes it impossible for clients to violate the non-emptiness invariant. In a sense, values of opaque newtypes are like <em>tokens</em>: the implementing module issues tokens via its constructor functions, and those tokens have no intrinsic value. The only way to do anything useful with them is to “redeem” them to the issuing module’s accessor functions, in this case <code>head</code> and <code>tail</code>, to obtain the values contained within.</p><p>This approach is significantly weaker than using a constructive datatype, since it is theoretically possible to screw up and accidentally provide a means to construct an invalid <code>NonEmpty []</code> value. For this reason, the newtype approach to type safety does not on its own constitute a <em>proof</em> that a desired invariant holds. However, it restricts the “surface area” where an invariant violation can occur to the defining module, so reasonable confidence the invariant really does hold can be achieved by thoroughly testing the module’s API using fuzzing or property-based testing techniques.<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup></p><p>This tradeoff may not seem all that bad, and indeed, it is often a very good one! Guaranteeing invariants using constructive data modeling can, in general, be quite difficult, which often makes it impractical. However, it is easy to dramatically underestimate the care needed to avoid accidentally providing a mechanism that permits violating the invariant. For example, the programmer may choose to take advantage of GHC’s convenient typeclass deriving to derive a <code>Generic</code> instance for <code>NonEmpty</code>:</p><pre><code class="pygments"><span class="cm">{-# LANGUAGE DeriveGeneric #-}</span>
<span class="kr">import</span> <span class="nn">GHC.Generics</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">)</span>
<span class="kr">newtype</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">NonEmpty</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">)</span></code></pre><p>However, this innocuous line provides a trivial mechanism to circumvent the abstraction boundary:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="kt">GHC</span><span class="o">.</span><span class="kt">Generics</span><span class="o">.</span><span class="n">to</span> <span class="o">@</span><span class="p">(</span><span class="kt">NonEmpty</span> <span class="nb">()</span><span class="p">)</span> <span class="p">(</span><span class="kt">M1</span> <span class="o">$</span> <span class="kt">M1</span> <span class="o">$</span> <span class="kt">M1</span> <span class="o">$</span> <span class="kt">K1</span> <span class="kt">[]</span><span class="p">)</span>
<span class="kt">NonEmpty</span> <span class="kt">[]</span></code></pre><p>This is a particularly extreme example, since derived <code>Generic</code> instances are fundamentally abstraction-breaking, but this problem can crop up in less obvious ways, too. The same problem occurs with a derived <code>Read</code> instance:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">read</span> <span class="o">@</span><span class="p">(</span><span class="kt">NonEmpty</span> <span class="nb">()</span><span class="p">)</span> <span class="s">"NonEmpty []"</span>
<span class="kt">NonEmpty</span> <span class="kt">[]</span></code></pre><p>To some readers, these pitfalls may seem obvious, but safety holes of this sort are remarkably common in practice. This is especially true for datatypes with more sophisticated invariants, as it may not be easy to determine whether the invariants are actually upheld by the module’s implementation. Proper use of this technique demands caution and care:</p><ul><li><p>All invariants must be made clear to maintainers of the trusted module. For simple types, such as <code>NonEmpty</code>, the invariant is self-evident, but for more sophisticated types, comments are not optional.</p></li><li><p>Every change to the trusted module must be carefully audited to ensure it does not somehow weaken the desired invariants.</p></li><li><p>Discipline is needed to resist the temptation to add unsafe trapdoors that allow compromising the invariants if used incorrectly.</p></li><li><p>Periodic refactoring may be needed to ensure the trusted surface area remains small. It is all too easy for the responsibility of the trusted module to accumulate over time, dramatically increasing the likelihood of some subtle interaction causing an invariant violation.</p></li></ul><p>In contrast, datatypes that are correct by construction suffer none of these problems. The invariant cannot be violated without changing the datatype definition itself, which has rippling effects throughout the rest of the program to make the consequences immediately clear. Discipline on the part of the programmer is unnecessary, as the typechecker enforces the invariants automatically. There is no “trusted code” for such datatypes, since all parts of the program are equally beholden to the datatype-mandated constraints.</p><p>In libraries, the newtype-afforded notion of safety via encapsulation is useful, as libraries often provide the building blocks used to construct more complicated data structures. Such libraries generally receive more scrutiny and care than application code does, especially given they change far less frequently. In application code, these techniques are still useful, but the churn of a production codebase tends to weaken encapsulation boundaries over time, so correctness by construction should be preferred whenever practical.</p><h2><a name="other-newtype-use-abuse-and-misuse"></a>Other newtype use, abuse, and misuse</h2><p>The previous section covers the primary means by which newtypes are useful. However, in practice, newtypes are routinely used in ways that do not fit the above pattern. Some such uses are reasonable:</p><ul><li><p>Haskell’s notion of typeclass coherency limits each type to a single instance of any given class. For types that permit more than one useful instance, newtypes are the traditional solution, and this can be used to good effect. For example, the <code>Sum</code> and <code>Product</code> newtypes from <code>Data.Monoid</code> provide useful <code>Monoid</code> instances for numeric types.</p></li><li><p>In a similar vein, newtypes can be useful for introducing or rearranging type parameters. The <code>Flip</code> newtype from <code>Data.Bifunctor.Flip</code> is a simple example, flipping the arguments of a <code>Bifunctor</code> so the <code>Functor</code> instance may operate on the other side:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">Flip</span> <span class="n">p</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">=</span> <span class="kt">Flip</span> <span class="p">{</span> <span class="n">runFlip</span> <span class="ow">::</span> <span class="n">p</span> <span class="n">b</span> <span class="n">a</span> <span class="p">}</span></code></pre><p>Newtypes are needed to do this sort of juggling, as Haskell does not (yet) support type-level lambdas.</p></li><li><p>More simply, transparent newtypes can be useful to discourage misuse when the value needs to be passed between distant parts of the program and the intermediate code has no reason to inspect the value. For example, a <code>ByteString</code> containing a secret key may be wrapped in a newtype (with a <code>Show</code> instance omitted) to discourage code from accidentally logging or otherwise exposing it.</p></li></ul><p>All of these applications are good ones, but they have little to do with <em>type safety.</em> The last bullet in particular is often confused for safety, and to be fair, it does in fact take advantage of the type system to help avoid logical mistakes. However, it would be a mischaracterization to claim such usage actually <em>prevents</em> misuse; any part of the program may inspect the value at any time.</p><p>Too often, this illusion of safety leads to outright newtype abuse. For example, here’s a definition from the very codebase I work on for a living:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">ArgumentName</span> <span class="ow">=</span> <span class="kt">ArgumentName</span> <span class="p">{</span> <span class="n">unArgumentName</span> <span class="ow">::</span> <span class="kt">GraphQL</span><span class="o">.</span><span class="kt">Name</span> <span class="p">}</span>
<span class="kr">deriving</span> <span class="p">(</span> <span class="kt">Show</span><span class="p">,</span> <span class="kt">Eq</span><span class="p">,</span> <span class="kt">FromJSON</span><span class="p">,</span> <span class="kt">ToJSON</span><span class="p">,</span> <span class="kt">FromJSONKey</span><span class="p">,</span> <span class="kt">ToJSONKey</span>
<span class="p">,</span> <span class="kt">Hashable</span><span class="p">,</span> <span class="kt">ToTxt</span><span class="p">,</span> <span class="kt">Lift</span><span class="p">,</span> <span class="kt">Generic</span><span class="p">,</span> <span class="kt">NFData</span><span class="p">,</span> <span class="kt">Cacheable</span> <span class="p">)</span></code></pre><p>This newtype is useless noise. Functionally, it is completely interchangeable with its underlying <code>Name</code> type, so much so that it derives a dozen typeclasses! In every location it’s used, it’s immediately unwrapped the instant it’s extracted from its enclosing record, so there is no type safety benefit whatsoever. Worse, there isn’t even any clarity added by labeling it an <code>ArgumentName</code>, since the enclosing field name already makes its role clear.</p><p>Newtypes like these seem to arise from a desire to use the type system as a taxonomy of the external world. An “argument name” is a more specific concept than a generic “name,” so surely it ought to have its own type. This makes some intuitive sense, but it’s rather misguided: taxonomies are useful for documenting a domain of interest, but not necessarily helpful for modeling it. When programming, we use types for a different end:</p><ul><li><p>Primarily, types distinguish <em>functional</em> differences between values. A value of type <code>NonEmpty a</code> is <em>functionally</em> distinct from a value of type <code>[a]</code>, since it is fundamentally structurally different and permits additional operations. In this sense, types are <em>structural</em>; they describe what values <em>are</em> in the internal world of the programming language.</p></li><li><p>Secondarily, we sometimes use types to help ourselves avoid making logical mistakes. We might use separate <code>Distance</code> and <code>Duration</code> types to avoid accidentally doing something nonsensical like adding them together, even though they’re both representationally real numbers.</p></li></ul><p>Note that both these uses are <em>pragmatic</em>; they look at the type system as a tool. This is a rather natural perspective to take, seeing as a static type system <em>is</em> a tool in a literal sense. Nevertheless, that perspective seems surprisingly unusual, even though the use of types to classify the world routinely yields unhelpful noise like <code>ArgumentName</code>.</p><p>If a newtype is completely transparent, and it is routinely wrapped and unwrapped at will, it is likely not very helpful. In this particular case, I would eliminate the distinction altogether and use <code>Name</code>, but in situations where the different label adds genuine clarity, one can always use a type alias:<sup><a href="#footnote-3" id="footnote-ref-3-1">3</a></sup></p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">ArgumentName</span> <span class="ow">=</span> <span class="kt">GraphQL</span><span class="o">.</span><span class="kt">Name</span></code></pre><p>Newtypes like these are security blankets. Forcing programmers to jump through a few hoops is not type safety—trust me when I say they will happily jump through them without a second thought.</p><h2><a name="final-thoughts-and-related-reading"></a>Final thoughts and related reading</h2><p>I’ve been wanting to write this blog post for a long time. Ostensibly, it’s a very specific critique of Haskell newtypes, and I’ve chosen to frame things this way because I write Haskell for a living and this is the way I encounter this problem in practice. Really, though, the core idea is much bigger than that.</p><p>Newtypes are one particular mechanism of defining <em>wrapper types</em>, a concept that exists in almost any language, even those that are dynamically typed. Even if you don’t write Haskell, much of the reasoning in this blog post is likely still relevant in your language of choice. More broadly, this is a continuation of a theme I’ve been trying to convey from different angles over the past year: type systems are tools, and we should be more conscious and intentional about what they actually do and how to use them effectively.</p><p>The catalyst that got me to finally sit down and write this was the recently-published <a href="https://tech.freckle.com/2020/10/26/tagged-is-not-a-newtype/">Tagged is not a Newtype</a>. It’s a good blog post, and I wholeheartedly agree with its general thrust, but I thought it was a missed opportunity to make a larger point. Indeed, <code>Tagged</code> <em>is</em> a newtype, definitionally, so the title of the blog post is something of a misdirection. The real problem is a little deeper.</p><p>Newtypes are useful when carefully applied, but their safety is not intrinsic, no more than the safety of a traffic cone is somehow contained within the plastic it’s made of. What matters is being placed in the right context—without that, newtypes are just a labeling scheme, a way of giving something a name.</p><p>And a name is not type safety.</p><ol class="footnotes"><li id="footnote-1"><p>Admittedly rather unlikely given its name, but bear with me through the contrived example. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>In theory, it is still possible to thoroughly prove the invariant holds using external verification techniques, such as by writing a pen-and-paper proof or by using program extraction in combination with a proof assistant/theorem prover. However, these techniques are extremely uncommon in general programming practice. <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>As it happens, I think type aliases are often also more harmful than helpful, so I would caution against overusing them, too, but that is outside the scope of this blog post. <a href="#footnote-ref-3-1">↩</a></p></li></ol></article>Types as axioms, or: playing god with static types2020-08-13T00:00:00Z2020-08-13T00:00:00ZAlexis King<article><p>Just what exactly <em>is</em> a type?</p><p>A common perspective is that types are <em>restrictions</em>. Static types restrict the set of values a variable may contain, capturing some subset of the space of “all possible values.” Under this worldview, a typechecker is sort of like an oracle, predicting which values will end up where when the program runs and making sure they satisfy the constraints the programmer wrote down in the type annotations. Of course, the typechecker can’t <em>really</em> predict the future, so when the typechecker gets it wrong—it can’t “figure out” what a value will be—static types can feel like self-inflicted shackles.</p><p>But that is not the <em>only</em> perspective. There is another way—a way that puts you, the programmer, back in the driver’s seat. You make the rules, you call the shots, you set the objectives. You need not be limited any longer by what the designers of your programming language decided the typechecker can and cannot prove. You do not serve the typechecker; the typechecker serves <em>you.</em></p><p>…no, I’m not trying to sell you a dubious self-help book for programmers who feel like they’ve lost control of their lives. If the above sounds too good to be true, well… I won’t pretend it’s all actually as easy as I make it sound. Nevertheless, it’s well within the reach of the working programmer, and most remarkably, all it takes is a change in perspective.</p><h2><a name="seeing-the-types-half-empty"></a>Seeing the types half-empty</h2><p>Let’s talk a little about TypeScript.</p><p>TypeScript is a <em>gradually-typed</em> language, which means it’s possible to mix statically- and dynamically-typed code. The original intended use case of gradual typing was to <em>gradually</em> add static types to an existing dynamically-typed codebase, which imposes some interesting design constraints. For one, a valid JavaScript program must also be a valid TypeScript program; for another, TypeScript must be accommodating of traditional JavaScript idioms.</p><p>Gradually typed languages like TypeScript are particularly good illustrations of the way type annotations can be viewed as constraints. A function with no explicit type declarations<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup> can accept <em>any</em> JavaScript value, so adding a type annotation fundamentally restricts the set of legal values.</p><p>Furthermore, languages like TypeScript tend to have subtyping. This makes it easy to classify certain types as “more restrictive” than others. For example, a type like <code>string | number</code> clearly includes more values than just <code>number</code>, so <code>number</code> is a more restrictive type—a <em>subtype</em>.</p><p>An exceptionally concrete way to illustrate this “types are restrictions” mentality is to write a function with an unnecessarily specific type. Here’s a TypeScript function that returns the first element in an array of numbers:</p><pre><code class="pygments"><span class="kd">function</span> <span class="nx">getFirst</span><span class="p">(</span><span class="nx">arr</span>: <span class="kt">number</span><span class="p">[])</span><span class="o">:</span> <span class="kt">number</span> <span class="o">|</span> <span class="kc">undefined</span> <span class="p">{</span>
<span class="k">return</span> <span class="nx">arr</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="p">}</span></code></pre><p>If we ignore the type annotations and consider only the dynamic semantics of JavaScript, this function would work perfectly well given a list of strings. However, if we write <code>getFirst(["hello", "world"])</code>, the typechecker will complain. In this example, the restriction is thoroughly self-imposed—it would be easy to give this function a more generic type—but it’s not always that easy. For example, suppose we wrote a function where the return type depends upon the type of the argument:</p><pre><code class="pygments"><span class="kd">function</span> <span class="nx">emptyLike</span><span class="p">(</span><span class="nx">val</span>: <span class="kt">number</span> <span class="o">|</span> <span class="kt">string</span><span class="p">)</span><span class="o">:</span> <span class="kt">number</span> <span class="o">|</span> <span class="kt">string</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="k">typeof</span> <span class="nx">val</span> <span class="o">===</span> <span class="s2">"number"</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="k">return</span> <span class="s2">""</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre><p>Now if we write <code>emptyLike(42) * 10</code>, the typechecker will once again complain, claiming the result might be a string—it can’t “figure out” that when we pass a number, we always get a number back.</p><p>When type systems are approached from this perspective, the result is often frustration. The programmer knows that the equivalent untyped JavaScript is perfectly well-behaved, so the typechecker comes off as being the highly unfortunate combination of stubborn yet dim-witted. What’s more, the programmer likely has little mental model of the typechecker’s internal operation, so when types like the above are inferred (not explicitly written), it can be unclear what solutions exist to make the error go away.</p><p>At this point, the programmer may give up. “Stupid typechecker,” they grumble, changing the return type of <code>emptyLike</code> to <code>any</code>. “If it can’t even figure this out, can it <em>really</em> be all that useful?”</p><p>Sadly, this relationship with the typechecker is all too common, and gradually-typed languages in particular tend to create a vicious cycle of frustration:</p><ul><li><p>Gradual type systems are intentionally designed to “just work” on idiomatic code as much as possible, so programmers may not think much about the types except when they get type errors.</p></li><li><p>Furthermore, many programmers using gradually-typed languages are already adept at programming in the underlying dynamically-typed language, so they have working mental models of program operation in terms of the dynamic semantics alone. They are much less likely to develop a rich mental model of the static semantics of the type system because they are used to reasoning without one.</p></li><li><p>Gradually typed languages must support idioms from their dynamically-typed heritage, so they often include ad-hoc special cases (such as, for example, special treatment of <code>typeof</code> checks) that obscure the rules the typechecker follows and make them seem semi-magical.</p></li><li><p>Builtin types are deeply blessed in the type system, strongly encouraging programmers to embrace their full flexibility, but leaving little recourse when they run up against their limits.</p></li><li><p>All this frustration breeds a readiness to override the typechecker using casts or <code>any</code>, which ultimately creates a self-fulfilling prophecy in which the typechecker rarely catches any interesting mistakes because it has been so routinely disabled.</p></li></ul><p>The end result of all of this is a defeatist attitude that views the typechecker as a minor tooling convenience at best (i.e. a fancy autocomplete provider) or an active impediment at worst. Who can really blame them? The type system has (unintentionally of course) been designed in such a way so as to lead them into this dead end. The public perception of type systems settles into that of a strikingly literal nitpicker we endure rather than as a tool we actively leverage.</p><h2><a name="taking-back-types"></a>Taking back types</h2><p>After everything I said above, it may be hard to imagine seeing types any other way. Indeed, through the lens of TypeScript, the “types are restrictions” mentality is incredibly natural, so much so that it seems self-evident. But let’s move away from TypeScript for a moment and focus on a different language, Haskell, which encourages a somewhat different perspective. If you aren’t familiar with Haskell, that’s alright—I’m going to try to keep the examples in this blog post as accessible as possible whether you’ve written any Haskell or not.</p><p>Though Haskell and TypeScript are both statically-typed—and both of their type systems are fairly sophisticated—Haskell’s type system is almost completely different philosophically:</p><ul><li><p>Haskell does not have subtyping,<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup> which means that every value belongs to exactly one type.</p></li><li><p>While JavaScript is built around a small handful of flexible builtin datatypes (booleans, numbers, strings, arrays, and objects), Haskell has essentially no blessed, built-in datatypes other than numbers. Key types such as booleans, lists, and tuples are ordinary datatypes defined in the standard library, no different from types users could define.<sup><a href="#footnote-3" id="footnote-ref-3-1">3</a></sup></p></li><li><p>In particular, Haskell is built around the idea that datatypes can be defined with multiple <em>cases</em>, and branching is done via pattern-matching (more on this shortly).</p></li></ul><p>Let’s look at a basic Haskell datatype declaration. Suppose we want to define a type that represents a season:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Season</span> <span class="ow">=</span> <span class="kt">Spring</span> <span class="o">|</span> <span class="kt">Summer</span> <span class="o">|</span> <span class="kt">Fall</span> <span class="o">|</span> <span class="kt">Winter</span></code></pre><p>If you are familiar with TypeScript, this may look rather similar to a union type; if you’re familiar with a C-family language, this may remind you more of an enum. Both are on the right track: this defines a new type named <code>Season</code> with four possible values, <code>Spring</code>, <code>Summer</code>, <code>Fall</code>, and <code>Winter</code>.</p><p>But what exactly <em>are</em> those values?</p><ul><li><p>In TypeScript, we’d represent this type with a union of strings, like this:</p><pre><code class="pygments"><span class="nx">type</span> <span class="nx">Season</span> <span class="o">=</span> <span class="s2">"spring"</span> <span class="o">|</span> <span class="s2">"summer"</span> <span class="o">|</span> <span class="s2">"fall"</span> <span class="o">|</span> <span class="s2">"winter"</span><span class="p">;</span></code></pre><p>Here, <code>Season</code> is a type that can be one of those four strings, but nothing else.</p></li><li><p>In C, we’d represent this type with an enum, like this:</p><pre><code class="pygments"><span class="k">enum</span> <span class="n">season</span> <span class="p">{</span> <span class="n">SPRING</span><span class="p">,</span> <span class="n">SUMMER</span><span class="p">,</span> <span class="n">FALL</span><span class="p">,</span> <span class="n">WINTER</span> <span class="p">};</span></code></pre><p>Here, <code>SPRING</code>, <code>SUMMER</code>, <code>FALL</code>, and <code>WINTER</code> are essentially defined to be global aliases for the integers <code>0</code>, <code>1</code>, <code>2</code>, and <code>3</code>, and the type <code>enum season</code> is essentially an alias for <code>int</code>.</p></li></ul><p>So in TypeScript, the values are strings, and in C, the values are numbers. What are they in Haskell? Well… they simply <em>are</em>.</p><p>The Haskell declaration invents four completely new constants out of thin air, <code>Spring</code>, <code>Summer</code>, <code>Fall</code>, and <code>Winter</code>. They aren’t aliases for numbers, nor are they symbols or strings. The compiler doesn’t expose anything about how it chooses to represent these values at runtime; that’s an implementation detail. In Haskell, <code>Spring</code> is now a value <em>distinct from all other values</em>, even if someone in a different module were to also use the name <code>Spring</code>. Haskell type declarations let us play god, creating something from nothing.</p><p>Since these values are totally unique, abstract constants, what can we actually do with them? The answer is one thing and <em>exactly</em> one thing: we can branch on them. For example, we can write a function that takes a <code>Season</code> as an argument and returns whether or not Christmas occurs during it:</p><pre><code class="pygments"><span class="nf">containsChristmas</span> <span class="ow">::</span> <span class="kt">Season</span> <span class="ow">-></span> <span class="kt">Bool</span>
<span class="nf">containsChristmas</span> <span class="n">season</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">season</span> <span class="kr">of</span>
<span class="kt">Spring</span> <span class="ow">-></span> <span class="kt">False</span>
<span class="kt">Summer</span> <span class="ow">-></span> <span class="kt">True</span> <span class="c1">-- southern hemisphere</span>
<span class="kt">Fall</span> <span class="ow">-></span> <span class="kt">False</span>
<span class="kt">Winter</span> <span class="ow">-></span> <span class="kt">True</span> <span class="c1">-- northern hemisphere</span></code></pre><p><code>case</code> expressions are, to a first approximation, a lot like C-style <code>switch</code> statements (though they can do a lot more than this simple example suggests). Using <code>case</code>, we can also define conversions from our totally unique <code>Season</code> constants to other types, if we want:</p><pre><code class="pygments"><span class="nf">seasonToString</span> <span class="ow">::</span> <span class="kt">Season</span> <span class="ow">-></span> <span class="kt">String</span>
<span class="nf">seasonToString</span> <span class="n">season</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">season</span> <span class="kr">of</span>
<span class="kt">Spring</span> <span class="ow">-></span> <span class="s">"spring"</span>
<span class="kt">Summer</span> <span class="ow">-></span> <span class="s">"summer"</span>
<span class="kt">Fall</span> <span class="ow">-></span> <span class="s">"fall"</span>
<span class="kt">Winter</span> <span class="ow">-></span> <span class="s">"winter"</span></code></pre><p>We can also go the other way around, converting a <code>String</code> to a <code>Season</code>, but if we try, we run into a problem: what do we return for a string like, say, <code>"cheesecake"</code>? In other languages, we might throw an error or return <code>null</code>, but Haskell does not have <code>null</code>, and errors are generally reserved for truly catastrophic failures. What can we do instead?</p><p>A particularly naïve solution would be to create a type called <code>MaybeASeason</code> that has two cases—it can be a valid <code>Season</code>, or it can be <code>NotASeason</code>:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">MaybeASeason</span> <span class="ow">=</span> <span class="kt">IsASeason</span> <span class="kt">Season</span> <span class="o">|</span> <span class="kt">NotASeason</span>
<span class="nf">stringToSeason</span> <span class="ow">::</span> <span class="kt">String</span> <span class="ow">-></span> <span class="kt">MaybeASeason</span>
<span class="nf">stringToSeason</span> <span class="n">seasonString</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">seasonString</span> <span class="kr">of</span>
<span class="s">"spring"</span> <span class="ow">-></span> <span class="kt">IsASeason</span> <span class="kt">Spring</span>
<span class="s">"summer"</span> <span class="ow">-></span> <span class="kt">IsASeason</span> <span class="kt">Summer</span>
<span class="s">"fall"</span> <span class="ow">-></span> <span class="kt">IsASeason</span> <span class="kt">Fall</span>
<span class="s">"winter"</span> <span class="ow">-></span> <span class="kt">IsASeason</span> <span class="kt">Winter</span>
<span class="kr">_</span> <span class="ow">-></span> <span class="kt">NotASeason</span></code></pre><p>This shows a feature of Haskell datatypes that C-style enums do <em>not</em> have: they aren’t just constants, they can contain other values. A <code>MaybeASeason</code> can be one of five different values: <code>IsASeason Spring</code>, <code>IsASeason Summer</code>, <code>IsASeason Fall</code>, <code>IsASeason Winter</code>, or <code>NotASeason</code>.</p><p>In TypeScript, we’d write <code>MaybeASeason</code> more like this:</p><pre><code class="pygments"><span class="nx">type</span> <span class="nx">MaybeASeason</span> <span class="o">=</span> <span class="nx">Season</span> <span class="o">|</span> <span class="s2">"not-a-season"</span><span class="p">;</span></code></pre><p>This is kind of nice, because we don’t have to wrap all our <code>Season</code> values with <code>IsASeason</code> like we have to do in Haskell. But remember that Haskell doesn’t have subtyping—every value must belong to exactly one type—so the Haskell code needs the <code>IsASeason</code> wrapper to distinguish the value as a <code>MaybeASeason</code> rather than a <code>Season</code>.</p><p>Now, you may rightly point out that having to invent a type like <code>MaybeASeason</code> every time we need to create a variant of a type with a failure case is absurd, so fortunately we can define a type like <code>MaybeASeason</code> that works for <em>any</em> underlying type. In Haskell, it looks like this:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Maybe</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="n">a</span> <span class="o">|</span> <span class="kt">Nothing</span></code></pre><p>This defines a generic type, where the <code>a</code> in <code>Maybe a</code> is a stand-in for some other type, much like the <code>T</code> in <code>Array<T></code> in other languages. We can change our <code>stringToSeason</code> function to use <code>Maybe</code>:</p><pre><code class="pygments"><span class="nf">stringToSeason</span> <span class="ow">::</span> <span class="kt">String</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="kt">Season</span>
<span class="nf">stringToSeason</span> <span class="n">seasonString</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">seasonString</span> <span class="kr">of</span>
<span class="s">"spring"</span> <span class="ow">-></span> <span class="kt">Just</span> <span class="kt">Spring</span>
<span class="s">"summer"</span> <span class="ow">-></span> <span class="kt">Just</span> <span class="kt">Summer</span>
<span class="s">"fall"</span> <span class="ow">-></span> <span class="kt">Just</span> <span class="kt">Fall</span>
<span class="s">"winter"</span> <span class="ow">-></span> <span class="kt">Just</span> <span class="kt">Winter</span>
<span class="kr">_</span> <span class="ow">-></span> <span class="kt">Nothing</span></code></pre><p><code>Maybe</code> gets us something a lot like nullable types, but it isn’t built into the type system, it’s just an ordinary type defined in the standard library.</p><h3><a name="positive-versus-negative-space"></a>Positive versus negative space</h3><p>At this point, you may be wondering to yourself why I am talking about all of this, seeing as everything in the previous section is information you could find in a basic Haskell tutorial. But the point of this blog post is not to teach you Haskell, it’s to focus on a particular philosophical approach to modeling data.</p><p>In TypeScript, when we write a type declaration like</p><pre><code class="pygments"><span class="nx">type</span> <span class="nx">Season</span> <span class="o">=</span> <span class="s2">"summer"</span> <span class="o">|</span> <span class="s2">"spring"</span> <span class="o">|</span> <span class="s2">"fall"</span> <span class="o">|</span> <span class="s2">"winter"</span><span class="p">;</span></code></pre><p>we are defining a type that can be one of those four strings <em>and nothing else</em>. All the other strings that <em>aren’t</em> one of those four make up <code>Season</code>’s “negative space”—values that exist, but that we have intentionally excluded. In contrast, the Haskell type does not really have any “negative space” because we pulled four new values out of thin air.</p><p>Of course, I suspect you don’t really buy this argument. What makes a string like <code>"cheesecake"</code> “negative space” in TypeScript but not in Haskell? Well… nothing, really. The distinction I’m drawing here doesn’t really exist, it’s just a different perspective, and arguably a totally contrived and arbitrary one. But now that I’ve explained the premise and set up some context, let me provide a more compelling example.</p><p>Suppose you are writing a TypeScript program, and you want a function that only accepts <em>non-empty</em> arrays. What can you do? Your first instinct is that you need a way to somehow further restrict the function’s input type to exclude empty arrays. And indeed, there <em>is</em> a trick for doing that:</p><pre><code class="pygments"><span class="nx">type</span> <span class="nx">NonEmptyArray</span><span class="o"><</span><span class="nx">T</span><span class="o">></span> <span class="o">=</span> <span class="p">[</span><span class="nx">T</span><span class="p">,</span> <span class="p">...</span><span class="nx">T</span><span class="p">[]];</span></code></pre><p>Great! But what if the constraint was more complicated: what if you needed an array containing an even number of elements? Unfortunately, there isn’t really a trick for that one. At this point, you might start wishing the type system had support for something really fancy, like refinement types, so you could write something like this:</p><pre><code class="pygments"><span class="nx">type</span> <span class="nx">EvenArray</span><span class="o"><</span><span class="nx">T</span><span class="o">></span> <span class="o">=</span> <span class="nx">T</span><span class="p">[]</span> <span class="nx">satisfies</span> <span class="p">(</span><span class="nx">arr</span> <span class="o">=></span> <span class="nx">arr</span><span class="p">.</span><span class="nx">length</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">===</span> <span class="mi">0</span><span class="p">);</span></code></pre><p>But TypeScript doesn’t support anything like that, so for now you’re stuck. You need a way to restrict the function’s domain in a way the type system does not have any special support for, so your conclusion might be “I guess the type system just can’t do this.” People tend to call this “running up against the limits of the type system.”</p><p>But what if we took a different perspective? Recall that in Haskell, lists aren’t built-in datatypes, they’re just ordinary datatypes defined in the standard library:<sup><a href="#footnote-4" id="footnote-ref-4-1">4</a></sup></p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">List</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Nil</span> <span class="o">|</span> <span class="kt">Cons</span> <span class="n">a</span> <span class="p">(</span><span class="kt">List</span> <span class="n">a</span><span class="p">)</span></code></pre><p>This type might be a bit confusing at first if you have not written any Haskell, since it’s <em>recursive</em>. All of these are valid values of type <code>List Int</code>:</p><ul><li><p><code>Nil</code></p></li><li><p><code>Cons 1 Nil</code></p></li><li><p><code>Cons 1 (Cons 2 Nil)</code></p></li><li><p><code>Cons 1 (Cons 2 (Cons 3 Nil))</code></p></li></ul><p>The recursive nature of <code>Cons</code> is what gives our user-defined datatype the ability to hold any number of values: we can have any number of nested <code>Cons</code>es we want before we terminate the list with a final <code>Nil</code>.</p><p>If we wanted to define an <code>EvenList</code> type in Haskell, we might end up thinking along the same lines we did before, that we need some fancy type system extension so we can restrict <code>List</code> to exclude lists with odd numbers of elements. But that’s focusing on the negative space of things we want to exclude… what if instead, we focused on the <em>positive</em> space of things we want to <em>include?</em></p><p>What do I mean by that? Well, we could define an entirely new type that’s just like <code>List</code>, but we make it <em>impossible</em> to ever include an odd number of elements:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">EvenList</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">EvenNil</span> <span class="o">|</span> <span class="kt">EvenCons</span> <span class="n">a</span> <span class="n">a</span> <span class="p">(</span><span class="kt">EvenList</span> <span class="n">a</span><span class="p">)</span></code></pre><p>Here are some valid values of type <code>EvenList Int</code>:</p><ul><li><p><code>EvenNil</code></p></li><li><p><code>EvenCons 1 2 EvenNil</code></p></li><li><p><code>EvenCons 1 2 (EvenCons 3 4 EvenNil)</code></p></li></ul><p>Lo and behold, a datatype that can only ever include even numbers of elements!</p><p>Now, at this point you might realize that this is kind of silly. We don’t need to invent an entirely new datatype for this! We could just create a list of pairs:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">EvenList</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">List</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">a</span><span class="p">)</span></code></pre><p>Now values like <code>Cons (1, 2) (Cons (3, 4) Nil)</code> would be valid values of type <code>EvenList Int</code>, and we wouldn’t have to reinvent lists. But again, this is an approach based on thinking not on which values we want to exclude, but rather how to structure our data such that those illegal values aren’t even <em>constructible.</em></p><p><strong>This is the essence of the Haskeller’s mantra, “Make illegal states unrepresentable,”</strong> and sadly it is often misinterpreted. It’s much easier to think “hm, I want to make these states illegal, how can I add some post-hoc restrictions to rule them out?” And indeed, this is why refinement types really <em>are</em> awesome, and when they’re available, by all means use them! But checking totally arbitrary properties at the type level is not tractable in general, and sometimes you need to think a little more outside the box.</p><h3><a name="types-as-axiom-schemas"></a>Types as axiom schemas</h3><p>So far in this blog post, I’ve repeatedly touched upon a handful of different ideas in a few different ways:</p><ul><li><p>Instead of thinking about how to <em>restrict</em>, it can be useful to think about how to <em>correctly construct</em>.</p></li><li><p>In Haskell, datatype declarations invent new values out of thin air.</p></li><li><p>We can represent a <em>lot</em> of different data structures using the incredibly simple framework of “datatypes with several possibilities.”</p></li></ul><p>Independently, those ideas might not seem deeply related, but in fact, they’re all essential to the Haskell school of data modeling. I want to now explore how we can unify them into a single framework that makes this seem less magical and more like an iterative design process.</p><p>In Haskell, when you define a datatype, you’re really defining a new, self-contained set of <em>axioms</em> and <em>inference rules.</em> That is rather abstract, so let’s make it more concrete. Consider the <code>List</code> type again:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">List</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Nil</span> <span class="o">|</span> <span class="kt">Cons</span> <span class="n">a</span> <span class="p">(</span><span class="kt">List</span> <span class="n">a</span><span class="p">)</span></code></pre><p>Viewed as an axiom schema, this type has one axiom and one inference rule:</p><ul><li><p>The empty list is a list.</p></li><li><p>If you have a list, and you add an element to the beginning, the result is also a list.</p></li></ul><p>The axiom is <code>Nil</code>, and the inference rule is <code>Cons</code>. Every list<sup><a href="#footnote-5" id="footnote-ref-5-1">5</a></sup> is constructed by starting with the axiom, <code>Nil</code>, followed by some number of applications of the inference rule, <code>Cons</code>.</p><p>We can take a similar approach when designing the <code>EvenList</code> type. The axiom is the same:</p><ul><li><p>The empty list is a list with an even number of elements.</p></li></ul><p>But our inference rule must preserve the invariant that the list always contains an even number of elements. We can do this by always adding two elements at a time:</p><ul><li><p>If you have a list with an even number of elements, and you add two elements to the beginning, the result is also a list with an even number of elements.</p></li></ul><p>This corresponds precisely to our <code>EvenList</code> declaration:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">EvenList</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">EvenNil</span> <span class="o">|</span> <span class="kt">EvenCons</span> <span class="n">a</span> <span class="n">a</span> <span class="p">(</span><span class="kt">EvenList</span> <span class="n">a</span><span class="p">)</span></code></pre><p>We can also go through this same reasoning process to come up with a type that represents non-empty lists. That type has just one inference rule:</p><ul><li><p>If you have a list, and you add an element to the beginning, the result is a non-empty list.</p></li></ul><p>That inference rule corresponds to the following datatype:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">NonEmptyList</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">NonEmptyCons</span> <span class="n">a</span> <span class="p">(</span><span class="kt">List</span> <span class="n">a</span><span class="p">)</span></code></pre><p>Of course, it’s possible to do this with much more than just lists. A particularly classic example is the constructive definition of natural numbers:</p><ul><li><p>Zero is a natural number.</p></li><li><p>If you have a natural number, its successor (i.e. that number plus one) is also a natural number.</p></li></ul><p>These are two of the <a href="https://en.wikipedia.org/wiki/Peano_axioms">Peano axioms</a>, which can be represented in Haskell as the following datatype:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Natural</span> <span class="ow">=</span> <span class="kt">Zero</span> <span class="o">|</span> <span class="kt">Succ</span> <span class="kt">Natural</span></code></pre><p>Using this type, <code>Zero</code> represents 0, <code>Succ Zero</code> represents 1, <code>Succ (Succ Zero)</code> represents 2, and so on. Just as <code>EvenList</code> allowed us to represent any list with an even number of elements but made other values impossible to even express, this <code>Natural</code> type allows us to represent all natural numbers, while other numbers (such as, for example, negative integers) are impossible to express.</p><p>Now, of course, all this hinges on our interpretation of the values we’ve invented! We have chosen to interpret <code>Zero</code> as <code>0</code> and <code>Succ n</code> as <code>n + 1</code>, but that interpretation is not inherent to <code>Natural</code>’s definition—it’s all in our heads! We could choose to interpret <code>Succ n</code> as <code>n - 1</code> instead, in which case we would only be able to represent non-positive integers, or we could interpret <code>Zero</code> as <code>1</code> and <code>Succ n</code> as <code>n * 2</code>, in which case we could only represent powers of two.</p><p>I find that people sometimes find this approach troubling, or at least counterintuitive. Is <code>Succ (Succ Zero)</code> <em>really</em> 2? It certainly doesn’t look like a number we’re used to writing. When someone thinks “I need a datatype for a number greater than or equal to zero,” they’re going to reach for the type in their programming language called <code>number</code> or <code>int</code>, not think to invent a recursive datatype. And admittedly, the <code>Natural</code> type defined here is not very practical: it’s an incredibly inefficient representation of natural numbers.</p><p>But in less contrived situations, this approach <em>is</em> practical, and in fact it’s highly useful! The quibble that an <code>EvenList Int</code> isn’t “really” a <code>List Int</code> is rather meaningless, seeing as our definition of <code>List</code> was just as arbitrary. A great deal of our jobs as programmers is imbuing arbitrary symbols with meaning; at some point someone decided that the number 65 would correspond to the capital letter A, and it was no less arbitrary then.</p><p>So when you have a property you want to capture in your types, take a step back and think about it for a little bit. Is there a way you can structure your data so that, no matter how you build it, the result is always a valid value? In other words, don’t try to add post-hoc restrictions to exclude bad values, <strong>make your datatypes correct by construction</strong>.</p><h2><a name="but-what-if-i-don-t-write-haskell-and-other-closing-thoughts"></a>“But what if I don’t write Haskell?” And other closing thoughts</h2><p>I write Haskell for a living, and I wrote this blog post with both my coworkers and the broader Haskell community in mind, but if I had <em>only</em> written it with those people in mind, it wouldn’t make sense to have spent so much time explaining basic Haskell. These techniques can be used in almost any statically typed programming language, though it’s certainly easier in some than others.</p><p>I don’t want people to come away from this blog post with an impression that I think TypeScript is a bad language, or that I’m claiming Haskell can do things TypeScript can’t. In fact, TypeScript <em>can</em> do all the things I’ve talked about in this blog post! As proof, here are TypeScript definitions of both <code>EvenList</code> and <code>Natural</code>:</p><pre><code class="pygments"><span class="nx">type</span> <span class="nx">EvenList</span><span class="o"><</span><span class="nx">T</span><span class="o">></span> <span class="o">=</span> <span class="p">[]</span> <span class="o">|</span> <span class="p">[</span><span class="nx">T</span><span class="p">,</span> <span class="nx">T</span><span class="p">,</span> <span class="nx">EvenList</span><span class="o"><</span><span class="nx">T</span><span class="o">></span><span class="p">];</span>
<span class="nx">type</span> <span class="nx">Natural</span> <span class="o">=</span> <span class="s2">"zero"</span> <span class="o">|</span> <span class="p">{</span> <span class="nx">succ</span>: <span class="kt">Natural</span> <span class="p">};</span></code></pre><p>If anything, <strong>the real point of this blog post is that a type system does not have a well-defined list of things it “can prove” and “can’t prove.”</strong> Languages like TypeScript don’t really encourage this approach to data modeling, where you restructure your values in a certain way so as to guarantee certain properties. Rather, they prefer to add increasingly sophisticated constraints and type system features that can capture the properties people want to capture without having to change their data representation.</p><p>And in general, <em>that’s great!</em></p><p>Being able to reuse the same data representation is <em>hugely</em> beneficial. Functions like <code>map</code> and <code>filter</code> already exist for ordinary lists/arrays, but a home-grown <code>EvenList</code> type needs its own versions. Passing an <code>EvenList</code> to a function that expects a list requires explicitly converting between the two. All these things have both code complexity and performance costs, and type system features that make these issues just invisibly disappear are <em>obviously</em> a good thing.</p><p>But the danger of treating the type system this way is that it means you may find yourself unsure what to do when suddenly you have a new requirement that the type system doesn’t provide built-in support for. What then? Do you start punching holes through your type system? The more you do that, the less useful the type system becomes: type systems are great at detecting how changes in one part of a codebase can impact seemingly-unrelated areas in surprising ways, but every unsafe cast or use of <code>any</code> is a hard stop, a point past which the typechecker cannot propagate information. Do that once or twice in a leaf function, it’s okay, but do that even just a half dozen times in your application’s connective tissue, and your type system might not be able to catch those things anymore.</p><p>Even if it isn’t a technique you use every day, it’s worth getting comfortable tweaking your data representation to preserve those guarantees. It’s a magical experience having the typechecker teach you things about your domain you hadn’t even considered simply because you got a type error and started thinking through why. Yes, it’s extra work, but trust me: it’s a lot more pleasant to work for your typechecker when you know exactly how much your typechecker is working for you.</p><ol class="footnotes"><li id="footnote-1"><p>Sort of. TypeScript will try to infer type annotations based on how variables and functions are used, but by default, it falls back on the dynamic, unchecked <code>any</code> type if it can’t find a solution that makes the program typecheck. That behavior can be changed via a configuration option, but that isn’t relevant here: I’m just trying to illustrate a perspective, not make any kind of value judgment about TypeScript specifically. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>Sort of. Haskell does have a limited notion of subtyping when polymorphism is involved; for example, the type <code>forall a. a -> a</code> is a subtype of the type <code>Int -> Int</code>. But Haskell does not have anything resembling inheritance (e.g. there is no common <code>Number</code> supertype that includes both <code>Int</code> and <code>Double</code>) nor does it have untagged unions (e.g. the argument to a function cannot be something like <code>Int | String</code>, you must define a wrapper type like <code>data IntOrString = AnInt Int | AString String</code>). <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>Lists, tuples, and strings do technically have special <em>syntax</em>, which is built into the compiler, but there is truly nothing special about their semantics. They would work exactly the same way without the syntax, the code would just look less pretty. <a href="#footnote-ref-3-1">↩</a></p></li><li id="footnote-4"><p>Haskell programmers will notice that this is not actually the definition of the list type, since the real list type uses special syntax, but I wanted to keep things as simple as possible for this blog post. <a href="#footnote-ref-4-1">↩</a></p></li><li id="footnote-5"><p>Ignoring infinite lists, but the fact that infinite lists are representable in Haskell is outside the scope of this blog post. <a href="#footnote-ref-5-1">↩</a></p></li></ol></article>No, dynamic type systems are not inherently more open2020-01-19T00:00:00Z2020-01-19T00:00:00ZAlexis King<article><p>Internet debates about typing disciplines continue to be plagued by a pervasive myth that dynamic type systems are inherently better at modeling “open world” domains. The argument usually goes like this: the goal of static typing is to pin everything down as much as possible, but in the real world, that just isn’t practical. Real systems should be loosely coupled and worry about data representation as little as possible, so dynamic types lead to a more robust system in the large.</p><p>This story sounds compelling, but it isn’t true. The flaw is in the premise: static types are <em>not</em> about “classifying the world” or pinning down the structure of every value in a system. The reality is that static type systems allow specifying exactly how much a component needs to know about the structure of its inputs, and conversely, how much it doesn’t. Indeed, in practice static type systems excel at processing data with only a partially-known structure, as they can be used to ensure application logic doesn’t accidentally assume too much.</p><h2><a name="two-typing-fallacies"></a>Two typing fallacies</h2><p>I’ve wanted to write this blog post for a while, but what finally made me decide to do it were misinformed comments responding to <a href="/blog/2019/11/05/parse-don-t-validate/">my previous blog post</a>. Two comments in particular caught my eye, <a href="https://www.reddit.com/r/programming/comments/dt0w63/parse_dont_validate/f6ulpsy/">the first of which was posted on /r/programming</a>:</p><blockquote><p>Strongly disagree with the post […] it promotes a fundamentally entangled and static view of the world. It assumes that we can or should theorize about what is "valid" input at the edge between the program and the world, thus introducing a strong sense of coupling through the entire software, where failure to conform to some schema will automatically crash the program.</p><p>This is touted as a feature here but imagine if the internet worked like this. A server changes their JSON output, and we need to recompile and reprogram the entire internet. This is the static view that is promoted as a feature here. […] The "parser mentality" is fundamentally rigid and global, whereas robust system design should be decentralised and leave interpretation of data to the receiver.</p></blockquote><p>Given the argument being made in the blog post—that you should use precise types whenever possible—one can see where this misinterpretation comes from. How could a proxy server possibly be written in such a style, since it cannot anticipate the structure of its payloads? The commenter’s conclusion is that strict static typing is at odds with programs that don’t know the structure of their inputs ahead of time.</p><p><a href="https://news.ycombinator.com/item?id=21479933">The second comment was left on Hacker News</a>, and it is significantly shorter than the first one:</p><blockquote><p>What would be the type signature of, say, Python's <code>pickle.load()</code>?</p></blockquote><p>This is a different kind of argument, one that relies on the fact that the types of reflective operations may depend on runtime values, which makes them challenging to capture with static types. This argument suggests that static types limit expressiveness because they forbid such operations outright.</p><p>Both these arguments are fallacious, but in order to show why, we have to make explicit an implicit claim. The two comments focus primarily on illustrating how static type systems can’t process data of an unknown shape, but they simultaneously advance an implicit belief: that dynamically typed languages <em>can</em> process data of an unknown shape. As we’ll see, this belief is misguided; programs are not capable of processing data of a truly unknown shape regardless of typing discipline, and static type systems only make already-present assumptions explicit.</p><h2><a name="you-can-t-process-what-you-don-t-know"></a>You can’t process what you don’t know</h2><p>The claim is simple: in a static type system, you must declare the shape of data ahead of time, but in a dynamic type system, the type can be, well, dynamic! It sounds self-evident, so much so that Rich Hickey has practically built a speaking career upon its emotional appeal. The only problem is it isn’t true.</p><p>The hypothetical scenario usually goes like this. Say you have a distributed system, and services in the system emit events that can be consumed by any other service that might need them. Each event is accompanied by a payload, which listening services can use to inform further action. The payload itself is minimally-structured, schemaless data encoded using a generic interchange format such as JSON or <a href="https://github.com/edn-format/edn">EDN</a>.</p><p>As a simple example, a login service might emit an event like this one whenever a new user signs up:</p><pre><code class="pygments"><span class="p">{</span>
<span class="nt">"event_type"</span><span class="p">:</span> <span class="s2">"signup"</span><span class="p">,</span>
<span class="nt">"timestamp"</span><span class="p">:</span> <span class="s2">"2020-01-19T05:37:09Z"</span><span class="p">,</span>
<span class="nt">"data"</span><span class="p">:</span> <span class="p">{</span>
<span class="nt">"user"</span><span class="p">:</span> <span class="p">{</span>
<span class="nt">"id"</span><span class="p">:</span> <span class="mi">42</span><span class="p">,</span>
<span class="nt">"name"</span><span class="p">:</span> <span class="s2">"Alyssa"</span><span class="p">,</span>
<span class="nt">"email"</span><span class="p">:</span> <span class="s2">"alyssa@example.com"</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre><p>Some downstream services might listen for these <code>signup</code> events and take further action whenever they are emitted. For example, a transactional email service might send a welcome email whenever a new user signs up. If the service were written in JavaScript, the handler might look something like this:</p><pre><code class="pygments"><span class="kr">const</span> <span class="nx">handleEvent</span> <span class="o">=</span> <span class="p">({</span> <span class="nx">event_type</span><span class="p">,</span> <span class="nx">data</span> <span class="p">})</span> <span class="p">=></span> <span class="p">{</span>
<span class="k">switch</span> <span class="p">(</span><span class="nx">event_type</span><span class="p">)</span> <span class="p">{</span>
<span class="k">case</span> <span class="s1">'login'</span><span class="o">:</span>
<span class="cm">/* ... */</span>
<span class="k">break</span>
<span class="k">case</span> <span class="s1">'signup'</span><span class="o">:</span>
<span class="nx">sendEmail</span><span class="p">(</span><span class="nx">data</span><span class="p">.</span><span class="nx">user</span><span class="p">.</span><span class="nx">email</span><span class="p">,</span> <span class="sb">`Welcome to Blockchain Emporium, </span><span class="si">${</span><span class="nx">data</span><span class="p">.</span><span class="nx">user</span><span class="p">.</span><span class="nx">name</span><span class="si">}</span><span class="sb">!`</span><span class="p">)</span>
<span class="k">break</span>
<span class="p">}</span>
<span class="p">}</span></code></pre><p>But what if this service were written in Haskell instead? Being good, reality-fearing Haskell programmers who <a href="/blog/2019/11/05/parse-don-t-validate/">parse, not validate</a>, the Haskell code might look something like this, instead:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Event</span> <span class="ow">=</span> <span class="kt">Login</span> <span class="kt">LoginPayload</span> <span class="o">|</span> <span class="kt">Signup</span> <span class="kt">SignupPayload</span>
<span class="kr">data</span> <span class="kt">LoginPayload</span> <span class="ow">=</span> <span class="kt">LoginPayload</span> <span class="p">{</span> <span class="n">userId</span> <span class="ow">::</span> <span class="kt">Int</span> <span class="p">}</span>
<span class="kr">data</span> <span class="kt">SignupPayload</span> <span class="ow">=</span> <span class="kt">SignupPayload</span>
<span class="p">{</span> <span class="n">userId</span> <span class="ow">::</span> <span class="kt">Int</span>
<span class="p">,</span> <span class="n">userName</span> <span class="ow">::</span> <span class="kt">Text</span>
<span class="p">,</span> <span class="n">userEmail</span> <span class="ow">::</span> <span class="kt">Text</span> <span class="p">}</span>
<span class="kr">instance</span> <span class="kt">FromJSON</span> <span class="kt">Event</span> <span class="kr">where</span>
<span class="n">parseJSON</span> <span class="ow">=</span> <span class="n">withObject</span> <span class="s">"Event"</span> <span class="nf">\</span><span class="n">obj</span> <span class="ow">-></span> <span class="kr">do</span>
<span class="n">eventType</span> <span class="ow"><-</span> <span class="n">obj</span> <span class="o">.:</span> <span class="s">"event_type"</span>
<span class="kr">case</span> <span class="n">eventType</span> <span class="kr">of</span>
<span class="s">"login"</span> <span class="ow">-></span> <span class="kt">Login</span> <span class="o"><$></span> <span class="p">(</span><span class="n">obj</span> <span class="o">.:</span> <span class="s">"data"</span><span class="p">)</span>
<span class="s">"signup"</span> <span class="ow">-></span> <span class="kt">Signup</span> <span class="o"><$></span> <span class="p">(</span><span class="n">obj</span> <span class="o">.:</span> <span class="s">"signup"</span><span class="p">)</span>
<span class="kr">_</span> <span class="ow">-></span> <span class="n">fail</span> <span class="o">$</span> <span class="s">"unknown event_type: "</span> <span class="o"><></span> <span class="n">eventType</span>
<span class="kr">instance</span> <span class="kt">FromJSON</span> <span class="kt">LoginPayload</span> <span class="kr">where</span> <span class="p">{</span> <span class="o">...</span> <span class="p">}</span>
<span class="kr">instance</span> <span class="kt">FromJSON</span> <span class="kt">SignupPayload</span> <span class="kr">where</span> <span class="p">{</span> <span class="o">...</span> <span class="p">}</span>
<span class="nf">handleEvent</span> <span class="ow">::</span> <span class="kt">JSON</span><span class="o">.</span><span class="kt">Value</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">handleEvent</span> <span class="n">payload</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">fromJSON</span> <span class="n">payload</span> <span class="kr">of</span>
<span class="kt">Success</span> <span class="p">(</span><span class="kt">Login</span> <span class="kt">LoginPayload</span> <span class="p">{</span> <span class="n">userId</span> <span class="p">})</span> <span class="ow">-></span> <span class="cm">{- ... -}</span>
<span class="kt">Success</span> <span class="p">(</span><span class="kt">Signup</span> <span class="kt">SignupPayload</span> <span class="p">{</span> <span class="n">userName</span><span class="p">,</span> <span class="n">userEmail</span> <span class="p">})</span> <span class="ow">-></span>
<span class="n">sendEmail</span> <span class="n">userEmail</span> <span class="o">$</span> <span class="s">"Welcome to Blockchain Emporium, "</span> <span class="o"><></span> <span class="n">userName</span> <span class="o"><></span> <span class="s">"!"</span>
<span class="kt">Error</span> <span class="n">message</span> <span class="ow">-></span> <span class="n">fail</span> <span class="o">$</span> <span class="s">"could not parse event: "</span> <span class="o"><></span> <span class="n">message</span></code></pre><p>It’s definitely more boilerplate, but some extra overhead for type definitions is to be expected (and is greatly exaggerated in such tiny examples), and the arguments we’re discussing aren’t about boilerplate, anyway. The <em>real</em> problem with this version of the code, according to the Reddit comment from earlier, is that the Haskell code has to be updated whenever a service adds a new event type! A new case has to be added to the <code>Event</code> datatype, and it must be given new parsing logic. And what about when new fields get added to the payload? What a maintenance nightmare.</p><p>In comparison, the JavaScript code is much more permissive. If a new event type is added, it will just fall through the <code>switch</code> and do nothing. If extra fields are added to the payload, the JavaScript code will just ignore them. Seems like a win for dynamic typing.</p><p>Except that no, it isn’t. The only reason the statically typed program fails if we don’t update the <code>Event</code> type is that we wrote <code>handleEvent</code> that way. We could just have easily done the same thing in the JavaScript code, adding a default case that rejects unknown event types:</p><pre><code class="pygments"><span class="kr">const</span> <span class="nx">handleEvent</span> <span class="o">=</span> <span class="p">({</span> <span class="nx">event_type</span><span class="p">,</span> <span class="nx">data</span> <span class="p">})</span> <span class="p">=></span> <span class="p">{</span>
<span class="k">switch</span> <span class="p">(</span><span class="nx">event_type</span><span class="p">)</span> <span class="p">{</span>
<span class="cm">/* ... */</span>
<span class="k">default</span><span class="o">:</span>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span><span class="p">(</span><span class="sb">`unknown event_type: </span><span class="si">${</span><span class="nx">event_type</span><span class="si">}</span><span class="sb">`</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span></code></pre><p>We didn’t do that, since in this case it would clearly be silly. If a service receives an event it doesn’t know about, it should just ignore it. This is a case where being permissive is clearly the correct behavior, and we can easily implement that in the Haskell code too:</p><pre><code class="pygments"><span class="nf">handleEvent</span> <span class="ow">::</span> <span class="kt">JSON</span><span class="o">.</span><span class="kt">Value</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">handleEvent</span> <span class="n">payload</span> <span class="ow">=</span> <span class="kr">case</span> <span class="n">fromJSON</span> <span class="n">payload</span> <span class="kr">of</span>
<span class="cm">{- ... -}</span>
<span class="kt">Error</span> <span class="kr">_</span> <span class="ow">-></span> <span class="n">pure</span> <span class="nb">()</span></code></pre><p>This is still in the spirit of “parse, don’t validate” because we’re still parsing the values we <em>do</em> care about as early as possible, so we don’t fall into the double-validation trap. At no point do we take a code path that depends on a value being well-formed without first ensuring (with the help of the type system) that it is, in fact, actually well-formed. We don’t have to respond to an ill-formed value by raising an error! We just have to be explicit about ignoring it.</p><p>This illustrates an important point: the <code>Event</code> type in this Haskell code doesn’t describe “all possible events,” it describes all the events that the application cares about. Likewise, the code that parses those events’ payloads only worries about the fields the application needs, and it ignores extraneous ones. A static type system doesn’t require you eagerly write a schema for the whole universe, it simply requires you to be up front about the things you need.</p><p>This turns out to have a lot of pleasant benefits even though knowledge about inputs is limited:</p><ul><li><p>It’s easy to discover the assumptions of the Haskell program just by looking at the type definitions. We know, for example, that this application doesn’t care about the <code>timestamp</code> field, since it never appears in any of the payload types. In the dynamically-typed program, we’d have to audit every code path to see whether or not it inspects that field, which would be a lot of error-prone work!</p></li><li><p>What’s more, it turns out the Haskell code doesn’t actually <em>use</em> the <code>userId</code> field inside the <code>SignupPayload</code> type, so that type is overly conservative. If we want to ensure it isn’t actually needed (since, for example, maybe we’re phasing out providing the user ID in that payload entirely), we need only delete that record field; if the code typechecks, we can be confident it really doesn’t depend on that field.</p></li><li><p>Finally, we neatly avoid all the gotchas related to shotgun parsing <a href="/blog/2019/11/05/parse-don-t-validate/#the-danger-of-validation">mentioned in the previous blog post</a>, since we still haven’t compromised on any of those principles.</p></li></ul><p>We’ve already invalidated the first half of the claim: that statically typed languages can’t deal with data where the structure isn’t completely known. Let’s now look at the other half, which states that dynamically typed languages can process data where the structure isn’t known at all. Maybe that still sounds right, but if you slow down and think about it more carefully, you’ll find it can’t be.</p><p>The above JavaScript code makes all the same assumptions our Haskell code does: it assumes event payloads are JSON objects with an <code>event_type</code> field, and it assumes <code>signup</code> payloads include <code>data.user.name</code> and <code>data.user.email</code> fields. It certainly can’t do anything useful with truly unknown input! If a new event payload is added, our JavaScript code can’t magically adapt to handle it simply because it is dynamically typed. Dynamic typing just means the types of values are carried alongside them at runtime and checked as the program executes; the types are still there, and this program still implicitly relies on them being particular things.</p><h2><a name="keeping-opaque-data-opaque"></a>Keeping opaque data opaque</h2><p>In the previous section, we debunked the idea that statically typed systems can’t process partially-known data, but if you have been paying close attention, you may have noticed it did not fully refute the original claim.</p><p>Although we were able to handle unknown data, we always simply discarded it, which would not fly if we were trying to implement some sort of proxying. For example, suppose we have a forwarding service that broadcasts events over a public network, attaching a signature to each payload to ensure it can’t be spoofed. We might implement this in JavaScript this way:</p><pre><code class="pygments"><span class="kr">const</span> <span class="nx">handleEvent</span> <span class="o">=</span> <span class="p">(</span><span class="nx">payload</span><span class="p">)</span> <span class="p">=></span> <span class="p">{</span>
<span class="kr">const</span> <span class="nx">signedPayload</span> <span class="o">=</span> <span class="p">{</span> <span class="p">...</span><span class="nx">payload</span><span class="p">,</span> <span class="nx">signature</span><span class="o">:</span> <span class="nx">signature</span><span class="p">(</span><span class="nx">payload</span><span class="p">)</span> <span class="p">}</span>
<span class="nx">retransmitEvent</span><span class="p">(</span><span class="nx">signedPayload</span><span class="p">)</span>
<span class="p">}</span></code></pre><p>In this case, we don’t care about the structure of the payload at all (the <code>signature</code> function just works on any valid JSON object), but we still have to preserve all the information. How could we do that in a statically typed language, since a statically-typed language would have to assign the payload a precise type?</p><p>Once again, the answer involves rejecting the premise: there’s no need to give data a type that’s any more precise than the application needs. The same logic could be written in a straightforward way in Haskell:</p><pre><code class="pygments"><span class="nf">handleEvent</span> <span class="ow">::</span> <span class="kt">JSON</span><span class="o">.</span><span class="kt">Value</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">handleEvent</span> <span class="p">(</span><span class="kt">Object</span> <span class="n">payload</span><span class="p">)</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">signedPayload</span> <span class="ow">=</span> <span class="kt">Map</span><span class="o">.</span><span class="n">insert</span> <span class="s">"signature"</span> <span class="p">(</span><span class="n">signature</span> <span class="n">payload</span><span class="p">)</span> <span class="n">payload</span>
<span class="n">retransmitEvent</span> <span class="n">signedPayload</span>
<span class="nf">handleEvent</span> <span class="n">payload</span> <span class="ow">=</span> <span class="n">fail</span> <span class="o">$</span> <span class="s">"event payload was not an object "</span> <span class="o"><></span> <span class="n">show</span> <span class="n">payload</span></code></pre><p>In this case, since we don’t care about the structure of the payload, we manipulate a value of type <code>JSON.Value</code> directly. This type is extremely imprecise compared to our <code>Event</code> type from earlier—it can hold any legal JSON value, of any shape—but in this case, we <em>want</em> it to be imprecise.</p><p>Thanks to that imprecision, the type system helped us here: it caught the fact that we’re assuming the payload is a JSON object, not some other JSON value, and it made us handle the non-object cases explicitly. In this case we chose to raise an error, but of course, as before, you could choose some other form of recovery if you wanted to. You just have to be explicit about it.</p><p>Once more, note that the assumption we were forced to make explicit in Haskell is <em>also</em> made by the JavaScript code! If our JavaScript <code>handleEvent</code> function were called with a string rather than an object, it’s unlikely the behavior would be desirable, since an object spread on a string results in the following surprise:</p><pre><code class="pygments"><span class="o">></span> <span class="p">{</span> <span class="p">...</span><span class="s2">"payload"</span><span class="p">,</span> <span class="nx">signature</span><span class="o">:</span> <span class="s2">"sig"</span> <span class="p">}</span>
<span class="p">{</span><span class="mi">0</span><span class="o">:</span> <span class="s2">"p"</span><span class="p">,</span> <span class="mi">1</span><span class="o">:</span> <span class="s2">"a"</span><span class="p">,</span> <span class="mi">2</span><span class="o">:</span> <span class="s2">"y"</span><span class="p">,</span> <span class="mi">3</span><span class="o">:</span> <span class="s2">"l"</span><span class="p">,</span> <span class="mi">4</span><span class="o">:</span> <span class="s2">"o"</span><span class="p">,</span> <span class="mi">5</span><span class="o">:</span> <span class="s2">"a"</span><span class="p">,</span> <span class="mi">6</span><span class="o">:</span> <span class="s2">"d"</span><span class="p">,</span> <span class="nx">signature</span><span class="o">:</span> <span class="s2">"sig"</span><span class="p">}</span></code></pre><p>Oops. Once again, the parsing style of programming has helped us out, since if we didn’t “parse” the JSON value into an object by matching on the <code>Object</code> case explicitly, our code would not compile, and if we left off the fallthrough case, we’d get a warning about inexhaustive patterns.</p><hr/><p>Let’s look at one more example of this phenomenon before moving on. Suppose we’re consuming an API that returns user IDs, and suppose those IDs happen to be UUIDs. A straightforward interpretation of “parse, don’t validate” might suggest we represent user IDs in our Haskell API client using a <code>UUID</code> type:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">UserId</span> <span class="ow">=</span> <span class="kt">UUID</span></code></pre><p>However, our Reddit commenter would likely take umbrage with this! Unless the API contract explicitly states that all user IDs will be UUIDs, this representation is overstepping our bounds. Although user IDs might be UUIDs today, perhaps they won’t be tomorrow, and then our code would break for no reason! Is this the fault of static type systems?</p><p>Again, the answer is no. This is a case of improper data modeling, but the static type system is not at fault—it has simply been misused. The appropriate way to represent a <code>UserId</code> is to define a new, opaque type:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">UserId</span> <span class="ow">=</span> <span class="kt">UserId</span> <span class="kt">Text</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Eq</span><span class="p">,</span> <span class="kt">FromJSON</span><span class="p">,</span> <span class="kt">ToJSON</span><span class="p">)</span></code></pre><p>Unlike the type alias defined above which simply creates a new name for the existing <code>UUID</code> type, this declaration creates a totally new <code>UserId</code> type that is distinct from all other types, including <code>Text</code>. If we keep the datatype’s constructor private (that is, we don’t export it from the module that defines this type), then the <em>only</em> way to produce a <code>UserId</code> will be to go through its <code>FromJSON</code> parser. Dually, the only things you can do with a <code>UserId</code> are compare it with other <code>UserId</code>s for equality or serialize it using the <code>ToJSON</code> instance. Nothing else is permitted: the type system will prevent you from depending on the remote service’s internal representation of user IDs.</p><p>This illustrates another way that static type systems can provide strong, useful guarantees when manipulating completely opaque data. The runtime representation of a <code>UserId</code> is really just a string, but the type system does not allow you to accidentally use it like it’s a string, nor does it allow you to forge a new <code>UserId</code> out of thin air from an arbitrary string.<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup></p><p>The type system is not a ball and chain forcing you to describe the representation of every value that enters and leaves your program in exquisite detail. Rather, it’s a tool that you can use in whatever way best suits your needs.</p><h2><a name="reflection-is-not-special"></a>Reflection is not special</h2><p>We’ve now thoroughly debunked the claims made by the first commenter, but the question posed by the second commenter may still seem like a loophole in our logic. What <em>is</em> the type of Python’s <code>pickle.load()</code>? For those unfamiliar, <a href="https://docs.python.org/3/library/pickle.html">Python’s cutely-named <code>pickle</code> library</a> allows serializing and deserializing entire Python object graphs. Any object can be serialized and stored in a file using <code>pickle.dump()</code>, and it can be deserialized at a later point in time using <code>pickle.load()</code>.</p><p>What makes this appear challenging to our static type system is that the type of value produced by <code>pickle.load()</code> is difficult to predict—it depends entirely on whatever happened to be written to that file using <code>pickle.dump()</code>. This seems inherently dynamic, since we cannot possibly know what type of value it will produce at compile-time. At first blush, this is something a dynamically typed system can pull off, but a statically-typed one just can’t.</p><p>However, it turns out this situation is actually identical to the previous examples using JSON, and the fact that Python’s pickling serializes native Python objects directly does not change things. Why? Well, consider what happens <em>after</em> a program calls <code>pickle.load()</code>. Say you write the following function:</p><pre><code class="pygments"><span class="k">def</span> <span class="nf">load_value</span><span class="p">(</span><span class="n">f</span><span class="p">):</span>
<span class="n">val</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
<span class="c1"># do something with `val`</span></code></pre><p>The trouble is that <code>val</code> can now be of <em>any</em> type, and just as you can’t do anything useful with truly unknown, unstructured input, you can’t do anything with a value unless you know at least something about it. If you call any method or access any field on the result, then you’ve already made an assumption about what sort of thing <code>pickle.load(f)</code> returned—and it turns out those assumptions <em>are</em> <code>val</code>’s type!</p><p>For example, imagine the only thing you do with <code>val</code> is call the <code>val.foo()</code> method and return its result, which is expected to be a string. If we were writing Java, then the expected type of <code>val</code> would be quite straightforward—we’d expect it to be an instance of the following interface:</p><pre><code class="pygments"><span class="kd">interface</span> <span class="nc">Foo</span> <span class="kd">extends</span> <span class="n">Serializable</span> <span class="o">{</span>
<span class="n">String</span> <span class="nf">foo</span><span class="o">();</span>
<span class="o">}</span></code></pre><p>And indeed, it turns out a <code>pickle.load()</code>-like function can be given a perfectly reasonable type in Java:</p><pre><code class="pygments"><span class="kd">static</span> <span class="o"><</span><span class="n">T</span> <span class="kd">extends</span> <span class="n">Serializable</span><span class="o">></span> <span class="n">Optional</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="nf">load</span><span class="o">(</span><span class="n">InputStream</span> <span class="n">in</span><span class="o">,</span> <span class="n">Class</span><span class="o"><?</span> <span class="kd">extends</span> <span class="n">T</span><span class="o">></span> <span class="n">cls</span><span class="o">);</span></code></pre><p>Nitpickers will complain that this isn’t the same as <code>pickle.load()</code>, since you have to pass a <code>Class<T></code> token to choose what type of thing you want ahead of time. However, nothing is stopping you from passing <code>Serializable.class</code> and branching on the type later, after the object has been loaded. And that’s the key point: the instant you do <em>anything</em> with the object, you must know something about its type, even in a dynamically typed language! The statically-typed language just forces you to be more explicit about it, just as it did when we were talking about JSON payloads.</p><hr/><p>Can we do this in Haskell, too? Absolutely—we can use <a href="https://hackage.haskell.org/package/serialise">the <code>serialise</code> library</a>, which has a similar API to the Java one mentioned above. It also happens to have a very similar interface to <a href="https://hackage.haskell.org/package/aeson">the Haskell JSON library, aeson</a>, as it turns out the problem of dealing with unknown JSON data is not terribly different from dealing with an unknown Haskell value—at some point, you have to do a little bit of parsing to do anything with the value.</p><p>That said, while you <em>can</em> emulate the dynamic typing of <code>pickle.load()</code> if you really want to by deferring the type check until the last possible moment, the reality is that doing so is almost never actually useful. At some point, you have to make assumptions about the structure of the value in order to use it, and you know what those assumptions are because <em>you wrote the code</em>. While there are extremely rare exceptions to this that require true dynamic code loading (such as, say, implementing a REPL for your programming language), they do not occur in day-to-day programming, and programmers in statically-typed languages are perfectly happy to supply their assumptions up front.</p><p>This is one of the fundamental disconnects between the static typing camp and the dynamic typing camp. Programmers working in statically-typed languages are perplexed when a programmer suggests they can do something in a dynamically typed language that a statically-typed language “fundamentally” prevents, since a programmer in a statically-typed language may reply the value has simply not been given a sufficiently precise type. From the perspective of a programmer working in a dynamically-typed language, the type system restricts the space of legal behaviors, but from the perspective of a programmer working in a statically-typed language, the set of legal behaviors <em>is</em> a value’s type.</p><p>Neither of these perspectives are actually inaccurate, from the appropriate point of view. Static type systems <em>do</em> impose restrictions on program structure, as it is provably impossible to reject <em>all</em> bad programs in a Turing-complete language without also rejecting some good ones (this is <a href="https://en.wikipedia.org/wiki/Rice's_theorem">Rice’s theorem</a>). But it is simultaneously true that the impossibility of solving the general problem does not preclude solving a slightly more restricted version of the problem in a useful way, and a lot of the so-called “fundamental” inabilities of static type systems are not fundamental at all.</p><h2><a name="appendix-the-reality-behind-the-myths"></a>Appendix: the reality behind the myths</h2><p>The key thesis of this blog post has now been delivered: static type systems are not fundamentally worse than dynamic type systems at processing data with an open or partially-known structure. The sorts of claims made in the comments cited at the beginning of this blog post are not accurate depictions of what statically-typed program construction is like, and they misunderstand the limitations of static typing disciplines while exaggerating the capabilities of dynamically typed disciplines.</p><p>However, although greatly exaggerated, these myths do have some basis in reality. They appear to have developed at least in part from a misunderstanding about the differences between structural and nominal typing. This difference is unfortunately too big to address in this blog post, as it could likely fill several blog posts of its own. About six months ago I attempted to write a blog post on the subject, but I didn’t think it came out very compelling, so I scrapped it. Maybe someday I’ll find a better way to communicate the ideas.</p><p>Although I can’t give it the full treatment it deserves right now, I’d still like to touch on the idea briefly so that interested readers may be able to find other resources on the subject should they wish to do so. The key idea is that many dynamically typed languages idiomatically reuse simple data structures like hashmaps to represent what in statically-typed languages are often represented by bespoke datatypes (usually defined as classes or structs).</p><p>These two styles facilitate very different flavors of programming. A JavaScript or Clojure program may represent a record as a hashmap from string or symbol keys to values, written using object or hash literals and manipulated using ordinary functions from the standard library that manipulate keys and values in a generic way. This makes it straightforward to take two records and union their fields or to take an arbitrary (or even dynamic) subselection of fields from an existing record.</p><p>In contrast, most static type systems do not allow such free-form manipulation of records because records are not maps at all but unique types distinct from all other types. These types are uniquely identified by their (fully-qualified) name, hence the term <em>nominal typing</em>. If you wish to take a subselection of a struct’s fields, you must define an entirely new struct; doing this often creates an explosion of awkward boilerplate.</p><p>This is one of the main ideas that Rich Hickey has discussed in many of his talks that criticize static typing. He has advanced the idea that this ability to fluidly merge, separate, and transform records makes dynamic typing particularly suited to the domain of distributed, open systems. Unfortunately, this rhetoric has two significant flaws:</p><ol><li><p>It skirts too close to calling this a fundamental limitation of type systems, suggesting that it is not simply inconvenient but <em>impossible</em> to model such systems in a nominal, static type system. Not only is this not true (as this blog post has demonstrated), it misdirects people away from the point of his that actually has value: the practical, pragmatic advantage of a more structural approach to data modeling.</p></li><li><p>It confuses the structural/nominal distinction with the dynamic/static distinction, incorrectly creating the impression that the fluid merging and splitting of records represented as key-value maps is only possible in a dynamically typed language. In fact, not only can statically-typed languages support structural typing, many dynamically-typed languages also support nominal typing. These axes have historically loosely correlated, but they are theoretically orthogonal.</p></li></ol><p>For counterexamples to these claims, consider Python classes, which are quite nominal despite being dynamic, and TypeScript interfaces, which are structural despite being static. Indeed, modern statically-typed languages are increasingly acquiring native support for structurally-typed records. In these systems, record types work much like hashes in Clojure—they are not distinct, named types but rather anonymous collections of key-value pairs—and they support many of the same expressive manipulation operations that Clojure’s hashes do, all within a statically-typed framework.</p><p>If you are interested in exploring static type systems with strong support for structural typing, I would recommend taking a look at any of TypeScript, Flow, PureScript, Elm, OCaml, or Reason, all of which have some sort of support for structurally typed records. What I would <em>not</em> recommend for this purpose is Haskell, which has abysmal support for structural typing; Haskell is (for various reasons outside the scope of this blog post) aggressively nominal.<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup></p><p>Does this mean Haskell is bad, or that it cannot be practically used to solve these kinds of problems? No, certainly not; there are many ways to model these problems in Haskell that work well enough, though some of them suffer from significant boilerplate. The core thesis of this blog post applies just as much to Haskell as it does to any of the other languages I mentioned above. However, I would be remiss not to mention this distinction, as it may give programmers from a dynamically-typed background who have historically found statically-typed languages much more frustrating to work with a better understanding of the <em>real</em> reason they feel that way. (Essentially all mainstream, statically-typed OOP languages are even more nominal than Haskell!)</p><p>As closing thoughts: this blog post is not intended to start a flame war, nor is it intended to be an assault on dynamically typed programming. There are many patterns in dynamically-typed languages that are genuinely difficult to translate into a statically-typed context, and I think discussions of those patterns can be productive. The purpose of this blog post is to clarify why one particular discussion is <em>not</em> productive, so please: stop making these arguments. There are much more productive conversations to have about typing than this.</p><ol class="footnotes"><li id="footnote-1"><p>Technically, you could abuse the <code>FromJSON</code> instance to convert an arbitrary string to a <code>UserId</code>, but this would not be as easy as it sounds, since <code>fromJSON</code> can fail. This means you’d somehow have to handle that failure case, so this trick would be unlikely to get you very far unless you’re already in a context where you’re doing input parsing… at which point it would be easier to just do the right thing. So yes, the type system doesn’t prevent you from going out of your way to shoot yourself in the foot, but it guides you towards the right solution (and there is no safeguard in existence that can completely protect a programmer from making their own life miserable if they are determined to do so). <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>I consider this to be Haskell’s most significant flaw at the time of this writing. <a href="#footnote-ref-2-1">↩</a></p></li></ol></article>Parse, don’t validate2019-11-05T00:00:00Z2019-11-05T00:00:00ZAlexis King<article><p>Historically, I’ve struggled to find a concise, simple way to explain what it means to practice type-driven design. Too often, when someone asks me “How did you come up with this approach?” I find I can’t give them a satisfying answer. I know it didn’t just come to me in a vision—I have an iterative design process that doesn’t require plucking the “right” approach out of thin air—yet I haven’t been very successful in communicating that process to others.</p><p>However, about a month ago, <a href="https://twitter.com/lexi_lambda/status/1182242561655746560">I was reflecting on Twitter</a> about the differences I experienced parsing JSON in statically- and dynamically-typed languages, and finally, I realized what I was looking for. Now I have a single, snappy slogan that encapsulates what type-driven design means to me, and better yet, it’s only three words long:</p><div style="text-align: center; font-size: larger"><strong>Parse, don’t validate.</strong></div><h2><a name="the-essence-of-type-driven-design"></a>The essence of type-driven design</h2><p>Alright, I’ll confess: unless you already know what type-driven design is, my catchy slogan probably doesn’t mean all that much to you. Fortunately, that’s what the remainder of this blog post is for. I’m going to explain precisely what I mean in gory detail—but first, we need to practice a little wishful thinking.</p><h3><a name="the-realm-of-possibility"></a>The realm of possibility</h3><p>One of the wonderful things about static type systems is that they can make it possible, and sometimes even easy, to answer questions like “is it possible to write this function?” For an extreme example, consider the following Haskell type signature:</p><pre><code class="pygments"><span class="nf">foo</span> <span class="ow">::</span> <span class="kt">Integer</span> <span class="ow">-></span> <span class="kt">Void</span></code></pre><p>Is it possible to implement <code>foo</code>? Trivially, the answer is <em>no</em>, as <code>Void</code> is a type that contains no values, so it’s impossible for <em>any</em> function to produce a value of type <code>Void</code>.<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup> That example is pretty boring, but the question gets much more interesting if we choose a more realistic example:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="n">a</span></code></pre><p>This function returns the first element from a list. Is it possible to implement? It certainly doesn’t sound like it does anything very complicated, but if we attempt to implement it, the compiler won’t be satisfied:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">head</span> <span class="p">(</span><span class="n">x</span><span class="kt">:</span><span class="kr">_</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span></code></pre><pre><code>warning: [-Wincomplete-patterns]
Pattern match(es) are non-exhaustive
In an equation for ‘head’: Patterns not matched: []
</code></pre><p>This message is helpfully pointing out that our function is <em>partial</em>, which is to say it is not defined for all possible inputs. Specifically, it is not defined when the input is <code>[]</code>, the empty list. This makes sense, as it isn’t possible to return the first element of a list if the list is empty—there’s no element to return! So, remarkably, we learn this function isn’t possible to implement, either.</p><h3><a name="turning-partial-functions-total"></a>Turning partial functions total</h3><p>To someone coming from a dynamically-typed background, this might seem perplexing. If we have a list, we might very well want to get the first element in it. And indeed, the operation of “getting the first element of a list” isn’t impossible in Haskell, it just requires a little extra ceremony. There are two different ways to fix the <code>head</code> function, and we’ll start with the simplest one.</p><h4><a name="managing-expectations"></a>Managing expectations</h4><p>As established, <code>head</code> is partial because there is no element to return if the list is empty: we’ve made a promise we cannot possibly fulfill. Fortunately, there’s an easy solution to that dilemma: we can weaken our promise. Since we cannot guarantee the caller an element of the list, we’ll have to practice a little expectation management: we’ll do our best return an element if we can, but we reserve the right to return nothing at all. In Haskell, we express this possibility using the <code>Maybe</code> type:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="n">a</span></code></pre><p>This buys us the freedom we need to implement <code>head</code>—it allows us to return <code>Nothing</code> when we discover we can’t produce a value of type <code>a</code> after all:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="n">a</span>
<span class="nf">head</span> <span class="p">(</span><span class="n">x</span><span class="kt">:</span><span class="kr">_</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">Just</span> <span class="n">x</span>
<span class="nf">head</span> <span class="kt">[]</span> <span class="ow">=</span> <span class="kt">Nothing</span></code></pre><p>Problem solved, right? For the moment, yes… but this solution has a hidden cost.</p><p>Returning <code>Maybe</code> is undoubtably convenient when we’re <em>implementing</em> <code>head</code>. However, it becomes significantly less convenient when we want to actually use it! Since <code>head</code> always has the potential to return <code>Nothing</code>, the burden falls upon its callers to handle that possibility, and sometimes that passing of the buck can be incredibly frustrating. To see why, consider the following code:</p><pre><code class="pygments"><span class="nf">getConfigurationDirectories</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="p">[</span><span class="kt">FilePath</span><span class="p">]</span>
<span class="nf">getConfigurationDirectories</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">configDirsString</span> <span class="ow"><-</span> <span class="n">getEnv</span> <span class="s">"CONFIG_DIRS"</span>
<span class="kr">let</span> <span class="n">configDirsList</span> <span class="ow">=</span> <span class="n">split</span> <span class="sc">','</span> <span class="n">configDirsString</span>
<span class="n">when</span> <span class="p">(</span><span class="n">null</span> <span class="n">configDirsList</span><span class="p">)</span> <span class="o">$</span>
<span class="n">throwIO</span> <span class="o">$</span> <span class="n">userError</span> <span class="s">"CONFIG_DIRS cannot be empty"</span>
<span class="n">pure</span> <span class="n">configDirsList</span>
<span class="nf">main</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">main</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">configDirs</span> <span class="ow"><-</span> <span class="n">getConfigurationDirectories</span>
<span class="kr">case</span> <span class="n">head</span> <span class="n">configDirs</span> <span class="kr">of</span>
<span class="kt">Just</span> <span class="n">cacheDir</span> <span class="ow">-></span> <span class="n">initializeCache</span> <span class="n">cacheDir</span>
<span class="kt">Nothing</span> <span class="ow">-></span> <span class="ne">error</span> <span class="s">"should never happen; already checked configDirs is non-empty"</span></code></pre><p>When <code>getConfigurationDirectories</code> retrieves a list of file paths from the environment, it proactively checks that the list is non-empty. However, when we use <code>head</code> in <code>main</code> to get the first element of the list, the <code>Maybe FilePath</code> result still requires us to handle a <code>Nothing</code> case that we know will never happen! This is terribly bad for several reasons:</p><ol><li><p>First, it’s just annoying. We already checked that the list is non-empty, why do we have to clutter our code with another redundant check?</p></li><li><p>Second, it has a potential performance cost. Although the cost of the redundant check is trivial in this particular example, one could imagine a more complex scenario where the redundant checks could add up, such as if they were happening in a tight loop.</p></li><li><p>Finally, and worst of all, this code is a bug waiting to happen! What if <code>getConfigurationDirectories</code> were modified to stop checking that the list is empty, intentionally or unintentionally? The programmer might not remember to update <code>main</code>, and suddenly the “impossible” error becomes not only possible, but probable.</p></li></ol><p>The need for this redundant check has essentially forced us to punch a hole in our type system. If we could statically <em>prove</em> the <code>Nothing</code> case impossible, then a modification to <code>getConfigurationDirectories</code> that stopped checking if the list was empty would invalidate the proof and trigger a compile-time failure. However, as-written, we’re forced to rely on a test suite or manual inspection to catch the bug.</p><h4><a name="paying-it-forward"></a>Paying it forward</h4><p>Clearly, our modified version of <code>head</code> leaves some things to be desired. Somehow, we’d like it to be smarter: if we already checked that the list was non-empty, <code>head</code> should unconditionally return the first element without forcing us to handle the case we know is impossible. How can we do that?</p><p>Let’s look at the original (partial) type signature for <code>head</code> again:</p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="n">a</span></code></pre><p>The previous section illustrated that we can turn that partial type signature into a total one by weakening the promise made in the return type. However, since we don’t want to do that, there’s only one thing left that can be changed: the argument type (in this case, <code>[a]</code>). Instead of weakening the return type, we can <em>strengthen</em> the argument type, eliminating the possibility of <code>head</code> ever being called on an empty list in the first place.</p><p>To do this, we need a type that represents non-empty lists. Fortunately, the existing <code>NonEmpty</code> type from <code>Data.List.NonEmpty</code> is exactly that. It has the following definition:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">=</span> <span class="n">a</span> <span class="kt">:|</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span></code></pre><p>Note that <code>NonEmpty a</code> is really just a tuple of an <code>a</code> and an ordinary, possibly-empty <code>[a]</code>. This conveniently models a non-empty list by storing the first element of the list separately from the list’s tail: even if the <code>[a]</code> component is <code>[]</code>, the <code>a</code> component must always be present. This makes <code>head</code> completely trivial to implement:<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup></p><pre><code class="pygments"><span class="nf">head</span> <span class="ow">::</span> <span class="kt">NonEmpty</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">head</span> <span class="p">(</span><span class="n">x</span><span class="kt">:|</span><span class="kr">_</span><span class="p">)</span> <span class="ow">=</span> <span class="n">x</span></code></pre><p>Unlike before, GHC accepts this definition without complaint—this definition is <em>total</em>, not partial. We can update our program to use the new implementation:</p><pre><code class="pygments"><span class="nf">getConfigurationDirectories</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="kt">FilePath</span><span class="p">)</span>
<span class="nf">getConfigurationDirectories</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">configDirsString</span> <span class="ow"><-</span> <span class="n">getEnv</span> <span class="s">"CONFIG_DIRS"</span>
<span class="kr">let</span> <span class="n">configDirsList</span> <span class="ow">=</span> <span class="n">split</span> <span class="sc">','</span> <span class="n">configDirsString</span>
<span class="kr">case</span> <span class="n">nonEmpty</span> <span class="n">configDirsList</span> <span class="kr">of</span>
<span class="kt">Just</span> <span class="n">nonEmptyConfigDirsList</span> <span class="ow">-></span> <span class="n">pure</span> <span class="n">nonEmptyConfigDirsList</span>
<span class="kt">Nothing</span> <span class="ow">-></span> <span class="n">throwIO</span> <span class="o">$</span> <span class="n">userError</span> <span class="s">"CONFIG_DIRS cannot be empty"</span>
<span class="nf">main</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">main</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">configDirs</span> <span class="ow"><-</span> <span class="n">getConfigurationDirectories</span>
<span class="n">initializeCache</span> <span class="p">(</span><span class="n">head</span> <span class="n">configDirs</span><span class="p">)</span></code></pre><p>Note that the redundant check in <code>main</code> is now completely gone! Instead, we perform the check exactly once, in <code>getConfigurationDirectories</code>. It constructs a <code>NonEmpty a</code> from a <code>[a]</code> using the <code>nonEmpty</code> function from <code>Data.List.NonEmpty</code>, which has the following type:</p><pre><code class="pygments"><span class="nf">nonEmpty</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="n">a</span><span class="p">)</span></code></pre><p>The <code>Maybe</code> is still there, but this time, we handle the <code>Nothing</code> case very early in our program: right in the same place we were already doing the input validation. Once that check has passed, we now have a <code>NonEmpty FilePath</code> value, which preserves (in the type system!) the knowledge that the list really is non-empty. Put another way, you can think of a value of type <code>NonEmpty a</code> as being like a value of type <code>[a]</code>, plus a <em>proof</em> that the list is non-empty.</p><p>By strengthening the type of the argument to <code>head</code> instead of weakening the type of its result, we’ve completely eliminated all the problems from the previous section:</p><ul><li><p>The code has no redundant checks, so there can’t be any performance overhead.</p></li><li><p>Furthermore, if <code>getConfigurationDirectories</code> changes to stop checking that the list is non-empty, its return type must change, too. Consequently, <code>main</code> will fail to typecheck, alerting us to the problem before we even run the program!</p></li></ul><p>What’s more, it’s trivial to recover the old behavior of <code>head</code> from the new one by composing <code>head</code> with <code>nonEmpty</code>:</p><pre><code class="pygments"><span class="nf">head'</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="n">a</span>
<span class="nf">head'</span> <span class="ow">=</span> <span class="n">fmap</span> <span class="n">head</span> <span class="o">.</span> <span class="n">nonEmpty</span></code></pre><p>Note that the inverse is <em>not</em> true: there is no way to obtain the new version of <code>head</code> from the old one. All in all, the second approach is superior on all axes.</p><h3><a name="the-power-of-parsing"></a>The power of parsing</h3><p>You may be wondering what the above example has to do with the title of this blog post. After all, we only examined two different ways to validate that a list was non-empty—no parsing in sight. That interpretation isn’t wrong, but I’d like to propose another perspective: in my mind, the difference between validation and parsing lies almost entirely in how information is preserved. Consider the following pair of functions:</p><pre><code class="pygments"><span class="nf">validateNonEmpty</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">validateNonEmpty</span> <span class="p">(</span><span class="kr">_</span><span class="kt">:</span><span class="kr">_</span><span class="p">)</span> <span class="ow">=</span> <span class="n">pure</span> <span class="nb">()</span>
<span class="nf">validateNonEmpty</span> <span class="kt">[]</span> <span class="ow">=</span> <span class="n">throwIO</span> <span class="o">$</span> <span class="n">userError</span> <span class="s">"list cannot be empty"</span>
<span class="nf">parseNonEmpty</span> <span class="ow">::</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="p">(</span><span class="kt">NonEmpty</span> <span class="n">a</span><span class="p">)</span>
<span class="nf">parseNonEmpty</span> <span class="p">(</span><span class="n">x</span><span class="kt">:</span><span class="n">xs</span><span class="p">)</span> <span class="ow">=</span> <span class="n">pure</span> <span class="p">(</span><span class="n">x</span><span class="kt">:|</span><span class="n">xs</span><span class="p">)</span>
<span class="nf">parseNonEmpty</span> <span class="kt">[]</span> <span class="ow">=</span> <span class="n">throwIO</span> <span class="o">$</span> <span class="n">userError</span> <span class="s">"list cannot be empty"</span></code></pre><p>These two functions are nearly identical: they check if the provided list is empty, and if it is, they abort the program with an error message. The difference lies entirely in the return type: <code>validateNonEmpty</code> always returns <code>()</code>, the type that contains no information, but <code>parseNonEmpty</code> returns <code>NonEmpty a</code>, a refinement of the input type that preserves the knowledge gained in the type system. Both of these functions check the same thing, but <code>parseNonEmpty</code> gives the caller access to the information it learned, while <code>validateNonEmpty</code> just throws it away.</p><p>These two functions elegantly illustrate two different perspectives on the role of a static type system: <code>validateNonEmpty</code> obeys the typechecker well enough, but only <code>parseNonEmpty</code> takes full advantage of it. If you see why <code>parseNonEmpty</code> is preferable, you understand what I mean by the mantra “parse, don’t validate.” Still, perhaps you are skeptical of <code>parseNonEmpty</code>’s name. Is it really <em>parsing</em> anything, or is it merely validating its input and returning a result? While the precise definition of what it means to parse or validate something is debatable, I believe <code>parseNonEmpty</code> is a bona-fide parser (albeit a particularly simple one).</p><p>Consider: what is a parser? Really, a parser is just a function that consumes less-structured input and produces more-structured output. By its very nature, a parser is a partial function—some values in the domain do not correspond to any value in the range—so all parsers must have some notion of failure. Often, the input to a parser is text, but this is by no means a requirement, and <code>parseNonEmpty</code> is a perfectly cromulent parser: it parses lists into non-empty lists, signaling failure by terminating the program with an error message.</p><p>Under this flexible definition, parsers are an incredibly powerful tool: they allow discharging checks on input up-front, right on the boundary between a program and the outside world, and once those checks have been performed, they never need to be checked again! Haskellers are well-aware of this power, and they use many different types of parsers on a regular basis:</p><ul><li><p>The <a href="https://hackage.haskell.org/package/aeson">aeson</a> library provides a <code>Parser</code> type that can be used to parse JSON data into domain types.</p></li><li><p>Likewise, <a href="https://hackage.haskell.org/package/optparse-applicative">optparse-applicative</a> provides a set of parser combinators for parsing command-line arguments.</p></li><li><p>Database libraries like <a href="https://hackage.haskell.org/package/persistent">persistent</a> and <a href="https://hackage.haskell.org/package/postgresql-simple">postgresql-simple</a> have a mechanism for parsing values held in an external data store.</p></li><li><p>The <a href="https://hackage.haskell.org/package/servant">servant</a> ecosystem is built around parsing Haskell datatypes from path components, query parameters, HTTP headers, and more.</p></li></ul><p>The common theme between all these libraries is that they sit on the boundary between your Haskell application and the external world. That world doesn’t speak in product and sum types, but in streams of bytes, so there’s no getting around a need to do some parsing. Doing that parsing up front, before acting on the data, can go a long way toward avoiding many classes of bugs, some of which might even be security vulnerabilities.</p><p>One drawback to this approach of parsing everything up front is that it sometimes requires values be parsed long before they are actually used. In a dynamically-typed language, this can make keeping the parsing and processing logic in sync a little tricky without extensive test coverage, much of which can be laborious to maintain. However, with a static type system, the problem becomes marvelously simple, as demonstrated by the <code>NonEmpty</code> example above: if the parsing and processing logic go out of sync, the program will fail to even compile.</p><h3><a name="the-danger-of-validation"></a>The danger of validation</h3><p>Hopefully, by this point, you are at least somewhat sold on the idea that parsing is preferable to validation, but you may have lingering doubts. Is validation really so bad if the type system is going to force you to do the necessary checks eventually anyway? Maybe the error reporting will be a little bit worse, but a bit of redundant checking can’t hurt, right?</p><p>Unfortunately, it isn’t so simple. Ad-hoc validation leads to a phenomenon that the <a href="http://langsec.org">language-theoretic security</a> field calls <em>shotgun parsing</em>. In the 2016 paper, <a href="http://langsec.org/papers/langsec-cwes-secdev2016.pdf">The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them</a>, its authors provide the following definition:</p><blockquote><p>Shotgun parsing is a programming antipattern whereby parsing and input-validating code is mixed with and spread across processing code—throwing a cloud of checks at the input, and hoping, without any systematic justification, that one or another would catch all the “bad” cases.</p></blockquote><p>They go on to explain the problems inherent to such validation techniques:</p><blockquote><p>Shotgun parsing necessarily deprives the program of the ability to reject invalid input instead of processing it. Late-discovered errors in an input stream will result in some portion of invalid input having been processed, with the consequence that program state is difficult to accurately predict.</p></blockquote><p>In other words, a program that does not parse all of its input up front runs the risk of acting upon a valid portion of the input, discovering a different portion is invalid, and suddenly needing to roll back whatever modifications it already executed in order to maintain consistency. Sometimes this is possible—such as rolling back a transaction in an RDBMS—but in general it may not be.</p><p>It may not be immediately apparent what shotgun parsing has to do with validation—after all, if you do all your validation up front, you mitigate the risk of shotgun parsing. The problem is that validation-based approaches make it extremely difficult or impossible to determine if everything was actually validated up front or if some of those so-called “impossible” cases might actually happen. The entire program must assume that raising an exception anywhere is not only possible, it’s regularly necessary.</p><p>Parsing avoids this problem by stratifying the program into two phases—parsing and execution—where failure due to invalid input can only happen in the first phase. The set of remaining failure modes during execution is minimal by comparison, and they can be handled with the tender care they require.</p><h2><a name="parsing-not-validating-in-practice"></a>Parsing, not validating, in practice</h2><p>So far, this blog post has been something of a sales pitch. “You, dear reader, ought to be parsing!” it says, and if I’ve done my job properly, at least some of you are sold. However, even if you understand the “what” and the “why,” you might not feel especially confident about the “how.”</p><p>My advice: focus on the datatypes.</p><p>Suppose you are writing a function that accepts a list of tuples representing key-value pairs, and you suddenly realize you aren’t sure what to do if the list has duplicate keys. One solution would be to write a function that asserts there aren’t any duplicates in the list:</p><pre><code class="pygments"><span class="nf">checkNoDuplicateKeys</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadError</span> <span class="kt">AppError</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Eq</span> <span class="n">k</span><span class="p">)</span> <span class="ow">=></span> <span class="p">[(</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="p">)]</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>However, this check is fragile: it’s extremely easy to forget. Because its return value is unused, it can always be omitted, and the code that needs it would still typecheck. A better solution is to choose a data structure that disallows duplicate keys by construction, such as a <code>Map</code>. Adjust your function’s type signature to accept a <code>Map</code> instead of a list of tuples, and implement it as you normally would.</p><p>Once you’ve done that, the call site of your new function will likely fail to typecheck, since it is still being passed a list of tuples. If the caller was given the value via one of its arguments, or if it received it from the result of some other function, you can continue updating the type from list to <code>Map</code>, all the way up the call chain. Eventually, you will either reach the location the value is created, or you’ll find a place where duplicates actually ought to be allowed. At that point, you can insert a call to a modified version of <code>checkNoDuplicateKeys</code>:</p><pre><code class="pygments"><span class="nf">checkNoDuplicateKeys</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadError</span> <span class="kt">AppError</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Eq</span> <span class="n">k</span><span class="p">)</span> <span class="ow">=></span> <span class="p">[(</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="p">)]</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Map</span> <span class="n">k</span> <span class="n">v</span><span class="p">)</span></code></pre><p>Now the check <em>cannot</em> be omitted, since its result is actually necessary for the program to proceed!</p><p>This hypothetical scenario highlights two simple ideas:</p><ol><li><p><strong>Use a data structure that makes illegal states unrepresentable.</strong> Model your data using the most precise data structure you reasonably can. If ruling out a particular possibility is too hard using the encoding you are currently using, consider alternate encodings that can express the property you care about more easily. Don’t be afraid to refactor.</p></li><li><p><strong>Push the burden of proof upward as far as possible, but no further.</strong> Get your data into the most precise representation you need as quickly as you can. Ideally, this should happen at the boundary of your system, before <em>any</em> of the data is acted upon.<sup><a href="#footnote-3" id="footnote-ref-3-1">3</a></sup></p><p>If one particular code branch eventually requires a more precise representation of a piece of data, parse the data into the more precise representation as soon as the branch is selected. Use sum types judiciously to allow your datatypes to reflect and adapt to control flow.</p></li></ol><p>In other words, write functions on the data representation you <em>wish</em> you had, not the data representation you are given. The design process then becomes an exercise in bridging the gap, often by working from both ends until they meet somewhere in the middle. Don’t be afraid to iteratively adjust parts of the design as you go, since you may learn something new during the refactoring process!</p><p>Here are a handful of additional points of advice, arranged in no particular order:</p><ul><li><p><strong>Let your datatypes inform your code, don’t let your code control your datatypes.</strong> Avoid the temptation to just stick a <code>Bool</code> in a record somewhere because it’s needed by the function you’re currently writing. Don’t be afraid to refactor code to use the right data representation—the type system will ensure you’ve covered all the places that need changing, and it will likely save you a headache later.</p></li><li><p><strong>Treat functions that return <code>m ()</code> with deep suspicion.</strong> Sometimes these are genuinely necessary, as they may perform an imperative effect with no meaningful result, but if the primary purpose of that effect is raising an error, it’s likely there’s a better way.</p></li><li><p><strong>Don’t be afraid to parse data in multiple passes.</strong> Avoiding shotgun parsing just means you shouldn’t act on the input data before it’s fully parsed, not that you can’t use some of the input data to decide how to parse other input data. Plenty of useful parsers are context-sensitive.</p></li><li><p><strong>Avoid denormalized representations of data, <em>especially</em> if it’s mutable.</strong> Duplicating the same data in multiple places introduces a trivially representable illegal state: the places getting out of sync. Strive for a single source of truth.</p><ul><li><p><strong>Keep denormalized representations of data behind abstraction boundaries.</strong> If denormalization is absolutely necessary, use encapsulation to ensure a small, trusted module holds sole responsibility for keeping the representations in sync.</p></li></ul></li><li><p><strong>Use abstract datatypes to make validators “look like” parsers.</strong> Sometimes, making an illegal state truly unrepresentable is just plain impractical given the tools Haskell provides, such as ensuring an integer is in a particular range. In that case, use an abstract <code>newtype</code> with a smart constructor to “fake” a parser from a validator.</p></li></ul><p>As always, use your best judgement. It probably isn’t worth breaking out <a href="https://hackage.haskell.org/package/singletons">singletons</a> and refactoring your entire application just to get rid of a single <code>error "impossible"</code> call somewhere—just make sure to treat those situations like the radioactive substance they are, and handle them with the appropriate care. If all else fails, at least leave a comment to document the invariant for whoever needs to modify the code next.</p><h2><a name="recap-reflection-and-related-reading"></a>Recap, reflection, and related reading</h2><p>That’s all, really. Hopefully this blog post proves that taking advantage of the Haskell type system doesn’t require a PhD, and it doesn’t even require using the latest and greatest of GHC’s shiny new language extensions—though they can certainly sometimes help! Sometimes the biggest obstacle to using Haskell to its fullest is simply being aware what options are available, and unfortunately, one downside of Haskell’s small community is a relative dearth of resources that document design patterns and techniques that have become tribal knowledge.</p><p>None of the ideas in this blog post are new. In fact, the core idea—“write total functions”—is conceptually quite simple. Despite that, I find it remarkably challenging to communicate actionable, practicable details about the way I write Haskell code. It’s easy to spend lots of time talking about abstract concepts—many of which are quite valuable!—without communicating anything useful about <em>process</em>. My hope is that this is a small step in that direction.</p><p>Sadly, I don’t know very many other resources on this particular topic, but I do know of one: I never hesitate to recommend Matt Parson’s fantastic blog post <a href="https://www.parsonsmatt.org/2017/10/11/type_safety_back_and_forth.html">Type Safety Back and Forth</a>. If you want another accessible perspective on these ideas, including another worked example, I’d highly encourage giving it a read. For a significantly more advanced take on many of these ideas, I can also recommend Matt Noonan’s 2018 paper <a href="https://kataskeue.com/gdp.pdf">Ghosts of Departed Proofs</a>, which outlines a handful of techniques for capturing more complex invariants in the type system than I have described here.</p><p>As a closing note, I want to say that doing the kind of refactoring described in this blog post is not always easy. The examples I’ve given are simple, but real life is often much less straightforward. Even for those experienced in type-driven design, it can be genuinely difficult to capture certain invariants in the type system, so do not consider it a personal failing if you cannot solve something the way you’d like! Consider the principles in this blog post ideals to strive for, not strict requirements to meet. All that matters is to try.</p><ol class="footnotes"><li id="footnote-1"><p>Technically, in Haskell, this ignores “bottoms,” constructions that can inhabit <em>any</em> value. These aren’t “real” values (unlike <code>null</code> in some other languages)—they’re things like infinite loops or computations that raise exceptions—and in idiomatic Haskell, we usually try to avoid them, so reasoning that pretends they don’t exist still has value. But don’t take my word for it—I’ll let Danielsson et al. convince you that <a href="https://www.cs.ox.ac.uk/jeremy.gibbons/publications/fast+loose.pdf">Fast and Loose Reasoning is Morally Correct</a>. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>In fact, <code>Data.List.NonEmpty</code> already provides a <code>head</code> function with this type, but just for the sake of illustration, we’ll reimplement it ourselves. <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>Sometimes it is necessary to perform some kind of authorization before parsing user input to avoid denial of service attacks, but that’s okay: authorization should have a relatively small surface area, and it shouldn’t cause any significant modifications to the state of your system. <a href="#footnote-ref-3-1">↩</a></p></li></ol></article>Empathy and subjective experience in programming languages2019-10-19T00:00:00Z2019-10-19T00:00:00ZAlexis King<article><p>A stereotype about programmers is that they like to think in black and white. Programmers like things to be good or bad, moral or immoral, responsible or irresponsible. Perhaps there is something romantic in the idea that programmers like to be as binary as the computers they program. Reductionist? Almost certainly, but hey, laugh at yourself a bit: we probably deserve to be made fun of from time to time.</p><p>Personally, I have no idea if the trope of the nuance-challenged programmer is accurate, but whether it’s a property of programmers or just humans behind a keyboard, the intensity with which we disagree with one another never ceases to amaze. Ask any group of working programmers what their least favorite programming language is, and there’s a pretty good chance things are going to get heated real fast. Why? What is it about programming that makes us feel so strongly that we are right and others are wrong, even when our experiences contradict those of tens or hundreds of thousands of others?</p><p>I think about that question a lot.</p><h2><a name="2015-called-and-they-want-their-dress-back"></a>2015 called, and they want their dress back</h2><p>Humans have a knack for caring intensely about the most trivial of things. Name almost anything—cats versus dogs, the appropriate way to fasten a necktie, or even which day of the week comes first—and someone somewhere has probably written an essay about it on an internet forum. It would be easy to throw up our hands and give up trying to understand our peers, as sometimes they seem like aliens from another planet.</p><p>However, what interests me is how the littlest things seem to get people the most upset. Few people have shouting matches over the best interpretation of quantum mechanics, but friendships will be tested when someone says they just aren’t that into <em>Star Wars</em>. One explanation for this phenomenon is simple accessibility: most people aren’t equipped to understand quantum mechanics well enough to argue about it, but almost anyone can have an opinion on which direction the toilet paper is supposed to go.<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup></p><p>There is truth in that explanation, but personally, I don’t think it’s the whole story. Rather, I think we grow so used to the idea that our experiences are universal that discovering someone else experienced the exact same thing we did yet came to a different conclusion is not just frustrating: it’s incomprehensible.</p><p>Take 2015’s phenomenon of “<a href="https://en.wikipedia.org/wiki/The_dress">the dress</a>” as an example. Some people see black and blue, others white and gold, and frankly, whether you see one or the other has no impact on anything remotely meaningful. How did <em>this</em>—something so completely irrelevant—become a cross-cultural phenomenon reported on by major news outlets? My guess: people just aren’t used to the idea that vision—the primary way we sense the world—does not provide us with an objective, universal understanding of reality.</p><h3><a name="when-something-objective-isn-t"></a>When something objective isn’t</h3><p>Our culture and society works because, in spite of our differences, we’re still all humans. We eat food, we sleep, we like spending time with each other, and we like feeling connected to those around us. So when we watch a movie, and it tickles us in a way that makes us feel good, we can have an awfully hard time understanding how our best friend—who we largely agree with about everything—didn’t like it at all.</p><p>The truth, of course, is that very little of what we experience is in any way objective. Yes, we can be pretty confident that basic arithmetic is true anywhere in the universe, and that if we all agree a table is brown it probably is. There are even things we accept as subjective without a second thought, such as the kinds of food people like or the fashions they find attractive. It’s all the in-betweens that are so pernicious! “The dress” was so unbelievable to most people because, nine hundred and ninety nine times out of of a thousand, when two humans look at a picture, they at least mostly agree on the colors contained within. We do not consider that we are seeing different lenses into the same objective reality, we simply think we are perceiving objective truths directly.</p><p>In the case of the dress, whether you <a href="https://en.wikipedia.org/wiki/Yanny_or_Laurel">heard “yanny” or “laurel,”</a> or whether you believe the <em>Sonic</em> games were ever any good, subjective disagreement is essentially harmless. But what about when it isn’t? Might incorrect beliefs that our experiences are universal cause genuine harm?</p><p>I think the answer is absolutely, unequivocally <em>yes</em>.</p><h2><a name="subjectivity-in-programming-and-in-programming-languages-specifically"></a>Subjectivity in programming, and in programming languages specifically</h2><p>Quick question: which is better, functional or imperative programming?</p><p>My guess, given the usual subject of my blog, most of my readers would pick the former. However, the actual answer you chose doesn’t matter: my guess is you feel like you have a pretty rational argument to back it up. It certainly isn’t simply a matter of taste… right?</p><p>Well, no, I hope not. I don’t think the world is so subjective that we cannot ever advocate for one thing over another—we tried that whole “everything is XML” thing for a while, and I think we agreed it really wasn’t a good idea. But if you truly believe your answer to the above question can be completely objectively justified (as many do), how does one explain the average Hacker News comment thread on just about any post about Haskell?</p><p>I generally try not to read Hacker News if I can help it, as I find doing it mostly just makes me angry,<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup> but I did happen to find a link to <a href="https://news.ycombinator.com/item?id=21282647">a recent discussion</a> on a blog post about using Haskell in production. Let’s take a look at a few comments, shall we?</p><p>In a <a href="https://news.ycombinator.com/item?id=21284383">branch of the discussion</a>, one user writes:</p><blockquote><blockquote><p>Haskell is great for business and great in production</p></blockquote><p>I disagree. It's a beautiful language but it lacks a lot of features to make it useable at scale. And in my experience Haskell engineers are extremely smart but the environment/culture they create makes it difficult to foster team spirit.</p><p>I've been in 2 companies in the last 4 years who initially used Haskell in production. One has transitioned to Go and the other is rewriting the codebase in Rust.</p></blockquote><p>The first paragraph is an assertion without many specifics, but it does sound like it could be reasonable. And although the last two sentences are entirely anecdotal, anecdotes are still better than hunches. Let’s see what someone else has to say in response:</p><blockquote><p>I’ve met some pretty damn solid engineers who started on Haskell and, even at a junior level in other languages, produce an elegant solution far more easily than a senior engineer in that language. You probably wouldn’t put the code in production verbatim but you can very easily see what’s going on and it isn’t haunted by spectre of early abstraction, which IMO is the biggest flaw of OOP at scale.</p><p>[…]</p><p>From my naive perspective it’s easy to make classes out of everything, and to hold state and put side-effects everywhere, but you don’t want to deal with the trouble of a monad until you need it. So you have an automatic inclination towards cleaner code when you start functional and move on.</p></blockquote><p>Also pretty vague and high-level, but also sounds reasonable. If you read either of these comments, and your first inclination was to grow frustrated and start crafting counter-arguments in your head, I encourage you to step outside your feelings momentarily (rational as they may be!) and try your very hardest to interpret them charitably. The discussion continues:</p><blockquote><p>Haskell gives one plenty of rope to hang himself on complexity.</p><p>So much that developers develop an aversion to it as deep as fear. It's unavoidable, the ones that didn't develop it are still buried at the working of their first Rube Goldberg machine and unavailable.</p></blockquote><p>Whether you think it’s accurate or not, there is definitely a perception held by a great many people that Haskell is a very complicated language. Surely at least some of them must have given it an honest shot, so have they just not “seen the light” yet? What do you think they’re missing? Perhaps a followup commenter can help elucidate things:</p><blockquote><p>Hi, I find that everything people here are complaining about (and they're valid complaints) has also been true of C++. C++ developed a lot of its complexity (particularly 15-20 yrs ago in the template space) after it got popular, so people were already wed to it.</p><p>[…]</p><p>The C++ community's really gotten good in the last 5 years or so about reigning in the bad impulses and getting people to write clean, clear, efficient code that has reasonable expressiveness.</p><p>Coming into Haskell from C++, I have the same instincts. Haskell's been a pure pleasure. The benefits are really there, and they're easy to get. You just have to think of the trade-offs.</p></blockquote><p>That argument seems reasonable, too. Everything in moderation, right? If you disagree, and you think Haskell is just not worth it, what does this person value that you don’t? What are they missing that you see?</p><h3><a name="the-unsatisfying-subjective-reality-of-programming-languages"></a>The unsatisfying subjective reality of programming languages</h3><p>You can probably see where I’m going with all this. These arguments are not built on hard, refutable facts or rigorous real-world evidence, they’re based in gut feelings and personal preferences. Does that mean they’re wrong, invalid, and worthless, and we should do studies to determine which language allows programmers to ship features the fastest and with the fewest bugs, then all agree to use that?</p><p><em><strong>No!</strong></em></p><p>These conversations are subjective because, for better or for worse, humans think in different ways and value different things, and programming languages are the medium in which we express ourselves. To many people who write Haskell (myself included), there is an effervescent joy in modeling a problem with the type system—like capturing something in amber—that others just don’t care about. What’s more, some people clearly loathe Haskell’s significant whitespace and plethora of infix operators, but I’ve never really minded. Is one of us wrong? If so, <em>why?</em> Talk about reliability all you want, but the few rigorous numbers we have don’t provide much evidence one way or the other.</p><p>While <a href="https://news.ycombinator.com/item?id=21284317">one commenter</a> in the aforementioned Hacker News thread described Haskell as nothing less than “pain and torture,” <a href="https://news.ycombinator.com/item?id=21284540">another</a> says they “did some Haskell in production and it was delightful.” People push excuses and rationalizations for these differences constantly—they point out that most people are exposed to imperative programming first, while others retort that Haskell is clearly not very widely used despite being around for an awfully long time—but none of their arguments ever seem to change people’s minds.</p><p>Often, people walk away from these conversations confused and incensed. To them, their point of view is so obviously apparent that it is hard to fathom anyone else seeing things differently. They rack their brains trying to figure out why their opponents just don’t <em>get it.</em> There must be some key point they’ve misunderstood, some joy they haven’t experienced, some sharp edge they haven’t yet been cut by. But no matter how much time they spend trying to reach these people, somehow, it’s never enough.</p><h3><a name="empathy-and-how-bad-results-come-from-good-intentions"></a>Empathy, and how bad results come from good intentions</h3><p>I’ll admit that these kinds of discussions aren’t <em>always</em> fruitless; sometimes they really do manage to change people’s minds or help them see some new idea they had not been able to grasp. When people manage to keep their cool and acknowledge the differences in their mindsets while still helping people learn, everyone benefits.</p><p>Sadly, in my experience, this rarely happens. We have a natural tendency to become angry if people don’t see things the way we do; it’s confusing and disorienting, and it can even disgust us. None of those emotions are conducive to empathy. When we fail to account for the ways in which others might think differently, we voluntarily reject any insights we might have otherwise gained from the conversation because we did not allow ourselves to embrace, even just temporarily, someone else’s strange and perhaps uncomfortable set of values and experiences. We refuse to accept that our perception of color might not be as universal as we thought, and we miss out on the amazing insights we could learn about the nature of light, color, and human vision.</p><p>Although failing to empathize with those we are arguing with is bad enough, in my mind, this failure to accept the potential subjectivity of one’s own views has even worse, indirect effects. Take this comment for example, again from the same thread:</p><blockquote><p>Sounds like you've barely programmed in Haskell and don't know what you're talking about. Haskell was the first language I learned. I didn't think this at all and I still don't. It doesn't strike me as any more difficult than learning Java or something.</p></blockquote><p>I have no doubts that this commenter meant what they said: they didn’t find Haskell difficult to learn. The comment they were replying to was vitriolic and combative, so one could almost feel they had a smackdown coming to them… but this isn’t a private conversation. How do you think someone feels when they are learning Haskell, scroll through this thread, and find a comment that tells them they ought to find it easy? If they’ve been struggling, even a little bit, what do you think they might think?</p><p>If I were in their place, I might feel a little stupid. I might wonder if I’m really cut out for Haskell or if I should just give up. I definitely wouldn’t feel encouraged and excited to keep trying.</p><p>Who knows why this commenter found Haskell straightforward. Maybe they were exposed to certain concepts already, maybe it just fit their style of thinking, perhaps they’re even exceptionally smart. I don’t know. But no matter what the answer is, insulting the intelligence of others, even indirectly in this way, belies a lack of empathy in the face of frustration, and although the intent may not have been to hurt, it can still be seriously harmful.</p><p>To be clear, I’m not saying the commenter should have pretended their experiences were different or even kept them to themselves. I don’t believe in being “fake nice”—in my experience, I am best equipped to reach people when speaking genuinely, from the heart. What I would have done is tell my story in a different way, perhaps by writing something like this:</p><blockquote><p>It’s true that a lot of people find Haskell challenging, and I totally accept that some people just don’t think it’s worth it. It’s fine if you don’t want to write Haskell. But personally, I really enjoy writing it, as do the people I work with, and I think we ship great software with it because it aligns naturally—even joyfully!—with the way we like to think about program construction.</p><p>Personally, I didn’t find Haskell as challenging to learn as I think some people have, but it was still work, and in some ways I was just exposed to it at the right time. Other people I know have struggled quite a lot at first, and reasonably so, but they’ve still managed to become great Haskell programmers, and they found it worthwhile. Our team dynamic just wouldn’t be the same in any other language.</p></blockquote><p>When I respond to comments I disagree with, I try to tell a personal story that provides a different perspective <strong>without</strong> invalidating their experiences. Sometimes the result is ungrateful snark anyway (or just no response at all), but you might be surprised how often talking from an emotional place about <em>your own</em> experiences—while being neither aggressive nor especially defensive—can go a long way. Perhaps you can even learn something if they return the favor and explain what they find frustrating, beyond the fundamental, subjective disagreements.</p><p>It’s okay to have opinions. It’s okay to like and dislike things. It’s okay to be frustrated that others don’t see things the way you do, and to advocate for the technologies and values you believe in. It’s just not okay to tell someone else their reality is wrong.</p><p>Learn to embrace the subjective differences between us all, and you won’t just be kinder. You’ll be <em>happier.</em></p><ol class="footnotes"><li id="footnote-1"><p>This is where I’m supposed to put a snarky footnote saying something like “obviously, the correct way is <em>blah</em>,” but you deserve better. So you, uh, get a <em>meta</em> snarky footnote instead. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>Which, to be entirely fair, may well be as subjective as anything else in this blog post. <a href="#footnote-ref-2-1">↩</a></p></li></ol></article>Demystifying MonadBaseControl2019-09-07T00:00:00Z2019-09-07T00:00:00ZAlexis King<article><p><a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#t:MonadBaseControl"><code>MonadBaseControl</code> from the <code>monad-control</code> package</a> is a confusing typeclass, and its methods have complicated types. For many people, it’s nothing more than scary, impossible-to-understand magic that is, for some reason, needed when lifting certain kinds of operations. Few resources exist that adequately explain how, why, and when it works, which sadly seems to have resulted in some <a href="https://en.wikipedia.org/wiki/Fear,_uncertainty,_and_doubt">FUD</a> about its use.</p><p>There’s no doubt that the machinery of <code>MonadBaseControl</code> is complex, and the role it plays in practice is often subtle. However, its essence is actually much simpler than it appears, and I promise it can be understood by mere mortals. In this blog post, I hope to provide a complete survey of <code>MonadBaseControl</code>—how it works, how it’s designed, and how it can go wrong—in a way that is accessible to anyone with a firm grasp of monads and monad transformers. To start, we’ll motivate <code>MonadBaseControl</code> by reinventing it ourselves.</p><h2><a name="the-higher-order-action-problem"></a>The higher-order action problem</h2><p>Say we have a function with the following type:<sup><a href="#footnote-0" id="footnote-ref-0-1">1</a></sup></p><pre><code class="pygments"><span class="nf">foo</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span></code></pre><p>If we have an action built from a transformer stack like</p><pre><code class="pygments"><span class="nf">bar</span> <span class="ow">::</span> <span class="kt">StateT</span> <span class="kt">X</span> <span class="kt">IO</span> <span class="kt">Y</span></code></pre><p>then we might wish to apply <code>foo</code> to <code>bar</code>, but that is ill-typed, since <code>IO</code> is not the same as <code>StateT X IO</code>. In cases like these, we often use <code>lift</code>, but it’s not good enough here: <code>lift</code> <em>adds</em> a new monad transformer to an action, but here we need to <em>remove</em> a transformer. So we need a function with a type like this:</p><pre><code class="pygments"><span class="nf">unliftState</span> <span class="ow">::</span> <span class="kt">StateT</span> <span class="kt">X</span> <span class="kt">IO</span> <span class="kt">Y</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="kt">Y</span></code></pre><p>However, if you think about that type just a little bit, it’s clear something’s wrong: it throws away information, namely the state. You may remember that a <code>StateT X IO Y</code> action is equivalent to a function of type <code>X -> IO (Y, X)</code>, so our hypothetical <code>unliftState</code> function has two problems:</p><ol><li><p>We have no <code>X</code> to use as the initial state.</p></li><li><p>We’ll lose any modifications <code>bar</code> made to the state, since the result type is just <code>Y</code>, not <code>(Y, X)</code>.</p></li></ol><p>Clearly, we’ll need something more sophisticated, but what?</p><h2><a name="a-na-ve-solution"></a>A naïve solution</h2><p>Given that <code>foo</code> doesn’t know anything about the state, we can’t easily thread it through <code>foo</code> itself. However, by using <code>runStateT</code> explicitly, we could do some of the state management ourselves:</p><pre><code class="pygments"><span class="nf">foo'</span> <span class="ow">::</span> <span class="kt">StateT</span> <span class="n">s</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">StateT</span> <span class="n">s</span> <span class="kt">IO</span> <span class="n">a</span>
<span class="nf">foo'</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">get</span>
<span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">foo</span> <span class="p">(</span><span class="n">runStateT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span>
<span class="n">put</span> <span class="n">s'</span>
<span class="n">pure</span> <span class="n">v</span></code></pre><p>Do you see what’s going on there? It’s not actually very complicated: we get the current state, then pass it as the initial state to <code>runStateT</code>. This produces an action <code>IO (a, s)</code> that has <em>closed over</em> the current state. We can pass that action to <code>foo</code> without issue, since <code>foo</code> is polymorphic in the action’s return type. Finally, all we have to do is <code>put</code> the modified state back into the enclosing <code>StateT</code> computation, and we can get on with our business.</p><p>That strategy works okay when we only have one monad transformer, but it gets hairy quickly as soon as we have two or more. For example, if we had <code>baz :: ExceptT X (StateT Y IO) Z</code>, then we <em>could</em> do the same trick by getting the underlying</p><pre><code class="pygments"><span class="kt">Y</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">X</span> <span class="kt">Z</span><span class="p">,</span> <span class="kt">Y</span><span class="p">)</span></code></pre><p>function, closing over the state, restoring it, and doing the appropriate case analysis to re-raise any <code>ExceptT</code> errors, but that’s a lot of work to do for every single function! What we’d like to do instead is somehow abstract over the pattern we used to write <code>foo'</code> in a way that scales to arbitrary monad transformers.</p><h2><a name="the-essence-of-monadbasecontrol"></a>The essence of <code>MonadBaseControl</code></h2><p>To build a more general solution for “unlifting” arbitrary monad transformers, we need to start thinking about monad transformer state. The technique we used to implement <code>foo'</code> operated on the following process:</p><ol><li><p>Capture the action’s input state and close over it.</p></li><li><p>Package up the action’s output state with its result and run it.</p></li><li><p>Restore the action’s output state into the enclosing transformer.</p></li><li><p>Return the action’s result.</p></li></ol><p>For <code>StateT s</code>, it turns out that the input state and output state are both <code>s</code>, but other monad transformers have state, too. Consider the input and output state for the following common monad transformers:</p><div class="table-wrapper">
<table class="no-line-wrapping">
<thead><tr>
<th>transformer</th>
<th>representation</th>
<th>input state</th>
<th>output state</th>
</tr></thead>
<tr>
<td><code>StateT s m a</code></td>
<td><code>s -> m (a, s)</code></td>
<td><code>s</code></td>
<td><code>s</code></td>
</tr>
<tr>
<td><code>ReaderT r m a</code></td>
<td><code>r -> m a</code></td>
<td><code>r</code></td>
<td><code>()</code></td>
</tr>
<tr>
<td><code>WriterT w m a</code></td>
<td><code>m (a, w)</code></td>
<td><code>()</code></td>
<td><code>w</code></td>
</tr>
</table>
</div><p>Notice how the input state is whatever is to the left of the <code>-></code>, while the output state is whatever extra information gets produced alongside the result. Using the same reasoning, we can also deduce the input and output state for compositions of multiple monad transformers, such as the following:</p><div class="table-wrapper">
<table class="no-line-wrapping">
<thead><tr>
<th>transformer</th>
<th>representation</th>
<th>input state</th>
<th>output state</th>
</tr></thead>
<tr>
<td><code>ReaderT r (WriterT w m) a</code></td>
<td><code>r -> m (a, w)</code></td>
<td><code>r</code></td>
<td><code>w</code></td>
</tr>
<tr>
<td><code>StateT s (ReaderT r m) a</code></td>
<td><code>r -> s -> m (a, s)</code></td>
<td><code>(r, s)</code></td>
<td><code>s</code></td>
</tr>
<tr>
<td><code>WriterT w (StateT s m) a</code></td>
<td><code>s -> m ((a, w), s)</code></td>
<td><code>s</code></td>
<td><code>(w, s)</code></td>
</tr>
</table>
</div><p>Notice that when monad transformers are composed, their states are composed, too. This is useful to keep in mind, since our goal is to capture the four steps above in a typeclass, polymorphic in the state of the monad transformers we need to lift through. At minimum, we need two new operations: one to capture the input state and close over it (step 1) and one to restore the output state (step 3). One class we might come up with could look like this:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">MonadBase</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">b</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">InputState</span> <span class="n">m</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="n">m</span>
<span class="n">captureInputState</span> <span class="ow">::</span> <span class="n">m</span> <span class="p">(</span><span class="kt">InputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">closeOverInputState</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">InputState</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">b</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="ow">::</span> <span class="kt">OutputState</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>If we can write instances of that typeclass for various transformers, we can use the class’s operations to implement <code>foo'</code> in a generic way that works with any combination of them:</p><pre><code class="pygments"><span class="nf">foo'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">foo'</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">captureInputState</span>
<span class="kr">let</span> <span class="n">m'</span> <span class="ow">=</span> <span class="n">closeOverInputState</span> <span class="n">m</span> <span class="n">s</span>
<span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">liftBase</span> <span class="o">$</span> <span class="n">foo</span> <span class="n">m'</span>
<span class="n">restoreOutputState</span> <span class="n">s'</span>
<span class="n">pure</span> <span class="n">v</span></code></pre><p>So how do we implement those instances? Let’s start with <code>IO</code>, since that’s the base case:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">InputState</span> <span class="kt">IO</span> <span class="ow">=</span> <span class="nb">()</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="kt">IO</span> <span class="ow">=</span> <span class="nb">()</span>
<span class="n">captureInputState</span> <span class="ow">=</span> <span class="n">pure</span> <span class="nb">()</span>
<span class="n">closeOverInputState</span> <span class="n">m</span> <span class="nb">()</span> <span class="ow">=</span> <span class="n">m</span> <span class="o"><&></span> <span class="p">(,</span> <span class="nb">()</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="nb">()</span> <span class="ow">=</span> <span class="n">pure</span> <span class="nb">()</span></code></pre><p>Not very exciting. The <code>StateT s</code> instance, on the other hand, is significantly more interesting:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">InputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="kt">InputState</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">captureInputState</span> <span class="ow">=</span> <span class="p">(,)</span> <span class="o"><$></span> <span class="n">get</span> <span class="o"><*></span> <span class="n">lift</span> <span class="n">captureInputState</span>
<span class="n">closeOverInputState</span> <span class="n">m</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="p">((</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">),</span> <span class="n">ss'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">closeOverInputState</span> <span class="p">(</span><span class="n">runStateT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span> <span class="n">ss</span>
<span class="n">pure</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">ss'</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span> <span class="o">*></span> <span class="n">put</span> <span class="n">s</span></code></pre><p><strong>This instance alone includes most of the key ideas behind <code>MonadBaseControl</code>.</strong> There’s a lot going on, so let’s break it down, step by step:</p><ol><li><p>Start by examining the definitions of <code>InputState</code> and <code>OutputState</code>. Are they what you expected? You’d be forgiven for expecting the following:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">InputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="n">s</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="n">s</span></code></pre><p>After all, that’s what we wrote in the table, isn’t it?</p><p>However, if you give it a try, you’ll find it doesn’t work. <code>InputState</code> and <code>OutputState</code> must capture the state of the <em>entire</em> monad, not just a single transformer layer, so we have to combine the <code>StateT s</code> state with the state of the underlying monad. In the simplest case we get</p><pre><code class="pygments"><span class="kt">InputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="kt">IO</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="nb">()</span><span class="p">)</span></code></pre><p>which is boring, but in a more complex case, we need to get something like this:</p><pre><code class="pygments"><span class="kt">InputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="kt">IO</span><span class="p">))</span> <span class="ow">=</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="nb">()</span><span class="p">))</span></code></pre><p>Therefore, <code>InputState (StateT s m)</code> combines <code>s</code> with <code>InputState m</code> in a tuple, and <code>OutputState</code> does the same.</p></li><li><p>Moving on, take a look at <code>captureInputState</code> and <code>closeOverInputState</code>. Just as <code>InputState</code> and <code>OutputState</code> capture the state of the entire monad, these functions need to be inductive in the same way.</p><p><code>captureInputState</code> acquires the current state using <code>get</code>, and it combines it with the remaining monadic state using <code>lift captureInputState</code>. <code>closeOverInputState</code> uses the captured state to peel off the outermost <code>StateT</code> layer, then calls <code>closeOverInputState</code> recursively to peel off the rest of them.</p></li><li><p>Finally, <code>restoreOutputState</code> restores the state of the underlying monad stack, then restores the <code>StateT</code> state, ensuring everything ends up back the way it’s supposed to be.</p></li></ol><p>Take the time to digest all that—work through it yourself if you need to—as it’s a dense piece of code. Once you feel comfortable with it, take a look at the instances for <code>ReaderT</code> and <code>WriterT</code> as well:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">InputState</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="kt">InputState</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">OutputState</span> <span class="n">m</span>
<span class="n">captureInputState</span> <span class="ow">=</span> <span class="p">(,)</span> <span class="o"><$></span> <span class="n">ask</span> <span class="o"><*></span> <span class="n">lift</span> <span class="n">captureInputState</span>
<span class="n">closeOverInputState</span> <span class="n">m</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="n">closeOverInputState</span> <span class="p">(</span><span class="n">runReaderT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span> <span class="n">ss</span>
<span class="n">restoreOutputState</span> <span class="n">ss</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">InputState</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">InputState</span> <span class="n">m</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">captureInputState</span> <span class="ow">=</span> <span class="n">lift</span> <span class="n">captureInputState</span>
<span class="n">closeOverInputState</span> <span class="n">m</span> <span class="n">ss</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="p">((</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">),</span> <span class="n">ss'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">closeOverInputState</span> <span class="p">(</span><span class="n">runWriterT</span> <span class="n">m</span><span class="p">)</span> <span class="n">ss</span>
<span class="n">pure</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">ss'</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span> <span class="o">*></span> <span class="n">tell</span> <span class="n">s</span></code></pre><p>Make sure you understand these instances, too. It should be easier this time, since they share most of their structure with the <code>StateT</code> instance, but note the asymmetry that arises from the differing input and output states. (It may even help to try and write these instances yourself, focusing on the types whenever you get stuck.)</p><p>If you feel alright with them, then congratulations: you’re already well on your way to grokking <code>MonadBaseControl</code>!</p><h3><a name="hiding-the-input-state"></a>Hiding the input state</h3><p>So far, our implementation of <code>MonadBaseControl</code> works, but it’s actually slightly more complicated than it needs to be. As it happens, all valid uses of <code>MonadBaseControl</code> will always end up performing the following pattern:</p><pre><code class="pygments"><span class="nf">s</span> <span class="ow"><-</span> <span class="n">captureInputState</span>
<span class="kr">let</span> <span class="n">m'</span> <span class="ow">=</span> <span class="n">closeOverInputState</span> <span class="n">m</span> <span class="n">s</span></code></pre><p>That is, we close over the input state as soon as we capture it. We can therefore combine <code>captureInputState</code> and <code>closeOverInputState</code> into a single function:</p><pre><code class="pygments"><span class="nf">captureAndCloseOverInputState</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">b</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">))</span></code></pre><p>What’s more, we no longer need the <code>InputState</code> associated type at all! This is an improvement, since it simplifies the API and removes the possibility for any misuse of the input state, since it’s never directly exposed. On the other hand, it has a more complicated type: it produces a monadic action <em>that returns another monadic action</em>. This can be a little more difficult to grok, which is why I presented the original version first, but it may help to consider how the above type arises naturally from the following definition:</p><pre><code class="pygments"><span class="nf">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="n">closeOverInputState</span> <span class="n">m</span> <span class="o"><$></span> <span class="n">captureInputState</span></code></pre><p>Let’s update the <code>MonadBaseControl</code> class to incorporate this simplification:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">MonadBase</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">b</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="n">m</span>
<span class="n">captureAndCloseOverInputState</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">b</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="ow">::</span> <span class="kt">OutputState</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>We can then update all the instances to use the simpler API by simply fusing the definitions of <code>captureInputState</code> and <code>closeOverInputState</code> together:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="kt">IO</span> <span class="ow">=</span> <span class="nb">()</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="n">pure</span> <span class="p">(</span><span class="n">m</span> <span class="o"><&></span> <span class="p">(,</span> <span class="nb">()</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="nb">()</span> <span class="ow">=</span> <span class="n">pure</span> <span class="nb">()</span>
<span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">get</span>
<span class="n">m'</span> <span class="ow"><-</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runStateT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span>
<span class="n">pure</span> <span class="o">$</span> <span class="kr">do</span>
<span class="p">((</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">),</span> <span class="n">ss'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">m'</span>
<span class="n">pure</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">ss'</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span> <span class="o">*></span> <span class="n">put</span> <span class="n">s</span>
<span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">OutputState</span> <span class="n">m</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">ask</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runReaderT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="n">ss</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">m'</span> <span class="ow"><-</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runWriterT</span> <span class="n">m</span><span class="p">)</span>
<span class="n">pure</span> <span class="o">$</span> <span class="kr">do</span>
<span class="p">((</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">),</span> <span class="n">ss'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">m'</span>
<span class="n">pure</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">ss'</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span> <span class="o">*></span> <span class="n">tell</span> <span class="n">s</span></code></pre><p>This is already very close to a full <code>MonadBaseControl</code> implementation. The <code>captureAndCloseOverInputState</code> implementations are getting a little out of hand, but bear with me—they’ll get simpler before this blog post is over.</p><h3><a name="coping-with-partiality"></a>Coping with partiality</h3><p>Our <code>MonadBaseControl</code> class now works with <code>StateT</code>, <code>ReaderT</code>, and <code>WriterT</code>, but one transformer we haven’t considered is <code>ExceptT</code>. Let’s try to extend our table from before with a row for <code>ExceptT</code>:</p><div class="table-wrapper">
<table class="no-line-wrapping">
<thead><tr>
<th>transformer</th>
<th>representation</th>
<th>input state</th>
<th>output state</th>
</tr></thead>
<tr>
<td><code>ExceptT e m a</code></td>
<td><code>m (Either e a)</code></td>
<td><code>()</code></td>
<td><code>???</code></td>
</tr>
</table>
</div><p>Hmm… what <em>is</em> the output state for <code>ExceptT</code>?</p><p>The answer can’t be <code>e</code>, since we might not end up with an <code>e</code>—the computation might not fail. <code>Maybe e</code> would be closer… could that work?</p><p>Well, let’s try it. Let’s write a <code>MonadBaseControl</code> instance for <code>ExceptT</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">OutputState</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">e</span><span class="p">,</span> <span class="kt">OutputState</span> <span class="n">m</span><span class="p">)</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">m'</span> <span class="ow"><-</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runExceptT</span> <span class="n">m</span><span class="p">)</span>
<span class="n">pure</span> <span class="o">$</span> <span class="kr">do</span>
<span class="p">((</span><span class="n">v</span><span class="p">,</span> <span class="n">s'</span><span class="p">),</span> <span class="n">ss'</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">m'</span>
<span class="n">pure</span> <span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">ss'</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">ss</span><span class="p">)</span> <span class="ow">=</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span> <span class="o">*></span> <span class="kr">case</span> <span class="n">s</span> <span class="kr">of</span>
<span class="kt">Just</span> <span class="n">e</span> <span class="ow">-></span> <span class="n">throwError</span> <span class="n">e</span>
<span class="kt">Nothing</span> <span class="ow">-></span> <span class="n">pure</span> <span class="nb">()</span></code></pre><p>Sadly, the above implementation doesn’t typecheck; it is rejected with the following type error:</p><pre><code>• Couldn't match type ‘Either e a’ with ‘(a, Maybe e)’
Expected type: m (b ((a, Maybe e), OutputState m))
Actual type: m (b (Either e a, OutputState m))
• In the second argument of ‘($)’, namely
‘captureAndCloseOverInputState (runExceptT m)’
In a stmt of a 'do' block:
m' <- lift $ captureAndCloseOverInputState (runExceptT m)
In the expression:
do m' <- lift $ captureAndCloseOverInputState (runExceptT m)
return do ((v, s'), ss') <- m'
pure (v, (s', ss'))
</code></pre><p>We promised a <code>(a, Maybe e)</code>, but we have an <code>Either e a</code>, and there’s certainly no way to get the former from the latter. Are we stuck? (If you’d like, take a moment to think about how you’d solve this type error before moving on, as it may be helpful for understanding the following solution.)</p><p>The fundamental problem here is <em>partiality</em>. The type of the <code>captureAndCloseOverInputState</code> method always produces an action in the base monad that includes an <code>a</code> <em>in addition</em> to some other output state. But <code>ExceptT</code> is different: when it an error is raised, it doesn’t produce an <code>a</code> at all—it only produces an <code>e</code>. Therefore, as written, it’s impossible to give <code>ExceptT</code> a <code>MonadBaseControl</code> instance.</p><p>Of course, we’d very much <em>like</em> to give <code>ExceptT</code> a <code>MonadBaseControl</code> instance, so that isn’t very satisfying. Somehow, we need to change <code>captureAndCloseOverInputState</code> so that it doesn’t always need to produce an <code>a</code>. There are a few ways we could accomplish that, but an elegant way to do it is this:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">MonadBase</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">b</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span>
<span class="n">captureAndCloseOverInputState</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">b</span> <span class="p">(</span><span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="ow">::</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>We’ve replaced the old <code>OutputState</code> associated type with a new <code>WithOutputState</code> type, and the key difference between them is that <code>WithOutputState</code> describes the type of a <em>combination</em> of the result (of type <code>a</code>) and the output state, rather than describing the type of the output state alone. For total monad transformers like <code>StateT</code>, <code>ReaderT</code>, and <code>WriterT</code>, <code>WithOutputState m a</code> will just be a tuple of the result value and the output state, the same as before. For example, here’s an updated <code>MonadBaseControl</code> instance for <code>StateT</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">get</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runStateT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="n">ss</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">restoreOutputState</span> <span class="n">ss</span>
<span class="n">put</span> <span class="n">s</span>
<span class="n">pure</span> <span class="n">a</span></code></pre><p>Before we consider how this helps us with <code>ExceptT</code>, let’s pause for a moment and examine the revised <code>StateT</code> instance in detail, as there are some new things going on here:</p><ul><li><p>Take a close look at the definition of <code>WithOutputState (StateT s m) a</code>. Note that we’ve defined it to be <code>WithOutputState m (a, s)</code>, <em>not</em> <code>(WithOutputState m a, s)</code>. Consider, for a moment, the difference between these types. Can you see why we used the former, not the latter?</p><p>If it’s unclear to you, that’s okay—let’s illustrate the difference with an example. Consider two similar monad transformer stacks:</p><pre><code class="pygments"><span class="nf">m1</span> <span class="ow">::</span> <span class="kt">StateT</span> <span class="n">s</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="kt">IO</span><span class="p">)</span> <span class="n">a</span>
<span class="nf">m2</span> <span class="ow">::</span> <span class="kt">ExceptT</span> <span class="n">e</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="kt">IO</span><span class="p">)</span> <span class="n">a</span></code></pre><p>Both these stacks contain <code>StateT</code> and <code>ExceptT</code>, but they are layered in a different order. What’s the difference? Well, consider what <code>m1</code> and <code>m2</code> return once fully unwrapped:</p><pre><code class="pygments"><span class="nf">runExceptT</span> <span class="p">(</span><span class="n">runStateT</span> <span class="n">m1</span> <span class="n">s</span><span class="p">)</span> <span class="ow">::</span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">e</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span>
<span class="nf">runStateT</span> <span class="p">(</span><span class="n">runExceptT</span> <span class="n">m2</span><span class="p">)</span> <span class="n">s</span> <span class="ow">::</span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">e</span> <span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span></code></pre><p>These results are meaningfully different: in <code>m1</code>, the state is discarded if an error is raised, but in <code>m2</code>, the final state is always returned, even if the computation is aborted. What does this mean for <code>WithOutputState</code>?</p><p>Here’s the important detail: <strong>the state is discarded when <code>ExceptT</code> is “inside” <code>StateT</code>, not the other way around.</strong> This can be counterintuitive, since the <code>s</code> ends up <em>inside</em> the <code>Either</code> when the <code>StateT</code> constructor is on the <em>outside</em> and vice versa. This is really just a property of how monad transformers compose, not anything specific to <code>MonadBaseControl</code>, so an explanation of why this happens is outside the scope of this blog post, but the relevant insight is that the <code>m</code> in <code>StateT s m a</code> controls the eventual action’s output state.</p><p>If we had defined <code>WithOutputState (StateT s m) a</code> to be <code>(WithOutputState m a, s)</code>, we’d be in a pickle, since <code>m</code> would be unable to influence the presence of <code>s</code> in the output state. Therefore, we have no choice but to use <code>WithOutputState m (a, s)</code>. (If you are still confused by this, try it yourself; you’ll find that there’s no way to make the other definition typecheck.)</p></li><li><p>Now that we’ve developed an intuitive understanding of why <code>WithOutputState</code> must be defined the way it is, let’s look at things from another perspective. Consider the type of <code>runStateT</code> once more:</p><pre><code class="pygments"><span class="nf">runStateT</span> <span class="ow">::</span> <span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">s</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span></code></pre><p>Note that the result type is <code>m (a, s)</code>, with the <code>m</code> on the outside. As it happens, this correspondence simplifies the definition of <code>captureAndCloseOverInputState</code>, since we no longer have to do any fiddling with its result—it’s already in the proper shape, so we can just return it directly.</p></li><li><p>Finally, this instance illustrates an interesting change to <code>restoreOutputState</code>. Since the <code>a</code> is now packed inside the <code>WithOutputState m a</code> value, the caller of <code>captureAndCloseOverInputState</code> needs some way to get the <code>a</code> back out! Conveniently, <code>restoreOutputState</code> can play that role, both restoring the output state and unpacking the result.</p><p>Even ignoring partial transformers like <code>ExceptT</code>, this is an improvement over the old API, as it conveniently prevents the programmer from forgetting to call <code>restoreOutputState</code>. However, as we’ll see shortly, it is much more than a convenience: once <code>ExceptT</code> comes into play, it is essential!</p></li></ul><p>With those details addressed, let’s return to <code>ExceptT</code>. Using the new interface, writing an instance for <code>ExceptT</code> is not only possible, it’s actually rather easy:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">e</span> <span class="n">a</span><span class="p">)</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runExceptT</span> <span class="n">m</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="n">ss</span> <span class="ow">=</span>
<span class="n">either</span> <span class="n">throwError</span> <span class="n">pure</span> <span class="o">=<<</span> <span class="n">lift</span> <span class="p">(</span><span class="n">restoreOutputState</span> <span class="n">ss</span><span class="p">)</span></code></pre><p>This instance illustrates why it’s so crucial that <code>restoreOutputState</code> have the aforementioned dual role: it must handle the case where no <code>a</code> exists at all! In the case of <code>ExceptT</code>, it restores the state in the enclosing monad by re-raising an error.</p><p>Now all that’s left to do is update the other instances:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">=</span> <span class="n">a</span>
<span class="n">captureAndCloseOverInputState</span> <span class="ow">=</span> <span class="n">pure</span>
<span class="n">restoreOutputState</span> <span class="ow">=</span> <span class="n">pure</span>
<span class="kr">instance</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">ask</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runReaderT</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="n">ss</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">restoreOutputState</span> <span class="n">ss</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span>
<span class="n">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">captureAndCloseOverInputState</span> <span class="p">(</span><span class="n">runWriterT</span> <span class="n">m</span><span class="p">)</span>
<span class="n">restoreOutputState</span> <span class="n">ss</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">restoreOutputState</span> <span class="n">ss</span>
<span class="n">tell</span> <span class="n">s</span>
<span class="n">pure</span> <span class="n">a</span></code></pre><p>Finally, we can update our lifted variant of <code>foo</code> to use the new interface so it will work with transformer stacks that include <code>ExceptT</code>:</p><pre><code class="pygments"><span class="nf">foo'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">foo'</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">m'</span> <span class="ow"><-</span> <span class="n">captureAndCloseOverInputState</span> <span class="n">m</span>
<span class="n">restoreOutputState</span> <span class="o">=<<</span> <span class="n">liftBase</span> <span class="p">(</span><span class="n">foo</span> <span class="n">m'</span><span class="p">)</span></code></pre><p>At this point, it’s worth considering something: although getting the <code>MonadBaseControl</code> class and instances right was a lot of work, the resulting <code>foo'</code> implementation is actually incredibly simple. That’s a good sign, since we only have to write the <code>MonadBaseControl</code> instances once (in a library), but we have to write functions like <code>foo'</code> quite often.</p><h2><a name="scaling-to-the-real-monadbasecontrol"></a>Scaling to the real <code>MonadBaseControl</code></h2><p>The <code>MonadBaseControl</code> class we implemented in the previous section is complete. It is a working, useful class that is equivalent in power to <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#t:MonadBaseControl">the “real” <code>MonadBaseControl</code> class in the <code>monad-control</code> library</a>. However, if you compare the two, you’ll notice that the version in <code>monad-control</code> looks a little bit different. What gives?</p><p>Let’s compare the two classes side by side:</p><pre><code class="pygments"><span class="c1">-- ours</span>
<span class="kr">class</span> <span class="kt">MonadBase</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">b</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span>
<span class="n">captureAndCloseOverInputState</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">b</span> <span class="p">(</span><span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span><span class="p">))</span>
<span class="n">restoreOutputState</span> <span class="ow">::</span> <span class="kt">WithOutputState</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="c1">-- theirs</span>
<span class="kr">class</span> <span class="kt">MonadBase</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">b</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span>
<span class="n">liftBaseWith</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">RunInBase</span> <span class="n">m</span> <span class="n">b</span> <span class="ow">-></span> <span class="n">b</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">restoreM</span> <span class="ow">::</span> <span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>Let’s start with the similarities, since those are easy:</p><ul><li><p>Our <code>WithOutputState</code> associated type is precisely equivalent to their <code>StM</code> associated type, they just use a (considerably) shorter name.</p></li><li><p>Likewise, our <code>restoreOutputState</code> method is precisely equivalent to their <code>restoreM</code> method, simply under a different name.</p></li></ul><p>That leaves <code>captureAndCloseOverInputState</code> and <code>liftBaseWith</code>. Those two methods both do similar things, but they aren’t identical, and that’s where all the differences lie. To understand <code>liftBaseWith</code>, let’s start by inlining the definition of the <code>RunInBase</code> type alias so we can see the fully-expanded type:</p><pre><code class="pygments"><span class="nf">liftBaseWith</span>
<span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span>
<span class="ow">=></span> <span class="p">((</span><span class="n">forall</span> <span class="n">c</span><span class="o">.</span> <span class="n">m</span> <span class="n">c</span> <span class="ow">-></span> <span class="n">b</span> <span class="p">(</span><span class="kt">StM</span> <span class="n">m</span> <span class="n">c</span><span class="p">))</span> <span class="ow">-></span> <span class="n">b</span> <span class="n">a</span><span class="p">)</span>
<span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>That type is complicated! However, if we break it down, hopefully you’ll find it’s not as scary as it first appears. Let’s reimplement the <code>foo'</code> example from before using <code>liftBaseWith</code> to show how this version of <code>MonadBaseControl</code> works:</p><pre><code class="pygments"><span class="nf">foo'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">foo'</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span> <span class="n">foo</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">m</span><span class="p">)</span>
<span class="n">restoreM</span> <span class="n">s</span></code></pre><p>This is, in some ways, superficially similar to the version we wrote using our version of <code>MonadBaseControl</code>. Just like in our version, we capture the input state, apply <code>foo</code> in the <code>IO</code> monad, then restore the state. But what exactly is doing the state capturing, and what is <code>runInBase</code>?</p><p>Let’s start by adding a type annotation to <code>runInBase</code> to help make it a little clearer what’s going on:</p><pre><code class="pygments"><span class="nf">foo'</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">m</span> <span class="n">a</span><span class="o">.</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">foo'</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="p">(</span><span class="n">runInBase</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">b</span><span class="o">.</span> <span class="n">m</span> <span class="n">b</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="p">(</span><span class="kt">StM</span> <span class="n">m</span> <span class="n">b</span><span class="p">))</span> <span class="ow">-></span>
<span class="n">foo</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">m</span><span class="p">)</span>
<span class="n">restoreM</span> <span class="n">s</span></code></pre><p>That type should look sort of recognizable. If we replace <code>StM</code> with <code>WithOutputState</code>, then we get a type that looks very similar to that of our original <code>closeOverInputState</code> function, except it doesn’t need to take the input state as an argument. How does that work?</p><p>Here’s the trick: <code>liftBaseWith</code> starts by capturing the input state, just as before. However, it then builds a function, <code>runInBase</code>, which is like <code>closeOverInputState</code> partially-applied to the input state it captured. It hands that function to us, and we’re free to apply it to <code>m</code>, which produces the <code>IO (StM m a)</code> action we need, and we can now pass that action to <code>foo</code>. The result is returned in the outer monad, and we restore the state using <code>restoreM</code>.</p><h3><a name="sharing-the-input-state"></a>Sharing the input state</h3><p>At first, this might seem needlessly complicated. When we first started, we separated capturing the input state and closing over it into two separate operations (<code>captureInputState</code> and <code>closeOverInputState</code>), but we eventually combined them so that we could keep the input state hidden. Why does <code>monad-control</code> split them back into two operations again?</p><p>As it turns out, when lifting <code>foo</code>, there’s no advantage to the more complicated API of <code>monad-control</code>. In fact, we could implement our <code>captureAndCloseOverInputState</code> operation in terms of <code>liftBaseWith</code>, and we could use that to implement <code>foo'</code> the same way we did before:</p><pre><code class="pygments"><span class="nf">captureAndCloseOverInputState</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">b</span> <span class="p">(</span><span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span><span class="p">))</span>
<span class="nf">captureAndCloseOverInputState</span> <span class="n">m</span> <span class="ow">=</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span> <span class="n">pure</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">m</span><span class="p">)</span>
<span class="nf">foo'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">foo'</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">m'</span> <span class="ow"><-</span> <span class="n">captureAndCloseOverInputState</span> <span class="n">m</span>
<span class="n">restoreM</span> <span class="o">=<<</span> <span class="n">liftBase</span> <span class="p">(</span><span class="n">foo</span> <span class="n">m'</span><span class="p">)</span></code></pre><p>However, that approach has a downside once we need to lift more complicated functions. <code>foo</code> is exceptionally simple, as it only accepts a single input argument, but what if we wanted to lift a more complicated function that took <em>two</em> monadic arguments, such as this one:</p><pre><code class="pygments"><span class="nf">bar</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span></code></pre><p>We could implement that by calling <code>captureAndCloseOverInputState</code> twice, like this:</p><pre><code class="pygments"><span class="nf">bar'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">bar'</span> <span class="n">ma</span> <span class="n">mb</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">ma'</span> <span class="ow"><-</span> <span class="n">captureAndCloseOverInputState</span> <span class="n">ma</span>
<span class="n">mb'</span> <span class="ow"><-</span> <span class="n">captureAndCloseOverInputState</span> <span class="n">mb</span>
<span class="n">restoreM</span> <span class="o">=<<</span> <span class="n">liftBase</span> <span class="p">(</span><span class="n">bar</span> <span class="n">ma'</span> <span class="n">mb'</span><span class="p">)</span></code></pre><p>However, that would capture the monadic state twice, which is rather inefficient. By using <code>liftBaseWith</code>, the state capturing is done just once, and it’s shared between all calls to <code>runInBase</code>:</p><pre><code class="pygments"><span class="nf">bar'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">bar'</span> <span class="n">ma</span> <span class="n">mb</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span>
<span class="n">bar</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">ma</span><span class="p">)</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">mb</span><span class="p">)</span>
<span class="n">restoreM</span> <span class="n">s</span></code></pre><p>By providing a “running” function (<code>runInBase</code>) instead of direct access to the input state, <code>liftBaseWith</code> allows sharing the captured input state between multiple actions without exposing it directly.</p><h3><a name="sidebar-continuation-passing-and-impredicativity"></a>Sidebar: continuation-passing and impredicativity</h3><p>One last point before we move on: although the above explains why <code>captureAndCloseOverInputState</code> is insufficient, you may be left wondering why <code>liftBaseWith</code> can’t just <em>return</em> <code>runInBase</code>. Why does it need to be given a continuation? After all, it would be nicer if we could just write this:</p><pre><code class="pygments"><span class="nf">bar'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">bar'</span> <span class="n">ma</span> <span class="n">mb</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">runInBase</span> <span class="ow"><-</span> <span class="n">askRunInBase</span>
<span class="n">restoreM</span> <span class="o">=<<</span> <span class="n">liftBase</span> <span class="p">(</span><span class="n">bar</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">ma</span><span class="p">)</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">mb</span><span class="p">))</span></code></pre><p>To understand the problem with a hypothetical <code>askRunInBase</code> function, remember that the type of <code>runInBase</code> is polymorphic:</p><pre><code class="pygments"><span class="nf">runInBase</span> <span class="ow">::</span> <span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">b</span> <span class="p">(</span><span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span></code></pre><p>This is important, since if you need to lift a function with a type like</p><pre><code class="pygments"><span class="nf">baz</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">b</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">c</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">b</span> <span class="n">c</span><span class="p">)</span></code></pre><p>then you’ll want to instantiate that <code>a</code> variable with two different types. We’d need to retain that power in <code>askRunInBase</code>, so it would need to have the following type:</p><pre><code class="pygments"><span class="nf">askRunInBase</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="p">(</span><span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">b</span> <span class="p">(</span><span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span><span class="p">))</span></code></pre><p>Sadly, that type is illegal in Haskell. Type constructors must be applied to monomorphic types, but in the above type signature, <code>m</code> is applied to a polymorphic type.<sup><a href="#footnote-1" id="footnote-ref-1-1">2</a></sup> The <code>RankNTypes</code> GHC extension introduces a single exception: the <code>(->)</code> type constructor is special and may be applied to polymorphic types. That’s why <code>liftBaseWith</code> is legal, but <code>askRunInBase</code> is not: since <code>liftBaseWith</code> is passed a higher-order function that receives <code>runInBase</code> as an argument, the polymorphic type appears immediately under an application of <code>(->)</code>, which is allowed.</p><p>The aforementioned restriction means we’re basically out of luck, but if you <em>really</em> want <code>askRunInBase</code>, there is a workaround. GHC is perfectly alright with a field of a datatype being polymorphic, so we can define a newtype that wraps a suitably-polymorphic function:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">RunInBase</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=</span> <span class="kt">RunInBase</span> <span class="p">(</span><span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">b</span> <span class="p">(</span><span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span><span class="p">))</span></code></pre><p>We can now alter <code>askRunInBase</code> to return our newtype, and we can implement it in terms of <code>liftBaseWith</code>:<sup><a href="#footnote-2" id="footnote-ref-2-1">3</a></sup></p><pre><code class="pygments"><span class="nf">askRunInBase</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="n">b</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="p">(</span><span class="kt">RunInBase</span> <span class="n">b</span> <span class="n">m</span><span class="p">)</span>
<span class="nf">askRunInBase</span> <span class="ow">=</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span> <span class="n">pure</span> <span class="o">$</span> <span class="kt">RunInBase</span> <span class="n">runInBase</span></code></pre><p>To use <code>askRunInBase</code>, we have to pattern match on the <code>RunInBase</code> constructor, but it isn’t very noisy, since we can do it directly in a <code>do</code> binding. For example, we could implement a lifted version of <code>baz</code> this way:</p><pre><code class="pygments"><span class="nf">baz'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">b</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span>
<span class="nf">baz'</span> <span class="n">ma</span> <span class="n">mb</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="kt">RunInBase</span> <span class="n">runInBase</span> <span class="ow"><-</span> <span class="n">askRunInBase</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">liftBase</span> <span class="p">(</span><span class="n">baz</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">ma</span><span class="p">)</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">mb</span><span class="p">))</span>
<span class="n">bitraverse</span> <span class="n">restoreM</span> <span class="n">restoreM</span> <span class="n">s</span></code></pre><p>As of version 1.0.2.3, <code>monad-control</code> does not provide a newtype like <code>RunInBase</code>, so it also doesn’t provide a function like <code>askRunInBase</code>. For now, you’ll have to use <code>liftBaseWith</code>, but it might be a useful future addition to the library.</p><h2><a name="pitfalls"></a>Pitfalls</h2><p>At this point in the blog post, we’ve covered the essentials of <code>MonadBaseControl</code>: how it works, how it’s designed, and how you might go about using it. However, so far, we’ve only considered situations where <code>MonadBaseControl</code> works well, and I’ve intentionally avoided examples where the technique breaks down. In this section, we’re going to take a look at the pitfalls and drawbacks of <code>MonadBaseControl</code>, plus some ways they can be mitigated.</p><h3><a name="no-polymorphism-no-lifting"></a>No polymorphism, no lifting</h3><p>All of the pitfalls of <code>MonadBaseControl</code> stem from the same root problem, and that’s the particular technique it uses to save and restore monadic state. We’ll start by considering one of the simplest ways that technique is thwarted, and that’s monomorphism. Consider the following two functions:</p><pre><code class="pygments"><span class="nf">poly</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span>
<span class="nf">mono</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="kt">X</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="kt">X</span></code></pre><p>Even after all we’ve covered, it may surprise you to learn that although <code>poly</code> can be easily lifted to <code>MonadBaseControl IO m => m a -> m a</code>, it’s <em>impossible</em> to lift <code>mono</code> to <code>MonadBaseControl IO m => m X -> m X</code>. It’s a little unintuitive, as we often think of polymorphic types as being more complicated (so surely lifting polymorphic functions ought to be harder), but in fact, it’s the flexibility of polymorphism that allows <code>MonadBaseControl</code> to work in the first place.</p><p>To understand the problem, remember that when we lift a function of type <code>forall a. b a -> b a</code> using <code>MonadBaseControl</code>, we actually instantiate <code>a</code> to <code>(StM m c)</code>. That produces a function of type <code>b (StM m c) -> b (StM m c)</code>, which is isomorphic to the <code>m c -> m c</code> type we want. The instantiation step is easily overlooked, but it’s crucial, since otherwise we have no way to thread the state through the otherwise opaque function we’re trying to lift!</p><p>In the case of <code>mono</code>, that’s exactly the problem we’re faced with. <code>mono</code> will not accept an <code>IO (StM m X)</code> as an argument, only precisely an <code>IO X</code>, so we can’t pass along the monadic state. For all its machinery, <code>MonadBaseControl</code> is no help at all if no polymorphism is involved. Trying to generalize <code>mono</code> without modifying its implementation is a lost cause.</p><h3><a name="the-dangers-of-discarded-state"></a>The dangers of discarded state</h3><p>Our inability to lift <code>mono</code> is frustrating, but at least it’s conclusively impossible. In practice, however, many functions lie in an insidious in-between: polymorphic enough to be lifted, but not without compromises. The simplest of these functions have types such as the following:</p><pre><code class="pygments"><span class="nf">sideEffect</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span></code></pre><p>Unlike <code>mono</code>, it’s entirely possible to lift <code>sideEffect</code>:</p><pre><code class="pygments"><span class="nf">sideEffect'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="nf">sideEffect'</span> <span class="n">m</span> <span class="ow">=</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span> <span class="n">sideEffect</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">m</span><span class="p">)</span></code></pre><p>This definition typechecks, but you may very well prefer it didn’t, since it has a serious problem: any changes made by <code>m</code> to the monadic state are completely discarded once <code>sideEffect'</code> returns! Since <code>sideEffect'</code> never calls <code>restoreM</code>, there’s no way the state of <code>m</code> can be any different from the original state, but it’s impossible to call <code>restoreM</code> since we don’t actually get an <code>StM m ()</code> result from <code>sideEffect</code>.</p><p>Sometimes this may be acceptable, since some monad transformers don’t actually have any output state anyway, such as <code>ReaderT r</code>. In other cases, however, <code>sideEffect'</code> could be a bug waiting to happen. One way to make <code>sideEffect'</code> safe would be to add a <code>StM m a ~ a</code> constraint to its context, since that guarantees the monad transformers being lifted through are stateless, and nothing is actually being discarded. Of course, that significantly restricts the set of monad transformers that can be lifted through.</p><h4><a name="rewindable-state"></a>Rewindable state</h4><p>One scenario where state discarding can actually be useful is operations with so-called rewindable or transactional state. The most common example of such an operation is <code>catch</code>:</p><pre><code class="pygments"><span class="nf">catch</span> <span class="ow">::</span> <span class="kt">Exception</span> <span class="n">e</span> <span class="ow">=></span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">(</span><span class="n">e</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span></code></pre><p>When lifted, state changes from the action <em>or</em> from the exception handler will be “committed,” but never both. If an exception is raised during the computation, those state changes are discarded (“rewound”), giving <code>catch</code> a kind of backtracking semantics. This behavior arises naturally from the way a lifted version of <code>catch</code> must be implemented:</p><pre><code class="pygments"><span class="nf">catch'</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">Exception</span> <span class="n">e</span><span class="p">,</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">(</span><span class="n">e</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">catch'</span> <span class="n">m</span> <span class="n">f</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span>
<span class="n">catch</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">m</span><span class="p">)</span> <span class="p">(</span><span class="n">runInBase</span> <span class="o">.</span> <span class="n">f</span><span class="p">)</span>
<span class="n">restoreM</span> <span class="n">s</span></code></pre><p>If <code>m</code> raises an exception, it will never return an <code>StM m a</code> value, so there’s no way to get ahold of any of the state changes that happened before the exception. Therefore, the only option is to discard that state.</p><p>This behavior is actually quite useful, and it’s definitely not unreasonable. However, useful or not, it’s inconsistent with state changes to mutable values like <code>IORef</code>s or <code>MVar</code>s (they stay modified whether an exception is raised or not), so it can still be a gotcha. Either way, it’s worth being aware of.</p><h4><a name="partially-discarded-state"></a>Partially discarded state</h4><p>The next function we’re going to examine is <code>finally</code>:</p><pre><code class="pygments"><span class="nf">finally</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">b</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="n">a</span></code></pre><p>This function has a similar type to <code>catch</code>, and it even has similar semantics. Like <code>catch</code>, <code>finally</code> can be lifted, but unlike <code>catch</code>, its state <em>can’t</em> be given any satisfying treatment. The only way to implement a lifted version is</p><pre><code class="pygments"><span class="nf">finally'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">b</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">finally'</span> <span class="n">ma</span> <span class="n">mb</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span>
<span class="n">finally</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">ma</span><span class="p">)</span> <span class="p">(</span><span class="n">runInBase</span> <span class="n">mb</span><span class="p">)</span>
<span class="n">restoreM</span> <span class="n">s</span></code></pre><p>which always discards all state changes made by the second argument. This is clear just from looking at <code>finally</code>’s type: since <code>b</code> doesn’t appear anywhere in the return type, there’s simply no way to access that action’s result, and therefore no way to access its modified state.</p><p>However, don’t despair: there actually <em>is</em> a way to produce a lifted version of <code>finally</code> that preserves all state changes. It can’t be done by lifting <code>finally</code> directly, but if we reimplement <code>finally</code> in terms of simpler lifted functions that are more amenable to lifting, we can produce a lifted version of <code>finally</code> that preserves all the state:<sup><a href="#footnote-3" id="footnote-ref-3-1">4</a></sup></p><pre><code class="pygments"><span class="nf">finally'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">b</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">finally'</span> <span class="n">ma</span> <span class="n">mb</span> <span class="ow">=</span> <span class="n">mask'</span> <span class="o">$</span> <span class="nf">\</span><span class="n">restore</span> <span class="ow">-></span> <span class="kr">do</span>
<span class="n">a</span> <span class="ow"><-</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span>
<span class="n">try</span> <span class="p">(</span><span class="n">runInBase</span> <span class="p">(</span><span class="n">restore</span> <span class="n">ma</span><span class="p">))</span>
<span class="kr">case</span> <span class="n">a</span> <span class="kr">of</span>
<span class="kt">Left</span> <span class="n">e</span> <span class="ow">-></span> <span class="n">mb</span> <span class="o">*></span> <span class="n">liftBase</span> <span class="p">(</span><span class="n">throwIO</span> <span class="p">(</span><span class="n">e</span> <span class="ow">::</span> <span class="kt">SomeException</span><span class="p">))</span>
<span class="kt">Right</span> <span class="n">s</span> <span class="ow">-></span> <span class="n">restoreM</span> <span class="n">s</span> <span class="o"><*</span> <span class="n">mb</span></code></pre><p>This illustrates an important (and interesting) point about <code>MonadBaseControl</code>: whether or not an operation can be made state-preserving is not a fundamental property of the operation’s type, but rather a property of the types of the exposed primitives. There is sometimes a way to implement a state-preserving variant of operations that might otherwise seem unliftable given the right primitives and a bit of cleverness.</p><h4><a name="forking-state"></a>Forking state</h4><p>As a final example, I want to provide an example where the state may not actually be discarded <em>per se</em>, just inaccessible. Consider the type of <code>forkIO</code>:</p><pre><code class="pygments"><span class="nf">forkIO</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="nb">()</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="kt">ThreadId</span></code></pre><p>Although <code>forkIO</code> isn’t actually polymorphic in its argument, we can convert <em>any</em> <code>IO</code> action to one that produces <code>()</code> via <code>void</code>, so it might as well be. Therefore, we can lift <code>forkIO</code> in much the same way we did with <code>sideEffect</code>:</p><pre><code class="pygments"><span class="nf">forkIO'</span> <span class="ow">::</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="nb">()</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">ThreadId</span>
<span class="nf">forkIO'</span> <span class="n">m</span> <span class="ow">=</span> <span class="n">liftBaseWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">runInBase</span> <span class="ow">-></span> <span class="n">forkIO</span> <span class="p">(</span><span class="n">void</span> <span class="o">$</span> <span class="n">runInBase</span> <span class="n">m</span><span class="p">)</span></code></pre><p>As with <code>sideEffect</code>, we can’t recover the output state, but in this case, there’s a fundamental reason that goes deeper than the types: we’ve forked off a concurrent computation! We’ve therefore split the state in two, which might be what we want… but it also might not. <code>forkIO</code> is yet another illustration that it’s important to think about the state-preservation semantics when using <code>MonadBaseControl</code>, or you may end up with a bug!</p><h2><a name="monadbasecontrol-in-context"></a><code>MonadBaseControl</code> in context</h2><p>Congratulations: you’ve made it through most of this blog post. If you’ve followed everything so far, you now understand <code>MonadBaseControl</code>. All the tricky parts are over. However, before wrapping up, I’d like to add a little extra information about how <code>MonadBaseControl</code> relates to various other parts of the Haskell ecosystem. In practice, that information can be as important as understanding <code>MonadBaseControl</code> itself.</p><h3><a name="the-remainder-of-monad-control"></a>The remainder of <code>monad-control</code></h3><p>If you look at <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html">the documentation for <code>monad-control</code></a>, you’ll find that it provides more than just the <code>MonadBaseControl</code> typeclass. I’m not going to cover everything else in detail in this blog post, but I do want to touch upon it briefly.</p><p>First off, you should definitely take a look at the handful of helper functions provided by <code>monad-control</code>, such as <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#v:control"><code>control</code></a> and <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#v:liftBaseOp_"><code>liftBaseOp_</code></a>. These functions provide support for lifting common function types without having to use <code>liftBaseWith</code> directly. It’s useful to understand <code>liftBaseWith</code>, since it’s the most general way to use <code>MonadBaseControl</code>, but in practice, it is simpler and more readable to use the more specialized functions wherever possible. Many of the examples in this very blog post could be simplified using them, and I only stuck to <code>liftBaseWith</code> to introduce as few new concepts at a time as possible.</p><p>Second, I’d like to mention the related <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#t:MonadTransControl"><code>MonadTransControl</code></a> typeclass. You hopefully remember from earlier in the blog post how we defined <code>MonadBaseControl</code> instances inductively so that we could lift all the way down to the base monad. <code>MonadTransControl</code> is like <code>MonadBaseControl</code> if it intentionally did <em>not</em> do that—it allows lifting through a single transformer at a time, rather than through all of them at once.</p><p>Usually, <code>MonadTransControl</code> is not terribly useful to use directly (though I did use it once <a href="/blog/2017/04/28/lifts-for-free-making-mtl-typeclasses-derivable/#making-mtls-classes-derivable">in a previous blog post of mine</a> to help derive instances of mtl-style classes), but it <em>is</em> useful for implementing <code>MonadBaseControl</code> instances for your own transformers. If you define a <code>MonadTransControl</code> instance for your monad transformer, you can get a <code>MonadBaseControl</code> implementation for free using the provided <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#t:ComposeSt"><code>ComposeSt</code></a>, <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#v:defaultLiftBaseWith"><code>defaultLiftBaseWith</code></a>, and <a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#v:defaultRestoreM"><code>defaultRestoreM</code></a> bindings; see the documentation for more details.</p><h3><a name="lifted-base-and-lifted-async"></a><code>lifted-base</code> and <code>lifted-async</code></h3><p>If you’re going to use <code>MonadBaseControl</code>, the <a href="http://hackage.haskell.org/package/lifted-base"><code>lifted-base</code></a> and <a href="http://hackage.haskell.org/package/lifted-async"><code>lifted-async</code></a> packages are good to know about. As their names imply, they provide lifted versions of bindings in the <code>base</code> and <code>async</code> packages, so you can use them directly without needing to lift them yourself. For example, if you needed a lifted version of <code>mask</code> from <code>Control.Exception</code>, you could swap it for the <code>mask</code> export from <code>Control.Exception.Lifted</code>, and everything would mostly just work (though always be sure to check the documentation for any caveats on state discarding).</p><h3><a name="relationship-to-monadunliftio"></a>Relationship to <code>MonadUnliftIO</code></h3><p>Recently, FP Complete has developed the <a href="https://hackage.haskell.org/package/unliftio"><code>unliftio</code></a> package as an alternative to <code>monad-control</code>. It provides the <a href="https://hackage.haskell.org/package/unliftio-core-0.1.2.0/docs/Control-Monad-IO-Unlift.html#t:MonadUnliftIO"><code>MonadUnliftIO</code></a> typeclass, which is similar in spirit to <code>MonadBaseControl</code>, but heavily restricted: it is specialized to <code>IO</code> as the base monad, and it <em>only</em> allows instances for stateless monads, such as <code>ReaderT</code>. This is designed to encourage the so-called <a href="https://www.fpcomplete.com/blog/2017/06/readert-design-pattern"><code>ReaderT</code> design pattern</a>, which avoids ever using stateful monads like <code>ExceptT</code> or <code>StateT</code> over <code>IO</code>, encouraging the use of <code>IO</code> exceptions and mutable variables (e.g. <code>MVar</code>s or <code>TVar</code>s) instead.</p><p>I should be clear: I really like most of what FP Complete has done—to this day, I still use <code>stack</code> as my Haskell build tool of choice—and I think the suggestions given in the aforementioned “<code>ReaderT</code> design pattern” blog post have real weight to them. I have a deep respect for Michael Snoyman’s commitment to opinionated, user-friendly tools and libraries. But truthfully, I can’t stand <code>MonadUnliftIO</code>.</p><p><code>MonadUnliftIO</code> is designed to avoid all the complexity around state discarding that <code>MonadBaseControl</code> introduces, and on its own, that’s a noble goal. Safety first, after all. The problem is that <code>MonadUnliftIO</code> really is extremely limiting, and what’s more, it can actually be trivially encoded in terms of <code>MonadBaseControl</code> as follows:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">MonadUnliftIO</span> <span class="n">m</span> <span class="ow">=</span> <span class="p">(</span><span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="n">m</span><span class="p">,</span> <span class="n">forall</span> <span class="n">a</span><span class="o">.</span> <span class="kt">StM</span> <span class="n">m</span> <span class="n">a</span> <span class="o">~</span> <span class="n">a</span><span class="p">)</span></code></pre><p>This alias can be used to define safe, lifted functions that never discard state while still allowing functions that <em>can</em> be safely lifted through stateful transformers to do so. Indeed, the <a href="https://hackage.haskell.org/package/lifted-async-0.10.0.4/docs/Control-Concurrent-Async-Lifted-Safe.html"><code>Control.Concurrent.Async.Lifted.Safe</code></a> module from <code>lifted-async</code> does exactly that (albeit with a slightly different formulation than the above alias).</p><p>To be fair, the <code>unliftio</code> README does address this in its <a href="https://github.com/fpco/unliftio/tree/bb2e26e7fbbaebb15555f417ba9753a76b3218b2/unliftio#monad-control">comparison section</a>:</p><blockquote><p><code>monad-control</code> allows us to unlift both styles. In theory, we could write a variant of <code>lifted-base</code> that never does state discards […] In other words, this is an advantage of <code>monad-control</code> over <code>MonadUnliftIO</code>. We've avoided providing any such extra typeclass in this package though, for two reasons:</p><ul><li><p><code>MonadUnliftIO</code> is a simple typeclass, easy to explain. We don't want to complicated [sic] matters […]</p></li><li><p>Having this kind of split would be confusing in user code, when suddenly [certain operations are] not available to us.</p></li></ul></blockquote><p>In other words, the authors of <code>unliftio</code> felt that <code>MonadBaseControl</code> was simply not worth the complexity, and they could get away with <code>MonadUnliftIO</code>. Frankly, if you feel the same way, by all means, use <code>unliftio</code>. I just found it too limiting given the way I write Haskell, plain and simple.</p><h2><a name="recap"></a>Recap</h2><p>So ends another long blog post. As often seems the case, I set out to write something short, but I ended up writing well over 5,000 words. I suppose that means I learned something from this experience, too: <code>MonadBaseControl</code> is more complicated than I had anticipated! Maybe there’s something to take away from that.</p><p>In any case, it’s over now, so I’d like to briefly summarize what we’ve covered:</p><ul><li><p><a href="https://hackage.haskell.org/package/monad-control-1.0.2.3/docs/Control-Monad-Trans-Control.html#t:MonadBaseControl"><code>MonadBaseControl</code></a> allows us to lift higher-order monadic operations.</p></li><li><p>It operates by capturing the current monadic state and explicitly threading it through the action in the base monad before restoring it.</p></li><li><p>That technique works well for polymorphic operations for the type <code>forall a. b a -> b a</code>, but it can be tricky or even impossible for more complex operations, sometimes leading to discarded state.</p><p>This can sometimes be mitigated by restricting certain operations to stateless monads using a <code>StM m a ~ a</code> constraint, or by reimplementing the operation in terms of simpler primitives.</p></li><li><p>The <a href="http://hackage.haskell.org/package/lifted-base"><code>lifted-base</code></a> and <a href="http://hackage.haskell.org/package/lifted-async"><code>lifted-async</code></a> packages provide lifted versions of existing operations, avoiding the need to lift them yourself.</p></li></ul><p>As with many abstractions in Haskell, don’t worry too much if you don’t have a completely firm grasp of <code>MonadBaseControl</code> at first. Insight often comes with repeated experience, and <code>monad-control</code> can still be used in useful ways even without a perfect understanding. My hope is that this blog post has helped you build intuitions about <code>MonadBaseControl</code> even if some of the underlying machinery remains a little fuzzy, and I hope it can also serve as a reference for those who want or need to understand (or just be reminded of) all the little details.</p><p>Finally, I’ll admit <code>MonadBaseControl</code> isn’t especially elegant or beautiful as Haskell abstractions go. In fact, in many ways, it’s a bit of a kludge! Perhaps, in time, effect systems will evolve and mature so that it and its ilk are no longer necessary, and they may become distant relics of an inferior past. But in the meantime, it’s here, it’s useful, and I think it’s worth embracing. If you’ve shied away from it in the past, I hope I’ve illuminated it enough to make you consider giving it another try.</p><ol class="footnotes"><li id="footnote-0"><p>One example of a function with that type is <code>mask_</code>. <a href="#footnote-ref-0-1">↩</a></p></li><li id="footnote-1"><p>Types with polymorphic types under type constructors are called <em>impredicative</em>. GHC technically has limited support for impredicativity via the <code>ImpredicativeTypes</code> language extension, but as of GHC 8.8, it has been fairly broken for some time. A fix is apparently being worked on, but even if that effort is successful, I don’t know what impact it will have on type inference. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>Note that <code>askRunInBase = liftBaseWith (pure . RunInBase)</code> does <em>not</em> typecheck, as it would require impredicative polymorphism: it would require instantiating the type of <code>(.)</code> with polymorphic types. The version using <code>($)</code> works because GHC actually has special typechecking rules for <code>($)</code>! Effectively, <code>f $ x</code> is really syntax in GHC. <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>Assume that <code>mask'</code> is a suitably lifted version of <code>mask</code> (which can in fact be made state-preserving). <a href="#footnote-ref-3-1">↩</a></p></li></ol></article>Defeating Racket’s separate compilation guarantee2019-04-21T00:00:00Z2019-04-21T00:00:00ZAlexis King<article><p>Being a self-described <a href="https://felleisen.org/matthias/manifesto/sec_pl-pl.html">programming-language programming language</a> is an ambitious goal. To preserve predictability while permitting linguistic extension, Racket comes equipped with a module system carefully designed to accommodate <a href="https://www.cs.utah.edu/plt/publications/macromod.pdf">composable and compilable macros</a>. One of the module system’s foundational properties is its <a href="https://docs.racket-lang.org/reference/eval-model.html#%28part._separate-compilation%29"><em>separate compilation guarantee</em></a>, which imposes strong, unbreakable limits on the extent of compile-time side-effects. It is <em>essential</em> for preserving static guarantees in a world where compiling a module can execute arbitrary code, and despite numerous unsafe trapdoors that have crept into Racket since its birth as PLT Scheme, none have ever given the programmer the ability to cheat it.</p><p>Yet today, in this blog post, we’re going to do exactly that.</p><h2><a name="what-is-the-separate-compilation-guarantee"></a>What is the separate compilation guarantee?</h2><p>Before we get to the fun part (i.e. breaking things), let’s go over some fundamentals so we understand what we’re breaking. The authoritative source for the separate compilation guarantee is <a href="https://docs.racket-lang.org/reference/eval-model.html#%28part._separate-compilation%29">the Racket reference</a>, but it is dense, as authoritative sources tend to be. Although I enjoy reading technical manuals for sport, it is my understanding that not all the people who read this blog are as strange as I am, so let’s start with a quick primer, instead. (If you’re already an expert, feel free to <a href="#section:main-start">skip to the next section</a>.)</p><p>Racket is a macro-enabled programming language. In Racket, a macro is a user-defined, code-to-code transformation that occurs at compile-time. These transformations cannot make arbitrary changes to the program—in Racket, they are usually required to be <em>local</em>, affecting a single expression or definition at a time—but they may be implemented using arbitrary code. This means that a macro can, if it so desires, read the SSH keys off your filesystem and issue an HTTP request to send them someplace.</p><p>That kind of attack is bad, admittedly, but it’s also <em>uninteresting</em>: Racket allows you do all that and then some, making no attempt to prevent it.<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup> Racket calls these “external effects,” things that affect state outside of the programming language. They sound scary, but in practice, <em>internal effects</em>—effects that mutate state inside the programming language—are a much bigger obstacle to practical programming. Let’s take a look at why.</p><p>Let’s say we have a module with some global, mutable state. Perhaps it is used to keep track of a set of delicious foods:</p><pre><code class="pygments"><span class="c1">;; foods.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">provide</span> <span class="n">delicious-food?</span> <span class="n">add-delicious-food!</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">delicious-foods</span> <span class="p">(</span><span class="nb">mutable-set</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">delicious-food?</span> <span class="n">food</span><span class="p">)</span>
<span class="p">(</span><span class="nb">set-member?</span> <span class="n">delicious-foods</span> <span class="n">food</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">add-delicious-food!</span> <span class="n">new-food</span><span class="p">)</span>
<span class="p">(</span><span class="nb">set-add!</span> <span class="n">delicious-foods</span> <span class="n">new-food</span><span class="p">))</span></code></pre><p>Using this interface, let’s write a program that checks if a particular food, given as a command-line argument, is delicious:</p><pre><code class="pygments"><span class="c1">;; check-food.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="s2">"foods.rkt"</span><span class="p">)</span>
<span class="p">(</span><span class="n">add-delicious-food!</span> <span class="s2">"pineapple"</span><span class="p">)</span>
<span class="p">(</span><span class="n">add-delicious-food!</span> <span class="s2">"sushi"</span><span class="p">)</span>
<span class="p">(</span><span class="n">add-delicious-food!</span> <span class="s2">"cheesecake"</span><span class="p">)</span>
<span class="p">(</span><span class="k">command-line</span>
<span class="kd">#:args</span> <span class="p">[</span><span class="n">food-to-check</span><span class="p">]</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">delicious-food?</span> <span class="n">food-to-check</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"~a is a delicious food.</span><span class="se">\n</span><span class="s2">"</span> <span class="n">food-to-check</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"~a is not delicious.</span><span class="se">\n</span><span class="s2">"</span> <span class="n">food-to-check</span><span class="p">)))</span></code></pre><pre><code class="pygments">$ racket check-food.rkt cheesecake
cheesecake is a delicious food.
$ racket check-food.rkt licorice
licorice is not delicious.</code></pre><p>Exhilarating. (Sorry, licorice fans.) But what if a <em>macro</em> were to call <code>add-delicious-food!</code>? What would happen? For example, what if we wrote a macro to add a lot of foods at once?<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup></p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">add-food-combinations!</span> <span class="p">[</span><span class="n">fst:string</span> <span class="k">...</span><span class="p">]</span>
<span class="p">[</span><span class="n">snd:string</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">for*</span> <span class="p">([</span><span class="n">fst-str</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="nb">syntax->datum</span> <span class="o">#'</span><span class="p">[</span><span class="n">fst</span> <span class="k">...</span><span class="p">]))]</span>
<span class="p">[</span><span class="n">snd-str</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="nb">syntax->datum</span> <span class="o">#'</span><span class="p">[</span><span class="n">snd</span> <span class="k">...</span><span class="p">]))])</span>
<span class="p">(</span><span class="n">add-delicious-food!</span> <span class="p">(</span><span class="nb">string-append</span> <span class="n">fst-str</span> <span class="s2">" "</span> <span class="n">snd-str</span><span class="p">)))]</span>
<span class="p">(</span><span class="nb">void</span><span class="p">))</span>
<span class="c1">; should add “fried chicken,” “roasted chicken”, “fried potato,” and “roasted potato”</span>
<span class="p">(</span><span class="n">add-food-combinations!</span> <span class="p">[</span><span class="s2">"fried"</span> <span class="s2">"roasted"</span><span class="p">]</span> <span class="p">[</span><span class="s2">"chicken"</span> <span class="s2">"potato"</span><span class="p">])</span></code></pre><p>Now, what do you think executing <code>racket check-food.rkt 'fried chicken'</code> will do?</p><p>Clearly, the program should print <code>fried chicken is a delicious food</code>, and indeed, many traditional Lisp systems would happily produce such a result. After all, running <code>racket check-food.rkt 'fried chicken'</code> must load the source code inside <code>check-food.rkt</code>, expand and compile it, then run the result. While the program is being expanded, the compile-time calls to <code>add-delicious-food!</code> should add new elements to the <code>delicious-food</code> set, so when the program is executed, the string <code>"fried chicken"</code> ought to be in it.</p><p>But if you actually try this yourself, you will find that <em>isn’t</em> what happens. Instead, Racket rejects the program:</p><pre><code class="pygments">$ racket check-food.rkt <span class="s1">'fried chicken'</span>
check-food.rkt:12:11: add-delicious-food!: reference to an unbound identifier
at phase: <span class="m">1</span><span class="p">;</span> the transformer environment
in: add-delicious-food!</code></pre><p>Why does Racket reject this program? Well, consider that Racket allows programs to be pre-compiled using <code>raco make</code>, doing all the work of macroexpansion and compilation to bytecode ahead of time. Subsequent runs of the program will use the pre-compiled version, without having to run all the macros again. This is a problem, since expanding the <code>add-food-combinations!</code> macro had side-effects that our program depended on!</p><p>If Racket allowed the above program, it might do different things depending on whether it was pre-compiled. Running directly from source code might treat <code>'fried chicken'</code> as a delicious food, while running from pre-compiled bytecode might not. Racket considers this unacceptable, so it disallows the program entirely.</p><h3><a name="preserving-separate-compilation-via-phases"></a>Preserving separate compilation via phases</h3><p>Hopefully, you are now mostly convinced that the above program is a bad one, but you might have some lingering doubts. You might, for example, wonder if Racket disallows mutable compile-time state entirely. That is not the case—Racket really does allow everything that happens at runtime to happen at compile-time—but it does prevent compile-time and run-time state from ever <em>interacting</em>. Racket stratifies every program into a compile-time part and a run-time part, and it restricts communication between them to limited, well-defined channels (mainly via expanding to code that does something at run-time).</p><p>Racket calls this system of stratification <em>phases</em>. Code that executes at run-time belongs to the <em>run-time phase</em>, while code that executes at compile-time (i.e. macros) belongs to the <em>compile-time phase</em>. When a variable is defined, it is always defined in a particular phase, so bindings declared with <code>define</code> can only be used at run-time, while bindings declared with <code>define-for-syntax</code> can only be used at compile-time. Since <code>add-delicious-food!</code> was declared using <code>define</code>, it was not allowed (and in fact was not even visible) in the body of the <code>add-food-combinations!</code> macro.</p><p>While the whole macro system could work precisely as just described, such a strict stratification would be incredibly rigid. Since every definition would belong to either run-time or compile-time, but never both, reusing run-time code to implement macros would be impossible. While the example in the previous section might make it seem like that’s a good thing, it very often isn’t: imagine if general-purpose functions like <code>map</code> and <code>filter</code> all needed to be written twice!</p><p>To avoid this problem, Racket allows modules to be imported at both run-time and compile-time, so long as it’s done explicitly. Writing <code>(require "some-library.rkt")</code> requires <code>some-library.rkt</code> for run-time code, but writing <code>(require (for-syntax "some-library.rkt"))</code> requires it for compile-time code. Requiring a module <code>for-syntax</code> is sort of like implicitly adjusting all of its uses of <code>define</code> to be <code>define-for-syntax</code>, instead, effectively shifting all the code from run-time to compile-time. This kind of operation is therefore known as <em>phase shifting</em> in Racket terminology.</p><p>We can use phase shifting to make the program we wrote compile. If we adjust the <code>require</code> at the beginning of our program, then we can ensure <code>add-delicious-food!</code> is visible to both the run-time and compile-time parts of <code>check-food.rkt</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="s2">"foods.rkt"</span> <span class="p">(</span><span class="k">for-syntax</span> <span class="s2">"foods.rkt"</span><span class="p">))</span></code></pre><p>Now our program compiles. However, if you’ve been following everything carefully, you should be wondering why! According to the last section, sharing state between run-time and compile-time fundamentally can’t work without introducing inconsistencies between uncompiled and pre-compiled code. And that’s true—such a thing would cause all sorts of problems, and Racket doesn’t allow it. If you run the program, whether pre-compiled or not, you’ll find it always does the same thing:</p><pre><code class="pygments">$ racket check-food.rkt <span class="s1">'fried chicken'</span>
fried chicken is not delicious.</code></pre><p>This seems rather confusing. What happened to the calls to <code>add-delicious-food!</code> inside our <code>add-food-combinations!</code> macro? If we stick a <code>printf</code> inside <code>add-delicious-food!</code>, we’ll find that it really does get called:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">add-delicious-food!</span> <span class="n">new-food</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"Registering ~a as a delicious food.</span><span class="se">\n</span><span class="s2">"</span> <span class="n">new-food</span><span class="p">)</span>
<span class="p">(</span><span class="nb">set-add!</span> <span class="n">delicious-foods</span> <span class="n">new-food</span><span class="p">))</span></code></pre><pre><code class="pygments">$ racket check-food.rkt <span class="s1">'fried chicken'</span>
Registering fried chicken as a delicious food.
Registering fried potato as a delicious food.
Registering roasted chicken as a delicious food.
Registering roasted potato as a delicious food.
Registering pineapple as a delicious food.
Registering sushi as a delicious food.
Registering cheesecake as a delicious food.
fried chicken is not delicious.</code></pre><p>And in fact, if we pre-compile <code>check-food.rkt</code>, we’ll see that the first four registrations appear at compile-time, exactly as we expect:</p><pre><code class="pygments">$ raco make check-food.rkt
Registering fried chicken as a delicious food.
Registering fried potato as a delicious food.
Registering roasted chicken as a delicious food.
Registering roasted potato as a delicious food.
$ racket check-food.rkt <span class="s1">'fried chicken'</span>
Registering pineapple as a delicious food.
Registering sushi as a delicious food.
Registering cheesecake as a delicious food.
fried chicken is not delicious.</code></pre><p>The compile-time registrations really are happening, but Racket is automatically restricting the compile-time side-effects so they only apply at compile-time. After compilation has finished, Racket ensures that compile-time side effects are thrown away, and the run-time code starts over with fresh, untouched state. This guarantees consistent behavior, since it becomes impossible to distinguish at run-time whether a module was just compiled on the fly, or if it was pre-compiled long ago (possibly even on someone else’s computer).</p><p>This is the essence of the separate compilation guarantee. To summarize:</p><ul><li><p>Run-time and compile-time are distinct <em>phases</em> of execution, which cannot interact.</p></li><li><p>Modules can be required at multiple phases via <em>phase shifting</em>, but their state is kept separate. Each phase gets its own copy of the state.</p></li><li><p>Ensuring that the state is kept separate ensures predictable program behavior, no matter when the program is compiled.</p></li></ul><p>This summary is a simplification of phases in Racket. The full Racket module system does not have only two phases, since macros can also be <em>used</em> at compile-time to implement other macros, effectively creating a separate “compile-time” for the compile-time code. Each compile-time pass is isolated to its own phase, creating a finite but arbitrarily large number of distinct program phases (all but one of which occur at compile-time).</p><p>Furthermore, the separate compilation guarantee does not just isolate the state of each phase from the state of other phases but also isolates all compile-time state from the compile-time state of other modules. This ensures that compilation is still deterministic even if modules are compiled in a different <em>order</em>, or if several modules are sometimes compiled individually while other times compiled together all at once.</p><p>If you want to learn more, the full details of the module system are described at length in the <a href="https://docs.racket-lang.org/guide/phases.html">General Phase Levels</a> section of the Racket Guide, but the abridged summary I’ve described is enough for the purposes of this blog post. If the bulleted list above mostly made sense to you, you’re ready to move on.</p><h2 id="section:main-start">How we’re going to break it</h2><p>The separate compilation guarantee is a sturdy opponent, but it is not without weaknesses. Although no API in Racket, safe or unsafe, allows arbitrarily disabling phase separation, a couple features of Racket are already known to allow limited forms of cross-phase communication.</p><p>The most significant of these, and the one we’ll be using as our vector of attack, is the logger. Unlike many logging systems, which are exclusively string-oriented, Racket’s logging interface allows structured logging by associating an arbitrary Racket value with each and every log message. Since it is possible to set up listeners within Racket that receive log messages sent to a particular “topic,” the logger can be used as a communication channel to send values between different parts of a program.</p><p>The following program illustrates how this works. One thread creates a listener for all log messages on the topic <code>'send-me-a-value</code> using <code>make-log-receiver</code>, then uses <code>sync</code> to block until a value is received. Meanwhile, a second thread sends values through the logger using <code>log-message</code>. Together, this creates a makeshift buffered, asynchronous channel:</p><pre><code class="pygments"><span class="c1">;; log-comm.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">define</span> <span class="n">t1</span>
<span class="p">(</span><span class="nb">thread</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="n">recv</span> <span class="p">(</span><span class="nb">make-log-receiver</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">)</span> <span class="o">'</span><span class="ss">debug</span> <span class="o">'</span><span class="ss">send-me-a-value</span><span class="p">))</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">()</span>
<span class="p">(</span><span class="nb">println</span> <span class="p">(</span><span class="nb">sync</span> <span class="n">recv</span><span class="p">))</span>
<span class="p">(</span><span class="n">loop</span><span class="p">)))))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">t2</span>
<span class="p">(</span><span class="nb">thread</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">()</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">([</span><span class="n">n</span> <span class="mi">0</span><span class="p">])</span>
<span class="p">(</span><span class="nb">log-message</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">)</span> <span class="o">'</span><span class="ss">debug</span> <span class="o">'</span><span class="ss">send-me-a-value</span> <span class="s2">""</span> <span class="n">n</span> <span class="no">#f</span><span class="p">)</span>
<span class="p">(</span><span class="nb">sleep</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">add1</span> <span class="n">n</span><span class="p">))))))</span>
<span class="p">(</span><span class="nb">thread-wait</span> <span class="n">t1</span><span class="p">)</span> <span class="c1">; wait forever</span></code></pre><pre><code>$ racket log-comm.rkt
'#(debug "" 1 send-me-a-value)
'#(debug "" 2 send-me-a-value)
'#(debug "" 3 send-me-a-value)
'#(debug "" 4 send-me-a-value)
^Cuser break
</code></pre><p>In this program, the value being sent through the logger is just a number, which isn’t very interesting. But the value really can be <em>any</em> value, even arbitrary closures or mutable data structures. It’s even possible to send a <a href="https://docs.racket-lang.org/guide/concurrency.html#%28part._.Channels%29">channel</a> through a logger, which can subsequently be used to communicate directly, without having to abuse the logger.</p><p>Generally, this feature of loggers isn’t very useful, since Racket has plenty of features for cross-thread communication. What’s special about the logger, however, is that it is global, and it is cross-phase.</p><p>The cross-phase nature of the logger makes some sense. If a Racket program creates a namespace (that is, a fresh environment for dynamic evaluation), then uses it to expand and compile a Racket module, the process of compilation might produce some log messages, and the calling thread might wish to receive them. It wouldn’t be a very useful logging system if log messages during compile-time were always lost. However, this convenience is a loophole in the phase separation system, since it allows values to flow—bidirectionally—between phases.</p><p>This concept forms the foundation of our exploit, but it alone is not a new technique, and I did not discover it. However, all existing uses I know of that use the logger for cross-phase communication require control of the parent namespace in which modules are being compiled, which means some code must exist “outside” the actual program. That technique does not work for ordinary programs run directly with <code>racket</code> or compiled directly with <code>raco make</code>, so to get there, we’ll need something more clever.</p><h3><a name="the-challenge"></a>The challenge</h3><p>Our goal, therefore, is to share state between phases <em>without</em> controlling the compilation namespace. More precisely, we want to be able to create an arbitrary module-level definition that is <em>cross-phase persistent</em>, which means it will be evaluated once and only once no matter how many times its enclosing module is re-instantiated (i.e. given a fresh, untouched state) at various phases. A phase-shifted <code>require</code> of the module that contains the definition should share state with an unshifted version of the module, breaking the separate compilation guarantee wide open.</p><p>To use the example from the previous section, we should be able to adjust <code>foods.rkt</code> very slightly…</p><pre><code class="pygments"><span class="c1">;; foods.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="s2">"define-cross-phase.rkt"</span><span class="p">)</span>
<span class="p">(</span><span class="k">provide</span> <span class="n">delicious-food?</span> <span class="n">add-delicious-food!</span><span class="p">)</span>
<span class="c1">; share across phases</span>
<span class="p">(</span><span class="n">define/cross-phase</span> <span class="n">delicious-foods</span> <span class="p">(</span><span class="nb">mutable-set</span><span class="p">))</span>
<span class="cm">#| ... |#</span></code></pre><p>…and the <code>delicious-foods</code> mutable state should magically become cross-phase persistent. When running <code>check-food.rkt</code> from source, we should see the side-effects persisted from the module’s compilation, while running from pre-compiled bytecode should give us the result with compile-time effects discarded.</p><p>We already know the logger is going to be part of our exploit, but implementing <code>define/cross-phase</code> on top of it is more subtle than it might seem. In our previous example that used <code>make-log-receiver</code>, we had well-defined sender and receiver threads, but who is the “sender” in our multi-phased world? And what exactly is the sender sending?</p><p>To answer those questions, allow me to outline the general idea of our approach:</p><ol><li><p>The first time our <code>foods.rkt</code> module is instantiated, at any phase, it evaluates the <code>(mutable-set)</code> expression to produce a new mutable set. It spawns a sender thread that sends this value via the logger to anyone who will listen, and that thread lingers in the background for the remaining duration of the program.</p></li><li><p>All subsequent instantiations of <code>foods.rkt</code> do <em>not</em> evaluate the <code>(mutable-set)</code> expression. Instead, they obtain the existing set by creating a log receiver and obtaining the value the sender thread is broadcasting. This ensures that a single value is shared across all instantiations of the module.</p></li></ol><p>This sounds deceptively simple, but the crux of the problem is how to determine whether <code>foods.rkt</code> has previously been instantiated or not. Since we can only communicate across phases via the logger, we cannot use any shared state to directly record the first time the module is instantiated. We can listen to a log receiver and wait to see if we get a response, but this introduces a race condition: how long do we wait until giving up and deciding we’re the first instantiation? Worse, what if two threads instantiate the module at the same time, and both threads end up spawning a new sender thread, duplicating the state?</p><p>The true challenge, therefore, is to develop a protocol by which we can be <em>certain</em> we are the first instantiation of a module, without relying on any unspecified behavior, and without introducing any race conditions. This is possible, but it isn’t obvious, and it requires combining loggers with some extra tools available to the Racket programmer.</p><h3><a name="the-key-idea"></a>The key idea</h3><p>It’s finally time to tackle the key idea at the heart of our exploit: garbage collection. In Racket, garbage collection is an observable effect, since Racket allows attaching finalizers to values via <a href="https://docs.racket-lang.org/reference/willexecutor.html">wills and executors</a>. Since a single heap is necessarily shared by the entire VM, behavior happening on other threads (even in other phases) can be indirectly observed by creating a unique value—a “canary”—then sending it to another thread, and waiting to see if it will be garbage collected or not (that is, whether or not the canary “dies”).</p><p>Remember that logs and log receivers are effectively buffered, multicast, asynchronous FIFO channels. Since they are buffered, if any thread is already listening to a logger topic when a value is sent, it cannot possibly be garbage collected until that thread either reads it and discards it or the receiver itself is garbage collected. It’s possible to use this mechanism to observe whether or not another thread is already listening on a topic, as the following program demonstrates:<sup><a href="#footnote-3" id="footnote-ref-3-1">3</a></sup></p><pre><code class="pygments"><span class="c1">;; check-receivers.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">check-receivers</span> <span class="n">topic</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">executor</span> <span class="p">(</span><span class="nb">make-will-executor</span><span class="p">))</span>
<span class="c1">; limit scope of `canary` so we don’t retain a reference</span>
<span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="n">canary</span> <span class="p">(</span><span class="nb">gensym</span> <span class="o">'</span><span class="ss">canary</span><span class="p">))</span>
<span class="p">(</span><span class="nb">will-register</span> <span class="n">executor</span> <span class="n">canary</span> <span class="nb">void</span><span class="p">)</span>
<span class="p">(</span><span class="nb">log-message</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">)</span> <span class="o">'</span><span class="ss">debug</span> <span class="n">topic</span> <span class="s2">""</span> <span class="n">canary</span> <span class="no">#f</span><span class="p">))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">collect-garbage</span><span class="p">)</span>
<span class="p">(</span><span class="nb">collect-garbage</span><span class="p">)</span>
<span class="p">(</span><span class="nb">collect-garbage</span><span class="p">)</span>
<span class="p">(</span><span class="nb">sync/timeout</span> <span class="mi">0</span> <span class="n">executor</span><span class="p">))</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"no receivers for ~v</span><span class="se">\n</span><span class="s2">"</span> <span class="n">topic</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"receiver exists for ~v</span><span class="se">\n</span><span class="s2">"</span> <span class="n">topic</span><span class="p">)))</span>
<span class="c1">; add a receiver on topic 'foo</span>
<span class="p">(</span><span class="k">define</span> <span class="n">recv</span> <span class="p">(</span><span class="nb">make-log-receiver</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">)</span> <span class="o">'</span><span class="ss">debug</span> <span class="o">'</span><span class="ss">foo</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">t1</span> <span class="p">(</span><span class="nb">thread</span> <span class="p">(</span><span class="k">λ</span> <span class="p">()</span> <span class="p">(</span><span class="n">check-receivers</span> <span class="o">'</span><span class="ss">foo</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">t2</span> <span class="p">(</span><span class="nb">thread</span> <span class="p">(</span><span class="k">λ</span> <span class="p">()</span> <span class="p">(</span><span class="n">check-receivers</span> <span class="o">'</span><span class="ss">bar</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">thread-wait</span> <span class="n">t1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">thread-wait</span> <span class="n">t2</span><span class="p">)</span></code></pre><pre><code>$ racket check-receivers.rkt
no receivers for 'bar
receiver exists for 'foo
</code></pre><p>However, this program has some problems. For one, it needs to call <code>collect-garbage</code> several times to be certain that the canary will be collected if there are no listeners, which can take a second or two, and it also assumes that three calls to <code>collect-garbage</code> will be enough to collect the canary, though there is no guarantee that will be true.</p><p>A bulletproof solution should be both reasonably performant and guaranteed to work. To get there, we have to combine this idea with something more. Here’s the trick: instead of sending the canary alone, send a <a href="https://docs.racket-lang.org/guide/concurrency.html#%28part._.Channels%29">channel</a> alongside it. Synchronize on both the canary’s executor <em>and</em> the channel so that the thread will unblock if either the canary is collected <em>or</em> the channel is received and sent a value using <code>channel-put</code>. Have the receiver listen for the channel on a separate thread, and when it receives it, send a value back to unblock the waiting thread as quickly as possible, without needing to rely on a timeout or a particular number of calls to <code>collect-garbage</code>.</p><p>Using that idea, we can revise the program:</p><pre><code class="pygments"><span class="c1">;; check-receivers.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">check-receivers</span> <span class="n">topic</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">chan</span> <span class="p">(</span><span class="nb">make-channel</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">executor</span> <span class="p">(</span><span class="nb">make-will-executor</span><span class="p">))</span>
<span class="c1">; limit scope of `canary` so we don’t retain a reference</span>
<span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="n">canary</span> <span class="p">(</span><span class="nb">gensym</span> <span class="o">'</span><span class="ss">canary</span><span class="p">))</span>
<span class="p">(</span><span class="nb">will-register</span> <span class="n">executor</span> <span class="n">canary</span> <span class="nb">void</span><span class="p">)</span>
<span class="p">(</span><span class="nb">log-message</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">)</span> <span class="o">'</span><span class="ss">debug</span> <span class="n">topic</span> <span class="s2">""</span>
<span class="c1">; send the channel + the canary</span>
<span class="p">(</span><span class="nb">vector-immutable</span> <span class="n">chan</span> <span class="n">canary</span><span class="p">)</span> <span class="no">#f</span><span class="p">))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">([</span><span class="n">n</span> <span class="mi">0</span><span class="p">])</span>
<span class="p">(</span><span class="nb">sleep</span><span class="p">)</span> <span class="c1">; yield to try to let the receiver thread work</span>
<span class="p">(</span><span class="k">match</span> <span class="p">(</span><span class="nb">sync/timeout</span> <span class="mi">0</span>
<span class="p">(</span><span class="nb">wrap-evt</span> <span class="n">chan</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">'</span><span class="ss">received</span><span class="p">))</span>
<span class="p">(</span><span class="nb">wrap-evt</span> <span class="n">executor</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">'</span><span class="ss">collected</span><span class="p">)))</span>
<span class="p">[</span><span class="o">'</span><span class="ss">collected</span> <span class="no">#t</span><span class="p">]</span>
<span class="p">[</span><span class="o">'</span><span class="ss">received</span> <span class="no">#f</span><span class="p">]</span>
<span class="p">[</span><span class="k">_</span> <span class="c1">; collect garbage and try again</span>
<span class="p">(</span><span class="nb">collect-garbage</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb"><</span> <span class="n">n</span> <span class="mi">3</span><span class="p">)</span> <span class="o">'</span><span class="ss">minor</span> <span class="o">'</span><span class="ss">major</span><span class="p">))</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">add1</span> <span class="n">n</span><span class="p">))]))</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"no receivers for ~v</span><span class="se">\n</span><span class="s2">"</span> <span class="n">topic</span><span class="p">)</span>
<span class="p">(</span><span class="nb">printf</span> <span class="s2">"receiver exists for ~v</span><span class="se">\n</span><span class="s2">"</span> <span class="n">topic</span><span class="p">)))</span>
<span class="c1">; add a receiver on topic 'foo</span>
<span class="p">(</span><span class="k">define</span> <span class="n">recv</span> <span class="p">(</span><span class="nb">make-log-receiver</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">)</span> <span class="o">'</span><span class="ss">debug</span> <span class="o">'</span><span class="ss">foo</span><span class="p">))</span>
<span class="p">(</span><span class="nb">void</span> <span class="p">(</span><span class="nb">thread</span>
<span class="p">(</span><span class="k">λ</span> <span class="p">()</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">()</span>
<span class="p">(</span><span class="k">match</span> <span class="p">(</span><span class="nb">sync</span> <span class="n">recv</span><span class="p">)</span>
<span class="p">[(</span><span class="nb">vector</span> <span class="k">_</span> <span class="k">_</span> <span class="p">(</span><span class="nb">vector</span> <span class="n">chan</span> <span class="k">_</span><span class="p">)</span> <span class="k">_</span><span class="p">)</span>
<span class="p">(</span><span class="nb">channel-put</span> <span class="n">chan</span> <span class="no">#t</span><span class="p">)</span>
<span class="p">(</span><span class="n">loop</span><span class="p">)])))))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">t1</span> <span class="p">(</span><span class="nb">thread</span> <span class="p">(</span><span class="k">λ</span> <span class="p">()</span> <span class="p">(</span><span class="n">check-receivers</span> <span class="o">'</span><span class="ss">foo</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">t2</span> <span class="p">(</span><span class="nb">thread</span> <span class="p">(</span><span class="k">λ</span> <span class="p">()</span> <span class="p">(</span><span class="n">check-receivers</span> <span class="o">'</span><span class="ss">bar</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">thread-wait</span> <span class="n">t1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">thread-wait</span> <span class="n">t2</span><span class="p">)</span></code></pre><p>Now the program completes almost instantly. For this simple program, the explicit <code>(sleep)</code> call is effective enough at yielding that, on my machine, <code>(check-receivers 'foo)</code> returns without ever calling <code>collect-garbage</code>, and <code>(check-receivers 'bar)</code> returns after performing a single minor collection.</p><p>This is extremely close to a bulletproof solution, but there are two remaining subtle issues:</p><ol><li><p>There is technically a race condition between the <code>(sync recv)</code> in the receiver thread and the subsequent <code>channel-put</code>, since it’s possible for the canary to be received, discarded, and garbage collected before reaching the call to <code>channel-put</code>, which the sending thread would incorrectly interpret as indicating the topic has no receivers.</p><p>To fix that, the receiver thread can send the canary itself back through the channel, which fundamentally has to work, since the value cannot be collected until it has been received by the sending thread, at which point the <code>sync</code> has already chosen the channel.</p></li><li><p>It is possible for the receiver thread to receive the log message and call <code>channel-put</code>, but for the sending thread to somehow die in the meantime (which cannot be protected against in general in Racket, since <code>thread-kill</code> immediately and forcibly terminates a thread). If this were to happen, the sending thread would never obtain the value from the channel, blocking the receiving thread indefinitely.</p><p>A solution is to spawn a new thread for each <code>channel-put</code> instead of calling it directly from the receiving thread. Conveniently, this both ensures the receiving thread never gets stuck and avoids resource leaks, since the Racket runtime is smart enough to GC a thread blocked on a channel that has no other references and therefore can never be unblocked.</p></li></ol><p>With those fixes in place, the program is, to the best of my knowledge, bulletproof. It will always correctly determine whether or not a logger has a listener, with no race conditions or reliance upon unspecified behavior of the Racket runtime. It does, however, make a couple of assumptions.</p><p>First, it assumes that the value of <code>(current-logger)</code> is shared between the threads. It is true that <code>(current-logger)</code> can be changed, and sometimes is, but it’s usually done via <code>parameterize</code>, not mutation of the parameter directly. Therefore, this can largely be mitigated by storing the value of <code>(current-logger)</code> at module instantiation time.</p><p>Second, it assumes that no other receivers are listening on the same topic. Technically, even using a unique, uninterned key for the topic is insufficient to ensure that no receivers are listening to it, since a receiver can choose to listen to all topics. However, in practice, it is highly unlikely that any receiver would willfully choose to listen to all topics at the <code>'debug</code> level, since the receiver would be inundated with enormous amounts of useless information. Even if such a receiver were to be created, it is highly likely that it would dequeue the messages as quickly as possible and discard the accompanying payload, since doing otherwise would cause all messages to be retained in memory, leading to a significant memory leak.</p><p>Both these problems can be mitigated by using a logger other than the root logger, which is easy in this example. However, for the purpose of subverting the separate compilation guarantee, we would have no way to share the logger object itself across phases, defeating the whole purpose, so we are forced to use the root logger and hope the above two assumptions remain true (but they usually do).</p><h3><a name="preparing-the-exploit"></a>Preparing the exploit</h3><p>If you’ve made it here, congratulations! The most difficult part of this blog post is over. All that’s left is the fun part: performing the exploit.</p><p>The bulk of our implementation is a slightly adapted version of <code>check-receivers</code>:</p><pre><code class="pygments"><span class="c1">;; define-cross-phase.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">define</span> <span class="n">root-logger</span> <span class="p">(</span><span class="nb">current-logger</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">make-cross-phase</span> <span class="n">topic</span> <span class="k">thunk</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">receiver</span> <span class="p">(</span><span class="nb">make-log-receiver</span> <span class="n">root-logger</span> <span class="o">'</span><span class="ss">debug</span> <span class="n">topic</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">chan</span> <span class="p">(</span><span class="nb">make-channel</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">executor</span> <span class="p">(</span><span class="nb">make-will-executor</span><span class="p">))</span>
<span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="n">canary</span> <span class="p">(</span><span class="nb">gensym</span> <span class="o">'</span><span class="ss">canary</span><span class="p">))</span>
<span class="p">(</span><span class="nb">will-register</span> <span class="n">executor</span> <span class="n">canary</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">'</span><span class="ss">collected</span><span class="p">))</span>
<span class="p">(</span><span class="nb">log-message</span> <span class="n">root-logger</span> <span class="o">'</span><span class="ss">debug</span> <span class="n">topic</span> <span class="s2">""</span>
<span class="p">(</span><span class="nb">vector-immutable</span> <span class="n">canary</span> <span class="n">chan</span><span class="p">)</span> <span class="no">#f</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">()</span>
<span class="p">(</span><span class="k">match</span> <span class="p">(</span><span class="nb">sync</span> <span class="n">receiver</span><span class="p">)</span>
<span class="p">[(</span><span class="nb">vector</span> <span class="k">_</span> <span class="k">_</span> <span class="p">(</span><span class="nb">vector</span> <span class="k">_</span> <span class="p">(</span><span class="k">==</span> <span class="n">chan</span> <span class="nb">eq?</span><span class="p">))</span> <span class="k">_</span><span class="p">)</span>
<span class="p">(</span><span class="nb">void</span><span class="p">)]</span>
<span class="p">[</span><span class="k">_</span>
<span class="p">(</span><span class="n">loop</span><span class="p">)])))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">execute-evt</span> <span class="p">(</span><span class="nb">wrap-evt</span> <span class="n">executor</span> <span class="nb">will-execute</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">result</span> <span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">([</span><span class="n">n</span> <span class="mi">0</span><span class="p">])</span>
<span class="p">(</span><span class="nb">sleep</span><span class="p">)</span>
<span class="p">(</span><span class="k">or</span> <span class="p">(</span><span class="nb">sync/timeout</span> <span class="mi">0</span> <span class="n">chan</span> <span class="n">execute-evt</span><span class="p">)</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="nb">collect-garbage</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb"><</span> <span class="n">n</span> <span class="mi">3</span><span class="p">)</span> <span class="o">'</span><span class="ss">minor</span> <span class="o">'</span><span class="ss">major</span><span class="p">))</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">add1</span> <span class="n">n</span><span class="p">))))))</span>
<span class="p">(</span><span class="k">match</span> <span class="n">result</span>
<span class="p">[(</span><span class="nb">vector</span> <span class="k">_</span> <span class="n">value</span><span class="p">)</span>
<span class="n">value</span><span class="p">]</span>
<span class="p">[</span><span class="o">'</span><span class="ss">collected</span>
<span class="p">(</span><span class="k">define</span> <span class="n">value</span> <span class="p">(</span><span class="k">thunk</span><span class="p">))</span>
<span class="p">(</span><span class="nb">thread</span>
<span class="p">(</span><span class="k">λ</span> <span class="p">()</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">()</span>
<span class="p">(</span><span class="k">match</span> <span class="p">(</span><span class="nb">sync</span> <span class="n">receiver</span><span class="p">)</span>
<span class="p">[(</span><span class="nb">vector</span> <span class="k">_</span> <span class="k">_</span> <span class="p">(</span><span class="nb">vector</span> <span class="n">canary</span> <span class="n">chan</span><span class="p">)</span> <span class="k">_</span><span class="p">)</span>
<span class="p">(</span><span class="nb">thread</span> <span class="p">(</span><span class="k">λ</span> <span class="p">()</span> <span class="p">(</span><span class="nb">channel-put</span> <span class="n">chan</span> <span class="p">(</span><span class="nb">vector-immutable</span> <span class="n">canary</span> <span class="n">value</span><span class="p">))))</span>
<span class="p">(</span><span class="n">loop</span><span class="p">)]))))</span>
<span class="n">value</span><span class="p">]))</span></code></pre><p>There are a few minor differences, which I’ll list:</p><ol><li><p>The most obvious difference is that <code>make-cross-phase</code> does the work of both checking if a receiver exists—which I’ll call the <em>manager thread</em>—and spawning it if it doesn’t. If it does end up spawning a manager thread, it evaluates the given thunk to produce a value, which becomes the cross-phase value that will be sent through the channel alongside the canary.</p></li><li><p>Once the manager thread is created, subsequent calls to <code>make-cross-phase</code> will receive the value through the channel and return it instead of re-invoking <code>thunk</code>. This is what ensures the right-hand side of each use of <code>define/cross-phase</code> is only ever evaluated once.</p></li><li><p>Since <code>make-cross-phase</code> needs to create a log receiver if no manager thread exists, it does so immediately, before sending the canary through the logger. This avoids a race condition between multiple threads that are simultaneously competing to become the manager thread, where both threads could send a canary through the logger before either was listening, both canaries would get GC’d, and both threads would spawn a new manager.<sup><a href="#footnote-4" id="footnote-ref-4-1">4</a></sup></p><p>Creating the receiver before sending the canary avoids this problem, but the thread now needs to receive its own canary and discard it before synchronizing on the channel and executor, since otherwise it will retain a reference to the canary. It’s possible that in between creating the receiver and sending the canary, another thread also sent a canary, so it needs to drop any log messages it finds that don’t include its own canary.</p><p>This ends up working out perfectly, since every thread drops all the messages received before the one containing its own canary, but retains all subsequent values. This means that only one thread can ever “win” and become the manager, since the first thread to send a canary is guaranteed to retain all subsequent canaries, yet also guaranteed its canary will be GC’d. Other threads racing to become the manager will remain blocked until the manager thread is created, since its canaries will be retained by the manager-to-be until it dequeues them.</p><p>(This is the most subtle part of the process to get right, but conveniently, it mostly just works out without very much code. If you didn’t understand any of the above three paragraphs, it isn’t a big deal.)</p></li></ol><p>The final piece to this puzzle is to define the <code>define/cross-phase</code> macro that wraps <code>make-cross-phase</code>. The macro is actually slightly more involved than just generating a call to <code>make-cross-phase</code> directly, since we’d like to use an uninterned symbol for the topic instead of an interned one, just to ensure it is unique. Ordinarily, this might seem impossible, since an uninterned symbol is fundamentally a unique value that needs to be communicated across phases, and the whole problem we are solving is creating a communication channel that spans phases. However, Racket actually provides some built-in support for sharing uninterned symbols across phases (plus some other kinds of values, but they must always be immutable). To do this, we need to generate a <a href="https://docs.racket-lang.org/reference/eval-model.html#%28part._cross-phase._persistent-modules%29">cross-phase persistent submodule</a> that exports an uninterned symbol, then pass that symbol as the topic to <code>make-cross-phase</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">for-syntax</span> <span class="n">racket/syntax</span><span class="p">)</span>
<span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="k">provide</span> <span class="n">define/cross-phase</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">define/cross-phase</span> <span class="n">x:id</span> <span class="n">e:expr</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">topic-mod-name</span> <span class="p">(</span><span class="n">generate-temporary</span> <span class="o">'</span><span class="ss">cross-phase-topic-key</span><span class="p">)</span>
<span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">module</span> <span class="n">topic-mod-name</span> <span class="o">'</span><span class="ss">#%kernel</span>
<span class="p">(</span><span class="k">#%declare</span> <span class="kd">#:cross-phase-persistent</span><span class="p">)</span>
<span class="p">(</span><span class="k">#%provide</span> <span class="n">topic</span><span class="p">)</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">[</span><span class="n">topic</span><span class="p">]</span> <span class="p">(</span><span class="nb">gensym</span> <span class="s2">"cross-phase"</span><span class="p">)))</span>
<span class="p">(</span><span class="k">require</span> <span class="o">'</span><span class="ss">topic-mod-name</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">x</span> <span class="p">(</span><span class="n">make-cross-phase</span> <span class="n">topic</span> <span class="p">(</span><span class="k">λ</span> <span class="p">()</span> <span class="n">e</span><span class="p">)))))</span></code></pre><p>And that’s really it. We’re done.</p><h3><a name="executing-the-exploit"></a>Executing the exploit</h3><p>With our implementation of <code>define/cross-phase</code> in hand, all that’s left to do is run our original <code>check-foods.rkt</code> program and see what happens:</p><pre><code class="pygments">$ racket check-food.rkt <span class="s1">'fried chicken'</span>
set-add!: contract violation:
expected: set?
given: <span class="o">(</span>mutable-set <span class="s2">"fried chicken"</span> <span class="s2">"roasted chicken"</span> <span class="s2">"roasted potato"</span> <span class="s2">"fried potato"</span><span class="o">)</span>
argument position: 1st
other arguments...:
x: <span class="s2">"pineapple"</span></code></pre><p>Well, I don’t know what you expected. Play stupid games, win stupid prizes.</p><p>This error actually makes sense, but it belies one reason (of many) why this whole endeavor is probably a bad idea. Although we’ve managed to make our mutable set cross-phase persistent, our references to set operations like <code>set-add!</code> and <code>set-member?</code> are not, and every time <code>racket/set</code> is instantiated in a fresh phase, it creates an entirely new instance of the <code>set</code> structure type. This means that even though we have a bona fide mutable set, it isn’t actually the type of set that this phase’s <code>set-add!</code> understands!</p><p>Of course, this isn’t a problem that some liberal application of <code>define/cross-phase</code> can’t solve:</p><pre><code class="pygments"><span class="c1">;; foods.rkt</span>
<span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="s2">"define-cross-phase.rkt"</span><span class="p">)</span>
<span class="p">(</span><span class="k">provide</span> <span class="n">delicious-food?</span> <span class="n">add-delicious-food!</span><span class="p">)</span>
<span class="p">(</span><span class="n">define/cross-phase</span> <span class="n">cross:set-member?</span> <span class="nb">set-member?</span><span class="p">)</span>
<span class="p">(</span><span class="n">define/cross-phase</span> <span class="n">cross:set-add!</span> <span class="nb">set-add!</span><span class="p">)</span>
<span class="p">(</span><span class="n">define/cross-phase</span> <span class="n">delicious-foods</span> <span class="p">(</span><span class="nb">mutable-set</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">delicious-food?</span> <span class="n">food</span><span class="p">)</span>
<span class="p">(</span><span class="n">cross:set-member?</span> <span class="n">delicious-foods</span> <span class="n">food</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">add-delicious-food!</span> <span class="n">new-food</span><span class="p">)</span>
<span class="p">(</span><span class="n">cross:set-add!</span> <span class="n">delicious-foods</span> <span class="n">new-food</span><span class="p">))</span></code></pre><pre><code class="pygments">$ racket check-food.rkt <span class="s1">'fried chicken'</span>
fried chicken is a delicious food.
$ raco make check-food.rkt
$ racket check-food.rkt <span class="s1">'fried chicken'</span>
fried chicken is not delicious.</code></pre><p>And thus we find that another so-called “guarantee” isn’t.</p><h2><a name="reflection"></a>Reflection</h2><p>Now comes the time in the blog post when I have to step back and think about what I’ve done. Have mercy.</p><p>Everything in this blog post is a terrible idea. No, you should not use loggers for anything that isn’t logging, you shouldn’t use wills and executors for critical control flow, and obviously you should absolutely not intentionally break one of the most helpful guarantees the Racket module system affords you.</p><p>But I thought it was fun to do all that, anyway.</p><p>The meaningful takeaways from this blog post aren’t that the separate compilation guarantee can be broken, nor that any of the particular techniques I used hold, but that</p><ol><li><p>ensuring non-trivial guarantees is really hard,</p></li><li><p>despite that, the separate compilation guarantee is really, really hard to break,</p></li><li><p>the separate compilation guarantee is good, and you should appreciate the luxury it affords you while writing Racket macros,</p></li><li><p>avoiding races in a concurrent environment can be extremely subtle,</p></li><li><p>and Racket is totally <em>awesome</em> for giving me this much rope to hang myself with.</p></li></ol><p>If you want to hang yourself with Racket, too, <a href="https://gist.github.com/lexi-lambda/f173a84fc9727977bcea657b3bb0cd4f">runnable code from this blog post is available here</a>.</p><ol class="footnotes"><li id="footnote-1"><p>This isn’t <em>strictly</em> true, since Racket provides sandboxing mechanisms that can compile and execute untrusted code without file system or network access, but this is not the default compilation mode. Usually, it doesn’t matter nearly as much as it might sound: most of the time, if you’re compiling untrusted code, you’re also going to run it, and running untrusted code can do all those things, anyway. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>This is actually a <em>terrible</em> use case for a macro, since an ordinary function would do just fine, but I’m simplifying a little to keep the example small. <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>Racket actually provides this functionality directly via the <code>log-level?</code> procedure. However, since <code>log-level?</code> provides no way to determine how <em>many</em> receivers are listening to a topic, using it to guard against creating a receiver is vulnerable to a race condition that the garbage collection-based approach can avoid, as is discussed later. Furthermore, the GC technique is more likely to be resilient to nosy log receivers listening on all topics at the <code>'debug</code> level, since they will almost certainly dequeue and discard the value quickly (as otherwise they would leak large quantities of memory). <a href="#footnote-ref-3-1">↩</a></p></li><li id="footnote-4"><p>This race is the one that makes using <code>log-level?</code> untenable, since the receiver needs to be created before the topic is checked for listeners to avoid the race, which can’t be done with <code>log-level?</code> (since it would always return <code>#t</code>). <a href="#footnote-ref-4-1">↩</a></p></li></ol></article>Macroexpand anywhere with local-apply-transformer!2018-10-06T00:00:00Z2018-10-06T00:00:00ZAlexis King<article><p>Racket programmers are accustomed to the language’s incredible capacity for extension and customization. Writing useful macros that do complicated things is easy, and it’s simple to add new syntactic forms to meet domain-specific needs. However, it doesn’t take long before many budding macrologists bump into the realization that only <em>certain positions</em> in Racket code are subject to macroexpansion.</p><p>To illustrate, consider a macro that provides a Clojure-style <code>let</code> form:</p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">clj-let</span> <span class="p">[{</span><span class="n">~seq</span> <span class="n">x:id</span> <span class="n">e:expr</span><span class="p">}</span> <span class="k">...</span><span class="p">]</span> <span class="n">body:expr</span> <span class="n">...+</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">x</span> <span class="n">e</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">))</span></code></pre><p>This can be used anywhere an expression is expected, and it does as one would expect:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="n">clj-let</span> <span class="p">[</span><span class="n">x</span> <span class="mi">1</span>
<span class="n">y</span> <span class="mi">2</span><span class="p">]</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">))</span>
<span class="mi">3</span></code></pre><p>However, a novice macro programmer might realize that <code>clj-let</code> really only modifies the syntax of <em>binding pairs</em> for a <code>let</code> form. Therefore, could one define a macro that only adjusts the binding pairs of some existing <code>let</code> form instead of expanding to an entire <code>let</code>? That is, could one write the above example like this:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">clj-binding-pairs</span> <span class="p">[{</span><span class="n">~seq</span> <span class="n">x:id</span> <span class="n">e:expr</span><span class="p">}</span> <span class="k">...</span><span class="p">])</span>
<span class="p">([</span><span class="n">x</span> <span class="n">e</span><span class="p">]</span> <span class="k">...</span><span class="p">))</span>
<span class="nb">></span> <span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="n">clj-binding-pairs</span>
<span class="p">[</span><span class="n">x</span> <span class="mi">1</span>
<span class="n">y</span> <span class="mi">2</span><span class="p">])</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">))</span>
<span class="mi">3</span></code></pre><p>The answer is <em>no</em>: the binding pairs of a <code>let</code> form are not subject to macroexpansion, so the above attempt fails with a syntax error. In this blog post, we will examine the reasons behind this limitation, then explain how to overcome it using a solution that allows macroexpansion <em>anywhere</em> in a Racket program.</p><h2><a name="why-only-some-positions-are-subject-to-macroexpansion"></a>Why only some positions are subject to macroexpansion</h2><p>To understand <em>why</em> the macroexpander refuses to touch certain positions in a program, we must first understand how the macro system operates. In Racket, a macro is defined as a compile-time function associated with a particular binding, and macros are given complete control over the syntax trees they are surrounded with. If we define a macro <em><code>mac</code></em>, then we write the expression <code>(<em>mac</em> <em>form</em>)</code>, <em><code>form</code></em> is provided as-is to <em><code>mac</code></em> as a syntax object. Its structure can be anything at all, since <em><code>mac</code></em> can be an arbitrary Racket function, and that function can use <em><code>form</code></em> however it pleases.</p><p>To give a concrete illustration, consider a macro that binds some identifiers to symbols in a local scope:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">let-symbols</span> <span class="p">(</span><span class="n">x:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="n">...+</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">x</span> <span class="o">'</span><span class="ss">x</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">let-symbols</span> <span class="p">(</span><span class="n">hello</span> <span class="n">goodbye</span><span class="p">)</span>
<span class="p">(</span><span class="nb">list</span> <span class="n">hello</span> <span class="n">goodbye</span><span class="p">))</span>
<span class="o">'</span><span class="p">(</span><span class="ss">hello</span> <span class="ss">goodbye</span><span class="p">)</span></code></pre><p>It isn’t the most exciting macro in the world, but it illustrates a key point: the first subform to <code>let-symbols</code> is a list of identifiers that are eventually put in <em>binding</em> position. This means that <code>hello</code> and <code>goodbye</code> are bindings, not uses, and such bindings shadow any existing bindings that might have been in scope:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">foo</span> <span class="mi">42</span><span class="p">])</span>
<span class="p">(</span><span class="n">let-symbols</span> <span class="p">(</span><span class="n">foo</span><span class="p">)</span>
<span class="n">foo</span><span class="p">))</span>
<span class="o">'</span><span class="ss">foo</span></code></pre><p>This might not seem very interesting, but it’s critical to understand, since it means that the expander <em>can’t know</em> which sub-pieces of a use of <code>let-symbols</code> will eventually be expressions themselves until it expands the macro and discovers it produces a <code>let</code> form, so it can’t know where it’s safe to perform macroexpansion. To make this more explicit, imagine we define a macro under some name, then try and use that name with our <code>let-symbols</code> macro:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">hello</span> <span class="n">x:id</span><span class="p">)</span>
<span class="p">(</span><span class="n">x:id</span><span class="p">))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">let-symbols</span> <span class="p">(</span><span class="n">hello</span> <span class="n">goodbye</span><span class="p">)</span>
<span class="n">hello</span><span class="p">)</span></code></pre><p>What should the above program do? If we treat the first use of <code>hello</code> in the <code>let-symbols</code> form as a macro application, then <code>(hello goodbye)</code> should be transformed into <code>(goodbye)</code>, and the use of <code>hello</code> in the body should be a syntax error. But if the first use of <code>hello</code> was instead intended to be a binder, then it should shadow the <code>hello</code> definition above, and the output of the program should be <code>'hello</code>.</p><p>To avoid the chaos that would ensue if defining a macro could completely break local reasoning about other macros, Racket chooses the second option, and the program produces <code>'hello</code>. The macroexpander has no way of knowing <em>how</em> each macro will inspect its constituent pieces, so it avoids touching anything until the macro expands. After it discovers the <code>let</code> form in the expansion of <code>let-symbols</code>, it can safely determine that the body expressions are, indeed, expressions, and it can recursively expand any macros they contain. To put things another way, a macro’s sub-forms are never expanded before the macro itself is expanded, only after.</p><h2><a name="forcing-sub-form-expansion"></a>Forcing sub-form expansion</h2><p>The above section explains why the expander must operate as it does, but it’s a little bit unsatisfying. What if we write a macro where we <em>want</em> certain sub-forms to be expanded before they are passed to us? Fortunately, the Racket macro system provides an API to handle this use case, too.</p><p>It is true that the Racket macro system never <em>automatically</em> expands sub-forms before outer forms are expanded, but macro transformers can explicitly op-in to recursive expansion via the <a href="http://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._local-expand%29%29"><code>local-expand</code></a> function. This function effectively yields control back to the expander to expand some arbitrary piece of syntax as an expression, and when it returns, the macro transformer can inspect the expanded expression however it wishes. In theory, this can be used to implement extensible macros that allow macroexpansion in locations other than expression position.</p><p>To give an example of such a macro, consider the Racket <code>match</code> form, which implements an expressive pattern-matcher as a macro. One of the most interesting qualities of Racket’s <code>match</code> macro is that its pattern language is user-extensible, essentially allowing pattern-level macros. For example, a user might find they frequently match against natural numbers, and they wish to be able to write <code>(nat n)</code> as a shorthand for <code>(? exact-nonnegative-integer? n)</code>. Fortunately, this is easy using <code>define-match-expander</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-match-expander</span> <span class="n">nat</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">pat</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">?</span> <span class="nb">exact-nonnegative-integer?</span> <span class="n">pat</span><span class="p">)]))</span>
<span class="nb">></span> <span class="p">(</span><span class="k">match</span> <span class="o">'</span><span class="p">(</span><span class="mi">-5</span> <span class="mi">-2</span> <span class="mi">4</span> <span class="mi">-7</span><span class="p">)</span>
<span class="p">[(</span><span class="nb">list</span> <span class="k">_</span> <span class="k">...</span> <span class="p">(</span><span class="n">nat</span> <span class="n">n</span><span class="p">)</span> <span class="k">_</span> <span class="k">...</span><span class="p">)</span>
<span class="n">n</span><span class="p">])</span>
<span class="mi">4</span></code></pre><p>Clearly, <code>match</code> is somehow expanding the <code>nat</code> match expander as a part of its expansion. Is it using <code>local-expand</code>?</p><p>Well, no. While <a href="/blog/2018/04/15/reimplementing-hackett-s-type-language-expanding-to-custom-core-forms-in-racket/">a previous blog post of mine</a> has illustrated that it is possible to do such a thing with <code>local-expand</code> via some clever trickery, <code>local-expand</code> is really designed to expand <em>expressions</em>. This is a problem, since <code>(nat n)</code> is not an expression, it’s a pattern: it will expand into <code>(? exact-nonnegative-integer? n)</code>, which will lead to a syntax error, since <code>?</code> is not bound in the world of expressions. Instead, for a long while, <code>match</code> and forms like it have emulated how the expander performs macroexpansion in ad-hoc ways. Fortunately, as of Racket v7.0, the new <a href="http://docs.racket-lang.org/syntax/transformer-helpers.html#%28def._%28%28lib._syntax%2Fapply-transformer..rkt%29._local-apply-transformer%29%29"><code>local-apply-transformer</code></a> API provides a way to invoke recursive macroexpansion in a consistent way, and it doesn’t assume that what’s being expanded is an expression.</p><h3><a name="a-closer-look-at-local-apply-transformer"></a>A closer look at <code>local-apply-transformer</code></h3><p>If <code>local-apply-transformer</code> is the answer, what does it actually do? Well, <code>local-apply-transformer</code> allows explicitly invoking a transformer function on some piece of syntax and retrieving the result. In other words, <code>local-apply-transformer</code> allows expanding an arbitrary macro, but since it doesn’t make any assumptions about what the output will be, it only expands it <em>once</em>: just a single step of macro transformation.</p><p>To illustrate, we can write a macro that uses <code>local-apply-transformer</code> to invoke a transformer function and preserve the result using <code>quote-syntax</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">for-syntax</span> <span class="n">syntax/apply-transformer</span><span class="p">))</span>
<span class="p">(</span><span class="k">define-for-syntax</span> <span class="n">flip</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="n">a</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">b</span> <span class="n">a</span> <span class="n">more</span> <span class="k">...</span><span class="p">)]))</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">mac</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">result</span> <span class="p">(</span><span class="n">local-apply-transformer</span> <span class="n">flip</span> <span class="o">#'</span><span class="p">(([</span><span class="n">x</span> <span class="mi">1</span><span class="p">])</span> <span class="k">let</span> <span class="n">x</span><span class="p">)</span> <span class="o">'</span><span class="ss">expression</span><span class="p">)</span>
<span class="p">(</span><span class="k">quote-syntax</span> <span class="n">result</span><span class="p">))</span></code></pre><p>When we use <code>mac</code>, our <code>flip</code> function will be applied, as a macro, to the syntax object we provide:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="n">mac</span><span class="p">)</span>
<span class="n">#<syntax</span> <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="n">x</span> <span class="mi">1</span><span class="p">))</span> <span class="n">x</span><span class="p">)</span><span class="nb">></span></code></pre><p>Alright, so this works, but it raises some questions. Why is <code>flip</code> defined as a function at phase 1 (using <code>define-for-syntax</code>) instead of as a macro (using <code>define-syntax</code>)? What’s the deal with the <code>'expression</code> argument to <code>local-apply-transformer</code> given that <code>local-apply-transformer</code> is supposedly decoupled from expression expansion? And finally, how is this any different from just calling our <code>flip</code> function on the syntax object directly by writing <code>(flip #'(([x 1]) let x))</code>?</p><p>Let’s start with the first of those questions: why is <code>flip</code> defined as a function rather than as a macro? Well, <code>local-apply-transformer</code> is a fairly low-level operation: remember, it doesn’t assume <em>anything</em> about the argument it’s given! Therefore, it doesn’t take an expression containing a macro and expand it based on its structure, it needs to be explicitly provided the macro transformer function to apply. In practice, this might not seem very useful, since presumably we want to write our macros as macros, not as phase 1 functions. Fortunately, it’s possible to look up the function associated with a macro binding using the <a href="http://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._syntax-local-value%29%29"><code>syntax-local-value</code></a> function, so if we use that, we can define <code>flip</code> using <code>define-syntax</code> as usual:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">flip</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="n">a</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">b</span> <span class="n">a</span> <span class="n">more</span> <span class="k">...</span><span class="p">)]))</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">mac</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">result</span> <span class="p">(</span><span class="n">local-apply-transformer</span> <span class="p">(</span><span class="nb">syntax-local-value</span> <span class="o">#'</span><span class="n">flip</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(([</span><span class="n">x</span> <span class="mi">1</span><span class="p">])</span> <span class="k">let</span> <span class="n">x</span><span class="p">)</span>
<span class="o">'</span><span class="ss">expression</span><span class="p">)</span>
<span class="p">(</span><span class="k">quote-syntax</span> <span class="n">result</span><span class="p">))</span></code></pre><p>Now for the next question: what is the meaning of the <code>'expression</code> argument? This one is more of a historical artifact than anything else: when the expander applies a macro transformer, it does it in a “context”, which is accessible using the <a href="http://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._syntax-local-context%29%29"><code>syntax-local-context</code></a> function. This context can be one of a predefined enumeration of cases, including <code>'expression</code>, <code>'top-level</code>, <code>'module</code>, <code>'module-begin</code>, or a list representing a definition context. Whether or not any of those actually apply to our use case, we still have to pick one, but aside from how they affect the value returned by <code>syntax-local-context</code> (which some macros inspect), the value we choose is largely irrelevant. Using <code>'expression</code> will do, even if it’s a bit of a lie.</p><p>Finally, how does any of this differ from just applying the function we get directly? Well, the critical answer is all about <em>hygiene</em>. Racket’s macro system is hygienic, which, among other things, ensures bindings defined with the same name in different places do not unintentionally conflict. Racket’s hygiene mechanism is implemented in the macroexpander, when macro transformers are applied. If we just applied the <code>flip</code> transformer procedure to a syntax object directly, we would circumvent this hygiene mechanism, potentially causing all sorts of problems. By using <code>local-apply-transformer</code>, we ensure hygiene is preserved.</p><p>There is one small problem left with our program, however. Can you spot it? The key is to consider what would happen if we used <code>flip</code> as an ordinary macro, without using <code>local-apply-transformer</code>:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="n">flip</span> <span class="p">(([</span><span class="n">x</span> <span class="mi">1</span><span class="p">])</span> <span class="k">let</span> <span class="n">x</span><span class="p">))</span>
<span class="n">let:</span> <span class="n">bad</span> <span class="k">syntax</span>
<span class="n">in:</span> <span class="k">let</span></code></pre><p>What happened? Well, remember that when a macro in Racket is used, it receives the whole use site as a syntax object: in this case, <code>#'(flip (([x 1]) let x))</code>. This means that <code>flip</code> ought to be written to parse its argument slightly differently:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">flip</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="p">(</span><span class="n">a</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">))</span>
<span class="o">#'</span><span class="p">(</span><span class="n">b</span> <span class="n">a</span> <span class="n">more</span> <span class="k">...</span><span class="p">)]))</span></code></pre><p>Indeed, now that we’ve properly restructured the macro, we can easily switch to using the convenient <code>define-simple-macro</code> shorthand:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">flip</span> <span class="p">(</span><span class="n">a</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">))</span>
<span class="p">(</span><span class="n">b</span> <span class="n">a</span> <span class="n">more</span> <span class="k">...</span><span class="p">))</span></code></pre><p>This means we also need to update our definition of <code>mac</code> to provide the full syntax object the expander would:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">mac</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">result</span> <span class="p">(</span><span class="n">local-apply-transformer</span> <span class="p">(</span><span class="nb">syntax-local-value</span> <span class="o">#'</span><span class="n">flip</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">flip</span> <span class="p">(([</span><span class="n">x</span> <span class="mi">1</span><span class="p">])</span> <span class="k">let</span> <span class="n">x</span><span class="p">))</span>
<span class="o">'</span><span class="ss">expression</span><span class="p">)</span>
<span class="p">(</span><span class="k">quote-syntax</span> <span class="n">result</span><span class="p">))</span></code></pre><p>This might seem redundant, but remember, <code>local-apply-transformer</code> is very low-level! While the convention that <code>(<em>mac</em> . _)</code> is the syntax for a macro transformation might seem obvious, <code>local-apply-transformer</code> makes no assumptions. It just does what we tell it to do.</p><h3><a name="applying-local-apply-transformer"></a>Applying <code>local-apply-transformer</code></h3><p>So what does <code>local-apply-transformer</code> have to do with the problem at the beginning of this blog post? Well, as it happens, we can use <code>local-apply-transformer</code> to implement a macro that allows expansion <em>anywhere</em> using some simple tricks. While it’s true that we cannot magically divine which locations ought to be expanded, what we <em>can</em> do is explicitly annotate which places to expand.</p><p>To do this, we will implement a macro, <code>expand-inside</code>, that looks for subforms annotated with a special <code>$expand</code> identifier and performs macro transformation on those locations before proceeding with ordinary macroexpansion. Using the <code>clj-binding-pairs</code> example from the beginning of this blog post, our solution to that problem will look like this:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">clj-binding-pairs</span> <span class="p">[{</span><span class="n">~seq</span> <span class="n">x:id</span> <span class="n">e:expr</span><span class="p">}</span> <span class="k">...</span><span class="p">])</span>
<span class="p">([</span><span class="n">x</span> <span class="n">e</span><span class="p">]</span> <span class="k">...</span><span class="p">))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">expand-inside</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="n">$expand</span>
<span class="p">(</span><span class="n">clj-binding-pairs</span>
<span class="p">[</span><span class="n">x</span> <span class="mi">1</span>
<span class="n">y</span> <span class="mi">2</span><span class="p">]))</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">)))</span>
<span class="mi">3</span></code></pre><p>Put another way, <code>expand-inside</code> will force eager expansion on any subform surrounded with an <code>$expand</code> annotation.</p><p>We’ll start by defining the <code>$expand</code> binding itself. This binding won’t mean anything at all outside of <code>expand-inside</code>, but we’d like it to be a unique binding so that users can rename it (using, <code>rename-in</code>, for example) if they wish. To do this, we’ll use the usual trick of defining it as a macro that always produces an error if it’s ever used:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="p">(</span><span class="n">$expand</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="nb">raise-syntax-error</span> <span class="no">#f</span> <span class="s2">"illegal outside an ‘expand-inside’ form"</span> <span class="n">stx</span><span class="p">))</span></code></pre><p>Next, we’ll implement a syntax class that will form the bulk of our implementation of <code>expand-inside</code>. Since we need to find uses of <code>$expand</code> that might be deeply-nested inside the syntax object provided to <code>expand-inside</code>, we need to recursively look through the syntax object, find any instances of <code>$expand</code>, and put it all back together once we’re done. This can be done relatively cleanly using a recursive syntax class:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">do-expand-inside</span>
<span class="kd">#:literals</span> <span class="p">[</span><span class="n">$expand</span><span class="p">]</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">{</span><span class="n">~or</span> <span class="n">$expand</span> <span class="p">(</span><span class="n">$expand</span> <span class="o">.</span> <span class="k">_</span><span class="p">)}</span>
<span class="kd">#:with</span> <span class="n">:do-expand-inside</span> <span class="p">(</span><span class="n">do-$expand</span> <span class="n">this-syntax</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">a:do-expand-inside</span> <span class="o">.</span> <span class="n">b:do-expand-inside</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">reassembled</span> <span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">a.expansion</span><span class="p">)</span>
<span class="p">(</span><span class="n">attribute</span> <span class="n">b.expansion</span><span class="p">))])</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">syntax?</span> <span class="n">this-syntax</span><span class="p">)</span>
<span class="p">(</span><span class="nb">datum->syntax</span> <span class="n">this-syntax</span> <span class="n">reassembled</span>
<span class="n">this-syntax</span> <span class="n">this-syntax</span><span class="p">)</span>
<span class="n">reassembled</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="k">_</span> <span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]))</span></code></pre><p>There are some tricky details to get right in the reassembly of pairs, since syntax lists are actually composed of ordinary pairs rather than syntax pairs, but ultimately, the code for walking a syntax object is small. The key case of this syntax class is the call to <code>do-$expand</code> in the first clause, which we have not yet defined. This function will actually handle performing the expansion by invoking <code>local-apply-transformer</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">do-$expand</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="n">stx</span>
<span class="p">[(</span><span class="k">_</span> <span class="p">{</span><span class="n">~and</span> <span class="n">form</span> <span class="p">{</span><span class="n">~or</span> <span class="n">trans</span> <span class="p">(</span><span class="n">trans</span> <span class="o">.</span> <span class="k">_</span><span class="p">)}})</span>
<span class="kd">#:declare</span> <span class="n">trans</span> <span class="p">(</span><span class="n">static</span> <span class="p">(</span><span class="nb">disjoin</span> <span class="nb">procedure?</span> <span class="nb">set!-transformer?</span><span class="p">)</span>
<span class="s2">"syntax transformer"</span><span class="p">)</span>
<span class="p">(</span><span class="n">local-apply-transformer</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">trans.value</span><span class="p">)</span>
<span class="o">#'</span><span class="n">form</span>
<span class="o">'</span><span class="ss">expression</span><span class="p">)])))</span></code></pre><p>This uses the handy <code>static</code> syntax class that comes with <code>syntax/parse</code>, which implicitly handles the call to <code>syntax-local-value</code> and produces a nice error message if the value returned does not match a predicate. All we have to do is apply the transformer value bound to the <code>trans.value</code> attribute using <code>local-apply-transformer</code>, and now the <code>expand-macro</code> can be written in just a couple lines of code:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-syntax-parser</span> <span class="n">expand-inside</span>
<span class="kd">#:track-literals</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">form:do-expand-inside</span><span class="p">)</span> <span class="o">#'</span><span class="n">form.expansion</span><span class="p">])</span></code></pre><p>(Using the <code>#:track-literals</code> option, also new in Racket v7.0, ensures that Check Syntax will be able to recognize the uses of <code>$expand</code> that disappear from after <code>expand-inside</code> is expanded.)</p><p>Putting everything together, our example from above really works:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">clj-binding-pairs</span> <span class="p">[{</span><span class="n">~seq</span> <span class="n">x:id</span> <span class="n">e:expr</span><span class="p">}</span> <span class="k">...</span><span class="p">])</span>
<span class="p">([</span><span class="n">x</span> <span class="n">e</span><span class="p">]</span> <span class="k">...</span><span class="p">))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">expand-inside</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="n">$expand</span>
<span class="p">(</span><span class="n">clj-binding-pairs</span>
<span class="p">[</span><span class="n">x</span> <span class="mi">1</span>
<span class="n">y</span> <span class="mi">2</span><span class="p">]))</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">)))</span>
<span class="mi">3</span></code></pre><p>That’s it. All told, the entire implementation is only about 30 lines of code. For a full, compilable, working example, see <a href="https://gist.github.com/lexi-lambda/65d69043023b519694f50dfca2dc7d33">this gist</a>.</p><ol class="footnotes"></ol></article>Custom core forms in Racket, part II: generalizing to arbitrary expressions and internal definitions2018-09-13T00:00:00Z2018-09-13T00:00:00ZAlexis King<article><p>In my <a href="/blog/2018/04/15/reimplementing-hackett-s-type-language-expanding-to-custom-core-forms-in-racket/">previous blog post</a>, I covered the process involved in creating a small language with a custom set of core forms. Specifically, it discussed what was necessary to create Hackett’s type language, which involved expanding to custom expressions. While somewhat involved, Hackett’s type language was actually a relatively simple example to use, since it only made use of a subset of the linguistic features Racket supports. In this blog post, I’ll demonstrate how that same technique can be generalized to support runtime bindings and internal definitions, two key concepts useful if intending to develop a more featureful language than Hackett’s intentionally-restrictive type system.</p><h2><a name="what-are-internal-definitions"></a>What are internal definitions?</h2><p>This blog post is going to be largely focused on how to properly implement a form that handles the expansion of <em>internal definitions</em> in Racket. This is a tricky topic to get right, but before we can discuss internal definitions, we have to establish what definitions themselves are and how they relate to other binding forms.</p><p>In a traditional Lisp, there are two kinds of bindings: top-level bindings and local bindings. In Scheme and its descendants, this distinction is characterized by two different binding forms, <code>define</code> and <code>let</code>. To a first approximation, <code>define</code> is used for defining top-level, global bindings, and it resembles variable definitions in many mainstream languages in the sense that definitions using <code>define</code> are not really expressions. They don’t produce a value, they define a new binding. Definitions written with <code>define</code> look like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="n">x</span> <span class="mi">42</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">y</span> <span class="s2">"hello"</span><span class="p">)</span></code></pre><p>Each definition is made up of two parts: the <em>binding identifier</em>, in this case <code>x</code> and <code>y</code>, and the <em>right hand side</em>, or RHS for short. Each RHS is a single expression that will be evaluated and used as the value for the introduced binding.</p><p>In Scheme and Racket, <code>define</code> also supports a shorthand form for defining functions in a natural syntax without the explicit need to write <code>lambda</code>, which looks like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">double</span> <span class="n">x</span><span class="p">)</span>
<span class="p">(</span><span class="nb">*</span> <span class="n">x</span> <span class="mi">2</span><span class="p">))</span></code></pre><p>However, this is just syntactic sugar. The above form is really just a macro for the following equivalent, expanded version:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="n">double</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">*</span> <span class="n">x</span> <span class="mi">2</span><span class="p">)))</span></code></pre><p>Since we only care about fully-expanded programs, we’ll focus exclusively on the expanded version of <code>define</code> in this blog post, since if we handle that, we’ll also handle the function shorthand’s expansion.</p><p>In contrast to <code>define</code>, there is also <code>let</code>, which has a rather different shape. A <code>let</code> form <em>is</em> an expression, and it creates local bindings in a delimited scope:</p><pre><code class="pygments"><span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">x</span> <span class="mi">2</span><span class="p">]</span>
<span class="p">[</span><span class="n">y</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">))</span></code></pre><p>The binding clauses of a <code>let</code> expression are known as the <em>binding pairs</em>, and the sequence of expressions afterwards are known as the <em>body</em> of the <code>let</code>. Each binding pair consists of a binding identifier and a RHS, just like a top-level definition created with <code>define</code>, but while <code>define</code> is a standalone form, the binding pairs cannot meaningfully exist outside of a <code>let</code>—they are recognized as part of the grammar of the <code>let</code> form itself.</p><p>Like other Lisps, Racket distinguishes between top-level—or, more precisely, <em>module-level</em>—bindings and local bindings. A module-level binding can be exported using <code>provide</code>, which will allow other modules to access the binding by importing the module with <code>require</code>. Such definitions are treated specially by the macroexpander, compiler, and runtime system alike. There is a pervasive, meaningful difference between module-level definitions and local definitions besides simply scope.</p><p>I am making an effort to make this as clear as possible before discussing internal definitions because without it, the following point can be rather confusing: internal definitions are written using <code>define</code>, but they are local bindings, <em>not</em> module-level ones! In Racket, <code>define</code> is allowed to appear in the body of virtually all block forms like <code>let</code>, so the following is a legal program:</p><pre><code class="pygments"><span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="n">x</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">y</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">))</span></code></pre><p>This program is equivalent to the one expressed using <code>let</code>. In fact, when the Racket macroexpander expands these local uses of <code>define</code>, it actually translates them into uses of <code>letrec</code>. After expanding the above expression, it would look closer to the following:</p><pre><code class="pygments"><span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">letrec</span> <span class="p">([</span><span class="n">x</span> <span class="mi">2</span><span class="p">]</span>
<span class="p">[</span><span class="n">y</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="n">y</span><span class="p">)))</span></code></pre><p>In this sense, <code>define</code> is a form with a double life in Racket. When used at the module level, it creates module-level definitions, which remain in a fully-expanded program and can be imported by other modules. When used inside local blocks, it creates internal definitions, which do not remain in fully expanded programs, since they are translated into recursive local binding forms.</p><p>In this blog post, we will ignore module-level definitions. Like in the previous blog post, we will focus exclusively on expanding expressions, not whole modules. However, we will extend our language to allow internal definitions inside local binding forms, and we will translate them into <code>letrec</code> forms in the same way as the Racket macroexpander.</p><h2><a name="revisiting-and-generalizing-the-expression-expander"></a>Revisiting and generalizing the expression expander</h2><p>In the previous blog post, our expander expanded types, which were essentially expressions from the perspective of the Racket macroexpander. We wrote a syntax class that handled the parsing of a restricted type grammar that disallowed most Racket-level expression forms, like <code>begin</code>, <code>if</code>, <code>#%plain-lambda</code>, and <code>quote</code>. After all, Hackett is not dependently-typed, and it disallows explicit type abstraction to preserve type inference, so it would be a very bad thing if we allowed <code>if</code> or explicit lambda abstraction to appear in our types. For this blog post, however, we will restructure the type expander to handle the full grammar of expressions permitted by Racket.</p><p>While the syntax class approach used in the previous blog post was cute, this blog post will use ordinary functions defined at phase 1 instead of syntax classes. In practice, this provides superior error reporting, since it reports syntax errors in terms of the form that went wrong, not the form prior to expansion. Since we can still use <code>syntax-parse</code> to parse the arguments to these functions, we don’t lose any expressive power in the expression of our pattern language.</p><p>To start, we’ll extract the call to <code>local-expand</code> into its own function. This corresponds to the <code>type</code> syntax class from the previous blog post, but we’ll use phase 1 parameters to avoid threading so many explicit function arguments around:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">current-context</span> <span class="p">(</span><span class="nb">make-parameter</span> <span class="no">#f</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">current-stop-list</span> <span class="p">(</span><span class="nb">make-parameter</span> <span class="p">(</span><span class="nb">list</span> <span class="o">#'</span><span class="k">define-values</span> <span class="o">#'</span><span class="k">define-syntaxes</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">current-intdef-ctx</span> <span class="p">(</span><span class="nb">make-parameter</span> <span class="no">#f</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">current-expand</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="nb">local-expand</span> <span class="n">stx</span>
<span class="p">(</span><span class="n">current-context</span><span class="p">)</span>
<span class="p">(</span><span class="n">current-stop-list</span><span class="p">)</span>
<span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">))))</span></code></pre><p>Due to the way <code>local-expand</code> implicitly extends the stop list, as discussed in the previous blog post, we can initialize the stop list to a list containing just <code>define-values</code> and <code>define-syntaxes</code>, and the other forms we care about will be included automatically.</p><p>Next, we’ll use this function to implement a <code>expand-expression</code> function, which will emulate the way the expander expands a single expression, as the name implies. We’ll ignore any custom core forms for now, so we’ll just focus exclusively on the Racket core forms.</p><p>A few of Racket’s core forms are not actually subject to any expansion at all, and they expand to themselves. These forms are <code>quote</code>, <code>quote-syntax</code>, and <code>#%variable-reference</code>. Additionally, <code>#%top</code> is not something useful to handle ourselves, since it involves no recursive expansion, so we’ll treat it as if it expands to itself as well and allow the expander to raise any unbound identifier errors it produces. Here’s what the <code>expand-expression</code> function looks like when exclusively handling these things:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-expression</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-context</span> <span class="o">'</span><span class="ss">expression</span><span class="p">])</span>
<span class="p">(</span><span class="n">current-expand</span> <span class="n">stx</span><span class="p">))</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span><span class="p">]</span>
<span class="p">[({</span><span class="n">~or</span> <span class="k">quote</span> <span class="ss">quote-syntax</span> <span class="k">#%top</span> <span class="k">#%variable-reference</span><span class="p">}</span> <span class="n">~!</span> <span class="o">.</span> <span class="k">_</span><span class="p">)</span>
<span class="n">this-syntax</span><span class="p">]))</span></code></pre><p>Another set of Racket core forms are simple expressions which contain subforms, all of which are themselves expressions. These forms include things like <code>#%expression</code>, <code>begin</code>, and <code>if</code>, and they can be expanded recursively. We’ll add another clause to handle these, which can be written with a straightforward recursive call to <code>expand-expression</code>:</p><pre><code class="pygments"><span class="p">[({</span><span class="n">~and</span> <span class="n">head</span> <span class="p">{</span><span class="n">~or</span> <span class="k">#%expression</span> <span class="k">#%plain-app</span> <span class="k">begin</span> <span class="k">begin0</span> <span class="k">if</span> <span class="k">with-continuation-mark</span><span class="p">}}</span> <span class="n">~!</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">form*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">form*</span> <span class="k">...</span><span class="p">))]</span></code></pre><p>Another easy form to handle is <code>set!</code>, since it also requires simple recursive expansion, but it can’t be handled in the same way as the above forms since one of its subforms (the variable to mutate) should not be expanded. It needs another small clause:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:set!</span> <span class="n">~!</span> <span class="n">x:id</span> <span class="n">rhs</span><span class="p">)</span>
<span class="p">(</span><span class="n">quasisyntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">x</span> <span class="o">#,</span><span class="p">(</span><span class="n">expand-expression</span> <span class="o">#'</span><span class="n">rhs</span><span class="p">)))]</span></code></pre><p>The other expressions are harder, since they’re all the binding forms. Fully-expanded Racket code has four local binding forms: <code>#%plain-lambda</code>, <code>case-lambda</code>, <code>let-values</code>, and <code>letrec-values</code>. Additionally, as discussed in the previous blog post, <code>local-expand</code> can also produce <code>letrec-syntaxes+values</code> forms produced by local syntax bindings. In the type expander, we completely disallowed runtime bindings from appearing in the resulting program, so we completely removed <code>letrec-syntaxes+values</code> in our expansion, but in the case of handling arbitrary Racket programs, we actually want to leave a <code>letrec-values</code> form behind to hold any runtime bindings (i.e. the <code>values</code> part of <code>letrec-syntaxes+values</code>).</p><p>We’ll start with <code>#%plain-lambda</code>, which is the simplest of all the five aforementioned binding forms. It binds a sequence of identifiers at runtime, and they are in scope within the body of the lambda expression. Just as we created and used an internal-definition context to hold the bindings of a <code>letrec-syntax+values</code> form in the previous blog post, we’ll do the same for Racket’s other binding forms as well:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:#%plain-lambda</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="p">[</span><span class="n">x</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="n">body*</span> <span class="k">...</span><span class="p">))]</span></code></pre><p>However, the above handling of <code>#%plain-lambda</code> isn’t <em>quite</em> right, since the argument list can also include a “rest argument” binding in addition to a sequence of positional arguments. To accommodate this, we can introduce a simple syntax class that handles the different permutations:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">plain-formals</span>
<span class="kd">#:description</span> <span class="s2">"formals"</span>
<span class="kd">#:attributes</span> <span class="p">[[</span><span class="n">id</span> <span class="mi">1</span><span class="p">]]</span>
<span class="kd">#:commit</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">id:id</span> <span class="k">...</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">id*:id</span> <span class="k">...</span> <span class="o">.</span> <span class="n">id**:id</span><span class="p">)</span> <span class="kd">#:with</span> <span class="p">[</span><span class="n">id</span> <span class="k">...</span><span class="p">]</span> <span class="o">#'</span><span class="p">[</span><span class="n">id*</span> <span class="k">...</span> <span class="n">id**</span><span class="p">]]))</span></code></pre><p>Now we can use this to adjust <code>#%plain-lambda</code> to handle rest arguments:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:#%plain-lambda</span> <span class="n">~!</span> <span class="n">formals:plain-formals</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">formals.id</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="kd">#:with</span> <span class="n">formals*</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="n">formals</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">formals*</span> <span class="n">body*</span> <span class="k">...</span><span class="p">))]</span></code></pre><p>Next, we’ll handle <code>case-lambda</code>. As it turns out, expanding <code>case-lambda</code> is almost exactly the same as expanding <code>#%plain-lambda</code>, except that it has multiple clauses. Since each clause is expanded identically to the body of a <code>#%plain-lambda</code>, and it even has the same shape, the clauses can be extracted into a separate syntax class to share code between the two forms:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">lambda-clause</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">[</span><span class="n">formals:plain-formals</span> <span class="n">body</span> <span class="k">...</span><span class="p">]</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">formals.id</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="kd">#:with</span> <span class="n">formals*</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="n">formals</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">[</span><span class="n">formals*</span> <span class="n">body*</span> <span class="k">...</span><span class="p">]]))</span></code></pre><p>Now, both <code>#%plain-lambda</code> and <code>case-lambda</code> can be handled in a few lines of code each:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:#%plain-lambda</span> <span class="n">~!</span> <span class="o">.</span> <span class="n">clause:lambda-clause</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="o">.</span> <span class="n">clause.expansion</span><span class="p">))]</span>
<span class="p">[(</span><span class="n">head:case-lambda</span> <span class="n">~!</span> <span class="n">clause:lambda-clause</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">clause.expansion</span> <span class="k">...</span><span class="p">))]</span></code></pre><p>Finally, we need to tackle the three <code>let</code> forms. None of these involve any fundamentally new ideas, but they are a little bit more involved than the variants of lambda due to the need to handle the RHSs. Each variant is slightly different, but not dramatically so: the bindings aren’t in scope when expanding the RHSs of <code>let-values</code>, but they are for <code>letrec-values</code> and <code>letrec-syntaxes+values</code>, and <code>letrec-syntaxes+values</code> creates transformer bindings and must evaluate some RHSs in phase 1 while <code>let-values</code> and <code>letrec-values</code> exclusively bind runtime bindings. It would be possible to implement these three forms in separate clauses, but since we’d ideally like to duplicate as little code as possible, we can write a rather elaborate <code>syntax/parse</code> pattern to handle all three binding forms all at once.</p><p>We’ll start by handling <code>let-values</code> alone to keep things simple:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:let-values</span> <span class="n">~!</span> <span class="p">([(</span><span class="n">x:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="nb">append*</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="kd">#:with</span> <span class="p">[[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="p">[[</span><span class="n">x</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">rhs*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">))</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span> <span class="k">...</span><span class="p">))]</span></code></pre><p>This isn’t dramatically different from the implementation of <code>#%plain-lambda</code>. The only difference is that we have to recursively invoke <code>expand-expression</code> on the RHSs in addition to expanding the body expressions. To handle <code>letrec-values</code> in the same clause, however, we’ll have to get a little more creative.</p><p>So far, we haven’t actually tapped very far into <code>syntax/parse</code>’s pattern language over the course of these two blog posts. The full language available to patterns is rather extensive, and we can take advantage of that to write a modification of the above clause that handles both <code>let-values</code> and <code>letrec-values</code> at once:</p><pre><code class="pygments"><span class="p">[({</span><span class="n">~or</span> <span class="p">{</span><span class="n">~and</span> <span class="n">head:let-values</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#f</span><span class="p">]}}</span>
<span class="p">{</span><span class="n">~and</span> <span class="n">head:letrec-values</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]}}}</span>
<span class="n">~!</span> <span class="p">([(</span><span class="n">x:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="nb">append*</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="kd">#:with</span> <span class="p">[[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="p">[[</span><span class="n">x</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">rhs*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rec?</span><span class="p">)</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span> <span class="k">...</span><span class="p">))]</span></code></pre><p>The <code>~bind</code> pattern allows us to explicitly control how attributes are bound as part of the pattern-matching process, which allows us to track when we want to enable the recursive binding behavior of <code>letrec-values</code> in our handler code. Since the vast majority of the logic is otherwise identical, this is a significant improvement over duplicating the clause.</p><p>Adding support for <code>letrec-syntaxes+values</code> is done in the same general way, but the pattern is even more involved. In addition to tracking whether or not the bindings are recursive, we have to track if any syntax bindings were present at all, and if they were, bind them with <code>syntax-local-bind-syntaxes</code>:</p><pre><code class="pygments"><span class="p">[({</span><span class="n">~or</span> <span class="p">{</span><span class="n">~or</span> <span class="p">{</span><span class="n">~and</span> <span class="n">head:let-values</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#f</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#f</span><span class="p">]}}</span>
<span class="p">{</span><span class="n">~and</span> <span class="n">head:letrec-values</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#f</span><span class="p">]}}}</span>
<span class="p">{</span><span class="n">~seq</span> <span class="n">head:letrec-syntaxes+values</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#t</span><span class="p">]}</span>
<span class="n">~!</span> <span class="p">([(</span><span class="n">x/s:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs/s</span><span class="p">]</span> <span class="k">...</span><span class="p">)}}</span>
<span class="p">([(</span><span class="n">x:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="nb">append*</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="p">(</span><span class="k">when</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">stxs?</span><span class="p">)</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">xs/s</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x/s</span><span class="p">))]</span>
<span class="p">[</span><span class="n">rhs/s</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs/s</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">xs/s</span> <span class="n">rhs/s</span> <span class="n">intdef-ctx</span><span class="p">)))]</span>
<span class="kd">#:with</span> <span class="p">[[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="p">[[</span><span class="n">x</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">rhs*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rec?</span><span class="p">)</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">stxs?</span><span class="p">)</span>
<span class="p">(</span><span class="n">~></span> <span class="p">(</span><span class="k">syntax/loc</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="k">letrec-values</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span> <span class="k">...</span><span class="p">))</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span> <span class="k">...</span><span class="p">)))]</span></code></pre><p>This behemoth clause handles all three varieties of <code>let</code> forms that can appear in the result of <code>local-expand</code>. Notably, in the <code>letrec-syntaxes+values</code> case, we expand into <code>letrec-values</code>, since the transformer bindings are effectively erased, and we use <code>syntax-track-origin</code> to record that the result originally came from a use of <code>letrec-syntaxes+values</code>.</p><p>With these five clauses, we’ve handled all the special forms that can appear in expression position in Racket’s kernel language. To tie things off, we just need to handle the cases of a variable reference, which is represented by a bare identifier not bound to syntax, or literal data, like numbers or strings. We can add one more clause at the end to handle those:</p><pre><code class="pygments"><span class="p">[</span><span class="k">_</span>
<span class="n">this-syntax</span><span class="p">]</span></code></pre><p>Putting them all together, our <code>expand-expression</code> function looks as follows:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-expression</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-context</span> <span class="o">'</span><span class="ss">expression</span><span class="p">])</span>
<span class="p">(</span><span class="n">current-expand</span> <span class="n">stx</span><span class="p">))</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span><span class="p">]</span>
<span class="p">[({</span><span class="n">~or</span> <span class="k">quote</span> <span class="ss">quote-syntax</span> <span class="k">#%top</span> <span class="k">#%variable-reference</span><span class="p">}</span> <span class="n">~!</span> <span class="o">.</span> <span class="k">_</span><span class="p">)</span>
<span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[({</span><span class="n">~and</span> <span class="n">head</span> <span class="p">{</span><span class="n">~or</span> <span class="k">#%expression</span> <span class="k">#%plain-app</span> <span class="k">begin</span> <span class="k">begin0</span> <span class="k">if</span> <span class="k">with-continuation-mark</span><span class="p">}}</span> <span class="n">~!</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">form*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">form*</span> <span class="k">...</span><span class="p">))]</span>
<span class="p">[(</span><span class="n">head:#%plain-lambda</span> <span class="n">~!</span> <span class="o">.</span> <span class="n">clause:lambda-clause</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="o">.</span> <span class="n">clause.expansion</span><span class="p">))]</span>
<span class="p">[(</span><span class="n">head:case-lambda</span> <span class="n">~!</span> <span class="n">clause:lambda-clause</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">clause.expansion</span> <span class="k">...</span><span class="p">))]</span>
<span class="p">[({</span><span class="n">~or</span> <span class="p">{</span><span class="n">~or</span> <span class="p">{</span><span class="n">~and</span> <span class="n">head:let-values</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#f</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#f</span><span class="p">]}}</span>
<span class="p">{</span><span class="n">~and</span> <span class="n">head:letrec-values</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#f</span><span class="p">]}}}</span>
<span class="p">{</span><span class="n">~seq</span> <span class="n">head:letrec-syntaxes+values</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#t</span><span class="p">]}</span>
<span class="n">~!</span> <span class="p">([(</span><span class="n">x/s:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs/s</span><span class="p">]</span> <span class="k">...</span><span class="p">)}}</span>
<span class="p">([(</span><span class="n">x:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="nb">append*</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="p">(</span><span class="k">when</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">stxs?</span><span class="p">)</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">xs/s</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x/s</span><span class="p">))]</span>
<span class="p">[</span><span class="n">rhs/s</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs/s</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">xs/s</span> <span class="n">rhs/s</span> <span class="n">intdef-ctx</span><span class="p">)))]</span>
<span class="kd">#:with</span> <span class="p">[[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="p">[[</span><span class="n">x</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">rhs*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rec?</span><span class="p">)</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">body*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)))</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">stxs?</span><span class="p">)</span>
<span class="p">(</span><span class="n">~></span> <span class="p">(</span><span class="k">syntax/loc</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="k">letrec-values</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span> <span class="k">...</span><span class="p">))</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span> <span class="k">...</span><span class="p">)))]</span>
<span class="p">[</span><span class="k">_</span>
<span class="n">this-syntax</span><span class="p">])))</span></code></pre><p>If we try it out, we’ll see that it really does work! Even complicated local binding forms are handled properly by our expander:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="n">expand-expression</span>
<span class="o">#'</span><span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">x</span> <span class="mi">42</span><span class="p">])</span>
<span class="p">(</span><span class="k">letrec-syntax</span> <span class="p">([</span><span class="n">y</span> <span class="p">(</span><span class="nb">make-rename-transformer</span> <span class="o">#'</span><span class="n">z</span><span class="p">)]</span>
<span class="p">[</span><span class="n">z</span> <span class="p">(</span><span class="nb">make-rename-transformer</span> <span class="o">#'</span><span class="n">x</span><span class="p">)])</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">y</span> <span class="mi">3</span><span class="p">))))</span>
<span class="n">#<syntax</span> <span class="p">(</span><span class="k">let-values</span> <span class="p">(((</span><span class="n">x</span><span class="p">)</span> <span class="o">'</span><span class="mi">42</span><span class="p">))</span>
<span class="p">(</span><span class="k">letrec-values</span> <span class="p">()</span>
<span class="p">(</span><span class="k">#%plain-app</span> <span class="nb">+</span> <span class="n">x</span> <span class="o">'</span><span class="mi">3</span><span class="p">)))</span><span class="nb">></span></code></pre><p>We are now able to expand arbitrary Racket expressions in the same way that the expander does. While this might not seem immediately useful—after all, we haven’t actually gained anything here over just calling <code>local-expand</code> with an empty stop list—we can use this as the basis of an expander that can extensibly handle custom core forms, which I may cover in a future blog post.</p><h2><a name="adding-support-for-internal-definitions"></a>Adding support for internal definitions</h2><p>In the previous section, we defined an expander that could expand arbitrary Racket expressions, but our expander is still imperfect: we still do not support internal definitions. For all forms that have bodies, including <code>#%plain-lambda</code>, <code>case-lambda</code>, <code>let-values</code>, <code>letrec-values</code>, and <code>letrec-syntaxes+values</code>, Racket permits the use of internal definitions.</p><p>In practice, internal-definition contexts allow for an increased degree of modularity compared to traditional local binding forms, since they provide an <em>extensible</em> binding language. Users may mix many different binding forms within a single definition context, such as <code>define</code>, <code>define-syntax</code>, <code>match-define</code>, and even <code>struct</code>. However, this means the rewriting process described earlier in this blog post is not as simple as detecting the definitions and lifting them into a local binding form, since it’s not immediately apparent which forms are binding forms and which are expressions!</p><p>For this reason, expanding internal-definition contexts happens to be a nontrivial problem in itself. It involves a little more care than expanding expressions does, since it requires using partial expansion to discover which forms are definitions and which forms are expressions. We must take care to never expand too much, but also to expand enough that we reveal all uses of <code>define-values</code> and <code>define-syntaxes</code> (which all definition forms eventually expand into). We also must handle the splicing behavior of <code>begin</code>, which is necessary to allow single forms to expand into multiple definitions.</p><p>We’ll start by writing an <code>expand-body</code> function, which operates similarly to our previous <code>expand-expression</code> function. Unlike <code>expand-expression</code>, <code>expand-body</code> will accept a <em>list</em> of syntax objects, which represents the sequence of forms that make up the body. Logically, each body will create a first-class definition context with <code>syntax-local-make-definition-context</code> to represent the sequence of definitions:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-body</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-context</span> <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nb">gensym</span><span class="p">))]</span>
<span class="p">[</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">)))</span></code></pre><p>The bulk of our <code>expand-body</code> function will be a loop that partially expands body forms, adds definitions to the definition context as it discovers them, and returns the expressions and runtime definitions to be rewritten into binding pairs for a <code>letrec-values</code> form. Additionally, the loop will also track so-called <em>disappeared uses</em> and <em>disappeared bindings</em>, which are attached to the expansion using syntax properties to allow tools like DrRacket to learn about the binding structure of phase 1 definitions that are erased as part of macroexpansion.</p><p>The skeleton of this loop is relatively straightforward to write. We will iterate over the syntax objects that make up the body, expand them, and process the expansion using <code>syntax-parse</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-body</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-context</span> <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nb">gensym</span><span class="p">))]</span>
<span class="p">[</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">[</span><span class="n">binding-clauses</span> <span class="n">exprs</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">]</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">([</span><span class="n">stxs</span> <span class="n">stxs</span><span class="p">]</span>
<span class="p">[</span><span class="n">binding-clauses</span> <span class="o">'</span><span class="p">()]</span>
<span class="p">[</span><span class="n">exprs</span> <span class="o">'</span><span class="p">()]</span>
<span class="p">[</span><span class="n">disappeared-uses</span> <span class="o">'</span><span class="p">()]</span>
<span class="p">[</span><span class="n">disappeared-bindings</span> <span class="o">'</span><span class="p">()])</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">empty?</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="nb">reverse</span> <span class="n">binding-clauses</span><span class="p">)</span> <span class="p">(</span><span class="nb">reverse</span> <span class="n">exprs</span><span class="p">)</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="p">(</span><span class="n">current-expand</span> <span class="p">(</span><span class="nb">first</span> <span class="n">stxs</span><span class="p">))</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span><span class="p">]</span>
<span class="p">)))))))</span></code></pre><p>The hard part, of course, is actually handling the potential results of that expansion. We need to handle three forms specially: <code>begin</code>, <code>define-values</code>, and <code>define-syntaxes</code>. All other results of partial expansion will be treated as expressions. We’ll start by handling <code>begin</code>, since it’s the simplest case; we only need to prepend the subforms to the list of body forms to be processed, then continue looping:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:begin</span> <span class="n">~!</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">append</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">)</span> <span class="n">stxs</span><span class="p">)</span> <span class="n">binding-clauses</span> <span class="n">exprs</span>
<span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span></code></pre><p>However, as is often the case, this isn’t quite perfect, since the information that these forms came from a surrounding <code>begin</code> is lost, which tools like DrRacket want to know. To solve this problem, the expander adjusts the <code>origin</code> property for all spliced forms, which we can mimic using <code>syntax-track-origin</code>:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:begin</span> <span class="n">~!</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">append</span> <span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">form</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">form</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))</span>
<span class="n">stxs</span><span class="p">)</span>
<span class="n">binding-clauses</span> <span class="n">exprs</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span></code></pre><p>This is sufficient for <code>begin</code>, so we can move onto the actual definitions themselves. This actually isn’t too hard, since we just need to add the bindings we discover to the first-class definition context and preserve <code>define-values</code> bindings as binding pairs:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:define-values</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">rhs</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="o">#'</span><span class="p">[(</span><span class="n">x</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="n">binding-clauses</span><span class="p">)</span> <span class="n">exprs</span>
<span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span></code></pre><p>This solution is missing one thing, however, which is the use of <code>syntax-local-identifier-as-binding</code> to any use-site scopes that were added to the binding identifier while expanding the binding form in the definition context. Explaining precisely why this is necessary is outside the scope of this blog post, and is best understood by reading <a href="http://www.cs.utah.edu/plt/scope-sets/pattern-macros.html#%28part._use-site%29">the section on use-site scopes</a> in the paper that describes the theory behind Racket’s current macro system, Bindings as Sets of Scopes. In any case, the impact on our implementation is small:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:define-values</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">rhs</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="nb">syntax-local-identifier-as-binding</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="o">#'</span><span class="p">[(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="n">binding-clauses</span><span class="p">)</span> <span class="n">exprs</span>
<span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span></code></pre><p>Finally, as with <code>begin</code>, we want to track that the binding pairs we generate actually came from a use of <code>define-values</code> (which in turn likely came from a use of some other definition form). Therefore, we’ll add another use of <code>syntax-track-origin</code> to copy and extend the necessary properties:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:define-values</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">rhs</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="nb">syntax-local-identifier-as-binding</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="p">(</span><span class="n">loop</span>
<span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="o">#'</span><span class="p">[(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">)</span> <span class="n">binding-clauses</span><span class="p">)</span>
<span class="n">exprs</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span></code></pre><p>That’s it for <code>define-values</code>. All that’s left is to handle <code>define-syntaxes</code>, which is quite similar, but instead of storing the definition in a binding pair, its RHS is immediately evaluated and added to the definition context using <code>syntax-local-bind-syntaxes</code>:</p><pre><code class="pygments"><span class="p">[(</span><span class="n">head:define-syntaxes</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">rhs</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="nb">syntax-local-identifier-as-binding</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="o">#'</span><span class="n">rhs</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span> <span class="n">binding-clauses</span> <span class="n">exprs</span>
<span class="p">(</span><span class="nb">cons</span> <span class="o">#'</span><span class="n">head</span> <span class="n">disappeared-uses</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="n">disappeared-bindings</span><span class="p">))]</span></code></pre><p>As the above snippet indicates, this is also where the disappeared uses and disappeared bindings come in. In previous cases, we’ve used <code>syntax-track-origin</code> to indicate that a piece of syntax was the result of expanding a different piece of syntax, but in this case, <code>define-syntaxes</code> doesn’t expand into anything at all; it’s simply removed from the expansion entirely. Therefore, we need to resort to tracking the information in syntax properties on the resulting <code>letrec-values</code> form, so we’ll save them for later.</p><p>Finally, to finish things up, we can add a catchall clause that handles all other forms, which are now guaranteed to be expressions:</p><pre><code class="pygments"><span class="p">[</span><span class="k">_</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span> <span class="n">binding-clauses</span> <span class="p">(</span><span class="nb">cons</span> <span class="n">this-syntax</span> <span class="n">exprs</span><span class="p">)</span>
<span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span></code></pre><p>This completes our loop that processes definition forms, so all that’s left to do is handle the results. The only significant remaining work is to actually expand the RHSs of the binding pairs we collected and the body expressions, which can be done by calling our own <code>expand-expression</code> function directly:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="n">expanded-binding-clauses</span>
<span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">binding-clause</span> <span class="p">(</span><span class="nb">in-list</span> <span class="n">binding-clauses</span><span class="p">)])</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="n">binding-clause</span>
<span class="p">[[(</span><span class="n">x</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span>
<span class="p">(</span><span class="n">quasisyntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">[(</span><span class="n">x</span> <span class="k">...</span><span class="p">)</span> <span class="o">#,</span><span class="p">(</span><span class="n">expand-expression</span> <span class="o">#'</span><span class="n">rhs</span><span class="p">)])])))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">expanded-exprs</span> <span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="n">exprs</span><span class="p">))</span></code></pre><p>Finally, we can assemble all the pieces together into a single local binding form with the appropriate syntax properties:</p><pre><code class="pygments"><span class="p">(</span><span class="n">~></span> <span class="o">#`</span><span class="p">(</span><span class="k">letrec-values</span> <span class="o">#,</span><span class="n">expanded-binding-clauses</span> <span class="o">#,@</span><span class="n">expanded-exprs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">syntax-property</span> <span class="o">'</span><span class="ss">disappeared-uses</span> <span class="n">disappeared-uses</span><span class="p">)</span>
<span class="p">(</span><span class="nb">syntax-property</span> <span class="o">'</span><span class="ss">disappeared-bindings</span> <span class="n">disappeared-bindings</span><span class="p">))</span></code></pre><p>That’s it. We’ve now written an <code>expand-body</code> function that can process internal definition contexts in the same way that the macroexpander does. Overall, the whole function is just under 45 lines of code:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-body</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-context</span> <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nb">gensym</span><span class="p">))]</span>
<span class="p">[</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">[</span><span class="n">binding-clauses</span> <span class="n">exprs</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">]</span>
<span class="p">(</span><span class="k">let</span> <span class="n">loop</span> <span class="p">([</span><span class="n">stxs</span> <span class="n">stxs</span><span class="p">]</span>
<span class="p">[</span><span class="n">binding-clauses</span> <span class="o">'</span><span class="p">()]</span>
<span class="p">[</span><span class="n">exprs</span> <span class="o">'</span><span class="p">()]</span>
<span class="p">[</span><span class="n">disappeared-uses</span> <span class="o">'</span><span class="p">()]</span>
<span class="p">[</span><span class="n">disappeared-bindings</span> <span class="o">'</span><span class="p">()])</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">empty?</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="nb">reverse</span> <span class="n">binding-clauses</span><span class="p">)</span> <span class="p">(</span><span class="nb">reverse</span> <span class="n">exprs</span><span class="p">)</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="p">(</span><span class="n">current-expand</span> <span class="p">(</span><span class="nb">first</span> <span class="n">stxs</span><span class="p">))</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span><span class="p">]</span>
<span class="p">[(</span><span class="n">head:begin</span> <span class="n">~!</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">append</span> <span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">form</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">form</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))</span>
<span class="n">stxs</span><span class="p">)</span>
<span class="n">binding-clauses</span> <span class="n">exprs</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span>
<span class="p">[(</span><span class="n">head:define-values</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">rhs</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="nb">syntax-local-identifier-as-binding</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="p">(</span><span class="n">loop</span>
<span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="o">#'</span><span class="p">[(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">)</span> <span class="n">binding-clauses</span><span class="p">)</span>
<span class="n">exprs</span> <span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]</span>
<span class="p">[(</span><span class="n">head:define-syntaxes</span> <span class="n">~!</span> <span class="p">[</span><span class="n">x:id</span> <span class="k">...</span><span class="p">]</span> <span class="n">rhs</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="nb">syntax-local-identifier-as-binding</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="o">#'</span><span class="n">rhs</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span> <span class="n">binding-clauses</span> <span class="n">exprs</span>
<span class="p">(</span><span class="nb">cons</span> <span class="o">#'</span><span class="n">head</span> <span class="n">disappeared-uses</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x*</span><span class="p">)</span> <span class="n">disappeared-bindings</span><span class="p">))]</span>
<span class="p">[</span><span class="k">_</span>
<span class="p">(</span><span class="n">loop</span> <span class="p">(</span><span class="nb">rest</span> <span class="n">stxs</span><span class="p">)</span> <span class="n">binding-clauses</span> <span class="p">(</span><span class="nb">cons</span> <span class="n">this-syntax</span> <span class="n">exprs</span><span class="p">)</span>
<span class="n">disappeared-uses</span> <span class="n">disappeared-bindings</span><span class="p">)]))))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">expanded-binding-clauses</span>
<span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">binding-clause</span> <span class="p">(</span><span class="nb">in-list</span> <span class="n">binding-clauses</span><span class="p">)])</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="n">binding-clause</span>
<span class="p">[[(</span><span class="n">x</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span>
<span class="p">(</span><span class="n">quasisyntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">[(</span><span class="n">x</span> <span class="k">...</span><span class="p">)</span> <span class="o">#,</span><span class="p">(</span><span class="n">expand-expression</span> <span class="o">#'</span><span class="n">rhs</span><span class="p">)])])))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">expanded-exprs</span> <span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="n">exprs</span><span class="p">))</span>
<span class="p">(</span><span class="n">~></span> <span class="o">#`</span><span class="p">(</span><span class="k">letrec-values</span> <span class="o">#,</span><span class="n">expanded-binding-clauses</span> <span class="o">#,@</span><span class="n">expanded-exprs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">syntax-property</span> <span class="o">'</span><span class="ss">disappeared-uses</span> <span class="n">disappeared-uses</span><span class="p">)</span>
<span class="p">(</span><span class="nb">syntax-property</span> <span class="o">'</span><span class="ss">disappeared-bindings</span> <span class="n">disappeared-bindings</span><span class="p">)))))</span></code></pre><p>The next step is to actually use this function. We need to replace certain recursive calls to <code>expand-expression</code> with calls to <code>expand-body</code>, but if we do this naïvely, we’ll have some problems. Currently, when we expand body forms, they’re always immediately inside another definition context (i.e. the bindings introduced by lambda formals or by <code>let</code> binding pairs), but they haven’t actually been expanded in that context yet. When we call <code>expand-body</code>, we create a nested context, which will inherit the bindings, but won’t automatically add the parent context’s scope. Therefore, we need to manually call <code>internal-definition-context-introduce</code> on the body syntax objects before calling <code>expand-body</code>. We can write a small helper function to make this easier:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-body/in-ctx</span> <span class="n">stxs</span> <span class="n">ctx</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">add-ctx-scope</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">ctx</span> <span class="n">stx</span> <span class="o">'</span><span class="ss">add</span><span class="p">))</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">ctx</span><span class="p">])</span>
<span class="p">(</span><span class="n">add-ctx-scope</span> <span class="p">(</span><span class="n">expand-body</span> <span class="p">(</span><span class="nb">map</span> <span class="n">add-ctx-scope</span> <span class="n">stxs</span><span class="p">))))))</span></code></pre><p>Now we just need to replace the relevant calls to <code>expand-expression</code> with calls to <code>expand-body/in-ctx</code>, starting with a minor adjustment to our <code>lambda-clause</code> syntax class from earlier:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">lambda-clause</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">[</span><span class="n">formals:plain-formals</span> <span class="n">body</span> <span class="k">...</span><span class="p">]</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">formals.id</span><span class="p">)</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)]</span>
<span class="kd">#:with</span> <span class="n">formals*</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="n">formals</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">body*</span> <span class="p">(</span><span class="n">expand-body/in-ctx</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">[</span><span class="n">formals*</span> <span class="n">body*</span><span class="p">]]))</span></code></pre><p>The only other change must occur in the handling of the various <code>let</code> forms, which similarly replaces <code>expand-expression</code> with <code>expand-body/in-ctx</code>:</p><pre><code class="pygments"><span class="p">[({</span><span class="n">~or</span> <span class="p">{</span><span class="n">~or</span> <span class="p">{</span><span class="n">~and</span> <span class="n">head:let-values</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#f</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#f</span><span class="p">]}}</span>
<span class="p">{</span><span class="n">~and</span> <span class="n">head:letrec-values</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#f</span><span class="p">]}}}</span>
<span class="p">{</span><span class="n">~seq</span> <span class="n">head:letrec-syntaxes+values</span> <span class="p">{</span><span class="n">~bind</span> <span class="p">[</span><span class="n">rec?</span> <span class="no">#t</span><span class="p">]</span> <span class="p">[</span><span class="n">stxs?</span> <span class="no">#t</span><span class="p">]}</span>
<span class="n">~!</span> <span class="p">([(</span><span class="n">x/s:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs/s</span><span class="p">]</span> <span class="k">...</span><span class="p">)}}</span>
<span class="p">([(</span><span class="n">x:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="p">(</span><span class="n">current-intdef-ctx</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="p">(</span><span class="nb">append*</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x</span><span class="p">))</span> <span class="no">#f</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="p">(</span><span class="k">when</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">stxs?</span><span class="p">)</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">xs/s</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">x/s</span><span class="p">))]</span>
<span class="p">[</span><span class="n">rhs/s</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs/s</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">xs/s</span> <span class="n">rhs/s</span> <span class="n">intdef-ctx</span><span class="p">)))]</span>
<span class="kd">#:with</span> <span class="p">[[</span><span class="n">x*</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">internal-definition-context-introduce</span> <span class="n">intdef-ctx</span> <span class="o">#'</span><span class="p">[[</span><span class="n">x</span> <span class="k">...</span><span class="p">]</span> <span class="k">...</span><span class="p">])</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">rhs*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rec?</span><span class="p">)</span>
<span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-intdef-ctx</span> <span class="n">intdef-ctx</span><span class="p">])</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">map</span> <span class="n">expand-expression</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">rhs</span><span class="p">)))</span>
<span class="kd">#:with</span> <span class="n">body*</span> <span class="p">(</span><span class="n">expand-body/in-ctx</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">body</span><span class="p">)</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">stxs?</span><span class="p">)</span>
<span class="p">(</span><span class="n">~></span> <span class="p">(</span><span class="k">syntax/loc</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="k">letrec-values</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span><span class="p">))</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">([(</span><span class="n">x*</span> <span class="k">...</span><span class="p">)</span> <span class="n">rhs*</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body*</span><span class="p">)))]</span></code></pre><p>With these changes, we’ve now extended our expression expander with the ability to expand internal definitions. We can see this in action on a simple example:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="n">expand-expression</span>
<span class="o">#'</span><span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="n">x</span> <span class="mi">42</span><span class="p">)</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">y</span> <span class="p">(</span><span class="nb">make-rename-transformer</span> <span class="o">#'</span><span class="n">z</span><span class="p">))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">z</span> <span class="p">(</span><span class="nb">make-rename-transformer</span> <span class="o">#'</span><span class="n">x</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">y</span> <span class="mi">3</span><span class="p">)))</span>
<span class="n">#<syntax</span> <span class="p">(</span><span class="k">let-values</span> <span class="p">()</span>
<span class="p">(</span><span class="k">letrec-values</span> <span class="p">([(</span><span class="n">x</span><span class="p">)</span> <span class="o">'</span><span class="mi">42</span><span class="p">])</span>
<span class="p">(</span><span class="k">#%app</span> <span class="nb">+</span> <span class="n">x</span> <span class="o">'</span><span class="mi">3</span><span class="p">)))</span><span class="nb">></span></code></pre><p>Just as we’d like, the transformer bindings were expanded and subsequently eliminated, and the runtime binding was collected into a <code>letrec-values</code> form. The outer <code>let-values</code> is left over from the outer <code>let</code>, which is needed only to create an internal-definition context to hold our internal definitions.</p><h2><a name="putting-the-expression-expander-to-work"></a>Putting the expression expander to work</h2><p>So far, we’ve done a lot of work to emulate the behavior of Racket’s macroexpander, and as the above example demonstrates, we’ve been fairly successful in that goal. However, you might be wondering <em>why</em> we did any of this, as replicating the behavior of <code>local-expand</code> is not very useful on its own. As mentioned above, this can be used as the foundation of an expander for custom core forms that extends, rather than replaces, the built-in Racket core forms, It can also be used to “cheat” and expand through the behavior of the <code>local-expand</code> stop list, which implicitly adds the Racket core forms to any non-empty stop list. Hopefully, I’ll have a chance to cover some of these things more deeply in the future, but for now, I’ll just give a small taste of the latter.</p><p>By using the power of our <code>expand-expression</code> function, it’s actually possible to use this kind of expression expander to do genuinely nefarious things, such as hijack the behavior of arbitrary macros! For example, we could do something evil like make <code>for</code> loops run in reverse order by adding <code>for</code> to <code>current-stop-list</code>, then adding an additional special case to <code>expand-expression</code> for <code>for</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">current-stop-list</span> <span class="p">(</span><span class="nb">make-parameter</span> <span class="p">(</span><span class="nb">list</span> <span class="o">#'</span><span class="k">define-values</span> <span class="o">#'</span><span class="k">define-syntaxes</span> <span class="o">#'</span><span class="k">for</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">expand-expression</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="p">(</span><span class="k">parameterize</span> <span class="p">([</span><span class="n">current-context</span> <span class="o">'</span><span class="ss">expression</span><span class="p">])</span>
<span class="p">(</span><span class="n">current-expand</span> <span class="n">stx</span><span class="p">))</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span><span class="p">]</span>
<span class="kd">#:literals</span> <span class="p">[</span><span class="k">for</span><span class="p">]</span>
<span class="c1">; ...</span>
<span class="p">[(</span><span class="n">head:for</span> <span class="p">([</span><span class="n">x:id</span> <span class="n">seq:expr</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="n">body</span> <span class="n">...+</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="p">([</span><span class="n">x</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="nb">reverse</span> <span class="p">(</span><span class="nb">sequence->list</span> <span class="n">seq</span><span class="p">)))]</span> <span class="k">...</span><span class="p">)</span>
<span class="n">body</span> <span class="k">...</span><span class="p">))]</span>
<span class="c1">; ...</span>
<span class="p">)))</span></code></pre><p>Amazingly, due to the fact that we’ve taken complete control of the expansion process, this will rewrite uses of <code>for</code> <em>even if they are introduced by macroexpansion</em>. For example, we could write a small macro that expands into a use of <code>for</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">print-up-to</span> <span class="n">n</span><span class="p">)</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">i</span> <span class="p">(</span><span class="nb">in-range</span> <span class="n">n</span><span class="p">)])</span>
<span class="p">(</span><span class="nb">println</span> <span class="n">i</span><span class="p">)))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">print-up-to</span> <span class="mi">5</span><span class="p">)</span>
<span class="mi">0</span>
<span class="mi">1</span>
<span class="mi">2</span>
<span class="mi">3</span>
<span class="mi">4</span></code></pre><p>If we write a wrapper macro that applies our evil version of <code>expand-expression</code> to its body, then wrap a use of our <code>print-up-to</code> macro with it, it will execute the loop in reverse order:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-syntax-parser</span> <span class="n">hijack-for-loops</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">form:expr</span><span class="p">)</span> <span class="p">(</span><span class="n">expand-expression</span> <span class="o">#'</span><span class="n">form</span><span class="p">)])</span>
<span class="nb">></span> <span class="p">(</span><span class="n">hijack-for-loops</span>
<span class="p">(</span><span class="n">print-up-to</span> <span class="mi">5</span><span class="p">))</span>
<span class="mi">4</span>
<span class="mi">3</span>
<span class="mi">2</span>
<span class="mi">1</span>
<span class="mi">0</span></code></pre><p>On its own, this is not that impressive, since we could have just used <code>local-expand</code> on the body directly to achieve this. However, what’s remarkable about <code>hijack-for-loops</code> is that it will work even if the <code>for</code> loop is buried deep inside some arbitrary expression:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="k">define</span> <span class="n">foo</span>
<span class="p">(</span><span class="n">hijack-for-loops</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">n</span> <span class="p">(</span><span class="nb">*</span> <span class="n">x</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">(</span><span class="n">print-up-to</span> <span class="n">n</span><span class="p">))))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">foo</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">5</span>
<span class="mi">4</span>
<span class="mi">3</span>
<span class="mi">2</span>
<span class="mi">1</span>
<span class="mi">0</span></code></pre><p>Of course, this example is rather contrived—mucking with <code>for</code> loops like this isn’t useful at all, and nobody would really write <code>print-up-to</code> as a macro, anyway—but there is potential for using this technique to do more interesting things.</p><h2><a name="closing-thoughts"></a>Closing thoughts</h2><p>The system outlined in this blog post is not something I would recommend using in any real macro. It is enormously complicated, requires knowledge well above that of your average working macrologist, and it involves doing rather horrible things to the macro system, things it was undoubtably never designed to do. Still, I believe this blog post is useful, for a few different reasons:</p><ol><li><p>The technology outlined in this post, while perhaps not directly applicable to existing real-world problems, provides a framework for implementing various new kinds of syntax transformations in Racket <em>without</em> extending the macro system. It demonstrates the expressive power of the macro system, and it hopefully lays the foundation for a better, more high-level interface for users who wish to define their own languages with custom core forms.</p></li><li><p>This system provides insight into the way the Racket macroexpander operates, <em>in terms of the userspace syntax API</em>. The canonical existing model of hygienic macroexpansion, in the aforementioned <a href="http://www.cs.utah.edu/plt/scope-sets/">Bindings as Sets of Scopes</a> paper, does not explain the workings of internal definition contexts in detail, and it certainly doesn’t explain them in terms that a Racket programmer would already be familiar with. By reencoding those ideas within the macro system itself, an advanced macro writer may be able to more easily connect concepts in the macro system’s implementation to concepts they have already been exposed to.</p></li><li><p>The capability of the proof-of-concept outlined here demonstrates that the limitation imposed by the existing implementation of the stop list (namely, the way it is implicitly extended with additional identifiers) is essentially artificial, and it can be hacked around with sufficient (albeit significant) effort. This isn’t enormously important, but it is somewhat relevant to a recent debate in <a href="https://github.com/racket/racket/issues/2154">a GitHub issue</a> about the handling of the <code>local-expand</code> stop list.</p></li><li><p>Finally, for myself as much as anyone else, this implementation records in a concise way (perhaps overly concise at times) the collection of very subtle details I’ve learned over the past six months about how information is preserved and propagated during the expansion process.</p></li></ol><p>This blog post is not for everybody. If you made it to the end, give yourself a pat on the back. If you made it to the end <em>and</em> understood everything you read: congratulations, you are a certified expert in Racket macro programming. If not, do not fear, and do not lose hope—I plan for something significantly more mellow next time.</p><p>As always, I’d like to give thanks to the people who contributed significantly, if indirectly, to the contents of this blog post, namely <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a>, <a href="http://mballantyne.net">Michael Ballantyne</a>, and <a href="http://www.ccs.neu.edu/home/ryanc/">Ryan Culpepper</a>. And finally, for those interested, all of the code in this blog post can be found in a runnable form <a href="https://gist.github.com/lexi-lambda/c4f4b91ac9c0a555447d72d02e18be7b">in this GitHub gist</a>.</p><ol class="footnotes"></ol></article>Reimplementing Hackett’s type language: expanding to custom core forms in Racket2018-04-15T00:00:00Z2018-04-15T00:00:00ZAlexis King<article><p>In the past couple of weeks, I <a href="https://github.com/lexi-lambda/hackett/commit/ba64193da38f63dab2523f42c1b7614cdfa8c935">completely rewrote the implementation of Hackett’s type language</a> to improve the integration between the type representation and Racket’s macro system. The new type language effectively implements a way to reuse as much of the Racket macroexpanding infrastructure as possible while expanding a completely custom language, which uses a custom set of core forms. The fundamental technique used to do so is not novel, and it seems to be periodically rediscovered every so often, but it has never been published or documented anywhere, and getting it right involves understanding a great number of subtleties about the Racket macro system. While I cannot entirely eliminate the need to understand those subtleties, in this blog post, I hope to make the secret sauce considerably less secret.</p><p>This blog post is both a case study on how I implemented the expander for Hackett’s new type language and a discussion of how such a technique can apply more generally. Like <a href="/blog/2017/10/27/a-space-of-their-own-adding-a-type-namespace-to-hackett/">my previous blog post on Hackett</a>, which covered the implementation of its namespace system, the implementation section of this blog post is highly technical and probably requires significant experience with Racket’s macro system to completely comprehend. However, the surrounding material is written to be more accessible, so even if you are not a Racket programmer, you should hopefully be able to understand the big ideas behind this change.</p><h2><a name="what-are-core-forms"></a>What are core forms?</h2><p>Before we can get started writing <em>custom core forms</em>, we need to understand the meaning of Racket’s plain old <em>core forms</em>. What is a core form? In order to answer that question, we need to think about how Racket’s expansion and compilation model works.</p><p>To start, let’s consider a simple Racket program. Racket programs are organized into modules, which are usually written with a <code>#lang</code> line at the top. In this case, we’ll use <code>#lang racket</code> to keep things simple:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">add2</span> <span class="n">x</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">(</span><span class="n">add2</span> <span class="mi">3</span><span class="p">)</span></code></pre><p>How does Racket see this program? Well, before it can do anything with it, it must parse the program text, which is known in Racket as <em>reading</em> the program. The <code>#lang</code> line controls how the program is read—some <code>#lang</code>s provide parsers that allow syntax that is very different from the parser used for <code>#lang racket</code>—but no matter which reader is used, the result is an s-expression (actually a syntax object, but essentially an s-expression) representing a module. In the case of the above program, the result looks like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">module</span> <span class="n">m</span> <span class="n">racket</span>
<span class="p">(</span><span class="k">#%module-begin</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">add2</span> <span class="n">x</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">(</span><span class="n">add2</span> <span class="mi">3</span><span class="p">)))</span></code></pre><p>Note the introduction of <code>#%module-begin</code>. Despite the fancy name, this is really just an ordinary macro provided by the <code>racket</code> language. By convention, the reader and expander cooperate to ensure the body of every module is wrapped with <code>#%module-begin</code>; as we’ll see shortly, this allows languages to add functionality that affects the entire contents of the module.</p><p>One the program has been read, it is subsequently <em>expanded</em> by the macroexpander. As the name implies, this is the phase that expands all the macros in a module. What does the above module look like after expansion? Well, it doesn’t look unrecognizable, but it certainly does look different:</p><pre><code class="pygments"><span class="p">(</span><span class="k">module</span> <span class="n">m</span> <span class="n">racket</span>
<span class="p">(</span><span class="k">#%plain-module-begin</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">(</span><span class="n">add2</span><span class="p">)</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="k">#%plain-app</span> <span class="nb">+</span> <span class="n">x</span> <span class="o">'</span><span class="mi">2</span><span class="p">)))</span>
<span class="p">(</span><span class="k">#%plain-app</span> <span class="nb">call-with-values</span>
<span class="p">(</span><span class="k">lambda</span> <span class="p">()</span> <span class="p">(</span><span class="k">#%plain-app</span> <span class="n">add2</span> <span class="o">'</span><span class="mi">3</span><span class="p">))</span>
<span class="n">print-values</span><span class="p">)))</span></code></pre><p>Let’s note the things that changed:</p><ol><li><p><code>#%module-begin</code> was replaced with <code>#%plain-module-begin</code>. <code>#%plain-module-begin</code> is a binding that wraps the body of every expanded module, and all definitions of <code>#%module-begin</code> in any language must eventually expand to <code>#%plain-module-begin</code>. However, <code>#lang racket</code>’s <code>#%module-begin</code> doesn’t <em>just</em> expand to <code>#%plain-module-begin</code>, it also wraps bare expressions at the top level of a module so that their results are printed. This is why running the above program prints <code>5</code> even though there is no code related to printing in the original program!</p></li><li><p>The lambda shorthand used with <code>define</code> was converted to an explicit use of <code>lambda</code>, and it was expanded to <code>define-values</code>. In Racket, <code>define</code> and <code>define-syntax</code> are really just macros for <code>define-values</code> and <code>define-syntaxes</code> that only bind a single identifier.</p></li><li><p>All function applications were tagged explicitly with <code>#%plain-app</code>. This syntactically distinguishes function applications from uses of forms like <code>define-values</code> or <code>lambda</code>. It also allows languages to customize function application by providing their own macros named <code>#%app</code> (just like languages can provide their own macros named <code>#%module-begin</code> that expand to <code>#%plain-module-begin</code>), but that is outside the scope of this blog post.</p></li><li><p>All literals have been wrapped with <code>quote</code>, so <code>2</code> became <code>'2</code> and <code>3</code> became <code>'3</code>.</p></li></ol><p>Importantly, the resulting program contains <strong>no macros</strong>. Such programs are called <em>fully expanded</em>, since all macros have been eliminated and no further expansion can take place.</p><p>So what’s left behind? Well, some of the things in the program are literal data, like the numbers <code>2</code> and <code>3</code>. There are also some variable references, <code>x</code> and <code>add2</code>. Most of the program, however, is built out of primitives like <code>module</code>, <code>#%plain-module-begin</code>, <code>#%plain-app</code>, <code>define-values</code>, and <code>lambda</code>. These primitives are <em>core forms</em>—they are not variables, since they do not represent bindings that contain values at runtime, but they are also not macros, since they cannot be expanded any further.</p><p>In this sense, a fully-expanded program is just like a program in most languages that do not have macros. Core forms in Racket correspond to the syntax of other languages. We can imagine a JavaScript program similar to the above fully-expanded Racket program:</p><pre><code class="pygments"><span class="n">var</span> <span class="n">add2</span> <span class="o">=</span>
<span class="n">function</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">2</span><span class="p">;</span> <span class="p">};</span>
<span class="n">console</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="n">add2</span><span class="p">(</span><span class="mi">3</span><span class="p">));</span></code></pre><p>Just as this JavaScript program is internally transformed into an AST containing a definition node, a function abstraction node, and some function application nodes, a fully-expanded Racket program represents an AST ready to be sent off to be <em>compiled</em>. The Racket compiler has built-in rules for how to compile core forms like <code>define-values</code>, <code>lambda</code>, and <code>#%plain-app</code>, and the result is optimized Racket bytecode.</p><p>In the remainder of this blog post, as most discussions of macros do, we’ll ignore the <em>read</em> and <em>compile</em> steps of the Racket program pipeline and focus exclusively on the <em>expand</em> step. It’s useful, however, to keep the other steps in mind, since we’re going to be discussing what it means to implement custom core forms, and core forms really only make sense in the context of the subsequent compilation step that consumes them.</p><h3><a name="racket-s-default-core-forms"></a>Racket’s default core forms</h3><p>So, now that we know what core forms are in an abstract sense, what are they in practice? We’ve already encountered <code>module</code>, <code>#%plain-module-begin</code>, <code>#%plain-app</code>, <code>define-values</code>, <code>lambda</code>, and <code>quote</code>, but there are many more. The full list is available in the section of the Racket reference named <a href="http://docs.racket-lang.org/reference/syntax-model.html#%28part._fully-expanded%29">Fully Expanded Programs</a>, and I will not list all of them here. In general, they are more or less what you’d expect. The list of Racket’s core forms also includes things like <code>define-syntaxes</code>, <code>if</code>, <code>let-values</code>, <code>letrec-values</code>, <code>begin</code>, <code>quote-syntax</code>, and <code>set!</code>. Fundamentally, these correspond to the basic operations the Racket compiler understands, and it allows the remainder of Racket’s compilation pipeline to ignore the complexities of macroexpansion.</p><p>These forms are fairly versatile, and it’s easy to build high-level abstractions on top of them. For example, <code>#lang racket</code> implements <code>cond</code> as a macro that eventually expands into <code>if</code>, and it implements <code>syntax</code> as a macro that eventually expands into function calls and <code>quote-syntax</code>. The real power comes in the way new macros can be built out of other macros, not just core forms, so Racket’s <code>match</code> can expand into uses of <code>let</code> and <code>cond</code>, and it doesn’t need to concern itself with using <code>let-values</code> and <code>if</code>. For this reason, Racket’s core forms are quite capable of representing any language imaginable, since fully-expanded programs are essentially instructions for the Racket virtual machine, and macros are mini-compilers that can be mixed and matched.</p><h3><a name="the-need-for-custom-core-forms"></a>The need for custom core forms</h3><p>With that in mind, why might we wish to define <em>custom</em> core forms? In fact, what would such a thing even mean? By their very nature, <em>all</em> Racket programs eventually expand into Racket’s core forms; new core forms cannot be added because Racket’s underlying compiler infrastructure is not (currently) extensible. New forms can be added that are defined in terms of other forms, but adding new primitives doesn’t make any sense, since the compiler would not know what to do with them.</p><p>Despite this, there <em>are</em> at least two use-cases in which a programmer might wish to customize the set of core forms produced by the macroexpander. Each situation is slightly different, but they both revolve around the same idea.</p><h4><a name="supporting-multiple-backends"></a>Supporting multiple backends</h4><p>The most commonly discussed use case for customizing the set of core forms is for languages that wish to use the Racket macroexpander, but target backends that are not the Racket compiler. For example, a user might implement a Racket <code>#lang</code> that describes electronic circuits, and they might even implement a way to execute such a program in Racket, but they might <em>also</em> wish to compile the result to a more traditional hardware description language. Like other languages in the Racket ecosystem, such a language would be made up of a tower of macros built on top of core forms; unlike other languages, the core forms might need to be more abstract than the ones provided by Racket to efficiently compile to other targets.</p><p>In the case of a hardware description language, the custom core forms might include things like <code>input</code> and <code>output</code> for declaring circuit inputs and outputs, and expressions might be built out of hardware operations rather than high-level things like function calls. The Racket macroexpander would expand the input program into the custom set of core forms, at which point an external compiler program could compile the resutling AST in a more traditional way. If the language author wished, they could <em>additionally</em> define implementations of these core forms as Racket macros that eventually expand into Racket, which would allow them to emulate their circuits in Racket at little cost, but this would be a wholly optional step.</p><p>Essentially, this use case stems from a desire to reuse Racket’s advanced language-development technology, such as the macroexpander, the module system, and editor tooling, without also committing to using Racket as a runtime, which is not always appropriate for all languages. This use case is not nearly as easy as it ought to be, but it is a common request, and it is possible that future improvements to the Racket toolchain will be designed specifically to address this problem.</p><h4><a name="compiling-an-extensible-embedded-language"></a>Compiling an extensible embedded language</h4><p>A second use case for custom core forms is less frequently discussed, but I think it might actually be significantly more common in practice were it available in a form accessible to working macro programmers. In this scenario, users might wish to remain within Racket, but still want to define a custom language that other macros can consume.</p><p>This concept is a little more vague and fuzzily-defined than the case of developing a separate backend, so allow me to propose an example. Imagine a Racket programmer decides to build an embedded DSL for asynchronously producing and consuming events, similar to first-order functional reactive programming. In this case, the DSL is designed to be used in larger Racket programs, so it <em>will</em> eventually expand to Racket’s core forms. However, it’s possible that such a language might wish to enforce static invariants about the network graph, and in doing so, it might be able to produce significantly more optimal Racket code via a compile-time analysis.</p><p>Performing such a compile-time analysis is essentially writing a custom optimizer as part of a macro, which has been done numerous times already within the Racket ecosystem. One of the most prominent examples of such a thing is the <code>match</code> macro, which parses users’ patterns into compile-time data structures, performs a fairly traditional optimization pass designed to efficiently compile pattern matching, and it emits optimized Racket code as a result. This approach works well for fairly contained problems like pattern-matching, but it works less well for entirely new embedded languages that include everything from their own notion of evaluation to their own binding forms.</p><p>Existing DSLs of this type are rare, but they do exist. <code>syntax/parse</code> provides an expressive, specialized pattern-matching language designed specifically for matching syntax objects, and it uses a different model from <code>racket/match</code> to be more suitable for that task. It allows backtracking with cuts, an extensible pattern language, an abstraction language for defining reusable parsers that can accept inputs and produce outputs, and fine-grained control over both parsing and binding. While <code>match</code> is essentially just a traditional pattern-matcher, albeit an extensible one, <code>syntax-parse</code> is its own programming language, closer in some ways to Prolog than to Racket.</p><p>For this reason, <code>syntax/parse</code> has an extensive language to do everything from creating new bindings to controlling when and how parsing fails. This language is represented in two ways: an inline pattern language, and an alternate syntax known as <a href="http://docs.racket-lang.org/syntax/stxparse-specifying.html#%28part._.Pattern_.Directives%29"><em>pattern directives</em></a>. Here is an example of pattern directives in action, from my own <code>threading</code> library:</p><pre><code class="pygments"><span class="p">[(</span><span class="k">_</span> <span class="n">ex:expr</span> <span class="n">cl:clause</span> <span class="n">remaining:clause</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">call</span> <span class="p">(</span><span class="nb">syntax->list</span> <span class="o">#'</span><span class="n">cl.call</span><span class="p">))</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">(</span><span class="n">pre</span> <span class="n">post</span><span class="p">)</span>
<span class="p">(</span><span class="nb">split-at</span> <span class="n">call</span> <span class="p">(</span><span class="nb">add1</span> <span class="p">(</span><span class="k">or</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">cl.insertion-point</span><span class="p">)</span> <span class="mi">0</span><span class="p">))))]</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">pre</span> <span class="k">...</span><span class="p">]</span> <span class="n">pre</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">post</span> <span class="k">...</span><span class="p">]</span> <span class="n">post</span>
<span class="kd">#:with</span> <span class="n">app/ctx</span> <span class="p">(</span><span class="n">adjust-outer-context</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">pre</span> <span class="k">...</span> <span class="n">ex</span> <span class="n">post</span> <span class="k">...</span><span class="p">)</span> <span class="o">#'</span><span class="n">cl</span><span class="p">)</span>
<span class="p">(</span><span class="n">adjust-outer-context</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">~></span> <span class="n">app/ctx</span> <span class="n">remaining</span> <span class="k">...</span><span class="p">)</span> <span class="n">this-syntax</span><span class="p">)]</span></code></pre><p>Each directive is represented by a keyword, in this case <code>#:do</code> and <code>#:with</code>. Each directive has a corresponding keyword in the pattern language, in this case <code>~do</code> and <code>~parse</code>. Therefore, the above pattern could equivalently be written this way:</p><pre><code class="pygments"><span class="p">[{</span><span class="n">~and</span> <span class="p">(</span><span class="k">_</span> <span class="n">ex:expr</span> <span class="n">cl:clause</span> <span class="n">remaining:clause</span> <span class="k">...</span><span class="p">)</span>
<span class="p">{</span><span class="n">~do</span> <span class="p">(</span><span class="k">define</span> <span class="n">call</span> <span class="p">(</span><span class="nb">syntax->list</span> <span class="o">#'</span><span class="n">cl.call</span><span class="p">))</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">(</span><span class="n">pre</span> <span class="n">post</span><span class="p">)</span>
<span class="p">(</span><span class="nb">split-at</span> <span class="n">call</span> <span class="p">(</span><span class="nb">add1</span> <span class="p">(</span><span class="k">or</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">cl.insertion-point</span><span class="p">)</span> <span class="mi">0</span><span class="p">))))}</span>
<span class="p">{</span><span class="n">~parse</span> <span class="p">[</span><span class="n">pre</span> <span class="k">...</span><span class="p">]</span> <span class="n">pre</span><span class="p">}</span>
<span class="p">{</span><span class="n">~parse</span> <span class="p">[</span><span class="n">post</span> <span class="k">...</span><span class="p">]</span> <span class="n">post</span><span class="p">}</span>
<span class="p">{</span><span class="n">~parse</span> <span class="n">app/ctx</span> <span class="p">(</span><span class="n">adjust-outer-context</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">pre</span> <span class="k">...</span> <span class="n">ex</span> <span class="n">post</span> <span class="k">...</span><span class="p">)</span> <span class="o">#'</span><span class="n">cl</span><span class="p">)}}</span>
<span class="p">(</span><span class="n">adjust-outer-context</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">~></span> <span class="n">app/ctx</span> <span class="n">remaining</span> <span class="k">...</span><span class="p">)</span> <span class="n">this-syntax</span><span class="p">)]</span></code></pre><p>The transformation can go in the other direction, too—each syntax class annotation on each pattern variable can be extracted into the directive language using <code>#:declare</code>, so this is also equivalent:</p><pre><code class="pygments"><span class="p">[(</span><span class="k">_</span> <span class="n">ex</span> <span class="n">cl</span> <span class="n">remaining</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:declare</span> <span class="n">ex</span> <span class="n">expr</span>
<span class="kd">#:declare</span> <span class="n">cl</span> <span class="n">clause</span>
<span class="kd">#:declare</span> <span class="n">remaining</span> <span class="n">clause</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">call</span> <span class="p">(</span><span class="nb">syntax->list</span> <span class="o">#'</span><span class="n">cl.call</span><span class="p">))</span>
<span class="p">(</span><span class="k">define-values</span> <span class="p">(</span><span class="n">pre</span> <span class="n">post</span><span class="p">)</span>
<span class="p">(</span><span class="nb">split-at</span> <span class="n">call</span> <span class="p">(</span><span class="nb">add1</span> <span class="p">(</span><span class="k">or</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">cl.insertion-point</span><span class="p">)</span> <span class="mi">0</span><span class="p">))))]</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">pre</span> <span class="k">...</span><span class="p">]</span> <span class="n">pre</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">post</span> <span class="k">...</span><span class="p">]</span> <span class="n">post</span>
<span class="kd">#:with</span> <span class="n">app/ctx</span> <span class="p">(</span><span class="n">adjust-outer-context</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">pre</span> <span class="k">...</span> <span class="n">ex</span> <span class="n">post</span> <span class="k">...</span><span class="p">)</span> <span class="o">#'</span><span class="n">cl</span><span class="p">)</span>
<span class="p">(</span><span class="n">adjust-outer-context</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">~></span> <span class="n">app/ctx</span> <span class="n">remaining</span> <span class="k">...</span><span class="p">)</span> <span class="n">this-syntax</span><span class="p">)]</span></code></pre><p>This is very much a programming language, but it has very different semantics from programming in Racket! Failure to match against a <code>#:with</code> or <code>~parse</code> pattern causes pattern-matching to backtrack, and though it’s possible to escape to Racket using <code>#:do</code> or <code>~do</code>, practical uses of <code>syntax/parse</code> really do involve quite a lot of programming in its pattern DSL.</p><p>But the Racket programmer might not find this DSL wholly satisfying. Why? Well, it isn’t extensible! The pattern directives—<code>#:declare</code>, <code>#:do</code>, and <code>#:with</code>, among others—are essentially the core forms of <code>syntax/parse</code>’s pattern-matching language, but new ones cannot be defined. The desire to make this language easy to analyze statically in order to emit optimal pattern-matching code meant its author opted to define the language in terms of a specific grammar rather than a tower of macros.</p><p>But what if <code>syntax/parse</code> could define its own core forms? What if, instead of <code>#:do</code>, <code>#:declare</code>, and <code>#:with</code> being implemented as keyword options specially recognized by the <code>syntax-parse</code> grammar, it defined <code>do</code>, <code>declare</code>, and <code>with</code> as core forms for a new, macro-enabled language? A user of the language could then define a completely ordinary Racket macro and use it with this new language as long as it eventually expanded into the <code>syntax/parse</code> core forms. The implementation of <code>syntax/parse</code> could then invoke the macroexpander to request each clause be expanded into its core forms, perform its static analysis on the result, and finally emit optimized Racket code.</p><p>Now, to be fair, <code>syntax/parse</code> is not actually entirely inextensible. While new directives cannot be defined, new patterns can be added through a pattern-expander API that was added to the library after its initial design. However, pattern expanders are still not ideal because they are not ordinary Racket macros—users must explicitly define each pattern expander differently from how they would a macro—and they cannot use existing Racket forms, even ones that would theoretically be compatible with an arbitrary set of core forms.</p><p>The technique described in this blog post avoids all those problems. In the following sections, I’ll show that it’s possible to define an embedded language with a custom set of core forms that works well with the rest of the Racket ecosystem and still permits arbitrary static analysis.</p><h2><a name="the-need-for-a-custom-type-language-in-hackett"></a>The need for a custom type language in Hackett</h2><p>In the previous section, I described two use cases for custom core forms. Hackett, in fact, has uses for <em>both</em> of them:</p><ul><li><p>Hackett can definitely make use of custom core forms to compile to multiple backends. Eventually, it would be nice to compile Hackett to an intermediate language that can target both the Racket runtime and Haskell or GHC Core. This would allow Hackett to take advantage of GHC’s advanced optimizing compiler that already has decades of tuning for a pure, lazy, functional programming language, at the cost of not having access to the rest of Racket’s ecosystem of libraries at runtime.</p></li><li><p>Hackett can <em>also</em> make use of custom core forms for an embedded DSL. In this case, that embedded DSL is actually Hackett’s type language.</p></li></ul><p>The second of those two use cases is simpler, and it’s what I ended up implementing first, so it’s what I will focus on in this blog post. Hackett’s type language is fundamentally quite simple, so its set of custom core forms is small as well. Everything in the type language eventually compiles into only seven core forms:</p><ul><li><p><code>(#%type:con <em>id</em>)</code> — Type constructors, like <code>Integer</code> or <code>Maybe</code>. These are one of the fundamental building blocks of Hackett types.</p></li><li><p><code>(#%type:app <em>type</em> <em>type</em>)</code> — Type application, such as <code>(Maybe Integer)</code>. Types are curried, so type constructors that accept multiple arguments are represented by nested uses of <code>#%type:app</code>.</p></li><li><p><code>(#%type:forall <em>id</em> <em>type</em>)</code> — Universal quantification. This is essentially a binding form, which binds any uses of <code>(#%type:bound-var <em>id</em>)</code> in <code><em>type</em></code>.</p></li><li><p><code>(#%type:qual <em>type</em> <em>type</em>)</code> — Qualified types, aka types with typeclass constraints. Constraints in Hackett, like in GHC, are represented by types, so typeclass names like <code>Eq</code> are bound as type constructors.</p></li><li><p>Finally, Hackett types support three different varieties of type variables:</p><ul><li><p><code>(#%type:bound-var <em>id</em>)</code> — Bound type variables. These are only legal under a corresponding <code>#%type:forall</code>.</p></li><li><p><code>(#%type:wobbly-var <em>id</em>)</code> — Solver variables, which may unify with any other type as part of the typechecking process.</p></li><li><p><code>(#%type:rigid-var <em>id</em>)</code> — Rigid variables, aka skolem variables, which only unify with themselves. They represent a unique, anonymous type used to ensure types are suitably polymorphic.</p></li></ul></li></ul><p>To implement our custom core forms in Racket, we need to somehow define them, but how? Intentionally, these should never be expanded, since we want the expander to stop expanding whenever it encounters one of these identifiers. While we can’t encode this directly, we <em>can</em> bind them to macros that do nothing but raise an exception if something attempts to expand them:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntaxes</span> <span class="p">[</span><span class="n">#%type:con</span> <span class="n">#%type:app</span> <span class="n">#%type:forall</span> <span class="n">#%type:qual</span>
<span class="n">#%type:bound-var</span> <span class="n">#%type:wobbly-var</span> <span class="n">#%type:rigid-var</span><span class="p">]</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">type-literal</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span><span class="p">)</span> <span class="p">(</span><span class="nb">raise-syntax-error</span> <span class="no">#f</span> <span class="s2">"cannot be used as an expression"</span> <span class="n">stx</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">values</span> <span class="n">type-literal</span> <span class="n">type-literal</span> <span class="n">type-literal</span> <span class="n">type-literal</span>
<span class="n">type-literal</span> <span class="n">type-literal</span> <span class="n">type-literal</span><span class="p">)))</span></code></pre><p>This will ensure our core forms are never accidentally expanded, and we’ll instruct the macroexpander to stop whenever it sees one of them via a separate mechanism.</p><h3><a name="expanding-types-in-our-type-language"></a>Expanding types in our type language</h3><p>We’ve now defined our core forms, but we’ve intentionally left them meaningless. How do we actually inform the expander about how our types ought to be expanded? While it’s true that we don’t want the core forms themselves to be eliminated, we <em>do</em> want to expand some of their subforms. For example, in the type <code>(#%type:app a b)</code>, we want to recursively expand <code>a</code> and <code>b</code>.</p><p>In order to do this, we’ll use the API made available by the expander for manually invoking macroexpansion from within another macro. This API is called <a href="http://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._local-expand%29%29"><code>local-expand</code></a>, and it has an option relevant to our needs: the stop list.</p><p>Often, <code>local-expand</code> is used to force the expander to completely, recursively expand a form. For example, by using <code>local-expand</code>, we can produce a fragment of a fully-expanded program from a piece of syntax that still includes macros:</p><pre><code class="pygments"><span class="p">(</span><span class="nb">local-expand</span> <span class="o">#'</span><span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">x</span> <span class="mi">1</span><span class="p">])</span> <span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="mi">2</span><span class="p">))</span> <span class="o">'</span><span class="ss">expression</span> <span class="o">'</span><span class="p">())</span>
<span class="c1">; => (let-values ([(x) '1]) (#%plain-app + x '2))</span></code></pre><p>The third argument to <code>local-expand</code> is the <em>stop list</em>, which controls how deep the expander ought to expand a given form. By providing an empty list, we ask for a complete, recursive expansion. In this case, however, we don’t want a complete expansion! We can inform the expander to stop whenever it sees any of our custom core forms by passing a list of our core form identifiers instead of an empty list:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">type-literal-ids</span>
<span class="p">(</span><span class="nb">list</span> <span class="o">#'</span><span class="n">#%type:con</span> <span class="o">#'</span><span class="n">#%type:app</span> <span class="o">#'</span><span class="n">#%type:forall</span> <span class="o">#'</span><span class="n">#%type:qual</span>
<span class="o">#'</span><span class="n">#%type:bound-var</span> <span class="o">#'</span><span class="n">#%type:wobbly-var</span> <span class="o">#'</span><span class="n">#%type:rigid-var</span><span class="p">))</span>
<span class="p">(</span><span class="nb">local-expand</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:forall</span> <span class="n">x</span> <span class="n">t</span><span class="p">)</span> <span class="o">'</span><span class="ss">expression</span> <span class="n">type-literal-ids</span><span class="p">))</span>
<span class="c1">; => (#%type:forall x t)</span></code></pre><p>Of course, this isn’t very interesting, since it just gives us back exactly what we gave it. It spotted the <code>#%type:forall</code> identifier, which is in our stop list, and immediately halted expansion. It didn’t attempt to continue expanding <code>t</code> since the expander has no way of knowing which pieces of <code>(#%type:forall x t)</code> it should expand! In this case, we want it to recur to expand <code>t</code>, since it should be a type, but not <code>x</code>, since <code>#%type:forall</code> essentially puts <code>x</code> in binding position.</p><p>Therefore, we have to get more clever. We need to call <code>local-expand</code> to produce a type, then we have to pattern-match on it and subsequently call <code>local-expand</code> <em>again</em> on any of the pieces of syntax we want to keep expanding. Eventually, we’ll run out of things to expand, and our type will be fully-expanded.</p><p>One good way to do this is to use <code>syntax/parse</code> syntax classes, since they provide a convenient way for other macros to invoke the type expander. To implement our type expander, we’ll use two mutually recursive syntax classes: one to perform the actual expansion using <code>local-expand</code> and a second to pattern-match on the resulting expanded type. For example, here’s what these two classes would look like if they only handled <code>#%type:con</code> and <code>#%type:app</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-literal-set</span> <span class="n">type-literals</span>
<span class="p">[</span><span class="n">#%type:con</span> <span class="n">#%type:app</span> <span class="n">#%type:forall</span> <span class="n">#%type:qual</span>
<span class="n">#%type:bound-var</span> <span class="n">#%type:wobbly-var</span> <span class="n">#%type:rigid-var</span><span class="p">])</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">type</span>
<span class="kd">#:description</span> <span class="s2">"type"</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="k">_</span> <span class="kd">#:with</span> <span class="n">:expanded-type</span>
<span class="p">(</span><span class="nb">local-expand</span> <span class="n">this-syntax</span> <span class="o">'</span><span class="ss">expression</span> <span class="n">type-literal-ids</span><span class="p">)])</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">expanded-type</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">type-literals</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:con</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:app</span> <span class="n">~!</span> <span class="n">a:type</span> <span class="n">b:type</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:app</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">)]))</span></code></pre><p>This blog post is definitely <em>not</em> a <code>syntax/parse</code> tutorial, so I will not explain in detail everything that’s going on here, but the gist of it is that the above code defines two syntax classes, both of which produce a single output attribute named <code>expansion</code>. This attribute contains the fully expanded version of the type currently being parsed. In the <code>#%type:con</code> case, <code>expansion</code> is just <code>this-syntax</code>, which holds the current piece of syntax being parsed. This makes sense, since uses of <code>#%type:con</code> just expand to themselves—expanding <code>(#%type:con Maybe)</code> should not perform any additional expansion on <code>Maybe</code>. This is one of Hackett’s atomic types.</p><p>In contrast, <code>#%type:app</code> <em>does</em> recursively expand its arguments. By annotating its two subforms with <code>:type</code>, the <code>type</code> syntax class will invoke <code>local-expand</code> on each subform, which will in turn use <code>expanded-type</code> to parse the resulting type. This is what implements the expansion loop that will eventually expand each type completely. Once <code>a</code> and <code>b</code> have been expanded, <code>#%type:app</code> reassembles them into a new syntax object using <code>#'(#%type:app a.expansion b.expansion)</code>, which replaces their unexpanded versions with their new, expanded versions.</p><p>We can see this behavior by writing a small <code>expand-type</code> function that will expand its argument:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">expand-type</span> <span class="p">(</span><span class="n">syntax-parser</span> <span class="p">[</span><span class="n">t:type</span> <span class="o">#'</span><span class="n">t.expansion</span><span class="p">])))</span></code></pre><p>Now we can use it to observe what happens when we try expanding a type using <code>#%type:app</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">expand-type</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:app</span> <span class="n">Maybe</span> <span class="n">Integer</span><span class="p">))</span>
<span class="c1">; => #%type:app: expected type</span>
<span class="c1">; at: Maybe</span>
<span class="c1">; in: (#%type:app Maybe Integer)</span></code></pre><p>Okay, it failed with an error, which is not ideal, but it makes sense. We haven’t actually defined <code>Maybe</code> or <code>Integer</code> anywhere. Let’s do so! We can define them as simple macros that expand into uses of <code>#%type:con</code>, which can be done easily using <a href="http://docs.racket-lang.org/syntax/transformer-helpers.html#%28def._%28%28lib._syntax%2Ftransformer..rkt%29._make-variable-like-transformer%29%29"><code>make-variable-like-transformer</code></a> from <code>syntax/transformer</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">Maybe</span> <span class="p">(</span><span class="n">make-variable-like-transformer</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:con</span> <span class="n">Maybe</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">Integer</span> <span class="p">(</span><span class="n">make-variable-like-transformer</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:con</span> <span class="n">Integer</span><span class="p">)))</span></code></pre><p>Now, if we try expanding that same type again:</p><pre><code class="pygments"><span class="p">(</span><span class="n">expand-type</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:app</span> <span class="n">Maybe</span> <span class="n">Integer</span><span class="p">))</span>
<span class="c1">; => (#%type:app (#%type:con Maybe) (#%type:con Integer))</span></code></pre><p>…it works! Neat. Now we just need to add the cases for the remaining forms in our type language:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">expanded-type</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">type-literals</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:con</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:app</span> <span class="n">~!</span> <span class="n">a:type</span> <span class="n">b:type</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:app</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:forall</span> <span class="n">~!</span> <span class="n">x:id</span> <span class="n">t:type</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:forall</span> <span class="n">x</span> <span class="n">t.expansion</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:qual</span> <span class="n">~!</span> <span class="n">a:type</span> <span class="n">b:type</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:qual</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:bound-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:wobbly-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:rigid-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]))</span></code></pre><p>This is pretty good already, and to a first approximation, it’s done! However, it doesn’t actually work as well as we’d really like it to. One of the whole points of doing things this way is to allow other macros like <code>let-syntax</code> to work in types. For example, we ought to be able to create a local type binding with <code>let-syntax</code> and have it just work. Unfortunately, it doesn’t:</p><pre><code class="pygments"><span class="p">(</span><span class="n">expand-type</span> <span class="o">#'</span><span class="p">(</span><span class="k">let-syntax</span> <span class="p">([</span><span class="n">Bool</span> <span class="p">(</span><span class="n">make-variable-like-transformer</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:con</span> <span class="n">Bool</span><span class="p">))])</span>
<span class="p">(</span><span class="n">#%type:app</span> <span class="n">Maybe</span> <span class="n">Bool</span><span class="p">)))</span>
<span class="c1">; => let-syntax: expected one of these identifiers: `#%type:con', `#%type:app', `#%type:forall', `#%type:qual', `#%type:bound-var', `#%type:wobbly-var', or `#%type:rigid-var'</span>
<span class="c1">; at: letrec-syntaxes+values</span>
<span class="c1">; in: (let-syntax ((Bool (make-variable-like-transformer (syntax Bool)))) (#%type:app Maybe Bool))</span></code></pre><p>What went wrong? And why is it complaining about <code>letrec-syntaxes+values</code>? Well, if you read the documentation for <code>local-expand</code>, you’ll find that its behavior is a little more complicated than you might at first believe:</p><blockquote><p>If <em><code>stop-ids</code></em> is [a nonempty list containing more than just <code>module*</code>], then <code>begin</code>, <code>quote</code>, <code>set!</code>, <code>#%plain-lambda</code>, <code>case-lambda</code>, <code>let-values</code>, <code>letrec-values</code>, <code>if</code>, <code>begin0</code>, <code>with-continuation-mark</code>, <code>letrec-syntaxes+values</code>, <code>#%plain-app</code>, <code>#%expression</code>, <code>#%top</code>, and <code>#%variable-reference</code> are implicitly added to <em><code>stop-ids</code></em>. Expansion stops when the expander encounters any of the forms in <em><code>stop-ids</code></em>, and the result is the partially-expanded form.</p></blockquote><p>That’s a little strange, isn’t it? I am not completely sure why the behavior works quite this way, though I’m sure backwards compatibility plays a significant part, but while some of the behavior seems unnecessary, the issue with <code>letrec-syntaxes+values</code> (which <code>let-syntax</code> expands to) is a reasonable one. If the expander naïvely expanded <code>letrec-syntaxes+values</code> in the presence of a nonempty stop list, it could cause some significant problems!</p><p>Allow me to illustrate with an example. Let’s imagine we are the expander, and we are instructed to expand the following program:</p><pre><code class="pygments"><span class="p">(</span><span class="k">let-syntax</span> <span class="p">([</span><span class="n">Bool</span> <span class="p">(</span><span class="n">make-variable-like-transformer</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:con</span> <span class="n">Bool</span><span class="p">))])</span>
<span class="p">(</span><span class="n">#%type:app</span> <span class="n">Maybe</span> <span class="n">Bool</span><span class="p">))</span></code></pre><p>We see <code>let-syntax</code>, so we start by evaluating the expression on the right hand side of the <code>Bool</code> binding. This produces a transformer expression, so we bind <code>Bool</code> to the transformer in the local environment, then move onto expanding the body. At this point, the expander is looking at this:</p><pre><code class="pygments"><span class="c1">; local bindings:</span>
<span class="c1">; Bool -> #<variable-like-transformer></span>
<span class="p">(</span><span class="n">#%type:app</span> <span class="n">Maybe</span> <span class="n">Bool</span><span class="p">)</span></code></pre><p>Now, the identifier in application position is <code>#%type:app</code>, and <code>#%type:app</code> is in the stop list. Therefore, expansion must stop, and it does not attempt to expand any further. But what should the result of expansion be? Well, the <code>let-syntax</code> needs to go away when we expand it—local syntax bindings are erased as part of macroexpansion—so the logical thing to expand into is <code>(#%type:app Maybe Bool)</code>. But this is a problem, because when we then go to expand <code>Bool</code>, <code>Bool</code> isn’t in the local binding table anymore! The <code>let-syntax</code> was already erased, and <code>Bool</code> is unbound!</p><p>When expanding recursively, this isn’t a problem, since the entire expression is guaranteed to be expanded while the local binding is still in the expander’s environment. As soon as we introduce partial expansion, however, we run the risk of a binding getting erased too early. So we’re stuck: we can’t recursively expand, or we’ll expand too much, but we can’t partially expand, since we might expand too little.</p><p>Confronted with this problem, there is some good news and some bad news. The good news is that, while the macroexpander can’t help us, we can help the macroexpander by doing some of the necessary bookkeeping for it. We can do this using first-class definition contexts, which allow us to manually extend the local environment when we call <code>local-expand</code>. The bad news is that first-class definition contexts are <em>complicated</em>, and using them properly is a surprisingly subtle problem.</p><p>Fortunately, I’ve already spent a lot of time figuring out what needs to be done to properly manipulate the necessary definition contexts in this particular situation. The first step is to parameterize our <code>type</code> and <code>expanded-type</code> syntax classes so that we may thread a definition context around as we recursively expand:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="p">(</span><span class="n">type</span> <span class="p">[</span><span class="n">intdef-ctx</span> <span class="no">#f</span><span class="p">])</span>
<span class="kd">#:description</span> <span class="s2">"type"</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="k">_</span> <span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">||</span> <span class="p">(</span><span class="n">expanded-type</span> <span class="n">intdef-ctx</span><span class="p">)}</span>
<span class="p">(</span><span class="nb">local-expand</span> <span class="n">this-syntax</span> <span class="o">'</span><span class="ss">expression</span> <span class="n">type-literal-ids</span> <span class="n">intdef-ctx</span><span class="p">)])</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="p">(</span><span class="n">expanded-type</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">type-literals</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:con</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:app</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:app</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:forall</span> <span class="n">~!</span> <span class="n">x:id</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:forall</span> <span class="n">x</span> <span class="n">t.expansion</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:qual</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:qual</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">)]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:bound-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:wobbly-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:rigid-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]))</span></code></pre><p>Now, we can add an additional case to <code>expanded-type</code> to handle <code>letrec-syntaxes+values</code>, which will explicitly create a new definition context, add bindings to it, and use it when parsing the body:</p><pre><code class="pygments"><span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="k">letrec-syntaxes+values</span> <span class="n">~!</span> <span class="p">([(</span><span class="n">id:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">e:expr</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="p">()</span> <span class="n">t:expr</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx*</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span><span class="p">))</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">ids</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">id</span><span class="p">))]</span>
<span class="p">[</span><span class="n">e</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">e</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">ids</span> <span class="n">e</span> <span class="n">intdef-ctx*</span><span class="p">))]</span>
<span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t*</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx*</span><span class="p">)}</span> <span class="o">#'</span><span class="n">t</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="n">t*.expansion</span><span class="p">]</span></code></pre><p>But even this isn’t quite right. The problem with this implementation is that it throws away the existing <code>intdef-ctx</code> argument to <code>expanded-type</code>, which means those bindings will be lost as soon as we introduce a new set. To fix this, we have to make the new definition context a <em>child</em> of the previous definition context by passing the old context as an argument to <code>syntax-local-make-definition-context</code>. This will ensure the parent bindings are brought into scope when expanding using the child context:</p><pre><code class="pygments"><span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="k">letrec-syntaxes+values</span> <span class="n">~!</span> <span class="p">([(</span><span class="n">id:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">e:expr</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="p">()</span> <span class="n">t:expr</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx*</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="n">intdef-ctx</span><span class="p">))</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">ids</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">id</span><span class="p">))]</span>
<span class="p">[</span><span class="n">e</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">e</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">ids</span> <span class="n">e</span> <span class="n">intdef-ctx*</span><span class="p">))]</span>
<span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t*</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx*</span><span class="p">)}</span> <span class="o">#'</span><span class="n">t</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="n">t*.expansion</span><span class="p">]</span></code></pre><p>With this in place, our example using <code>let-syntax</code> actually works!</p><pre><code class="pygments"><span class="p">(</span><span class="n">expand-type</span> <span class="o">#'</span><span class="p">(</span><span class="k">let-syntax</span> <span class="p">([</span><span class="n">Bool</span> <span class="p">(</span><span class="n">make-variable-like-transformer</span> <span class="o">#'</span><span class="p">(</span><span class="n">#%type:con</span> <span class="n">Bool</span><span class="p">))])</span>
<span class="p">(</span><span class="n">#%type:app</span> <span class="n">Maybe</span> <span class="n">Bool</span><span class="p">)))</span>
<span class="c1">; => (#%type:app (#%type:con Maybe) (#%type:con Bool))</span></code></pre><p>Pretty cool, isn’t it?</p><h3><a name="preserving-syntax-properties-and-source-locations"></a>Preserving syntax properties and source locations</h3><p>We’ve now managed to essentially implement an expander for our custom language by periodically yielding to the Racket macroexpander, and for the most part, it works. However, our implementation isn’t perfect. The real Racket macroexpander takes great care to preserve source locations and syntax properties on syntax objects wherever possible, which our implementation does not do. Normally we don’t have to worry so much about such things, since the macroexpander automatically copies properties when expanding macros, but since we’re circumventing the expander, we don’t get that luxury. In order to properly preserve this information, we’ll have to be a little more careful.</p><p>To start, we really ought to copy the identifier in application position into the output wherever we can. In addition to preserving source location information and syntax properties, it also preserves the even more visible renamings. For example, if a user imports <code>#%type:app</code> under a different name, like <code>#%type:apply</code>, we should expand to a piece of syntax that still has <code>#%type:apply</code> in application position instead of replacing it with <code>#%type:app</code>.</p><p>To do this, we just need to bind each of the identifiers in application position, then use that binding when we produce output. For example, we would adjust the <code>#%type:app</code> clause to the following:</p><pre><code class="pygments"><span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:app</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="p">(</span><span class="n">head</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">)]</span></code></pre><p>But even after doing this, some source locations and syntax properties are lost, since we’re still reconstructing the pair from scratch. To ensure we copy <em>everything</em>, we can define two helper macros, <code>syntax/loc/props</code> and <code>quasisyntax/loc/props</code>, which are like <code>syntax/loc</code> and <code>quasisyntax/loc</code> but copy properties in addition to source location information:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define-syntaxes</span> <span class="p">[</span><span class="n">syntax/loc/props</span> <span class="n">quasisyntax/loc/props</span><span class="p">]</span>
<span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">make-syntax/loc/props</span> <span class="n">name</span> <span class="n">syntax-id</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">from-stx-expr:expr</span> <span class="p">{</span><span class="n">~describe</span> <span class="s2">"template"</span> <span class="n">template</span><span class="p">})</span>
<span class="o">#`</span><span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">from-stx</span> <span class="n">from-stx-expr</span><span class="p">])</span>
<span class="p">(</span><span class="k">unless</span> <span class="p">(</span><span class="nb">syntax?</span> <span class="n">from-stx</span><span class="p">)</span>
<span class="p">(</span><span class="nb">raise-argument-error</span> <span class="o">'#,</span><span class="ss">name</span> <span class="s2">"syntax?"</span> <span class="n">from-stx</span><span class="p">))</span>
<span class="p">(</span><span class="k">let*</span> <span class="p">([</span><span class="n">stx</span> <span class="p">(</span><span class="o">#,</span><span class="n">syntax-id</span> <span class="n">template</span><span class="p">)]</span>
<span class="p">[</span><span class="n">stx*</span> <span class="p">(</span><span class="nb">syntax-disarm</span> <span class="n">stx</span> <span class="no">#f</span><span class="p">)])</span>
<span class="p">(</span><span class="nb">syntax-rearm</span> <span class="p">(</span><span class="nb">datum->syntax</span> <span class="n">stx*</span> <span class="p">(</span><span class="nb">syntax-e</span> <span class="n">stx*</span><span class="p">)</span> <span class="n">from-stx</span> <span class="n">from-stx</span><span class="p">)</span> <span class="n">stx</span><span class="p">)))]))</span>
<span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="n">make-syntax/loc/props</span> <span class="o">'</span><span class="ss">syntax/loc/props</span> <span class="o">#'</span><span class="k">syntax</span><span class="p">)</span>
<span class="p">(</span><span class="n">make-syntax/loc/props</span> <span class="o">'</span><span class="ss">quasisyntax/loc/props</span> <span class="o">#'</span><span class="k">quasisyntax</span><span class="p">)))))</span></code></pre><p>Using <code>syntax/loc/props</code>, we can be truly thorough about ensuring all properties are preserved:</p><pre><code class="pygments"><span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:app</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">))]</span></code></pre><p>Applying this to the other relevant clauses, we get an updated version of the <code>expanded-type</code> syntax class:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="p">(</span><span class="n">expanded-type</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span> <span class="n">type-literals</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="k">letrec-syntaxes+values</span> <span class="n">~!</span> <span class="p">([(</span><span class="n">id:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">e:expr</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="p">()</span> <span class="n">t:expr</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx*</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="n">intdef-ctx</span><span class="p">))</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">ids</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">id</span><span class="p">))]</span>
<span class="p">[</span><span class="n">e</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">e</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">ids</span> <span class="n">e</span> <span class="n">intdef-ctx*</span><span class="p">))]</span>
<span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t*</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx*</span><span class="p">)}</span> <span class="o">#'</span><span class="n">t</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="o">#'</span><span class="n">t*.expansion</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:con</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:app</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:forall</span> <span class="n">~!</span> <span class="n">x:id</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">x</span> <span class="n">t.expansion</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:qual</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:bound-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:wobbly-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:rigid-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]))</span></code></pre><p>Now we’re getting closer, but if you can believe it, even <em>this</em> isn’t good enough. The real expander’s implementation of <code>letrec-syntaxes+values</code> does two things our implementation does not: it copies properties and updates the <code>'origin</code> property to indicate the syntax came from a use of <code>letrec-syntaxes+values</code>, and it adds a <code>'disappeared-use</code> property to record the erased bindings for use by tools like DrRacket. We can apply <code>syntax-track-origin</code> and <code>internal-definition-context-track</code> to the resulting syntax to add the same properties the expander would:</p><pre><code class="pygments"><span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:letrec-syntaxes+values</span> <span class="n">~!</span> <span class="p">([(</span><span class="n">id:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">e:expr</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="p">()</span> <span class="n">t:expr</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx*</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="n">intdef-ctx</span><span class="p">))</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">ids</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">id</span><span class="p">))]</span>
<span class="p">[</span><span class="n">e</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">e</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">ids</span> <span class="n">e</span> <span class="n">intdef-ctx*</span><span class="p">))]</span>
<span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t*</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx*</span><span class="p">)}</span> <span class="o">#'</span><span class="n">t</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">~></span> <span class="p">(</span><span class="n">internal-definition-context-track</span> <span class="n">intdef-ctx*</span> <span class="o">#'</span><span class="n">t*.expansion</span><span class="p">)</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))]</span></code></pre><p>Now we’ve <em>finally</em> dotted all our i’s and crossed our t’s. While it does take a lot to properly emulate what the macroexpander is doing, the important thing is that it’s actually possible! The end result of all this definition context juggling and property copying is that we’ve effectively managed to move some of the macroexpander’s logic into userspace code, which allows us to manipulate it as we see fit.</p><h3><a name="connecting-our-custom-language-to-hackett"></a>Connecting our custom language to Hackett</h3><p>It took a lot of work, but we finally managed to write a custom type language, and while the code is not exactly simple, it’s not actually very long. The entire implementation of our custom type language is less than 80 lines of code:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket/base</span>
<span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">for-meta</span> <span class="mi">2</span> <span class="n">racket/base</span>
<span class="n">syntax/parse</span><span class="p">)</span>
<span class="p">(</span><span class="k">for-syntax</span> <span class="n">racket/base</span>
<span class="n">syntax/intdef</span>
<span class="n">threading</span><span class="p">)</span>
<span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define-syntaxes</span> <span class="p">[</span><span class="n">syntax/loc/props</span> <span class="n">quasisyntax/loc/props</span><span class="p">]</span>
<span class="p">(</span><span class="k">let</span> <span class="p">()</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">make-syntax/loc/props</span> <span class="n">name</span> <span class="n">syntax-id</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">from-stx-expr:expr</span> <span class="p">{</span><span class="n">~describe</span> <span class="s2">"template"</span> <span class="n">template</span><span class="p">})</span>
<span class="o">#`</span><span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">from-stx</span> <span class="n">from-stx-expr</span><span class="p">])</span>
<span class="p">(</span><span class="k">unless</span> <span class="p">(</span><span class="nb">syntax?</span> <span class="n">from-stx</span><span class="p">)</span>
<span class="p">(</span><span class="nb">raise-argument-error</span> <span class="o">'#,</span><span class="ss">name</span> <span class="s2">"syntax?"</span> <span class="n">from-stx</span><span class="p">))</span>
<span class="p">(</span><span class="k">let*</span> <span class="p">([</span><span class="n">stx</span> <span class="p">(</span><span class="o">#,</span><span class="n">syntax-id</span> <span class="n">template</span><span class="p">)]</span>
<span class="p">[</span><span class="n">stx*</span> <span class="p">(</span><span class="nb">syntax-disarm</span> <span class="n">stx</span> <span class="no">#f</span><span class="p">)])</span>
<span class="p">(</span><span class="nb">syntax-rearm</span> <span class="p">(</span><span class="nb">datum->syntax</span> <span class="n">stx*</span> <span class="p">(</span><span class="nb">syntax-e</span> <span class="n">stx*</span><span class="p">)</span> <span class="n">from-stx</span> <span class="n">from-stx</span><span class="p">)</span> <span class="n">stx</span><span class="p">)))]))</span>
<span class="p">(</span><span class="nb">values</span> <span class="p">(</span><span class="n">make-syntax/loc/props</span> <span class="o">'</span><span class="ss">syntax/loc/props</span> <span class="o">#'</span><span class="k">syntax</span><span class="p">)</span>
<span class="p">(</span><span class="n">make-syntax/loc/props</span> <span class="o">'</span><span class="ss">quasisyntax/loc/props</span> <span class="o">#'</span><span class="k">quasisyntax</span><span class="p">)))))</span>
<span class="p">(</span><span class="k">define-syntaxes</span> <span class="p">[</span><span class="n">#%type:con</span> <span class="n">#%type:app</span> <span class="n">#%type:forall</span> <span class="n">#%type:qual</span>
<span class="n">#%type:bound-var</span> <span class="n">#%type:wobbly-var</span> <span class="n">#%type:rigid-var</span><span class="p">]</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">type-literal</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span><span class="p">)</span> <span class="p">(</span><span class="nb">raise-syntax-error</span> <span class="no">#f</span> <span class="s2">"cannot be used as an expression"</span> <span class="n">stx</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">values</span> <span class="n">type-literal</span> <span class="n">type-literal</span> <span class="n">type-literal</span> <span class="n">type-literal</span>
<span class="n">type-literal</span> <span class="n">type-literal</span> <span class="n">type-literal</span><span class="p">)))</span>
<span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">type-literal-ids</span>
<span class="p">(</span><span class="nb">list</span> <span class="o">#'</span><span class="n">#%type:con</span> <span class="o">#'</span><span class="n">#%type:app</span> <span class="o">#'</span><span class="n">#%type:forall</span> <span class="o">#'</span><span class="n">#%type:qual</span>
<span class="o">#'</span><span class="n">#%type:bound-var</span> <span class="o">#'</span><span class="n">#%type:wobbly-var</span> <span class="o">#'</span><span class="n">#%type:rigid-var</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-literal-set</span> <span class="n">type-literals</span>
<span class="p">[</span><span class="n">#%type:con</span> <span class="n">#%type:app</span> <span class="n">#%type:forall</span> <span class="n">#%type:qual</span>
<span class="n">#%type:bound-var</span> <span class="n">#%type:wobbly-var</span> <span class="n">#%type:rigid-var</span><span class="p">])</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="p">(</span><span class="n">type</span> <span class="p">[</span><span class="n">intdef-ctx</span> <span class="no">#f</span><span class="p">])</span>
<span class="kd">#:description</span> <span class="s2">"type"</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="k">_</span> <span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">||</span> <span class="p">(</span><span class="n">expanded-type</span> <span class="n">intdef-ctx</span><span class="p">)}</span>
<span class="p">(</span><span class="nb">local-expand</span> <span class="n">this-syntax</span> <span class="o">'</span><span class="ss">expression</span> <span class="n">type-literal-ids</span> <span class="n">intdef-ctx</span><span class="p">)])</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="p">(</span><span class="n">expanded-type</span> <span class="n">intdef-ctx</span><span class="p">)</span>
<span class="kd">#:description</span> <span class="no">#f</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">expansion</span><span class="p">]</span>
<span class="kd">#:commit</span>
<span class="kd">#:literal-sets</span> <span class="p">[</span><span class="n">kernel-literals</span> <span class="n">type-literals</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:letrec-syntaxes+values</span> <span class="n">~!</span> <span class="p">([(</span><span class="n">id:id</span> <span class="k">...</span><span class="p">)</span> <span class="n">e:expr</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span> <span class="p">()</span> <span class="n">t:expr</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define</span> <span class="n">intdef-ctx*</span> <span class="p">(</span><span class="nb">syntax-local-make-definition-context</span> <span class="n">intdef-ctx</span><span class="p">))</span>
<span class="p">(</span><span class="k">for</span> <span class="p">([</span><span class="n">ids</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">id</span><span class="p">))]</span>
<span class="p">[</span><span class="n">e</span> <span class="p">(</span><span class="nb">in-list</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">e</span><span class="p">))])</span>
<span class="p">(</span><span class="nb">syntax-local-bind-syntaxes</span> <span class="n">ids</span> <span class="n">e</span> <span class="n">intdef-ctx*</span><span class="p">))]</span>
<span class="kd">#:with</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t*</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx*</span><span class="p">)}</span> <span class="o">#'</span><span class="n">t</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">~></span> <span class="p">(</span><span class="n">internal-definition-context-track</span> <span class="n">intdef-ctx*</span> <span class="o">#'</span><span class="n">t*.expansion</span><span class="p">)</span>
<span class="p">(</span><span class="nb">syntax-track-origin</span> <span class="n">this-syntax</span> <span class="o">#'</span><span class="n">head</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:con</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:app</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:forall</span> <span class="n">~!</span> <span class="n">x:id</span> <span class="p">{</span><span class="n">~var</span> <span class="n">t</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">x</span> <span class="n">t.expansion</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">head:#%type:qual</span> <span class="n">~!</span> <span class="p">{</span><span class="n">~var</span> <span class="n">a</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)}</span> <span class="p">{</span><span class="n">~var</span> <span class="n">b</span> <span class="p">(</span><span class="n">type</span> <span class="n">intdef-ctx</span><span class="p">)})</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="p">(</span><span class="n">syntax/loc/props</span> <span class="n">this-syntax</span>
<span class="p">(</span><span class="n">head</span> <span class="n">a.expansion</span> <span class="n">b.expansion</span><span class="p">))]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:bound-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:wobbly-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">(</span><span class="n">#%type:rigid-var</span> <span class="n">~!</span> <span class="n">_:id</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="n">expansion</span> <span class="n">this-syntax</span><span class="p">])</span>
<span class="p">(</span><span class="k">define</span> <span class="n">expand-type</span> <span class="p">(</span><span class="n">syntax-parser</span> <span class="p">[</span><span class="n">t:type</span> <span class="o">#'</span><span class="n">t.expansion</span><span class="p">])))</span></code></pre><p>But what now? Just as Racket fully-expanded programs are useless without a compiler to turn them into something useful, our custom type language doesn’t do anything at all in isolation. As it happens, in the case of the type language, we don’t have a compiler at all—we have a <em>typechecker</em>. The Hackett typechecker consumes fully-expanded types as input and uses them to perform its typechecking process. The actual implementation of Hackett’s typechecker is outside the scope of this blog post, since it’s really an entirely separate problem, but you can probably imagine what such a thing might look like, in an extremely vague, handwavy sense.</p><p>But we don’t <em>just</em> need a typechecker. Just as the authors of Racket don’t expect users to write programs using the core forms directly, we also don’t expect users to write their types using the fully-expanded syntax. If we did, all this fancy expansion machinery would be pretty pointless! Hackett provides a custom <code>#%app</code> binding that converts n-ary type applications to nested uses of <code>#%type:app</code>, as well as a nicer <code>forall</code> macro that supports specifying multiple type variables and multiple typeclass constraints all at once. The best part, though, is that these macros can be defined in a completely straightforward way, just as any ordinary Racket macro would be written, and the machinery will work precisely as intended. It’s also perfectly okay to have two different versions of <code>#%app</code>—one for types and one for values—since <a href="/blog/2017/10/27/a-space-of-their-own-adding-a-type-namespace-to-hackett/">Hackett supports multiple namespaces</a>, and each can have its own <code>#%app</code> binding.</p><p>The real implementation of Hackett’s type language is a little bit longer than the one in this blog post because it includes some extra definitions to provide custom <code>syntax/parse</code> pattern expanders for matching types and some template metafunctions for producing them, which are used by the typechecker, but if you’d like to see the whole thing, <a href="https://github.com/lexi-lambda/hackett/blob/ba64193da38f63dab2523f42c1b7614cdfa8c935/hackett-lib/hackett/private/type-language.rkt">it’s available on GitHub here</a>.</p><h2><a name="evaluation-limitations-and-acknowledgements"></a>Evaluation, limitations, and acknowledgements</h2><p>Reimplementing Hackett’s type language took about a week and a half, about half of which was supplemented by the extra time I had before I started <a href="https://twitter.com/lexi_lambda/status/976533916596097024">my new job</a> this past week. A portion of that time was spent deciding what I actually wanted to do, and a lot of it was spent hunting down fiddly bugs. All told, the rewrite resulted in a net addition of 250 lines of code to the Hackett codebase. However, 350 of the added lines reside in a new, self-contained module dedicated to Hackett’s type language, so the change actually resulted in a net <em>removal</em> of 100 lines from the rest of the codebase, which I consider an organizational win.</p><p>As for whether or not the change will accomplish the goals I had in mind, I think signs currently point to a strong likelihood of the answer being yes. The very same night I finalized and merged the changes to the type language, I dusted off an old prototype of typeclass deriving I had not been able to get working due to insufficiencies of the old type representation. Not only was I <a href="https://twitter.com/lexi_lambda/status/985051504867446786">able to get it working</a> quickly and easily, I was able to do it in <a href="https://twitter.com/lexi_lambda/status/985052476473856000">no more than 20 lines of code</a>. While the implementation is not as robust as it should ideally be, nor is it safe or simple enough yet to be easy for Hackett users to write themselves, making the impossible possible is usually a sign of motion in the right direction.</p><p>Unfortunately, the technique outlined in this blog post is not completely flawless. Due to its reliance on the <code>local-expand</code> stop list, this technique is incompatible with macros that force recursive expansion using an empty stop list. In the upcoming reimplementation of the Racket macroexpander to be released in Racket 7, this includes <code>syntax-parameterize</code>, which unfortunately means syntax parameters don’t work in the type language. This is a problem, and while it’s not a dealbreaker, it is something that will almost certainly have to be fixed at some point. Fortunately, it isn’t intractable, and I’ve been discussing some potential approaches to fixing the problem, whether via changes to the macroexpander or by making macros like <code>syntax-parameterize</code> cooperate better with things like Hackett’s type language.</p><p>Finally, as seems to be the case more and more with my blog posts, I cannot express enough thanks to <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a>, without whose help I would probably not have been able to get everything working (not to mention that the Racket macro system would not exist without Matthew inventing and implementing it nearly singlehandedly). Matthew does an almost unfathomable number of things for Racket already without me pestering him with questions, bug reports, and feature requests, but he’s always patient and helpful all the same. Also, once again, I’d like to thank <a href="http://www.ccs.neu.edu/home/ryanc/">Ryan Culpepper</a> for <a href="https://www2.ccs.neu.edu/racket/pubs/dissertation-culpepper.pdf">his incredible work on constructing tools for the working macro developer</a>, including writing the fantastic <code>syntax/parse</code> library that powers essentially everything I do. Thank you both.</p><ol class="footnotes"></ol></article>An opinionated guide to Haskell in 20182018-02-10T00:00:00Z2018-02-10T00:00:00ZAlexis King<article><p>For me, this month marks the end of an era in my life: as of February 2018, I am no longer employed writing Haskell. It’s been a fascinating two years, and while I am excitedly looking forward to what I’ll be doing next, it’s likely I will continue to write Haskell in my spare time. I’ll probably even write it again professionally in the future.</p><p>In the meantime, in the interest of both sharing with others the small amount of wisdom I’ve gained and preserving it for my future self, I’ve decided to write a long, rather dry overview of a few select parts of the Haskell workflow I developed and the ecosystem I settled into. This guide is, as the title notes, <em>opinionated</em>—it is what I used in my day-to-day work, nothing more—and I don’t claim that anything here is the only way to write Haskell, nor even the best way. It is merely what I found helpful and productive. Take from it as much or as little as you’d like.</p><h2><a name="build-tools-and-how-to-use-them"></a>Build tools and how to use them</h2><p>When it comes to building Haskell, you have options. And frankly, most of them are pretty good. There was a time when <code>cabal-install</code> had a (warranted) reputation for being nearly impossible to use and regularly creating dependency hell, but I don’t think that’s the case anymore (though you <em>do</em> need to be a little careful about how you use it). Sandboxed builds work alright, and <code>cabal new-build</code> and the other <code>cabal new-*</code> commands are even better. That said, the UX of <code>cabal-install</code> is still less-than-stellar, and it has sharp edges, especially for someone coming from an ecosystem without a heavyweight compilation process like JavaScript, Ruby, or Python.</p><p>Nix is an alternative way to manage Haskell dependencies, and it seems pretty cool. It has a reputation for being large and complicated, and that reputation does not seem especially unfair, but you get lots of benefits if you’re willing to pay the cost. Unfortunately, I have never used it (though I’ve read a lot about it), so I can’t comment much on it here. Perhaps I’ll try to go all-in with Nix when I purchase my next computer, but for now, my workflow works well enough that I don’t feel compelled to switch.</p><p>Personally, I use <code>stack</code> as my Haskell build tool. It’s easy to use, it works out of the box, and while it doesn’t enjoy the same amount of caching as <code>cabal new-build</code> or Nix, it caches most packages, and it also makes things like git-hosted sources incredibly easy, which (as far as I can tell) can’t be done with <code>cabal-install</code> alone.</p><p>This section is going to be a guide on how <em>I</em> use <code>stack</code>. If you use <code>cabal-install</code> with or without Nix, great! Those tools seem good, too. This is not an endorsement of <code>stack</code> over the other build tools, just a description of how I use it, the issues I ran into, and my solutions to them.</p><h3><a name="understanding-stack-s-model-and-avoiding-its-biggest-gotcha"></a>Understanding <code>stack</code>’s model and avoiding its biggest gotcha</h3><p>Before using <code>stack</code>, there are a few things every programmer should know:</p><ul><li><p><code>stack</code> is not a package manager, it is a build tool. It does not manage a set of “installed” packages; it simply builds targets and their dependencies.</p></li><li><p>The command to build a target is <code>stack build <target></code>. Just using <code>stack build</code> on its own will build the current project’s targets.</p></li><li><p><strong>You almost certainly do not want to use <code>stack install</code>.</strong></p></li></ul><p>This is the biggest point of confusion I see among new users of <code>stack</code>. After all, when you want to install a package with <code>npm</code>, you type <code>npm install <package></code>. So a new Haskeller decides to install <code>lens</code>, types <code>stack install lens</code>, and then later tries <code>stack uninstall lens</code>, only to discover that no such command exists. What happened?</p><p><code>stack install</code> is not like <code>npm install</code>. <code>stack install</code> is like <code>make install</code>. It is nothing more than an alias for <code>stack build --copy-bins</code>, and <em>all</em> it does is build the target and copy all of its executables into some relatively global location like <code>~/.local/bin</code>. This is usually not what you want.</p><p>This design decision is not unique to <code>stack</code>; <code>cabal-install</code> suffers from it as well. One can argue that it isn’t unintuitive because it really is just following what <code>make install</code> conventionally does, and the fact that it happens to conflict with things like <code>npm install</code> or even <code>apt-get install</code> is just a naming clash. I think that argument is a poor one, however, and I think the decision to even include a <code>stack install</code> command was a bad idea.</p><p>So, remember: don’t use <code>stack install</code>! <code>stack</code> works best when everything lives inside the current project’s <em>local</em> sandbox, and <code>stack install</code> copies executables into a <em>global</em> location by design. While it might sometimes appear to work, it’s almost always wrong. The <em>only</em> situation in which <code>stack install</code> is the right answer is when you want to install an executable for a use unrelated to Haskell development (that is, something like <code>pandoc</code>) that just so happens to be provided by a Haskell package. <strong>This means no running <code>stack install ghc-mod</code> or <code>stack install intero</code> either, no matter what READMEs might tell you!</strong> Don’t worry: I’ll cover the proper way to install those things later.</p><h3><a name="actually-building-your-project-with-stack"></a>Actually building your project with <code>stack</code></h3><p>Okay, so now that you know to never use <code>stack install</code>, what <em>do</em> you use? Well, <code>stack build</code> is probably all you need. Let’s cover some variations of <code>stack build</code> that I use most frequently.</p><p>Once you have a <code>stack</code> project, you can build it by simply running <code>stack build</code> within the project directory. However, for local development, this is usually unnecessarily slow because it runs the GHC optimizer. For faster development build times, pass the <code>--fast</code> flag to disable optimizations:</p><pre><code>$ stack build --fast
</code></pre><p>By default, <code>stack</code> builds dependencies with coarse-grained, package-level parallelism, but you can enable more fine-grained, module-level parallel builds by adding <code>--ghc-options=-j</code>. Unfortunately, there are conflicting accounts on whether or not this actually makes things faster or slower in practice, and I haven’t extensively tested to see whether or not this is the case, so I mostly leave it off.</p><p>Usually, you also want to build and run the tests along with your code, which you can enable with the <code>--test</code> flag. Additionally, <code>stack test</code> is an alias for <code>stack build --test</code>, so these two commands are equivalent:</p><pre><code>$ stack build --fast --test
$ stack test --fast
</code></pre><p>Also, it is useful to build documentation as well as code! You can do this by passing the <code>--haddock</code> flag, but unfortunately, I find Haddock sometimes takes an unreasonably long time to run. Therefore, since I usually only care about running Haddock on my dependencies, I usually pass the <code>--haddock-deps</code> flag instead, which prevents having to re-run Haddock every time you build:</p><pre><code>$ stack test --fast --haddock-deps
</code></pre><p>Finally, I usually want to build and test my project in the background whenever my code changes. Fortunately, this can be done easily by using the <code>--file-watch</code> flag, making it easy to incrementally change project code and immediately see results:</p><pre><code>$ stack test --fast --haddock-deps --file-watch
</code></pre><p>This is the command I usually use to develop my Haskell projects.</p><h3><a name="accessing-local-documentation"></a>Accessing local documentation</h3><p>While Haskell does not always excel on the documentation front, a small amount of documentation is almost always better than no documentation at all, and I find my dependencies’ documentation to be an invaluable resource while developing. I find many people just look at docs on Hackage or use the hosted instance of Hoogle, but this sometimes leads people astray: they might end up looking at the wrong version of the documentation! Fortunately, there’s an easy solution to this problem, which is to browse the documentation <code>stack</code> installs locally, which is guaranteed to match the version you are using in your current project.</p><p>The easiest way to open local documentation for a particular package is to use the <code>stack haddock --open</code> command. For example, to open the documentation for <code>lens</code>, you could use the following command:</p><pre><code>$ stack haddock --open lens
</code></pre><p>This will open the local documentation in your web browser, and you can browse it at your leisure. If you have already built the documentation using the <code>--haddock-deps</code> option I recommended in the previous section, this command should complete almost instantly, but if you haven’t built the documentation yet, you’ll have to wait as <code>stack</code> builds it for you on-demand.</p><p>While this is a good start, it isn’t perfect. Ideally, I want to have <em>searchable</em> documentation, and fortunately, this is possible to do by running Hoogle locally. This is easy enough with modern versions of <code>stack</code>, which have built-in Hoogle integration, but it still requires a little bit of per-project setup, since you need to build the Hoogle search index with the following command:</p><pre><code>$ stack hoogle -- generate --local
</code></pre><p>This will install Hoogle into the current project if it isn’t already installed, and it will index your dependencies’ documentation and generate a new Hoogle database. Once you’ve done that, you can start a web server that serves a local Hoogle search page with the following command:</p><pre><code>$ stack hoogle -- server --local --port=8080
</code></pre><p>Navigate to <code>http://localhost:8080</code> in your web browser, and you’ll have a fully-searchable index of all your Haskell packages’ documentation. Isn’t that neat?</p><p>Unfortunately, you <em>will</em> have to manually regenerate the Hoogle database when you install new packages and their documentation, which you can do by re-running <code>stack hoogle -- generate --local</code>. Fortunately, regenerating the database doesn’t take very long, as long as you’ve been properly rebuilding the documentation with <code>--haddock-deps</code>.</p><h3><a name="configuring-your-project"></a>Configuring your project</h3><p>Every project built with <code>stack</code> is configured with two separate files:</p><ul><li><p>The <code>stack.yaml</code> file, which controls which packages are built and what versions to pin your dependencies to.</p></li><li><p>The <code><project>.cabal</code> file <em>or</em> <code>package.yaml</code> file, which specifies build targets, their dependencies, and which GHC options to apply, among other things.</p></li></ul><p>The <code>.cabal</code> file is, ultimately, what is used to build your project, but modern versions of <code>stack</code> generate projects that use hpack, which uses an alternate configuration file, the <code>package.yaml</code> file, to generate the <code>.cabal</code> file. This can get a little bit confusing, since it means you have <em>three</em> configuration files in your project, one of which is generated from the other one.</p><p>I happen to use and like hpack, so I use a <code>package.yaml</code> file and allow hpack to generate the <code>.cabal</code> file. I have no real love for YAML, and in fact I think custom configuration formats are completely fine, but the primary advantage of hpack is the ability to specify things like GHC options and default language extensions for all targets at once, instead of needing to duplicate them per-target.</p><p>You can think of the <code>.cabal</code> or <code>package.yaml</code> file as a specification for <em>how</em> your project is built and <em>what packages</em> it depends on, but the <code>stack.yaml</code> file is a specification of precisely <em>which version</em> of each package should be used and where it should be fetched from. Also, each <code>.cabal</code> file corresponds to precisely <em>one</em> Haskell package (though it may have any number of executable targets), but a <code>stack.yaml</code> file can specify multiple different packages to build, useful for multi-project builds that share a common library. The details here can be a little confusing, more than I am likely going to be able to explain in this blog post, but for the most part, you can get away with the defaults unless you’re doing something fancy.</p><h3><a name="setting-up-editor-integration"></a>Setting up editor integration</h3><p>Currently, I use Atom to write Haskell. Atom is not a perfect editor by any means, and it leaves a lot to be desired, but it’s easy to set up, and the Haskell editor integration is decent.</p><p>Atom’s editor integration is powered by <code>ghc-mod</code>, a program that uses the GHC API to provide tools to inspect Haskell programs. Installing <code>ghc-mod</code> must be done manually so that Atom’s <code>haskell-ghc-mod</code> package can find it, and this is where a lot of people get tripped up. They run <code>stack install ghc-mod</code>, it installs <code>ghc-mod</code> into <code>~/.local/bin</code>, they put that in their <code>PATH</code>, and things work! …except when a new version of GHC is released a few months later, everything stops working.</p><p>As mentioned above, <strong><code>stack install</code> is not what you want</strong>. Tools like <code>ghc-mod</code>, <code>hlint</code>, <code>hoogle</code>, <code>weeder</code>, and <code>intero</code> work best when installed as part of the sandbox, <em>not</em> globally, since that ensures they will match the current GHC version your project is using. This can be done per-project using the ordinary <code>stack build</code> command, so the easiest way to properly install <code>ghc-mod</code> into a <code>stack</code> project is with the following command:</p><pre><code>$ stack build ghc-mod
</code></pre><p>Unfortunately, this means you will need to run that command inside every single <code>stack</code> project individually in order to properly set it up so that <code>stack exec -- ghc-mod</code> will find the correct executable. One way to circumvent this is by using a recently-added <code>stack</code> flag designed for this explicit purpose, <code>--copy-compiler-tool</code>. This is like <code>--copy-bins</code>, but it copies the executables into a <em>compiler-specific location</em>, so a tool built for GHC 8.0.2 will be stored separately from the same tool built for GHC 8.2.2. <code>stack exec</code> arranges for the executables for the current compiler version to end up in the <code>PATH</code>, so you only need to build and install your tools once per compiler version.</p><p>Does this kind of suck? Yes, a little bit, but it sucks a whole lot less than all your editor integration breaking every time you switch to a project that uses a different version of GHC. I use the following command in a fresh sandbox when a Stackage LTS comes out for a new version of GHC:</p><pre><code>$ stack build --copy-compiler-tool ghc-mod hoogle weeder
</code></pre><p>This way, I only have to build those tools once, and I don’t worry about rebuilding them again until a the next release of GHC. To verify that things are working properly, you should be able to create a fresh <code>stack</code> project, run a command like this one, and get a similar result:</p><pre><code>$ stack exec -- which ghc-mod
/Users/alexis/.stack/compiler-tools/x86_64-osx/ghc-8.2.2/bin/ghc-mod
</code></pre><p>Note that this path is scoped to my operating system and my compiler version, but nothing else—no LTS or anything like that.</p><h2><a name="warning-flags-for-a-safe-build"></a>Warning flags for a safe build</h2><p>Haskell is a relatively strict language as programming languages go, but in my experience, it isn’t quite strict enough. Many things are not errors that probably ought to be, like orphan instances and inexhaustive pattern matches. Fortunately, GHC provides <em>warnings</em> that catch these problems statically, which fill in the gaps. I recommend using the following flags on all projects to ensure everything is caught:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wall"><code>-Wall</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wcompat"><code>-Wcompat</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wincomplete-record-updates"><code>-Wincomplete-record-updates</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wincomplete-uni-patterns"><code>-Wincomplete-uni-patterns</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wredundant-constraints"><code>-Wredundant-constraints</code></a></p></li></ul><p>The <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wall"><code>-Wall</code></a> option turns on <em>most</em> warnings, but (ironically) not all of them. The <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Weverything"><code>-Weverything</code></a> flag truly turns on <em>all</em> warnings, but some of the warnings left disabled by <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wall"><code>-Wall</code></a> really are quite silly, like warning when type signatures on polymorphic local bindings are omitted. Some of them, however, are legitimately useful, so I recommend turning them on explicitly.</p><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wcompat"><code>-Wcompat</code></a> enables warnings that make your code more robust in the face of future backwards-incompatible changes. These warnings are trivial to fix and serve as free future-proofing, so I see no reason not to turn these warnings on.</p><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wincomplete-record-updates"><code>-Wincomplete-record-updates</code></a> and <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wincomplete-uni-patterns"><code>-Wincomplete-uni-patterns</code></a> are things I think ought to be enabled by <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wall"><code>-Wall</code></a> because they both catch what are essentially partial pattern-matches (and therefore runtime errors waiting to happen). The fact that <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wincomplete-uni-patterns"><code>-Wincomplete-uni-patterns</code></a> <em>isn’t</em> enabled by <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wall"><code>-Wall</code></a> is so surprising that it can lead to bugs being overlooked, since the extremely similar <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wincomplete-patterns"><code>-Wincomplete-patterns</code></a> <em>is</em> enabled by <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wall"><code>-Wall</code></a>.</p><p><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Wredundant-constraints"><code>-Wredundant-constraints</code></a> is a useful warning that helps to eliminate unnecessary typeclass constraints on functions, which can sometimes occur if a constraint was previously necessary but ends up becoming redundant due to a change in the function’s behavior.</p><p>I put all five of these flags in the <code>.cabal</code> file (or <code>package.yaml</code>), which enables them everywhere, but this alone is unlikely to enforce a warning-free codebase, since the build will still succeed even in the presence of warnings. Therefore, when building projects in CI, I pass the <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/using-warnings.html#ghc-flag--Werror"><code>-Werror</code></a> flag (using <code>--ghc-options=-Werror</code> for <code>stack</code>), which treats warnings as errors and halts the build if any warnings are found. This is useful, since it means warnings don’t halt the whole build while developing, making it possible to write some code that has warnings and still run the test suite, but it still enforces that pushed code be warning-free.</p><h2><a name="any-flavor-you-like"></a>Any flavor you like</h2><p>Haskell is both a language and a spectrum of languages. It is both a standard and a specific implementation. Haskell 98 and Haskell 2010 are good, small languages, and there are a few different implementations, but when people talk about “Haskell”, unqualified, they’re almost always talking about GHC.</p><p>GHC Haskell, in stark contrast to standard Haskell, is neither small nor particularly specific, since GHC ships with <em>dozens</em> of knobs and switches that can be used to configure the language. In theory, this is a little terrifying. How could anyone ever hope to talk about Haskell and agree upon how to write it if there are so many <em>different</em> Haskells, each a little bit distinct? Having a cohesive ecosystem would be completely hopeless.</p><p>Fortunately, in practice, this is not nearly as bad as it seems. The majority of GHC extensions are simple switches: a feature is either on or it is off. Turning a feature on rarely affects code that does not use it, so most extensions can be turned on by default, and programmers may simply avoid the features they do not wish to use, just as any programmer in any programming language likely picks a subset of their language’s features to use on a daily basis. Writing Haskell is not different in this regard, only in the sense that it does not allow all features to be used by default; everything from minor syntactic tweaks to entirely new facets of the type system are opt-in.</p><p>Frankly, I think the UX around this is terrible. I recognize the desire to implement a standard Haskell, and the old <code>-fglasgow-exts</code> was not an especially elegant solution for people wishing to use nonstandard Haskell, but having to insert <code>LANGUAGE</code> pragmas at the top of every module just to take advantage of the best features GHC has to offer is a burden, and it is unnecessarily intimidating. I think much of the Haskell community finds the use of <code>LANGUAGE</code> pragmas preferable to enabling extensions globally using the <code>default-extensions</code> list in the <code>.cabal</code> file, but I cut across the grain on that issue <em>hard</em>. The vast majority of language extensions I use are extensions I want enabled all the time; a list of them at the top of a module is just distracting noise, and it only serves to bury the extensions I really do want to enable on a module-by-module basis. It also makes it tricky to communicate with a team which extensions are acceptable (or even preferable) and which are discouraged.</p><p>My <em><strong>strong</strong></em> recommendation if you decide to write GHC Haskell on a team is to agree as a group to a list of extensions the team is happy with enabling everywhere and putting those extensions in the <code>default-extensions</code> list in the <code>.cabal</code> file. This eliminates clutter, busywork, and the conceptual overhead of remembering which extensions are in favor, and which are discouraged. This is a net win, and it isn’t at all difficult to look in the <code>.cabal</code> file when you want to know which extensions are in use.</p><p>Now, with that small digression out of the way, the question becomes precisely which extensions should go into that <code>default-extensions</code> list. I happen to like using most of the features GHC makes available, so I enable a whopping <strong>34</strong> language extensions <em>by default</em>. As of GHC 8.2, here is my list:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XBangPatterns"><code>BangPatterns</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XConstraintKinds"><code>ConstraintKinds</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDataKinds"><code>DataKinds</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDefaultSignatures"><code>DefaultSignatures</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveFoldable"><code>DeriveFoldable</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveFunctor"><code>DeriveFunctor</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveGeneric"><code>DeriveGeneric</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveLift"><code>DeriveLift</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveTraversable"><code>DeriveTraversable</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDerivingStrategies"><code>DerivingStrategies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XEmptyCase"><code>EmptyCase</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XExistentialQuantification"><code>ExistentialQuantification</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleContexts"><code>FlexibleContexts</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleInstances"><code>FlexibleInstances</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFunctionalDependencies"><code>FunctionalDependencies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGeneralizedNewtypeDeriving"><code>GeneralizedNewtypeDeriving</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XInstanceSigs"><code>InstanceSigs</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XKindSignatures"><code>KindSignatures</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XLambdaCase"><code>LambdaCase</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMultiParamTypeClasses"><code>MultiParamTypeClasses</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMultiWayIf"><code>MultiWayIf</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XNamedFieldPuns"><code>NamedFieldPuns</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverloadedStrings"><code>OverloadedStrings</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XPatternSynonyms"><code>PatternSynonyms</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XRankNTypes"><code>RankNTypes</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XScopedTypeVariables"><code>ScopedTypeVariables</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XStandaloneDeriving"><code>StandaloneDeriving</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTupleSections"><code>TupleSections</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeApplications"><code>TypeApplications</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilies"><code>TypeFamilies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilyDependencies"><code>TypeFamilyDependencies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeOperators"><code>TypeOperators</code></a></p></li></ul><p>This is a lot, and a few of them are likely to be more controversial than others. Since I do not imagine everyone will agree with everything in this list, I’ve broken it down into smaller chunks, arranged from what I think ought to be least controversial to most controversial, along with a little bit of justification why each extension is in each category. If you’re interested in coming up with your own list of extensions, the rest of this section is for you.</p><h3><a name="trivial-lifting-of-standards-imposed-limitations"></a>Trivial lifting of standards-imposed limitations</h3><p>A few extensions are tiny changes that lift limitations that really have no reason to exist, other than that they are mandated by the standard. I am not sure why these restrictions are in the standard to begin with, other than perhaps a misguided attempt at making the language simpler. These extensions include the following:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XEmptyCase"><code>EmptyCase</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleContexts"><code>FlexibleContexts</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleInstances"><code>FlexibleInstances</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XInstanceSigs"><code>InstanceSigs</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMultiParamTypeClasses"><code>MultiParamTypeClasses</code></a></p></li></ul><p>These extensions have no business <em>not</em> being turned on everywhere. <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleContexts"><code>FlexibleContexts</code></a> and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleInstances"><code>FlexibleInstances</code></a> end up being turned on in almost any nontrivial Haskell module, since without them, the typeclass system is pointlessly and artificially limited.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XInstanceSigs"><code>InstanceSigs</code></a> is extremely useful, completely safe, and has zero downsides.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMultiParamTypeClasses"><code>MultiParamTypeClasses</code></a> are almost impossible to avoid, given how many libraries use them, and they are a completely obvious generalization of single-parameter typeclasses. Much like <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleContexts"><code>FlexibleContexts</code></a> and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFlexibleInstances"><code>FlexibleInstances</code></a>, I see no real reason to ever leave these disabled.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XEmptyCase"><code>EmptyCase</code></a> is even stranger to me, since <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XEmptyDataDecls"><code>EmptyDataDecls</code></a> is in Haskell 2010, so it’s possible to define empty datatypes in standard Haskell but not exhaustively pattern-match on them! This is silly, and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XEmptyCase"><code>EmptyCase</code></a> should be standard Haskell.</p><h3><a name="syntactic-conveniences"></a>Syntactic conveniences</h3><p>A few GHC extensions are little more than trivial, syntactic abbreviations. These things would be tiny macros in a Lisp, but they need to be extensions to the compiler in Haskell:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XLambdaCase"><code>LambdaCase</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMultiWayIf"><code>MultiWayIf</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XNamedFieldPuns"><code>NamedFieldPuns</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTupleSections"><code>TupleSections</code></a></p></li></ul><p>All of these extensions are only triggered by explicit use of new syntax, so existing programs will never change behavior when these extensions are introduced.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XLambdaCase"><code>LambdaCase</code></a> only saves a few characters, but it eliminates the need to come up with a fresh, unique variable name that will only be used once, which is sometimes hard to do and leads to worse names overall. Sometimes, it really is better to leave something unnamed.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMultiWayIf"><code>MultiWayIf</code></a> isn’t something I find I commonly need, but when I do, it’s nice to have. It’s far easier to read than nested <code>if...then...else</code> chains, and it uses the existing guard syntax already used with function declarations and <code>case...of</code>, so it’s easy to understand, even to those unfamiliar with the extension.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XNamedFieldPuns"><code>NamedFieldPuns</code></a> avoids headaches and clutter when using Haskell records without the <a href="https://www.reddit.com/r/haskell/comments/6jaa5f/recordwildcards_and_binary_parsing/djd5ugj/">accidental identifier capture issues</a> of <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XRecordWildCards"><code>RecordWildCards</code></a>. It’s a nice, safe compromise that brings some of the benefits of <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XRecordWildCards"><code>RecordWildCards</code></a> without any downsides.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTupleSections"><code>TupleSections</code></a> is a logical generalization of tuple syntax in the same vein as standard operator sections, and it’s quite useful when using applicative notation. I don’t see any reason to not enable it.</p><h3><a name="extensions-to-the-deriving-mechanism"></a>Extensions to the deriving mechanism</h3><p>GHC’s typeclass deriving mechanism is one of the things that makes Haskell so pleasant to write, and in fact I think Haskell would be nearly unpalatable to write without it. Boilerplate generation is a good thing, since it defines operations in terms of a single source of truth, and generated code is code you do not need to maintain. There is rarely any reason to write a typeclass instance by hand when the deriving mechanism will write it automatically.</p><p>These extensions give GHC’s typeclass deriving mechanism more power without any cost. Therefore, I see no reason <em>not</em> to enable them:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveFoldable"><code>DeriveFoldable</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveFunctor"><code>DeriveFunctor</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveGeneric"><code>DeriveGeneric</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveLift"><code>DeriveLift</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveTraversable"><code>DeriveTraversable</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDerivingStrategies"><code>DerivingStrategies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGeneralizedNewtypeDeriving"><code>GeneralizedNewtypeDeriving</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XStandaloneDeriving"><code>StandaloneDeriving</code></a></p></li></ul><p>The first five of these simply extend the list of typeclasses GHC knows how to derive, something that will only ever be triggered if the user explicitly requests GHC derive one of those classes. <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGeneralizedNewtypeDeriving"><code>GeneralizedNewtypeDeriving</code></a> is quite possibly one of the most important extensions in all of Haskell, since it dramatically improves <code>newtype</code>s’ utility. Wrapper types can inherit instances they need without any boilerplate, and making increased type safety easier and more accessible is always a good thing in my book.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDerivingStrategies"><code>DerivingStrategies</code></a> is new to GHC 8.2, but it finally presents the functionality of GHC’s <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveAnyClass"><code>DeriveAnyClass</code></a> extension in a useful way. <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveAnyClass"><code>DeriveAnyClass</code></a> is useful when used with certain libraries that use <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDefaultSignatures"><code>DefaultSignatures</code></a> (discussed later) with <code>GHC.Generics</code> to derive instances of classes without the deriving being baked into GHC. Unfortunately, enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveAnyClass"><code>DeriveAnyClass</code></a> essentially disables the far more useful <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGeneralizedNewtypeDeriving"><code>GeneralizedNewtypeDeriving</code></a>, so I do <em>not</em> recommend enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDeriveAnyClass"><code>DeriveAnyClass</code></a>. Fortunately, with <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDerivingStrategies"><code>DerivingStrategies</code></a>, it’s possible to opt into the <code>anyclass</code> deriving strategy on a case-by-case basis, getting some nice boilerplate reduction in the process.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XStandaloneDeriving"><code>StandaloneDeriving</code></a> is useful when GHC’s deriving algorithms aren’t <em>quite</em> clever enough to deduce the instance context automatically, so it allows specifying it manually. This is only useful in a few small situations, but it’s nice to have, and there are no downsides to enabling it, so it ought to be turned on.</p><h3><a name="lightweight-syntactic-adjustments"></a>Lightweight syntactic adjustments</h3><p>A couple extensions tweak Haskell’s syntax in more substantial ways than things like <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XLambdaCase"><code>LambdaCase</code></a>, but not in a significant enough way for them to really be at all surprising:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XBangPatterns"><code>BangPatterns</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XKindSignatures"><code>KindSignatures</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeOperators"><code>TypeOperators</code></a></p></li></ul><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XBangPatterns"><code>BangPatterns</code></a> mirror strictness annotations on datatypes, so they are unlikely to be confusing, and they provide a much more pleasant notation for annotating the strictness of bindings than explicit uses of <code>seq</code>.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XKindSignatures"><code>KindSignatures</code></a> are also fairly self-explanatory: they’re just like type annotations, but for types instead of values. Writing kind signatures explicitly is usually unnecessary, but they can be helpful for clarity or for annotating phantom types when <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XPolyKinds"><code>PolyKinds</code></a> is not enabled. Enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XKindSignatures"><code>KindSignatures</code></a> doesn’t have any adverse effects, so I see no reason not to enable it everywhere.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeOperators"><code>TypeOperators</code></a> adjusts the syntax of types slightly, allowing operators to be used as type constructors and written infix, which is technically backwards-incompatible, but I’m a little suspicious of anyone using <code>(!@#$)</code> as a type variable (especially since standard Haskell does not allow them to be written infix). This extension is useful with some libraries like <code>natural-transformations</code> that provide infix type constructors, and it makes the type language more consistent with the value language.</p><h3><a name="polymorphic-string-literals"></a>Polymorphic string literals</h3><p>I’m putting this extension in a category all of its own, mostly because I don’t think any other Haskell extensions have quite the same set of tradeoffs:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverloadedStrings"><code>OverloadedStrings</code></a></p></li></ul><p>For me, <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverloadedStrings"><code>OverloadedStrings</code></a> is not optional. Haskell’s infamous “string problem” (discussed in more detail at the end of this blog post) means that <code>String</code> is a linked list of characters, and all code that cares about performance actually uses <code>Text</code>. Manually invoking <code>pack</code> on every single string literal in a program is just noise, and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverloadedStrings"><code>OverloadedStrings</code></a> solves that noise.</p><p>That said, I actually find I don’t use the polymorphism of string literals very often, and I’d be alright with monomorphic literals if I could make them <em>all</em> have type <code>Text</code>. Unfortunately, there isn’t a way to do this, so <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverloadedStrings"><code>OverloadedStrings</code></a> is the next best thing, even if it sometimes causes some unnecessary ambiguities that require type annotations to resolve.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverloadedStrings"><code>OverloadedStrings</code></a> is an extension that I use so frequently, in so many modules (especially in my test suites) that I would rather keep it on everywhere so I don’t have to care about whether or not it’s enabled in the module I’m currently writing. On the other hand, it certainly isn’t my favorite language extension, either. I wouldn’t go as far as to call it a necessary evil, since I don’t think it’s truly “evil”, but it does seem to be necessary.</p><h3><a name="simple-extensions-to-aid-type-annotation"></a>Simple extensions to aid type annotation</h3><p>The following two extensions significantly round out Haskell’s language for referring to types, making it much easier to insert type annotations where necessary (for removing ambiguity or for debugging type errors):</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XScopedTypeVariables"><code>ScopedTypeVariables</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeApplications"><code>TypeApplications</code></a></p></li></ul><p>That the behavior of <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XScopedTypeVariables"><code>ScopedTypeVariables</code></a> is <em>not</em> the default is actually one of the most common gotchas for new Haskellers. Sadly, it can theoretically adjust the behavior of existing Haskell programs, so I cannot include it in the list of trivial changes, but I would argue such programs were probably confusing to begin with, and I have never seen a program in practice that was impacted by that problem. I think leaving <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XScopedTypeVariables"><code>ScopedTypeVariables</code></a> off is much, much more likely to be confusing than turning it on.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeApplications"><code>TypeApplications</code></a> is largely unrelated, but I include it in this category because it’s quite useful and cooperates well with <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XScopedTypeVariables"><code>ScopedTypeVariables</code></a>. Use of <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeApplications"><code>TypeApplications</code></a> makes instantiation much more lightweight than full-blown type annotations, and once again, it has no downsides if it is enabled and unused (since it is a syntactic addition). I recommend enabling it.</p><h3><a name="simple-extensions-to-the-haskell-type-system"></a>Simple extensions to the Haskell type system</h3><p>A few extensions tweak the Haskell type system in ways that I think are simple enough to be self-explanatory, even to people who might not have known they existed. These are as follows:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XConstraintKinds"><code>ConstraintKinds</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XRankNTypes"><code>RankNTypes</code></a></p></li></ul><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XConstraintKinds"><code>ConstraintKinds</code></a> is largely just used to define typeclass aliases, which is both useful and self-explanatory. Unifying the type and constraint language also has the effect of allowing type-level programming with constraints, which is sometimes useful, but far rarer in practice than the aforementioned use case.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XRankNTypes"><code>RankNTypes</code></a> are uncommon, looking at the average type in a Haskell program, but they’re certainly nice to have when you need them. The idea of pushing <code>forall</code>s further into a type to adjust how variables are quantified is something that I find people find fairly intuitive, especially after seeing them used once or twice, and higher-rank types do crop up regularly, if infrequently.</p><h3><a name="intermediate-syntactic-adjustments"></a>Intermediate syntactic adjustments</h3><p>Three syntactic extensions to Haskell are a little bit more advanced than the ones I’ve already covered, and none of them are especially related:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDefaultSignatures"><code>DefaultSignatures</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XPatternSynonyms"><code>PatternSynonyms</code></a></p></li></ul><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a> is, on the surface, simple. It changes <code>do</code> notation to use <code>Applicative</code> operations where possible, which allows using <code>do</code> notation with applicative functors that are not monads, and it also makes operations potentially more performant when <code>(<*>)</code> can be implemented more efficiently than <code>(>>=)</code>. In theory, it sounds like there are no downsides to enabling this everywhere. However, there are are a few drawbacks that lead me to put it so low on this list:</p><ol><li><p>It considerably complicates the desugaring of <code>do</code> blocks, to the point where the algorithm cannot even be easily syntactically documented. In fact, an additional compiler flag, <code>-foptimal-applicative-do</code>, is a way to <em>opt into</em> optimal solutions for <code>do</code> block expansions, tweaking the desugaring algorithm to have an <em>O</em>(<em>n</em><sup>3</sup>) time complexity! This means that the default behavior is guided by a heuristic, and desugaring isn’t even especially predictable. This isn’t necessarily so bad, since it’s really only intended as an optimization when some <code>Monad</code> operations are still necessary, but it does dramatically increase the complexity of one of Haskell’s core forms.</p></li><li><p>The desugaring, despite being <em>O</em>(<em>n</em><sup>2</sup>) by default, isn’t even especially clever. It relies on a rather disgusting hack that recognizes <code>return e</code>, <code>return $ e</code>, <code>pure e</code>, or <code>pure $ e</code> expressions <em>syntactically</em>, and it completely gives up if an expression with precisely that shape is not the final statement in a <code>do</code> block. This is a bit awkward, since it effectively turns <code>return</code> and <code>pure</code> into syntax when before they were merely functions, but that isn’t all. It also means that the following <code>do</code> block is <em>not</em> desugared using <code>Applicative</code> operations:</p><pre><code class="pygments"><span class="kr">do</span> <span class="n">foo</span> <span class="n">a</span> <span class="n">b</span>
<span class="n">bar</span> <span class="n">s</span> <span class="n">t</span>
<span class="n">baz</span> <span class="n">y</span> <span class="n">z</span></code></pre><p>This will use the normal, monadic desugaring, despite the fact that it is trivially desugared into <code>Applicative</code> operations as <code>foo a b *> bar s t *> baz y z</code>. In order to get <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a> to trigger here, the <code>do</code> block must be contorted into the following:</p><pre><code class="pygments"><span class="kr">do</span> <span class="n">foo</span> <span class="n">a</span> <span class="n">b</span>
<span class="n">bar</span> <span class="n">s</span> <span class="n">t</span>
<span class="n">r</span> <span class="ow"><-</span> <span class="n">baz</span> <span class="n">y</span> <span class="n">z</span>
<span class="n">pure</span> <span class="n">r</span></code></pre><p>This seems like an odd oversight.</p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTemplateHaskell"><code>TemplateHaskell</code></a> doesn’t seem able to cope with <code>do</code> blocks when <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a> is enabled. I reported this as <a href="https://ghc.haskell.org/trac/ghc/ticket/14471">an issue on the GHC bug tracker</a>, but it hasn’t received any attention, so it’s not likely to get fixed unless someone takes the initiative to do so.</p></li><li><p>Enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a> can cause problems with code that may have assumed <code>do</code> would always be monadic, and sometimes, that can cause code that typechecks to lead to an infinite loop at runtime. Specifically, if <code>do</code> notation is used to define <code>(<*>)</code> in terms of <code>(>>=)</code>, enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a> will cause the definition of <code>(<*>)</code> to become self-referential and therefore divergent. Fortunately, this issue can be easily mitigated by simply writing <code>(<*>) = ap</code> instead, which is clearer and shorter than the equivalent code using <code>do</code>.</p></li></ol><p>Given all these things, it seems <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XApplicativeDo"><code>ApplicativeDo</code></a> is a little too new in a few places, and it isn’t quite baked. Still, I keep it enabled by default. Why? Well, <em>usually</em> it works fine without any problems, and when I run into issues, I can disable it on a per-module basis by writing <code>{-# LANGUAGE NoApplicativeDo #-}</code>. I still find that keeping it enabled by default is fine the vast majority of the time, I just sometimes need to work around the bugs.</p><p>In contrast, <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDefaultSignatures"><code>DefaultSignatures</code></a> isn’t buggy at all, as far as I can tell, it’s just not usually useful without fairly advanced features like <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a> (for type equalities) or <code>GHC.Generics</code>. I mostly use it for <a href="/blog/2017/04/28/lifts-for-free-making-mtl-typeclasses-derivable/">making lifting instances for <code>mtl</code>-style typeclasses easier to write</a>, which I’ve found to be a tiny bit tricky to explain (mostly due to the use of type equalities in the context), but it works well. I don’t see any real reason to leave this disabled, but if you don’t think you’re going to use it anyway, it doesn’t really matter one way or the other.</p><p>Finally, <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XPatternSynonyms"><code>PatternSynonyms</code></a> allow users to extend the pattern language just as they are allowed to extend the value language. Bidirectional pattern synonyms are isomorphisms, and it’s quite useful to allow those isomorphisms to be used with Haskell’s usual pattern-matching syntax. I think this extension is actually quite benign, but I put it so low on this list because it seems infrequently used, and I get the sense most people consider it fairly advanced. I would argue, however, that it’s a very pleasant, useful extension, and it’s no more complicated than a number of the features in Haskell 98.</p><h3><a name="intermediate-extensions-to-the-haskell-type-system"></a>Intermediate extensions to the Haskell type system</h3><p>Now we’re getting into the meat of things. Everything up to this point has been, in my opinion, completely self-evident in its usefulness and simplicity. As far as I’m concerned, the extensions in the previous six sections have no business ever being left disabled. Starting in this section, however, I could imagine a valid argument being made either way.</p><p>The following three extensions add some complexity to the Haskell type system in return for some added expressive power:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XExistentialQuantification"><code>ExistentialQuantification</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFunctionalDependencies"><code>FunctionalDependencies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a></p></li></ul><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XExistentialQuantification"><code>ExistentialQuantification</code></a> and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a> are related, given that the former is subsumed by the latter, but <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a> also enables an alternative syntax. Both syntaxes allow packing away a typeclass dictionary or equality constraint that is brought into scope upon a successful pattern-match against a data constructor, something that is sometimes quite useful but certainly a departure from Haskell’s simple ADTs.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFunctionalDependencies"><code>FunctionalDependencies</code></a> extend multi-parameter typeclasses, and they are almost unavoidable, given their use in the venerable <code>mtl</code> library. Like <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a>, <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFunctionalDependencies"><code>FunctionalDependencies</code></a> add an additional layer of complexity to the typeclass system in order to express certain things that would otherwise be difficult or impossible.</p><p>All of these extensions involve a tradeoff. Enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a> also implies <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMonoLocalBinds"><code>MonoLocalBinds</code></a>, which disables let generalization, one of the most likely ways a program that used to typecheck might subsequently fail to do so. Some might argue that this is a good reason to turn <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a> on in a per-module basis, but I disagree: I actually want my language to be fairly consistent, and given that I know I am likely going to want to use <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XGADTs"><code>GADTs</code></a> <em>somewhere</em>, I want <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XMonoLocalBinds"><code>MonoLocalBinds</code></a> enabled <em>everywhere</em>, not inconsistently and sporadically.</p><p>That aside, all these extensions are relatively safe. They are well-understood, and they are fairly self-contained extensions to the Haskell type system. I think these extensions have a very good power to cost ratio, and I find myself using them regularly (especially <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XFunctionalDependencies"><code>FunctionalDependencies</code></a>), so I keep them enabled globally.</p><h3><a name="advanced-extensions-to-the-haskell-type-system"></a>Advanced extensions to the Haskell type system</h3><p>Finally, we arrive at the last set of extensions in this list. These are the most advanced features Haskell’s type system currently has to offer, and they are likely to be the most controversial to enable globally:</p><ul><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDataKinds"><code>DataKinds</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilies"><code>TypeFamilies</code></a></p></li><li><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilyDependencies"><code>TypeFamilyDependencies</code></a></p></li></ul><p>All of these extensions exist exclusively for the purpose of type-level programming. <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDataKinds"><code>DataKinds</code></a> allows datatype promotion, creating types that are always uninhabited and therefore can only be used phantom. <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilies"><code>TypeFamilies</code></a> allows the definition of type-level functions that map types to other types. Both of these are minor extensions to Haskell’s surface area, but they have rather significant ramifications on the sort of programming that can be done and the way GHC’s typechecker must operate.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilies"><code>TypeFamilies</code></a> is an interesting extension because it comes in so many flavors: associated type synonyms, associated datatypes, open and closed type synonym families, and open and closed datatype families. Associated types tend to be easier to grok and easier to use, though they can also be replaced by functional dependencies. Open type families are also quite similar to classes and instances, so they aren’t <em>too</em> tricky to understand. Closed type families, on the other hand, are a rather different beast, and they can be used to do fairly advanced things, <em>especially</em> in combination with <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XDataKinds"><code>DataKinds</code></a>.</p><p>I happen to appreciate GHC’s support for these features, and while I’m hopeful that an eventual <code>DependentHaskell</code> will alleviate many of the existing infelicities with dependently typed programming in GHC, in the meantime, it’s often useful to enjoy what exists where practically applicable. Therefore, I have little problem keeping them enabled, since, like the vast majority of extensions on this list, these extensions merely lift restrictions, not adjust semantics of the language without the extensions enabled. When I am going to write a type family, I am going to turn on <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilies"><code>TypeFamilies</code></a>; I see no reason to annotate the modules in which I decide to do so. I do not write an annotation at the top of each module in which I define a typeclass or a datatype, so why should I do so with type families?</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilyDependencies"><code>TypeFamilyDependencies</code></a> is a little bit different, since it’s a very new extension, and it doesn’t seem to always work as well as I would hope. Still, when it doesn’t work, it fails with a very straightforward error message, and when it works, it is legitimately useful, so I don’t see any real reason to leave it off if <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTypeFamilies"><code>TypeFamilies</code></a> is enabled.</p><h3><a name="extensions-intentionally-left-off-this-list"></a>Extensions intentionally left off this list</h3><p>Given what I’ve said so far, it may seem like I would advocate flipping on absolutely every lever GHC has to offer, but that isn’t actually true. There are a few extensions I quite intentionally do <em>not</em> enable.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XUndecidableInstances"><code>UndecidableInstances</code></a> is something I turn on semi-frequently, since GHC’s termination heuristic is not terribly advanced, but I turn it on per-module, since it’s useful to know when it’s necessary (and in application code, it rarely is). <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XOverlappingInstances"><code>OverlappingInstances</code></a> and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XIncoherentInstances"><code>IncoherentInstances</code></a>, in contrast, are completely banned—not only are they almost always a bad idea, GHC has a better, more fine-grained way to opt into overlapping instances, using the <code>{-# OVERLAPPING #-}</code>, <code>{-# OVERLAPPABLE #-}</code>, and <code>{-# INCOHERENT #-}</code> pragmas.</p><p><a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTemplateHaskell"><code>TemplateHaskell</code></a> and <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XQuasiQuotes"><code>QuasiQuotes</code></a> are tricky ones. Anecdotes seem to suggest that enabling <a href="https://downloads.haskell.org/~ghc/8.2.2/docs/html/users_guide/glasgow_exts.html#ghc-flag--XTemplateHaskell"><code>TemplateHaskell</code></a> everywhere leads to worse compile times, but after trying this on a few projects and measuring, I wasn’t able to detect any meaningful difference. Unless I manage to come up with some evidence that these extensions actually slow down compile times just by being <em>enabled</em>, even if they aren’t used, then I may add them to my list of globally-enabled extensions, since I use them regularly.</p><p>Other extensions I haven’t mentioned are probably things I just don’t use very often and therefore haven’t felt the need to include on this list. It certainly isn’t exhaustive, and I add to it all the time, so I expect I will continue to do so in the future. This is just what I have for now, and if your favorite extension isn’t included, it probably isn’t a negative judgement against that extension. I just didn’t think to mention it.</p><h2><a name="libraries-a-field-guide"></a>Libraries: a field guide</h2><p>Now that you’re able to build a Haskell project and have chosen which handpicked flavor of Haskell you are going to write, it’s time to decide which libraries to use. Haskell is an expressive programming language, and the degree to which different libraries can shape the way you structure your code is significant. Picking the right libraries can lead to clean code that’s easy to understand and maintain, but picking the wrong ones can lead to disaster.</p><p>Of course, there are <em>thousands</em> of Haskell libraries on Hackage alone, so I cannot hope to cover all of the ones I have ever found useful, and I certainly cannot cover ones that would be useful but I did not have the opportunity to try (of which there are certainly many). This blog post is long enough already, so I’ll just cover a few categories of libraries that I think I can offer interesting commentary on; most libraries can generally speak for themselves.</p><h3><a name="having-an-effect"></a>Having an effect</h3><p>One of the first questions Haskell programmers bump into when they begin working on a large application is how they’re going to model effects. Few practical programming languages are pure, but Haskell is one of them, so there’s no getting away from coming up with a way to manage side-effects.</p><p>For some applications, Haskell’s built-in solution might be enough: <code>IO</code>. This can work decently for data processing programs that do very minimal amounts of I/O, and the types of side-effects they perform are minimal. For these applications, most of the logic is likely to be pure, which means it’s already easy to reason about and easy to test. For other things, like web applications, it’s more likely that a majority of the program logic is going to be side-effectful by its nature—it may involve making HTTP requests to other services, interacting with a database, and writing to logfiles.</p><p>Figuring out how to structure these effects in a type-safe, decoupled, composable way can be tricky, especially since Haskell has so many different solutions. I could not bring myself to choose just one, but I did choose two: the so-called “<code>mtl</code> style” and freer monads.</p><p><code>mtl</code> style is so named because it is inspired by the technique of interlocking monadic typeclasses and lifting instances used to model effects using constraints that is used in the <a href="https://hackage.haskell.org/package/mtl"><code>mtl</code></a> library. Here is a small code example of what <code>mtl</code> style typeclasses and handlers look like:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">String</span>
<span class="n">writeFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="kr">default</span> <span class="n">readFile</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadFileSystem</span> <span class="n">m'</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m'</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">String</span>
<span class="n">readFile</span> <span class="n">a</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">readFile</span> <span class="n">a</span>
<span class="kr">default</span> <span class="n">writeFile</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadFileSystem</span> <span class="n">m'</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m'</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="n">writeFile</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">$</span> <span class="n">writeFile</span> <span class="n">a</span> <span class="n">b</span>
<span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="ow">=</span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">readFile</span>
<span class="n">writeFile</span> <span class="ow">=</span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">writeFile</span>
<span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">MaybeT</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">newtype</span> <span class="kt">InMemoryFileSystemT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">InMemoryFileSystemT</span> <span class="p">(</span><span class="kt">StateT</span> <span class="p">[(</span><span class="kt">FilePath</span><span class="p">,</span> <span class="kt">String</span><span class="p">)]</span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">,</span> <span class="kt">MonadError</span> <span class="n">e</span><span class="p">,</span> <span class="kt">MonadReader</span> <span class="n">r</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">InMemoryFileSystemT</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="n">path</span> <span class="ow">=</span> <span class="kt">InMemoryFileSystemT</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">vfs</span> <span class="ow"><-</span> <span class="n">get</span>
<span class="kr">case</span> <span class="n">lookup</span> <span class="n">path</span> <span class="n">vfs</span> <span class="kr">of</span>
<span class="kt">Just</span> <span class="n">contents</span> <span class="ow">-></span> <span class="n">pure</span> <span class="n">contents</span>
<span class="kt">Nothing</span> <span class="ow">-></span> <span class="ne">error</span> <span class="p">(</span><span class="s">"readFile: no such file "</span> <span class="o">++</span> <span class="n">path</span><span class="p">)</span>
<span class="n">writeFile</span> <span class="n">path</span> <span class="n">contents</span> <span class="ow">=</span> <span class="kt">InMemoryFileSystemT</span> <span class="o">$</span> <span class="n">modify</span> <span class="o">$</span> <span class="nf">\</span><span class="n">vfs</span> <span class="ow">-></span>
<span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">contents</span><span class="p">)</span> <span class="kt">:</span> <span class="n">delete</span> <span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">contents</span><span class="p">)</span> <span class="n">vfs</span></code></pre><p>This is the most prevalent way to abstract over effects in Haskell, and it’s been around for a long time. Due to the way it uses the typeclass system, it’s also very fast, since GHC can often specialize and inline the typeclass dictionaries to avoid runtime dictionary passing. The main drawbacks are the amount of boilerplate required and the conceptual difficulty of understanding exactly how monad transformers, monadic typeclasses, and lifting instances all work together to discharge <code>mtl</code> style constraints.</p><p>There are various alternatives to <code>mtl</code>’s direct approach to effect composition, most of which are built around the idea of reifying a computation as a data structure and subsequently interpreting it. The most popular of these is the <code>Free</code> monad, a clever technique for deriving a monad from a functor that happens to be useful for modeling programs. Personally, I think <code>Free</code> is overhyped. It’s a cute, mathematically elegant technique, but it involves a lot of boilerplate, and composing effect algebras is still a laborious process. The additional expressive power of <code>Free</code>, namely its ability to choose an interpreter dynamically, at runtime, is rarely necessary or useful, and it adds complexity and reduces performance for few benefits. (And in fact, this is still possible to do with <code>mtl</code> style, it’s just uncommon because there is rarely any need to do so.)</p><p>A 2017 blog post entitled <a href="https://markkarpov.com/post/free-monad-considered-harmful.html">Free monad considered harmful</a> discussed <code>Free</code> in comparison with <code>mtl</code> style, and unsurprisingly cast <code>Free</code> in a rather unflattering light. I largely agree with everything outlined in that blog post, so I will not retread its arguments here. I do, however, think that there is another abstraction that <em>is</em> quite useful: the so-called “freer monad” used to implement extensible effects.</p><p>Freer moves even further away from worrying about functors and monads, since its effect algebras do not even need to be functors. Instead, freer’s effect algebras are ordinary GADTs, and reusable, composable effect handlers are easily written to consume elements of these datatypes. Unfortunately, the way this works means that GHC is still not clever enough to optimize freer monads as efficiently as <code>mtl</code> style, since it can’t easily detect when the interpreter is chosen statically and use that information to specialize and inline effect implementations, but the cost difference is significantly reduced, and I’ve found that in real application code, the vast majority of the cost does not come from the extra overhead introduced by a more expensive <code>(>>=)</code>.</p><p>There are a few different implementations of freer monads, but I, sadly, was not satisfied with any of them, so I decided to contribute to the problem by creating yet another one. My implementation is called <a href="https://hackage.haskell.org/package/freer-simple"><code>freer-simple</code></a>, and it includes a streamlined API with <a href="https://hackage.haskell.org/package/freer-simple-1.0.1.1/docs/Control-Monad-Freer.html">more documentation than any other freer implementation</a>. Writing the above <code>mtl</code> style example using <code>freer-simple</code> is more straightforward:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">FileSystem</span> <span class="n">r</span> <span class="kr">where</span>
<span class="kt">ReadFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">FileSystem</span> <span class="kt">String</span>
<span class="kt">WriteFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="kt">FileSystem</span> <span class="nb">()</span>
<span class="nf">readFile</span> <span class="ow">::</span> <span class="kt">Member</span> <span class="kt">FileSystem</span> <span class="n">r</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">Eff</span> <span class="n">r</span> <span class="kt">String</span>
<span class="nf">readFile</span> <span class="n">a</span> <span class="ow">=</span> <span class="n">send</span> <span class="o">$</span> <span class="kt">ReadFile</span> <span class="n">a</span>
<span class="nf">writeFile</span> <span class="ow">::</span> <span class="kt">Member</span> <span class="kt">FileSystem</span> <span class="n">r</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="kt">Eff</span> <span class="n">r</span> <span class="nb">()</span>
<span class="nf">writeFile</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">=</span> <span class="n">send</span> <span class="o">$</span> <span class="kt">WriteFile</span> <span class="n">a</span> <span class="n">b</span>
<span class="nf">runFileSystemIO</span> <span class="ow">::</span> <span class="kt">LastMember</span> <span class="kt">IO</span> <span class="n">r</span> <span class="ow">=></span> <span class="kt">Eff</span> <span class="p">(</span><span class="kt">FileSystem</span> <span class="sc">'</span><span class="err">: r) ~> Eff r</span>
<span class="nf">runFileSystemIO</span> <span class="ow">=</span> <span class="n">interpretM</span> <span class="o">$</span> <span class="nf">\</span><span class="kr">case</span>
<span class="kt">ReadFile</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">readFile</span> <span class="n">a</span>
<span class="kt">WriteFile</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">-></span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">writeFile</span> <span class="n">a</span> <span class="n">b</span>
<span class="nf">runFileSystemInMemory</span> <span class="ow">::</span> <span class="p">[(</span><span class="kt">FilePath</span><span class="p">,</span> <span class="kt">String</span><span class="p">)]</span> <span class="ow">-></span> <span class="kt">Eff</span> <span class="p">(</span><span class="kt">FileSystem</span> <span class="sc">'</span><span class="err">: effs) ~> Eff effs</span>
<span class="nf">runFileSystemInMemory</span> <span class="n">initVfs</span> <span class="ow">=</span> <span class="n">runState</span> <span class="n">initVfs</span> <span class="o">.</span> <span class="n">fsToState</span> <span class="kr">where</span>
<span class="n">fsToState</span> <span class="ow">::</span> <span class="kt">Eff</span> <span class="p">(</span><span class="kt">FileSystem</span> <span class="sc">'</span><span class="err">: effs) ~> Eff (State [(FilePath, String)]</span><span class="sc"> '</span><span class="kt">:</span> <span class="n">effs</span><span class="p">)</span>
<span class="n">fsToState</span> <span class="ow">=</span> <span class="n">reinterpret</span> <span class="o">$</span> <span class="kr">case</span>
<span class="kt">ReadFile</span> <span class="n">path</span> <span class="ow">-></span> <span class="n">get</span> <span class="o">>>=</span> <span class="nf">\</span><span class="n">vfs</span> <span class="ow">-></span> <span class="kr">case</span> <span class="n">lookup</span> <span class="n">path</span> <span class="n">vfs</span> <span class="kr">of</span>
<span class="kt">Just</span> <span class="n">contents</span> <span class="ow">-></span> <span class="n">pure</span> <span class="n">contents</span>
<span class="kt">Nothing</span> <span class="ow">-></span> <span class="ne">error</span> <span class="p">(</span><span class="s">"readFile: no such file "</span> <span class="o">++</span> <span class="n">path</span><span class="p">)</span>
<span class="kt">WriteFile</span> <span class="n">path</span> <span class="n">contents</span> <span class="ow">-></span> <span class="n">modify</span> <span class="o">$</span> <span class="nf">\</span><span class="n">vfs</span> <span class="ow">-></span>
<span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">contents</span><span class="p">)</span> <span class="kt">:</span> <span class="n">delete</span> <span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">contents</span><span class="p">)</span> <span class="n">vfs</span></code></pre><p>(It could be simplified further with a little bit of Template Haskell to generate the <code>readFile</code> and <code>writeFile</code> function definitions, but I haven’t gotten around to writing that.)</p><p>So which effect system do I recommend? I used to recommend <code>mtl</code> style, but as of only two months ago, I now recommend <code>freer-simple</code>. It’s easier to understand, involves less boilerplate, achieves “good enough” performance, and generally gets out of the way wherever possible. Its API is designed to make it easy to do the sorts of the things you most commonly need to do, and it provides a core set of effects that can be used to build a real-world application.</p><p>That said, freer is indisputably relatively new and relatively untested. It has success stories, but <code>mtl</code> style is still the approach used by the majority of the ecosystem. <code>mtl</code> style has more library support, its performance characteristics are better understood, and it is a tried and true way to structure effects in a Haskell application. If you understand it well enough to use it, and you are happy with it in your application, my recommendation is to stick with it. If you find it confusing, however, or you end up running up against its limits, give <code>freer-simple</code> a try.</p><h3><a name="through-the-looking-glass-to-lens-or-not-to-lens"></a>Through the looking glass: to lens or not to lens</h3><p>There’s no getting around it: <a href="https://hackage.haskell.org/package/lens"><code>lens</code></a> is a behemoth of a library. For a long time, I wrote Haskell without it, and honestly, it worked out alright. I just wasn’t doing a whole lot of work that involved complicated, deeply-nested data structures, and I didn’t feel the need to bring in a library with such a reputation for having impenetrable operators and an almost equally impenetrable learning curve.</p><p>But, after some time, I decided I wanted to take the plunge. So I braced myself for the worst, pulled out my notebook, and started writing some code. To my surprise… it wasn’t that hard. It made sense. Sure, I still don’t know how it works on the inside, and I never did learn the majority of the exports in <code>Control.Lens.Operators</code>, but I had no need to. Lenses were useful in the way I had expected them to be, and so were prisms. One thing led to another, and before long, I understood the relationship between the various optics, the most notable additions to my toolkit being folds and traversals. Sure, the type errors were completely opaque much of the time, but I was able to piece things together with ample type annotations and time spent staring at ill-typed expressions. Before long, I had developed an intuition for <code>lens</code>.</p><p>After using it for a while, I retrospected on whether or not I liked it, and honestly, I still can’t decide. Some lensy expressions were straightforward to read and were a pleasant simplification, like this one:</p><pre><code class="pygments"><span class="nf">paramSpecs</span> <span class="o">^..</span> <span class="n">folded</span><span class="o">.</span><span class="n">_Required</span></code></pre><p>Others were less obviously improvements, such as this beauty:</p><pre><code class="pygments"><span class="kt">M</span><span class="o">.</span><span class="n">fromList</span> <span class="o">$</span> <span class="n">paramSpecs</span> <span class="o">^..</span> <span class="n">folded</span><span class="o">.</span><span class="n">_Optional</span><span class="o">.</span><span class="n">filtered</span> <span class="p">(</span><span class="n">has</span> <span class="o">$</span> <span class="n">_2</span><span class="o">.</span><span class="n">_UsePreviousValue</span><span class="p">)</span></code></pre><p>But operator soup aside, there was something deeper about <code>lens</code> that bothered me, and I just wasn’t sure what. I didn’t know how to articulate my vague feelings until I read a 2014 blog post entitled <a href="https://ro-che.info/articles/2014-04-24-lens-unidiomatic">Lens is unidiomatic Haskell</a>, which includes a point that I think is spot-on:</p><blockquote><p>Usually, types in Haskell are rigid. This leads to a distinctive style of composing programs: look at the types and see what fits where. This is impossible with <code>lens</code>, which takes overloading to the level mainstream Haskell probably hasn’t seen before.</p><p>We have to learn the new language of the <code>lens</code> combinators and how to compose them, instead of enjoying our knowledge of how to compose Haskell functions. Formally, <code>lens</code> types are Haskell function types, but while with ordinary Haskell functions you immediately see from types whether they can be composed, with <code>lens</code> functions this is very hard in practice.</p><p>[…]</p><p>Now let me clarify that this doesn’t necessarily mean that <code>lens</code> is a bad library. It’s an <em>unusual</em> library. It’s almost a separate language, with its own idioms, embedded in Haskell.</p></blockquote><p>The way <code>lens</code> structures its types deliberately introduces a sort of subtyping relationship—for example, all lenses are traversals and all traversals are folds, but not vice versa—and indeed, knowing this subtyping relationship is essential to working with the library and understanding how to use it. It is helpfully documented with a large diagram on <a href="https://hackage.haskell.org/package/lens">the <code>lens</code> package overview page</a>, and that diagram was most definitely an invaluable resource for me when I was learning how to use the library.</p><p>On the surface, this isn’t unreasonable. Subtyping is an enormously useful concept! The only reason Haskell dispenses with it entirely is because it makes type inference notoriously difficult. The subtyping relation between optics is one of the things that makes them so useful, since it allows you to easily compose a lens with a prism and get a traversal out. Unfortunately, the downside of all this is that Haskell does not truly have subtyping, so all of <code>lens</code>’s “types” really must be type aliases for types of roughly the same shape, namely functions. This makes type errors completely <em>baffling</em>, since the errors do not mention the aliases, only the fully-expanded types (which are often rather complicated, and their meaning is not especially clear without knowing how <code>lens</code> works under the hood).</p><p>So the above quote is correct: working with <code>lens</code> really <em>is</em> like working in a separate embedded language, but I’m usually okay with that. Embedded, domain-specific languages are good! Unfortunately, in this case, the host language is not very courteous to its guest. Haskell does not appear to be a powerful enough language for <code>lens</code> to be a language in its own right, so it must piggyback on top of Haskell’s error reporting mechanisms, which are insufficient for <code>lens</code> to be a cohesive linguistic abstraction. Just as debugging code by stepping through the assembly it produces (or, perhaps more relevant in 2018, debugging a compile-to-JS language by looking at the emitted JavaScript instead of the source code) makes for an unacceptably leaky language. We would never stand for such a thing in our general-purpose language tooling, and we should demand better even in our embedded languages.</p><p>That said, <code>lens</code> is just too useful to ignore. It is a hopelessly leaky abstraction, but it’s still an abstraction, and a powerful one at that. Given my selection of default extensions as evidence, I think it’s clear I have zero qualms with “advanced” Haskell; I will happily use even <code>singletons</code> where it makes sense. Haskell’s various language extensions are sometimes confusing in their own right, but their complexity is usually fundamental to the expressive power they bring. <code>lens</code> has some fundamental complexity, too, but it is mostly difficult for the wrong reasons. Still, while it is not the first library I reach for on every new Haskell project, manipulating nested data without <code>lens</code> is just too unpleasant after tasting the nectar, so I can’t advise against it in good faith.</p><p>Sadly, this means I’m a bit wishy-washy when it comes to using <code>lens</code>, but I do have at least one recommendation: if you decide to use <code>lens</code>, it’s better to go all-in. Don’t generate lenses for just a handful of datatypes, do it for <em>all</em> of them. You can definitely stick to a subset of the <code>lens</code> library’s features, but don’t apply it in some functions but not others. Having too many different, equally valid ways of doing things leads to confusion and inconsistency, and inconsistency minimizes code reuse and leads to duplication and spaghetti. Commit to using <code>lens</code>, or don’t use it at all.</p><h3><a name="mitigating-the-string-problem"></a>Mitigating the string problem</h3><p>Finally, Haskell has a problem with strings. Namely, <code>String</code> is a type alias for <code>[Char]</code>, a lazy, singly linked list of characters, which is an awful representation of text. Fortunately, the answer to this problem is simple: ban <code>String</code> in your programs.</p><p>Use <code>Text</code> everywhere. I don’t really care if you pick strict <code>Text</code> or lazy <code>Text</code>, but pick one and stick to it. Don’t ever use <code>String</code>, and <em>especially</em> don’t ever, <em>ever</em>, <em><strong>ever</strong></em> use <code>ByteString</code> to represent text! There are enormously few legitimate cases for using <code>ByteString</code> in a program that is not explicitly about reading or writing raw data, and even at that level, <code>ByteString</code> should only be used at program boundaries. In that sense, I treat <code>ByteString</code> much the same way I treat <code>IO</code>: push it to the boundaries of your program.</p><p>One of Haskell’s core tenets is making illegal states unrepresentable. Strings are not especially useful datatypes for this, since they are sequences of arbitrary length made up of atoms that can be an enormously large number of different things. Still, string types enforce a very useful invariant, a notion of a sequence of human-readable characters. In the presence of Unicode, this is a more valuable abstraction than it might seem, and the days of treating strings as little different from sequences of bytes are over. While strings make a poor replacement for enums, they are quite effective at representing the incredible amount of text humans produce in a staggeringly large number of languages, and they are the right type for that job.</p><p><code>ByteString</code>, on the other hand, is essentially never the right type for any job. If a type classifies a set of values, <code>ByteString</code> is no different from <code>Any</code>. It is the structureless type, the all-encompassing blob of bits. A <code>ByteString</code> could hold anything at all—some text, an image, an executable program—and the type system certainly isn’t going to help to answer that question. The only use case I can possibly imagine for passing around a <code>ByteString</code> in your program rather than decoding it into a more precise type is if it truly holds opaque data, e.g. some sort of token or key provided by a third party with no structure guaranteed whatsoever. Still, even this should be wrapped in a <code>newtype</code> so that the type system enforces this opaqueness.</p><p>Troublingly, <code>ByteString</code> shows up in many libraries’ APIs where it has no business being. In many cases, this seems to be things where ASCII text is expected, but this is hardly a good reason to willingly accept absolutely anything and everything! Make an <code>ASCII</code> type that forbids non-ASCII characters, and provide a <code>ByteString -> Maybe ASCII</code> function. Alternatively, think harder about your problem in question to properly support Unicode as you almost certainly ought to.</p><p>Other places <code>ByteString</code> appears are similarly unfortunate. Base-64 encoding, for example, could be given the wonderfully illustrative type <code>ByteString -> Text</code>, or even <code>ByteString -> ASCII</code>! Such a type makes it immediately clear why base-64 is useful: it allows transforming arbitrary binary data into a reliable textual encoding. If we consider that <code>ByteString</code> is essentially <code>Any</code>, this function has the type <code>Any -> ASCII</code>, which is amazingly powerful! We can convert <em>anything</em> to ASCII text!</p><p>Existing libraries, however, just provide the boring, disappointingly inaccurate type <code>ByteString -> ByteString</code>, which is one of the most useless types there is. It is essentially <code>Any -> Any</code>, the meaningless function type. It conveys nothing about what it does, other than that it is pure. Giving a function this type is scarcely better than dynamic typing. Its mere existence is a failure of Haskell library design.</p><p>But wait, it gets worse! <code>Data.Text.Encoding</code> exports a function called <code>decodeUtf8</code>, which has type <code>ByteString -> Text</code>. What an incredible function with a captivating type! Whatever could it possibly do? Again, this function’s type is basically <code>Any -> Text</code>, which is remarkable in the power it gives us. Let’s try it out, shall we?</p><pre><code>ghci> decodeUtf8 "\xc3\x28"
"*** Exception: Cannot decode byte '\x28': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
</code></pre><p>Oh. Well, that’s a disappointment.</p><p>Haskell’s string problem goes deeper than <code>String</code> versus <code>Text</code>; it seems to have wound its way around the collective consciousness of the Haskell community and made it temporarily forget that it cares about types and totality. This isn’t that hard, I swear! I can only express complete befuddlement at how many of these APIs are just completely worthless.</p><p>Fortunately, there is a way out, and that way out is <a href="https://hackage.haskell.org/package/text-conversions"><code>text-conversions</code></a>. It is the first Haskell library I ever wrote. It provides <em>type safe</em>, <em>total</em> conversions between <code>Text</code> and various other types, and it is encoding aware. It provides appropriately-typed base-16 and base-64 conversion functions, and is guaranteed to never raise any exceptions. Use it, and apply the Haskell philosophy to your strings, just as you already do for everything else in your program.</p><h2><a name="closing-thoughts"></a>Closing thoughts</h2><p><em>Phew.</em></p><p>When I started writing this blog post, it used the phrase “short overview” in the introduction. It is now over ten thousand words long. I think that’s all I have it in me to say for now.</p><p>Haskell is a wonderful language built by a remarkable group of people. Its community is often fraught with needlessly inflammatory debates about things like the value of codes of conduct, the evils of Hackage revisions, and precisely how much or how little people ought to care about the monad laws. These flame wars frustrate me to no end, and they sometimes go so far as to make me ashamed to call myself a part of the Haskell community. Many on the “outside” seem to view Haskellers as an elitist, mean-spirited cult, more interested in creating problems for itself than solving them.</p><p>That perception is categorically wrong.</p><p>I have never been in a community of programmers so dedicated and passionate about applying thought and rigor to building software, then going out and <em>actually doing it</em>. I don’t know anywhere else where a cutting-edge paper on effect systems is discussed by the very same people who are figuring out how to reliably deploy distributed services to AWS. Some people view the Haskell community as masturbatory, and to some extent, they are probably right. One of my primary motivators for writing Haskell is that it is fun and it challenges me intellectually in ways that other languages don’t. But that challenge is not a sign of uselessness, it is a sign that Haskell is <em>so close</em> to letting me do the right thing, to solving the problem the right way, to letting me work without compromises. When I write in most programming languages, I must constantly accept that my program will never be robust in all the ways I want it to be, and I might as well give up before I even start. Haskell’s greatest weakness is that it tempts me to try.</p><p>Haskell is imperfect, as it will always be. I doubt I will ever be satisfied by any language or any ecosystem. There will always be more to learn, more to discover, better tools and abstractions to develop. Many of them will not look anything like Haskell; they may not involve formal verification or static types or effect systems at all. Perhaps live programming, structural editors, and runtime hotswapping will finally take over the world, and we will find that the problems we thought we were solving were irrelevant to begin with. I can’t predict the future, and while I’ve found great value in the Haskell school of program construction, I dearly hope that we do not develop such tunnel vision that we cannot see that there may be other ways to solve these problems. Many of the solutions are things we likely have not even begun to think about. Still, whether that happens or not, it is clear to me that Haskell is a point in the design space unlike any other, and we learn almost as much from the things it gets wrong as we do from the things it gets right.</p><p>It’s been a wonderful two years, Haskell. I won’t be a stranger.</p><ol class="footnotes"></ol></article>A space of their own: adding a type namespace to Hackett2017-10-27T00:00:00Z2017-10-27T00:00:00ZAlexis King<article><p>As previously discussed on this blog, <a href="https://github.com/lexi-lambda/hackett">my programming language, Hackett</a>, is a fusion of two languages, Haskell and Racket. What happens when two distinctly different programming languages collide? Hackett recently faced that very problem when it came to the question of namespacing: Haskell has two namespaces, one for values and another for types, but Racket is a staunch Lisp-1 with a single namespace for all bindings. Which convention should Hackett adopt?</p><p>For now, at least, the answer is that Hackett will emulate Haskell: <strong>Hackett now has two namespaces</strong>. Of course, Hackett is embedded in Racket, so what did it take to add an entirely new namespace to a language that possesses only one? The answer was a little more than I had hoped, but it was still remarkably simple given the problem: after two weeks of hacking, I’ve managed to get something working.</p><h2><a name="why-two-namespaces"></a>Why two namespaces?</h2><p>Before delving into the mechanics of how multi-namespace Hackett is implemented, it’s important to understand what Hackett’s namespaces actually are and why they exist in the first place. Its host language, Racket, is a descendant of Scheme, a Lisp derivative that famously chose to only use a single namespace. This means everything—from values to functions to classes—lives in a single namespace in Racket.</p><p>This is in stark contrast to Common Lisp, which opts to divide bindings into many namespaces, most notably pushing functions into a separate namespace from other variables. You can see this difference most strikingly when applying higher-order functions. In Racket, Clojure, and Scheme, functions can be passed freely as values:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="nb">map</span> <span class="nb">first</span> <span class="o">'</span><span class="p">((</span><span class="mi">1</span> <span class="ss">a</span><span class="p">)</span> <span class="p">(</span><span class="mi">2</span> <span class="ss">b</span><span class="p">)</span> <span class="p">(</span><span class="mi">3</span> <span class="ss">c</span><span class="p">)))</span>
<span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span></code></pre><p>In Common Lisp and other languages with two namespaces, functions may still be passed as values, but the programmer must explicitly <em>annotate</em> when they wish to use a value from a different namespace:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="nb">mapcar</span> <span class="nf">#'</span><span class="nb">car</span> <span class="o">'</span><span class="p">((</span><span class="mi">1</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="mi">2</span> <span class="nv">b</span><span class="p">)</span> <span class="p">(</span><span class="mi">3</span> <span class="nv">c</span><span class="p">)))</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span></code></pre><p>The Common Lisp <code>#'x</code> reader abbreviation is equivalent to <code>(function x)</code>, and <code>function</code> is a special form that references a value in the function namespace.</p><p>While this distinction is somewhat arbitrary, it is generally my belief that the Scheme approach was, indeed, the right one. Runtime values are values, whether they are numbers, strings, or functions, and they ought to all be treated as equal citizens. After all, if a programmer wishes to define their own function-like thing, they should not be forced to make their abstraction a second-class citizen merely because it is slightly different from the built-in notion of a function. Higher-order functional programming encourages treating functions as ordinary values, and an arbitrary stratification of the namespace is antithetical to that mental model.</p><p>However, Hackett is a little different from all of the aforementioned languages because Hackett has <em>types</em>. Types are rather different from runtime values because they do not exist at all at runtime. One cannot use a type where a value is expected, nor can one use a value where a type is expected, so this distinction is <em>always</em> syntactically unambiguous.<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup> Even if types and values live in separate namespaces, there is no need for a <code>type</code> form a la CL’s <code>function</code> because it can always be determined implicitly.</p><p>For this reason, it makes a great deal of sense for Hackett to have separate type and value namespaces, permitting declarations such as the following:</p><pre><code class="pygments"><span class="p">(</span><span class="n">data</span> <span class="p">(</span><span class="n">Tuple</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="n">Tuple</span> <span class="n">a</span> <span class="n">b</span><span class="p">))</span></code></pre><p>This defines a binding named <code>Tuple</code> at the type level, which is a <em>type constructor</em> of two arguments that produces a type of kind <code>*</code>,<sup><a href="#footnote-2" id="footnote-ref-2-1">2</a></sup> and another binding named <code>Tuple</code> at the value level, which is a <em>value constructor</em> of two arguments that produces a value of type <code>(Tuple a b)</code>.</p><p>But why do we want to overload names in this way, anyway? How hard would it really be to just name the value constructor <code>tuple</code> instead of <code>Tuple</code>? Well, it wouldn’t be hard at all, if it weren’t for the unpleasant ambiguity such a naming convention introduces when pattern-matching. Consider the following code snippet:</p><pre><code class="pygments"><span class="p">(</span><span class="n">data</span> <span class="n">Foo</span> <span class="n">bar</span> <span class="p">(</span><span class="n">baz</span> <span class="n">Integer</span><span class="p">))</span>
<span class="p">(</span><span class="n">defn</span> <span class="n">foo->integer</span> <span class="n">:</span> <span class="p">{</span><span class="n">Foo</span> <span class="k">-></span> <span class="n">Integer</span><span class="p">}</span>
<span class="p">[[</span><span class="n">bar</span> <span class="p">]</span> <span class="mi">0</span><span class="p">]</span>
<span class="p">[[(</span><span class="n">baz</span> <span class="n">y</span><span class="p">)]</span> <span class="n">y</span><span class="p">])</span></code></pre><p>This works fine. But what happens if the programmer decides to change the name of the <code>bar</code> value?</p><pre><code class="pygments"><span class="p">(</span><span class="n">data</span> <span class="n">Foo</span> <span class="n">qux</span> <span class="p">(</span><span class="n">baz</span> <span class="n">Integer</span><span class="p">))</span>
<span class="p">(</span><span class="n">defn</span> <span class="n">foo->integer</span> <span class="n">:</span> <span class="p">{</span><span class="n">Foo</span> <span class="k">-></span> <span class="n">Integer</span><span class="p">}</span>
<span class="p">[[</span><span class="n">bar</span> <span class="p">]</span> <span class="mi">0</span><span class="p">]</span>
<span class="p">[[(</span><span class="n">baz</span> <span class="n">y</span><span class="p">)]</span> <span class="n">y</span><span class="p">])</span></code></pre><p>Can you spot the bug? Disturbingly, this code <em>still compiles</em>! Even though <code>bar</code> is not a member of <code>Foo</code> anymore, it’s still a valid pattern, since names used as patterns match anything, just as the <code>y</code> pattern matches against any integer inside the <code>baz</code> constructor. If Hackett had a pattern redundancy checker, it could at least hopefully catch this mistake, but as things are, this could would silently compile and do the wrong thing: <code>(foo->integer (baz 42))</code> will still produce <code>0</code>, not <code>42</code>, since the first case always matches.</p><p>Haskell escapes this flaw by syntactically distinguishing between patterns and ordinary bindings by requiring all constructors start with an uppercase letter. This means that programmers often want to define data constructors and type constructors with the same name, such as the <code>Tuple</code> example above, which is illegal if a programming language only supports a single namespace.</p><p>Although Hackett now supports two namespaces, it does not currently enforce this naming convention, but it seems like an increasingly good idea. Separating the namespaces is the biggest hurdle needed to implement such a feature, and happily, it is now complete. The <code>Tuple</code> example from above is perfectly legal Hackett.</p><h2><a name="adding-namespaces-to-a-language"></a>Adding namespaces to a language</h2><p>Hopefully, we now agree that it would be nice if Hackett had two namespaces, but that doesn’t really get us any closer to being able to <em>implement</em> such a feature. At its core, Hackett is still a Racket language, and Racket’s binding structure has no notion of namespaces. How can it possibly support a language with more than one namespace?</p><p>Fortunately, Racket is no ordinary language—it is a language with a highly formalized notion of lexical scope, and many of its low-level scope control features are accessible to ordinary programmers. Before we get into the details, however, a forewarning: <strong>the remainder of this blog post is <em>highly technical</em>, and some of it involves some of the more esoteric corners of Racket’s macro system</strong>. This blog post is <em>not</em> representative of most macros written in Racket, nor is it at all necessary to understand these things to be a working Racket or Hackett macrologist. It is certainly not a tutorial on any of these concepts, so if you find it intimidating, there is no shame in skipping the rest of this post! If, however, you think you can handle it, or if you simply want to stare into the sun, by all means, read on.</p><h3><a name="namespaces-as-scopes"></a>Namespaces as scopes</h3><p>With that disclaimer out of the way, let’s begin. As of this writing, the current Racket macroexpander uses a scoping model known as <a href="https://www.cs.utah.edu/plt/scope-sets/"><em>sets of scopes</em></a>, which characterizes the binding structure of a program by annotating identifiers with sets of opaque markers known as “scopes”. The details of Racket’s macro system are well outside the scope of this blog post, but essentially, two identifiers with the same name can be made to refer to different bindings by adding a unique scope to each identifier.</p><p>Using this system of scopes, it is surprisingly simple to create a system of two namespaces: we only need to arrange for all identifiers in a value position to have a particular scope, which we will call the <em>value scope</em>, and all identifiers in type position must have a different scope, which we will call the <em>type scope</em>. How do we create these scopes and apply them to identifiers? In Racket, we use a function called <a href="https://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._make-syntax-introducer%29%29"><code>make-syntax-introducer</code></a>, which produces a function that encapsulates a fresh scope. This function can be applied to any syntax object (Racket’s structured representation of code that includes lexical binding information) to do one of three things: it can <em>add</em> the scope to all pieces of the syntax object, <em>remove</em> the scope, or <em>flip</em> the scope (that is, add it to pieces of the syntax object that do not have it and remove it from pieces that do have it). In practice, this means we need to call <code>make-syntax-introducer</code> once for each namespace:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">value-introducer</span> <span class="p">(</span><span class="nb">make-syntax-introducer</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">type-introducer</span> <span class="p">(</span><span class="nb">make-syntax-introducer</span><span class="p">)))</span></code></pre><p>We define these in a <code>begin-for-syntax</code> block because these definitions will be used in our compile-time macros (aka “phase 1”), not in runtime code (aka “phase 0”). Now, we can write some macros that use these introducer functions to apply the proper scopes to their contents:</p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">begin/value</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">form*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span><span class="p">)</span> <span class="p">(</span><span class="n">value-introducer</span> <span class="n">stx</span> <span class="o">'</span><span class="ss">add</span><span class="p">))</span>
<span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))</span>
<span class="p">(</span><span class="k">begin</span> <span class="n">form*</span> <span class="k">...</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">begin/type</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">form*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span><span class="p">)</span> <span class="p">(</span><span class="n">type-introducer</span> <span class="n">stx</span> <span class="o">'</span><span class="ss">add</span><span class="p">))</span>
<span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))</span>
<span class="p">(</span><span class="k">begin</span> <span class="n">form*</span> <span class="k">...</span><span class="p">))</span></code></pre><p>Each of these two forms is like <code>begin</code>, which is a Racket form that is, for our purposes, essentially a no-op, but it applies <code>value-introducer</code> or <code>type-introducer</code> to add the appropriate scope. We can test that this works by writing a program that uses the two namespaces:</p><pre><code class="pygments"><span class="p">(</span><span class="n">begin/value</span>
<span class="p">(</span><span class="k">define</span> <span class="n">x</span> <span class="o">'</span><span class="ss">value-x</span><span class="p">))</span>
<span class="p">(</span><span class="n">begin/type</span>
<span class="p">(</span><span class="k">define</span> <span class="n">x</span> <span class="o">'</span><span class="ss">type-x</span><span class="p">))</span>
<span class="p">(</span><span class="n">begin/value</span>
<span class="p">(</span><span class="nb">println</span> <span class="n">x</span><span class="p">))</span>
<span class="p">(</span><span class="n">begin/type</span>
<span class="p">(</span><span class="nb">println</span> <span class="n">x</span><span class="p">))</span></code></pre><p>This program produces the following output:</p><pre><code>'value-x
'type-x
</code></pre><p>It works! Normally, if you try to define two bindings with the same name in Racket, it will produce a compile-time error, but by assigning them different scopes, we have essentially managed to create two separate namespaces.</p><p>However, although this is close, it isn’t <em>quite</em> right. What happens if we nest the two inside each other?</p><pre><code class="pygments"><span class="p">(</span><span class="n">begin/value</span>
<span class="p">(</span><span class="n">begin/type</span>
<span class="p">(</span><span class="nb">println</span> <span class="n">x</span><span class="p">)))</span></code></pre><pre><code>x: identifier's binding is ambiguous
context...:
#(189267 module) #(189268 module anonymous-module 0) #(189464 use-site)
#(189465 use-site) #(190351 use-site) #(190354 use-site) #(190358 local)
#(190359 intdef)
matching binding...:
#<module-path-index:()>
#(189267 module) #(189268 module anonymous-module 0) #(189464 use-site)
matching binding...:
#<module-path-index:()>
#(189267 module) #(189268 module anonymous-module 0) #(189465 use-site)
</code></pre><p>Oh no! That didn’t work at all. The error is a bit of a scary one, but the top of the error message is essentially accurate: the use of <code>x</code> is <em>ambiguous</em> because it has both scopes on it, so it could refer to either binding. What we really want is for nested uses of <code>begin/value</code> or <code>begin/type</code> to <em>override</em> outer ones, ensuring that a use can only be in a single namespace at a time.</p><p>To do this, we simply need to adjust <code>begin/value</code> and <code>begin/type</code> to remove the other scope in addition to adding the appropriate one:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">begin/value</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">form*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">type-introducer</span> <span class="p">(</span><span class="n">value-introducer</span> <span class="n">stx</span> <span class="o">'</span><span class="ss">add</span><span class="p">)</span> <span class="o">'</span><span class="ss">remove</span><span class="p">))</span>
<span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))</span>
<span class="p">(</span><span class="k">begin</span> <span class="n">form*</span> <span class="k">...</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">begin/type</span> <span class="n">form</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">form*</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">value-introducer</span> <span class="p">(</span><span class="n">type-introducer</span> <span class="n">stx</span> <span class="o">'</span><span class="ss">add</span><span class="p">)</span> <span class="o">'</span><span class="ss">remove</span><span class="p">))</span>
<span class="p">(</span><span class="n">attribute</span> <span class="n">form</span><span class="p">))</span>
<span class="p">(</span><span class="k">begin</span> <span class="n">form*</span> <span class="k">...</span><span class="p">))</span></code></pre><p>Now our nested program runs, and it produces <code>'type-x</code>, which is exactly what we want—the “nearest” scope wins.</p><p>With just a few lines of code, we’ve managed to implement the two-namespace system Hackett needs: we simply maintain two scopes, one for each namespace, and arrange for all the types to have the type scope applied and everything else to have the value scope applied. Easy, right? Well, not quite. Things start to get a lot more complicated once our programs span more than a single module.</p><h3><a name="namespaces-that-cross-module-boundaries"></a>Namespaces that cross module boundaries</h3><p>The system of using two syntax introducers to manage scopes is wonderfully simple as long as all of our programs are contained within a single module, but obviously, that is never true in practice. It is critical that users are able to export both values and types from one module and import them into another, as that is a pretty fundamental feature of any language. This is, unfortunately, where we start to run into problems.</p><p>Racket’s notion of hygiene is pervasive, but it is still essentially scoped to a single module. This makes sense, since each module conceptually has its own “module scope”, and it wouldn’t be very helpful to inject a binding from a different module with the <em>other</em> module’s scope—it would be impossible to reference the binding in the importing module. Instead, Racket’s modules essentially export <em>symbols</em>, not identifiers (which, in Racket terminology, are symbols packaged together with their lexical scope). When a Racket module provides a binding named <code>foo</code>, there is no other information attached to that binding. It does not have any scopes attached to it, since it is the <code>require</code> form’s job to attach the correct scopes to imported identifiers.</p><p>This completely makes sense for all normal uses of the Racket binding system, but it has unfortunate implications for our namespace system: Racket modules cannot export more than one binding with a given symbolic name!<sup><a href="#footnote-3" id="footnote-ref-3-1">3</a></sup> This won’t work at all, since a Hackett programmer might very well want to export a type and value with the same name from a single module. Indeed, this capability is one of the primary <em>points</em> of having multiple namespaces.</p><p>What to do? Sadly, Racket does not have nearly as elegant a solution for this problem, at least not at the time of this writing. Fortunately, hope is not lost. While far from perfect, we can get away with a relatively simple name-mangling scheme to prefix types upon export and unprefix them upon import. Since Racket’s <code>require</code> and <code>provide</code> forms are extensible, it’s even possible to implement this mangling in a completely invisible way.</p><p>Currently, the scheme that Hackett uses is to prefix <code>#%hackett-type:</code> onto the beginning of any type exports. This can be defined in terms of a <a href="https://docs.racket-lang.org/reference/stxtrans.html#%28tech._provide._pre._transformer%29"><em>provide pre-transformer</em></a>, which is essentially a macro that cooperates with Racket’s <code>provide</code> form to control the export process. In this case, we can define our <code>type-out</code> provide pre-transformer in terms of <a href="https://docs.racket-lang.org/reference/require.html#%28form._%28%28lib._racket%2Fprivate%2Fbase..rkt%29._prefix-out%29%29"><code>prefix-out</code></a>, a form built-in to Racket that allows prefixing the names of exports:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">type-out</span>
<span class="p">(</span><span class="n">make-provide-pre-transformer</span>
<span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">stx</span> <span class="n">modes</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="n">stx</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">provide-spec</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">pre-expand-export</span>
<span class="o">#`</span><span class="p">(</span><span class="k">prefix-out</span> <span class="n">#%hackett-type:</span>
<span class="o">#,</span><span class="p">(</span><span class="n">type-introducer</span>
<span class="o">#'</span><span class="p">(</span><span class="k">combine-out</span> <span class="n">provide-spec</span> <span class="k">...</span><span class="p">)))</span>
<span class="n">modes</span><span class="p">)]))))</span></code></pre><p>Note that we call <code>type-introducer</code> in this macro! That’s because we want to ensure that, when a user writes <code>(provide (type-out Foo))</code>, we look for <code>Foo</code> in the module’s type namespace. Of course, once it is provided, all that scoping information is thrown away, but we still need it around so that <code>provide</code> knows <em>which</em> <code>Foo</code> is being provided.</p><p>Once we have referenced the correct binding, the use of <code>prefix-out</code> will appropriately add the <code>#%hackett-type:</code> prefix, so the exporting side is already done. Users do need to explicitly write <code>(type-out ....)</code> if they are exporting a particular type-level binding, but this is rarely necessary, since most users use <code>data</code> or <code>class</code> to export datatypes or typeclasses respectively, which can be modified to use <code>type-out</code> internally. Very little user code actually needs to change to support this adjustment.</p><p>Handling imports is, comparatively, tricky. When exporting, we can just force the user to annotate which exports are types, but we don’t have that luxury when importing, since it is merely whether or not a binding has the <code>#%hackett-type:</code> prefix that indicates which namespace it should be imported into. This means we’ll need to explicitly iterate through every imported binding and check if it has the prefix or not. If it does, we need to strip it off and add the type namespace; otherwise, we just pass it through unchanged.</p><p>Just as we extended <code>provide</code> with a provide pre-transformer, we can extend <code>require</code> using a <a href="https://docs.racket-lang.org/reference/stxtrans.html#%28tech._require._transformer%29"><em>require transformer</em></a>. In code, this entire process looks like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">unmangle-type-name</span> <span class="n">name</span><span class="p">)</span>
<span class="p">(</span><span class="n">and~></span> <span class="p">(</span><span class="nb">regexp-match</span> <span class="sr">#rx"^#%hackett-type:(.+)$"</span> <span class="n">name</span><span class="p">)</span> <span class="nb">second</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">unmangle-types-in</span>
<span class="p">(</span><span class="n">make-require-transformer</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">require-spec</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:do</span> <span class="p">[(</span><span class="k">define-values</span> <span class="p">[</span><span class="n">imports</span> <span class="n">sources</span><span class="p">]</span>
<span class="p">(</span><span class="n">expand-import</span> <span class="o">#'</span><span class="p">(</span><span class="k">combine-in</span> <span class="n">require-spec</span> <span class="k">...</span><span class="p">)))]</span>
<span class="p">(</span><span class="nb">values</span>
<span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="k">match-lambda</span>
<span class="p">[(</span><span class="k">and</span> <span class="n">i</span> <span class="p">(</span><span class="k">import</span> <span class="n">local-id</span> <span class="n">src-sym</span> <span class="n">src-mod-path</span> <span class="n">mode</span> <span class="n">req-mode</span> <span class="n">orig-mode</span> <span class="n">orig-stx</span><span class="p">))</span>
<span class="p">(</span><span class="k">let*</span> <span class="p">([</span><span class="n">local-name</span> <span class="p">(</span><span class="nb">symbol->string</span> <span class="p">(</span><span class="nb">syntax-e</span> <span class="n">local-id</span><span class="p">))]</span>
<span class="p">[</span><span class="n">unmangled-type-name</span> <span class="p">(</span><span class="n">unmangle-type-name</span> <span class="n">local-name</span><span class="p">)])</span>
<span class="p">(</span><span class="k">if</span> <span class="n">unmangled-type-name</span>
<span class="p">(</span><span class="k">let*</span> <span class="p">([</span><span class="n">unmangled-id</span>
<span class="p">(</span><span class="nb">datum->syntax</span> <span class="n">local-id</span>
<span class="p">(</span><span class="nb">string->symbol</span> <span class="n">unmangled-type-name</span><span class="p">)</span>
<span class="n">local-id</span>
<span class="n">local-id</span><span class="p">)])</span>
<span class="p">(</span><span class="k">import</span> <span class="p">(</span><span class="n">type-introducer</span> <span class="n">unmangled-id</span><span class="p">)</span>
<span class="n">src-sym</span> <span class="n">src-mod-path</span> <span class="n">mode</span> <span class="n">req-mode</span> <span class="n">orig-mode</span> <span class="n">orig-stx</span><span class="p">))</span>
<span class="n">i</span><span class="p">))])</span>
<span class="n">imports</span><span class="p">)</span>
<span class="n">sources</span><span class="p">)])))</span></code></pre><p>This is a little intimidating if you are not familiar with the intricacies of Racket’s low-level macro system, but the bulk of the code isn’t as scary as it may seem. It essentially does three things:</p><ol><li><p>It iterates over each import and calls <code>unmangle-type-name</code> on the imported symbol. If the result is <code>#f</code>, that means the import does not have the <code>#%hackett-type:</code> prefix, and it can be safely passed through unchanged.</p></li><li><p>If <code>unmangle-type-name</code> does <em>not</em> return <code>#f</code>, then it returns the unprefixed name, which is then provided to <code>datum->syntax</code>, which allows users to forge new identifiers in an <em>unhygienic</em> (or “hygiene-bending”) way. In this case, we want to forge a new identifier with the name we get back from <code>unmangle-type-name</code>, but with the lexical context of the original identifier.</p></li><li><p>Finally, we pass the new identifier to <code>type-introducer</code> to properly add the type scope, injecting the fresh binding into the type namespace.</p></li></ol><p>With this in place, we now have a way for Hackett users to import and export type bindings, but while it is not much of a burden to write <code>type-out</code> when exporting types, it is unlikely that users will want to write <code>unmangle-types-in</code> around each and every import in their program. For that reason, we can define a slightly modified version of <code>require</code> that implicitly wraps all of its subforms with <code>unmangle-types-in</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="k">provide</span> <span class="p">(</span><span class="k">rename-out</span> <span class="p">[</span><span class="n">require/unmangle</span> <span class="k">require</span><span class="p">]))</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">require/unmangle</span> <span class="n">require-spec</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="n">unmangle-types-in</span> <span class="n">require-spec</span><span class="p">)</span> <span class="k">...</span><span class="p">))</span></code></pre><p>…and we’re done. Now, Hackett modules can properly import and export type-level bindings.</p><h3><a name="namespaces-plus-submodules-the-devil-s-in-the-details"></a>Namespaces plus submodules: the devil’s in the details</h3><p>Up until this point, adding namespaces has required some understanding of the nuances of Racket’s macro system, but it hasn’t been particularly difficult to implement. However, getting namespaces right is a bit trickier than it appears. One area where namespaces are less than straightforward is Racket’s system of <em>submodules</em>.</p><p>Submodules are a Racket feature that allows the programmer to arbitrarily nest modules. Each file always corresponds to a single outer module, but that module can contain an arbitrary number of submodules. Each submodule can have its own “module language”, which even allows different languages to be mixed within a single file.</p><p>Submodules in Racket come in two flavors: <code>module</code> and <code>module*</code>. The difference is what order, semantically, they are defined in. Submodules defined with <code>module</code> are essentially defined <em>before</em> their enclosing module, so they cannot import their enclosing module, but their enclosing module can import them. Modules defined with <code>module*</code> are the logical dual to this: they are defined after their enclosing module, so they can import their enclosing module, but the enclosing module cannot import them.</p><p>How do submodules interact with namespaces? Well, for the most part, they work totally fine. This is because submodules are really, for the most part, treated like any other module, so the same machinery that works for ordinary Racket modules works fine with submodules.</p><p>However, there is <a href="https://docs.racket-lang.org/guide/Module_Syntax.html#%28part._submodules%29">a special sort of <code>module*</code> submodule that uses <code>#f</code> in place of a module language</a>, which gives a module access to <em>all</em> of its enclosing module’s bindings, even ones that aren’t exported! This is commonly used to create a <code>test</code> submodule that contains unit tests, and functions can be tested in such a submodule even if they are not part of the enclosing module’s public API:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="c1">; not provided</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">private-add1</span> <span class="n">x</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="n">x</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="k">module*</span> <span class="n">test</span> <span class="no">#f</span>
<span class="p">(</span><span class="k">require</span> <span class="n">rackunit</span><span class="p">)</span>
<span class="p">(</span><span class="n">check-equal?</span> <span class="p">(</span><span class="n">private-add1</span> <span class="mi">41</span><span class="p">)</span> <span class="mi">42</span><span class="p">))</span></code></pre><p>It would be nice to be able to use these sorts of submodules in Hackett, too, but if we try, we’ll find that types from the enclosing module mysteriously can’t be referenced by the submodule. Why? Well, the issue is in how we naïvely create our type and value introducers:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">value-introducer</span> <span class="p">(</span><span class="nb">make-syntax-introducer</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">type-introducer</span> <span class="p">(</span><span class="nb">make-syntax-introducer</span><span class="p">)))</span></code></pre><p>Remember that <code>make-syntax-introducer</code> is generative—each time it is called, it produces a function that operates on a fresh scope. This is a problem, since those functions will be re-evaluated on every module <a href="https://docs.racket-lang.org/reference/eval-model.html#%28tech._instantiate%29">instantiation</a>, as ensured by Racket’s <a href="https://docs.racket-lang.org/reference/eval-model.html#%28part._separate-compilation%29">separate compilation guarantee</a>. This means that each module gets its <em>own</em> pair of scopes. This means the body of a <code>module*</code> submodule will have different scopes from its enclosing module, and the enclosing modules bindings will not be accessible.</p><p>Fortunately, there is a way to circumvent this. While we cannot directly preserve syntax introducers across module instantiations, we <em>can</em> preserve syntax objects by embedding them in the expanded program, and we can attach scopes to syntax objects. Using <a href="https://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._make-syntax-delta-introducer%29%29"><code>make-syntax-delta-introducer</code></a>, we can create a syntax introducer the adds or removes the <em>difference</em> between scopes on two syntax objects. Pairing this with a little bit of clever indirection, we can arrange for <code>value-introducer</code> and <code>type-introducer</code> to always operate on the same scopes on each module instantiation:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">define-value/type-introducers</span>
<span class="n">value-introducer:id</span> <span class="n">type-introducer:id</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">scopeless-id</span> <span class="p">(</span><span class="nb">datum->syntax</span> <span class="no">#f</span> <span class="o">'</span><span class="ss">introducer-id</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">value-id</span> <span class="p">((</span><span class="nb">make-syntax-introducer</span><span class="p">)</span> <span class="o">#'</span><span class="n">scopeless-id</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="n">type-id</span> <span class="p">((</span><span class="nb">make-syntax-introducer</span><span class="p">)</span> <span class="o">#'</span><span class="n">scopeless-id</span><span class="p">)</span>
<span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="n">value-introducer</span>
<span class="p">(</span><span class="nb">make-syntax-delta-introducer</span> <span class="o">#'</span><span class="n">value-id</span> <span class="o">#'</span><span class="n">scopeless-id</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="n">type-introducer</span>
<span class="p">(</span><span class="nb">make-syntax-delta-introducer</span> <span class="o">#'</span><span class="n">type-id</span> <span class="o">#'</span><span class="n">scopeless-id</span><span class="p">))))</span>
<span class="p">(</span><span class="n">define-value/type-introducers</span> <span class="n">value-introducer</span> <span class="n">type-introducer</span><span class="p">)</span></code></pre><p>The way this trick works is subtle, but to understand it, it’s important to understand that when a module is compiled, its macro uses are only evaluated once. Subsequent imports of the same module will not re-expand the module. <em>However</em>, code inside <code>begin-for-syntax</code> blocks is still re-evaluated every time the module is instantiated! This means we are <em>not</em> circumventing that re-evaluation directly, we are merely arranging for each re-evaluation to always produce the same result.</p><p>We still use <code>make-syntax-introducer</code> to create our two scopes, but critically, we only call <code>make-syntax-introducer</code> inside the <code>define-value/type-introducers</code> macro, which is, again, only run once (when the module is expanded). The resulting compiled module embeds <code>value-id</code> and <code>type-id</code> as syntax objects in the fully-expanded program, so they never change on each module instantiation, and they already contain the appropriate scopes. We can use <code>make-syntax-delta-introducer</code> to convert the “inert” scopes into introducer functions that we can use to apply the scopes to other syntax objects as we see fit.</p><p>By guaranteeing each namespace’s scope is always the same, even for different modules, <code>module*</code> submodules now work properly, and they are able to refer to bindings inherited from their enclosing module as desired.</p><h3><a name="the-final-stretch-making-scribble-documentation-namespace-aware"></a>The final stretch: making Scribble documentation namespace-aware</h3><p>As discussed in <a href="/blog/2017/08/28/hackett-progress-report-documentation-quality-of-life-and-snake/">my previous blog post</a>, Hackett has comprehensive documentation powered by Racket’s excellent documentation tool, Scribble. Fortunately for Hackett, Scribble is incredibly flexible, and it can absolutely cope with a language with multiple namespaces. Less fortunately, it is clear that Scribble’s built-in documentation forms were not at all designed with multiple namespaces in mind.</p><p>In general, documenting such a language is tricky, assuming one wishes all identifiers to be properly hyperlinked to their appropriate definition (which, of course, I do). However, documentation is far more ambiguous than code when attempting to determine which identifiers belong in which namespace. When actually writing Hackett code, forms can always syntactically deduce the appropriate namespace for their subforms and annotate them accordingly, but this is not true in documentation. Indeed, it’s entirely possible that a piece of documentation might include intentionally incorrect code, which cannot be expanded at all!</p><p>Haskell’s documentation tool, Haddock, does not appear to attempt to tackle this problem at all—when given an identifier that exists in both namespaces, it will generate a hyperlink to the type, not the value. I do not know if there is a way around this, but if there is, it isn’t documented. This works alright for Haddock because Haskell’s documentation generally contains fewer examples, and Haskell programmers do not expect all examples to be appropriately hyperlinked, so a best-effort approach is accepted. Racket programmers, however, are used to a very high standard of documentation, and incorrectly hyperlinked docs are unacceptable.</p><p>To work around this problem, Hackett’s documentation requires that users explicitly annotate which identifiers belong to the type namespace. Identifiers in the type namespace are prefixed with <code>t:</code> upon import, and they are bound to Scribble <a href="https://docs.racket-lang.org/scribble/scheme.html#%28tech._element._transformer%29"><em>element transformers</em></a> that indicate they should be typeset without the <code>t:</code> prefix. Fortunately, Scribble’s documentation forms <em>do</em> understand Racket’s model of lexical scope (mostly), so they can properly distinguish between two identifiers with the same name but different lexical context.</p><p>In practice, this means Hackett documentation must now include a proliferation of <code>t:</code> prefixes. For example, here is the code for a typeset REPL interaction:</p><pre><code class="pygments"><span class="n">@</span><span class="p">(</span><span class="n">hackett-examples</span>
<span class="p">(</span><span class="n">defn</span> <span class="n">square</span> <span class="n">:</span> <span class="p">(</span><span class="n">t:-></span> <span class="n">t:Integer</span> <span class="n">t:Integer</span><span class="p">)</span>
<span class="p">[[</span><span class="n">x</span><span class="p">]</span> <span class="p">{</span><span class="n">x</span> <span class="nb">*</span> <span class="n">x</span><span class="p">}])</span>
<span class="p">(</span><span class="n">square</span> <span class="mi">5</span><span class="p">))</span></code></pre><p>Note the use of <code>t:-></code> and <code>t:Integer</code> instead of <code>-></code> and <code>Integer</code>. When the documentation is rendered and the example is evaluated, the prefixes are stripped, resulting in properly-typeset Hackett code.</p><p>This also means Hackett’s documentation forms have been updated to understand multiple namespaces. Hackett now provides <code>deftype</code> and <code>deftycon</code> forms for documenting types and type constructors, respectively, which will use the additional lexical information attached to <code>t:</code>-prefixed identifiers to properly index documented forms. Similarly, <code>defdata</code> and <code>defclass</code> have been updated with an understanding of types.</p><p>The implementation details of these changes is less interesting than the ones made to the code itself, since it mostly just involved tweaking Racket’s implementation of <code>defform</code> slightly to cooperate with the prefixed identifiers. To summarize, Hackett defines a notion of “type binding transformers” that include information about both prefixed and unprefixed versions of types, and Hackett provides documentation forms that consume that information when typesetting. A require transformer converts imported bindings into <code>t:</code>-prefixed ones and attaches the necessary compile-time information to them. It isn’t especially elegant, but it works.</p><h2><a name="analysis-and-unsolved-problems"></a>Analysis and unsolved problems</h2><p>When laid out from top to bottom in this blog post, the amount of code it takes to actually implement multiple namespaces in Racket is surprisingly small. In hindsight, it does not feel like two weeks worth of effort, but it would be disingenuous to suggest that any of this was obvious. I tried a variety of different implementation strategies and spent a great deal of time staring at opaque error messages and begging <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a> for help before I got things working properly. Fortunately, with everything in place, the implementation seems reliable, predictable, and useful for Hackett’s users (or, as the case may be, users-to-be).</p><p>For the most part, all the machinery behind multiple namespaces is invisible to the average Hackett programmer, and it seems to “just work”. For completeness, however, I must mention one unfortunate exception: remember the work needed to unmangle type names? While it’s true that all imports into Hackett modules are automatically unmangled by the custom <code>require</code> form, types provided by a module’s <em>language</em> are not automatically unmangled. This is because Racket does not currently provide a hook to customize how bindings from a module language are introduced, unlike <code>require</code>’s require transformers.</p><p>To circumvent this restriction, <code>#lang hackett</code>’s reader includes a somewhat ad-hoc solution that actually inserts a <code>require</code> into users’ programs that unmangles and imports all the types provided by the module. This mostly works, but due to the way Racket’s imports work, it isn’t possible for Racket programmers to import different types with the same names as Hackett core types; the two bindings will conflict, and there is no way for users to hide these implicitly imported bindings. Whether or not this is actually a common problem remains to be seen. If it is rare, it might be sufficient to introduce an ad-hoc mechanism to hide certain type imports, but it might be better to extend Racket in some way to better support this use-case.</p><p>That issue aside, multi-namespace Hackett is now working smoothly. It’s worth nothing that I did not have to do <em>any</em> special work to help Racket’s tooling, such as DrRacket’s Check Syntax tool, understand the binding structure of Hackett programs. Since other tools, such as racket-mode for Emacs, use the same mechanisms under the hood, Racket programmers’ existing tools will be able to properly locate the distinct definition sites for types and values with the same name, another example of how Racket successfully <a href="http://www.ccs.neu.edu/home/matthias/manifesto/sec_intern.html">internalizes extra-linguistic mechanisms</a>.</p><p>As closing notes, even if the majority of this blog post was gibberish to you, do note that Hackett has come quite a long way in just the past two months, adding much more than just a separate type namespace. I might try and give a more comprehensive update at a later date, but here’s a quick summary of the meaningful changes for those interested:</p><ul><li><p><strong>Multi-parameter typeclasses</strong> are implemented, along with <strong>default typeclass method implementations</strong>.</p></li><li><p>Pattern-matching performs basic <strong>exhaustiveness checking</strong>, so unmatched cases are a compile-time error.</p></li><li><p>Hackett ships with a <strong>larger standard library</strong>, including an <code>Either</code> type and appropriate functions, an <code>Identity</code> type, a <code>MonadTrans</code> typeclass, and the <code>ReaderT</code> and <code>ErrorT</code> monad transformers.</p></li><li><p><strong>More things are documented</strong>, and parts of the documentation are slightly improved. Additionally, <strong>Hackett’s internals are much more heavily commented</strong>, hopefully making the project more accessible to new contributors.</p></li><li><p><strong>Parts of the typechecker are dramatically simplified</strong>, improving the mechanisms behind dictionary elaboration and clearing the way for a variety of additional long-term improvements, including multiple compilation targets and a type-aware optimizer.</p></li><li><p>As always, various bug fixes.</p></li></ul><p>Finally, special mention to two new contributors to Hackett, <a href="https://github.com/iitalics">Milo Turner</a> and <a href="https://github.com/Shamrock-Frost">Brendan Murphy</a>. Also special thanks to <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a> and <a href="https://github.com/michaelballantyne">Michael Ballantyne</a> for helping me overcome two of the trickiest macro-related problems I’ve encountered in Hackett to date. It has now been just over a year since Hackett’s original conception and roughly six months since the first commit of its current implementation, and the speed at which I’ve been able to work would not have been possible without the valuable help of the wonderful Racket community. Here’s hoping this is only the beginning.</p><ol class="footnotes"><li id="footnote-1"><p>“But what about dependent types?” you may ask. Put simply, Hackett is not dependently typed, and it is not going to be dependently typed. Dependent types are currently being bolted onto Haskell, but Haskell does not have <code>#lang</code>. Racket does. It seems likely that a dependently-typed language would be much more useful as a separate <code>#lang</code>, not a modified version of Hackett, so Hackett can optimize its user experience for what it <em>is</em>, not what it might be someday. <a href="#footnote-ref-1-1">↩</a></p></li><li id="footnote-2"><p>Hackett does not actually have a real kind system yet, but pleasantly, this same change will allow <code>*</code> to be used to mean “type” at the kind level and “multiply” at the value level. <a href="#footnote-ref-2-1">↩</a></p></li><li id="footnote-3"><p>This isn’t strictly true, as readers familiar with Racket’s macro system may likely be aware that Racket modules export bindings at different “phase levels”, where phase levels above 0 correspond to compile-time macroexpansion phases. Racket modules are allowed to export a single binding per name, <em>per phase</em>, so the same symbolic name can be bound to different things at different phases. This isn’t meaningfully relevant for Hackett, however, since types and values are both exported at phase 0, and there are reasons that must be the case, this phase separation does not make this problem any simpler. <a href="#footnote-ref-3-1">↩</a></p></li></ol></article>Hackett progress report: documentation, quality of life, and snake2017-08-28T00:00:00Z2017-08-28T00:00:00ZAlexis King<article><p>Three months ago, <a href="/blog/2017/05/27/realizing-hackett-a-metaprogrammable-haskell/">I wrote a blog post describing my new, prototype implementation of my programming language, Hackett</a>. At the time, some things looked promising—the language already included algebraic datatypes, typeclasses, laziness, and even a mini, proof of concept web server. It was, however, clearly still rather rough around the edges—error messages were poor, features were sometimes brittle, the REPL experience was less than ideal, and there was no documentation to speak of. In the time since, while the language is still experimental, I have tackled a handful of those issues, and I am excited to announce <a href="https://pkg-build.racket-lang.org/doc/hackett@hackett-doc/"><strong>the first (albeit quite incomplete) approach to Hackett’s documentation</strong></a>.</p><p>I’d recommend clicking that link above and at least skimming around before reading the rest of this blog post, as its remainder will describe some of the pieces that didn’t end up in the documentation: the development process, the project’s status, a small demo, and some other details from behind the scenes.</p><h2><a name="a-philosophy-of-documentation"></a>A philosophy of documentation</h2><p>Racket, as a project, has always had <a href="http://docs.racket-lang.org">wonderful documentation</a>. There are many reasons for this—Racket’s educational origins almost certainly play a part, and it helps that the core packages set the bar high—but one of the biggest reasons is undoubtably <a href="http://docs.racket-lang.org/scribble/index.html">Scribble, the Racket documentation tool</a>. Scribble is, in many ways, the embodiment of the Racket philosophy: it is a user-extensible, fully-featured, domain-specific programming language designed for typesetting, with <a href="http://docs.racket-lang.org/scribble/plt-manuals.html">a powerful library for documenting Racket code</a>. Like the Racket language itself, Scribble comes with a hygienic macro system, and in fact, all Racket libraries are trivially usable from within Scribble documents, if desired. The macro system is used to great effect to provide typesetting forms tailored to the various sorts of things a Racket programmer might wish to document, such as procedures, structures, and macros.</p><p>Scribble documents are decoupled from a rendering backend, so a single Scribble document can be rendered to plain text, a PDF, or HTML, but the HTML backend is the most useful for writing docs. Scribble documents themselves use a syntax inspired by (La)TeX’s syntax, but Scribble uses an <code>@</code> character instead of <code>\</code>. It also generalizes and regularizes TeX in many ways, creating a much more uniform language without nearly so much magic or complexity. Since Scribble’s “at-expressions” are merely an alternate syntax for Racket’s more traditional s-expressions, Scribble documents can be built out of ordinary Racket macros. For example, to document a procedure in Racket, one would use <a href="http://docs.racket-lang.org/scribble/doc-forms.html#%28form._%28%28lib._scribble%2Fmanual..rkt%29._defproc%29%29">the provided <code>defproc</code> form</a>:</p><pre><code class="pygments"><span class="n">@defproc</span><span class="p">[(</span><span class="nb">add1</span> <span class="p">[</span><span class="n">z</span> <span class="nb">number?</span><span class="p">])</span> <span class="nb">number?</span><span class="p">]{</span>
<span class="n">Returns</span> <span class="n">@racket</span><span class="p">[(</span><span class="nb">+</span> <span class="n">z</span> <span class="mi">1</span><span class="p">)]</span><span class="o">.</span><span class="p">}</span></code></pre><p>This syntax may look alien to someone more familiar with traditional, Javadoc-style documentation comments, but the results are quite impressive. The above snippet renders into <a href="http://docs.racket-lang.org/reference/generic-numbers.html#%28def._%28%28quote._~23~25kernel%29._add1%29%29">something like this</a>:</p><p><a href="http://docs.racket-lang.org/reference/generic-numbers.html#%28def._%28%28quote._~23~25kernel%29._add1%29%29"></a></p><p>The fact that Scribble documents are fully-fledged <em>programs</em> equips the programmer with a lot of power. One of the most remarkable tools Scribble provides is <a href="http://docs.racket-lang.org/scribble/eval.html">the <code>scribble/example</code> module</a>, a library that performs sandboxed evaluation as part of the rendering process. This allows Scribble documents to include REPL-style examples inline, automatically generated as part of typesetting, always kept up to date from a single source of truth: the implementation. It even provides a special <code>eval:check</code> form that enables <a href="https://docs.python.org/3/library/doctest.html">doctest</a>-like checking, which allows documentation to serve double duty as a test suite.</p><p>Of course, Hackett is not Racket, though it shares many similarities. Fortunately, all of Racket is <em>designed</em> with the goal of supporting many different programming languages, and Scribble is no exception. Things like <a href="http://docs.racket-lang.org/scribble/eval.html"><code>scribble/example</code></a> essentially work out of the box with Hackett, and most of <a href="http://docs.racket-lang.org/scribble/plt-manuals.html"><code>scribble/manual</code></a> can be reused. However, what about documenting algebraic datatypes? What about documenting typeclasses? Well, remember: Scribble is extensible. The <code>defproc</code> and <code>defstruct</code> forms are hardly builtins; they are defined as part of the <code>scribble/manual</code> library in terms of Scribble primitives, and <a href="https://github.com/lexi-lambda/hackett/blob/f472859cfc03086d39563e5c0eb81dcb2ceb49dc/hackett-doc/scribble/manual/hackett.rkt">we can do the same</a>.</p><p>Hackett’s documentation already defines three new forms, <code>defdata</code>, <code>defclass</code>, and <code>defmethod</code>, for documenting algebraic datatypes, typeclasses, and typeclass methods, respectively. They typeset documentation custom-tailored to Hackett’s needs, so Hackett’s documentation need not be constrained by Racket’s design decisions. For example, one could document the <code>Functor</code> typeclass using <code>defclass</code> like this:</p><pre><code class="pygments"><span class="n">@defclass</span><span class="p">[(</span><span class="n">Functor</span> <span class="n">f</span><span class="p">)</span>
<span class="p">[</span><span class="nb">map</span> <span class="n">:</span> <span class="p">(</span><span class="n">forall</span> <span class="p">[</span><span class="n">a</span> <span class="n">b</span><span class="p">]</span> <span class="p">{(</span><span class="n">a</span> <span class="k">-></span> <span class="n">b</span><span class="p">)</span> <span class="k">-></span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">)</span> <span class="k">-></span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span><span class="p">)})]]{</span>
<span class="n">A</span> <span class="k">class</span> <span class="n">of</span> <span class="n">types</span> <span class="n">that</span> <span class="n">are</span> <span class="n">@deftech</span><span class="p">{</span><span class="n">functors</span><span class="p">}</span><span class="o">,</span> <span class="n">essentially</span> <span class="n">types</span> <span class="n">that</span> <span class="k">provide</span> <span class="n">a</span>
<span class="n">mapping</span> <span class="k">or</span> <span class="n">“piercing”</span> <span class="n">operation.</span> <span class="n">The</span> <span class="n">@racket</span><span class="p">[</span><span class="nb">map</span><span class="p">]</span> <span class="n">function</span> <span class="n">can</span> <span class="n">be</span> <span class="n">viewed</span> <span class="n">in</span>
<span class="n">different</span> <span class="n">ways:</span>
<span class="k">...</span><span class="p">}</span></code></pre><p>With only a little more than the above code, <a href="http://docs.racket-lang.org/hackett/reference-typeclasses.html#%28def._%28%28lib._hackett%2Fmain..rkt%29._.Functor%29%29">Hackett’s documentation includes a beautifully-typeset definition of the <code>Functor</code> typeclass</a>, including examples and rich prose:</p><p><a href="http://docs.racket-lang.org/hackett/reference-typeclasses.html#%28def._%28%28lib._hackett%2Fmain..rkt%29._.Functor%29%29"></a></p><p>Scribble makes Hackett’s documentation shine.</p><h3><a name="a-tale-of-two-users"></a>A tale of two users</h3><p>For a programming language, documentation is critical. Once we have grown comfortable with a language, it’s easy to take for granted our ability to work within it, but there is always a learning period, no matter how simple or familiar the language may be. When learning a new language, we often relate the languages’ concepts and features to those which we already know, which is why having a broad vocabulary of languages makes picking up new ones so much easier.</p><p>A new user of a language needs a gentle introduction to its features, structured in a logical way, encouraging this period of discovery and internalization. Such an introduction should come equipped with plenty of examples, and it shouldn’t worry itself with being an authoritative reference. Some innocent simplifications are often conducive to learning, and it is unlikely to be helpful to force the full power of a language onto a user all at once.</p><p>However, for experienced users, an authoritative reference is <em>exactly</em> what they need. While learners want tutorial-style documentation that encourages experimentation and exploration, working users of a language need something closer to a dictionary or encyclopedia: a way to look up forms and functions by name and find precise definitions, complete explanations, and hopefully a couple of examples. Such a user does not want information to be scattered across multiple chapters of explanatory text; they simply need a focused, targeted, one-stop shop for the information they’re looking for.</p><p>This dichotomy is rarely well-served by existing programming language documentation. Most programming languages suffer from either failing entirely to serve both types of users, or doing so in a way that enforces too strong a separation between the styles of documentation. For example:</p><ul><li><p>Java ships with a quintessential example of a documentation generator: Javadoc. Java is a good case study because, although its documentation is not particularly good, it still manages to be considerably better than most languages’ docs.</p><p><a href="https://docs.oracle.com/javase/8/docs/api/">Java’s API documentation</a> documents its standard library, but it doesn’t document the language. Reference-style language documentation is largely relegated to the Java Language Specification, which is highly technical and rather low-level. It is more readable than the standards for some other languages, but it’s still mostly only useful to language lawyers. For Java, this ends up being mostly okay, largely because Java is a fairly <em>small</em> language that does not often change.</p><p>On the other hand, Java’s reference documentation is inconsistent, rarely provides any examples, and certainly does not do a good job of serving new users. Java <em>does</em> provide guide-style documentation in the form of the <a href="https://docs.oracle.com/javase/tutorial/">Java Tutorials</a>, but they are of inconsistent quality.</p><p>More importantly, while the Java tutorials link to the API docs, the reverse is <strong>not</strong> true, which is a real disservice. One of the most beautiful things about the web is how information can be extensively cross-linked, and exploring links is many times easier than turning pages of a physical book. Anyone who’s explored topics on Wikipedia for an hour (or more) at a time knows how amazing this can be.</p><p>Language documentation isn’t quite the same as an encyclopedia, but it’s a shame that Java’s documentation does not lend itself as easily to curious, open-ended learning. If the API docs frequently linked to relevant portions of the tutorials, then a user could open the Javadoc for a class or method they are using, then quickly jump to the relevant guide. As the documentation is currently organized, this is nearly impossible, and tutorials are only discovered when explicitly looking for them.</p></li><li><p>Other languages, such as JavaScript, are in even worse boats than Java when it comes to documentation. For whatever reason, structured documentation of any kind doesn’t seem to have caught on in the JavaScript world, probably largely because no documentation tool ships with the language, and no such tool ever became standard. Whatever the reason, JavaScript libraries’ documentation largely resides in markdown documents spread across version control repositories and various webpages.</p><p>The closest thing that JavaScript has to official language documentation, aside from the (largely incomprehensible) language standard, is <a href="https://developer.mozilla.org/en-US/">MDN</a>. MDN’s docs are actually quite good, and they tend to mix lots of examples together with reference-style documentation. They’re indexed and searchable, and they have a great Google search ranking. MDN is easily my go-to place to read about core JavaScript functions.</p><p>The trouble, of course, is that MDN only houses documentation for the standard library, and while new standards make it bigger than ever, huge amounts of critical functionality are often offloaded to separate packages. These libraries all have their own standards and styles of documentation, and virtually none of them even compare to MDN.</p><p>This means that documentation for JavaScript libraries, even the most popular ones, tends to be all over the map. <a href="http://ramdajs.com/docs/">Ramda’s documentation is nothing but a reference</a>, which makes it easy to look up information about a specific function, but nearly impossible to find anything if you don’t have a specific name to look for. In contrast, <a href="http://passportjs.org/docs">Passport’s docs are essentially <em>only</em> a set of tutorials</a>, which is great for learners, but enormously frustrating if I just want to look up what the heck a specific function or method <em>does</em>. Fortunately, <a href="https://facebook.github.io/react/docs/hello-world.html">there are some libraries, like React</a>, that absolutely <em>nail</em> this, and they have both styles of documentation that are <strong>actually cross-referenced</strong>. Unfortunately, those are mostly the exceptions, not the norm.</p></li><li><p><a href="https://docs.python.org/3/index.html">Python’s documentation is interesting</a>, since it includes a set of tutorials alongside the API reference, and it <em>also</em> ships a language reference written for ordinary users. In many ways, it does everything right, but disappointingly, it generally doesn’t link back to the tutorials from the API docs, even though the reverse is true. For example, the section in the tutorial on <code>if</code> links to the section in the reference about <code>if</code>, but nothing goes in the other direction, which is something of a missed opportunity.</p></li><li><p><a href="https://hackage.haskell.org/package/base">Haskell manages to be especially bad here</a> (maybe even notoriously bad) despite having an ubiquitous documentation generator, Haddock. Unfortunately, Haddock’s format makes writing prose and examples somewhat unpleasant, and very few packages provide any sort of tutorial. For those that do, the tutorial is often not included in the API docs, a common theme at this point.</p><p>It’s generally a bad sign when your documentation tool isn’t even powerful enough to document itself, and <a href="https://www.haskell.org/haddock/">Haddock’s docs are pretty impressively bad, though mostly serviceable if you’re willing to look</a>.</p></li></ul><p>The takeaway here is that I just don’t think most languages’ documentation is particularly good, and programmers seem to have gotten so used to this state of affairs that the bar is set disappointingly low. Fortunately, this is another area where Racket delivers. Racket, like Python, ships with <em>two</em> pieces of documentation: the <a href="http://docs.racket-lang.org/guide/index.html">Racket Guide</a> and the <a href="http://docs.racket-lang.org/reference/index.html">Racket Reference</a>. The guide includes over <strong>one hundred thousand</strong> words of explanations and examples, and the reference includes roughly <strong>half a million</strong>. Racket’s documentation is impressive on its own, but what’s equally impressive is how carefully and methodically cross-linked it is. Margin notes often provide links to corresponding sections in the relevant companion manual, so it’s easy to look up a form or function by name, then quickly jump to the section of the guide explaining it.</p><p>Hackett is obviously not going to have hundreds of thousands of words worth of documentation in its first few months of existence, but it already has nearly ten thousand, and that’s not nothing. More importantly, it is structured the same way that Racket’s docs are: it’s split into the <a href="http://docs.racket-lang.org/hackett/guide.html">Hackett Guide</a> and the <a href="http://docs.racket-lang.org/hackett/reference.html">Hackett Reference</a>, and the two are cross-referenced as much as possible. Haskell is a notoriously difficult language to learn, but my hope is that does not necessarily <em>need</em> to be the case. Documentation cannot make the language trivial, but my hope is that it can make it a <em>lot</em> more accessible without making it any less useful for power users.</p><h2><a name="rounding-hackett-s-library-sanding-its-edges"></a>Rounding Hackett’s library, sanding its edges</h2><p>One of the best things about sitting down and writing documentation—whether it’s for a tool, a library, or a language—is how it forces you, the author, to think about how someone else might perceive the project when seeing it for the first time. This encompasses everything: error messages, ease of installation, completeness of a standard library, friendliness of tooling, etc. Writing Hackett’s documentation forced me to make a <em>lot</em> of improvements, and while very few of them are flashy features, they make Hackett feel much less like a toy and more like a tool.</p><p>Hackett currently has no formal changelog because it is considered alpha quality, and its API is still unstable. There is no guarantee that things won’t change at any moment. Still, it’s useful to put together an ad-hoc list of changes made in the past few months. Here’s a very brief summary:</p><ul><li><p>Hackett includes a <a href="http://docs.racket-lang.org/hackett/reference-datatypes.html#%28form._%28%28lib._hackett%2Fmain..rkt%29._.Double%29%29"><code>Double</code></a> type for working with IEEE 754 double-precision floating-point numbers.</p></li><li><p>Local definitions are supported via the <a href="http://docs.racket-lang.org/hackett/reference-syntactic-forms.html#%28form._%28%28lib._hackett%2Fmain..rkt%29._let%29%29"><code>let</code></a> and <a href="http://docs.racket-lang.org/hackett/reference-syntactic-forms.html#%28form._%28%28lib._hackett%2Fmain..rkt%29._letrec%29%29"><code>letrec</code></a> forms.</p></li><li><p>The prelude includes many more functions, especially <a href="http://docs.racket-lang.org/hackett/reference-datatypes.html#%28part._reference-lists%29">functions on lists</a>.</p></li><li><p>The Hackett reader has been adjusted to support using <code>.</code> as a bare symbol, since <a href="http://docs.racket-lang.org/hackett/reference-datatypes.html#%28def._%28%28lib._hackett%2Fmain..rkt%29._..%29%29"><code>.</code> is the function composition operator</a>.</p></li><li><p>The Hackett REPL supports many more forms, including <a href="http://docs.racket-lang.org/hackett/reference-datatypes.html#%28form._%28%28lib._hackett%2Fmain..rkt%29._data%29%29">ADT</a>, <a href="http://docs.racket-lang.org/hackett/reference-typeclasses.html#%28form._%28%28lib._hackett%2Fmain..rkt%29._class%29%29">class</a>, and <a href="http://docs.racket-lang.org/hackett/reference-typeclasses.html#%28form._%28%28lib._hackett%2Fmain..rkt%29._instance%29%29">instance</a> definitions. Additionally, the REPL now uses <a href="http://docs.racket-lang.org/hackett/reference-typeclasses.html#%28def._%28%28lib._hackett%2Fmain..rkt%29._.Show%29%29"><code>Show</code></a> instances to display the results of expressions. To compensate for the inability to print non-<a href="http://docs.racket-lang.org/hackett/reference-typeclasses.html#%28def._%28%28lib._hackett%2Fmain..rkt%29._.Show%29%29"><code>Show</code></a>able things, a new <code>(#:type expr)</code> syntax is permitted to print the type of <em>any</em> expression.</p></li><li><p>Missing instance errors are now dramatically improved, now correctly highlighting the source location of expressions that led to the error.</p></li></ul><p>Alongside these changes are a variety of internal code improvements that make the Hackett code simpler, more readable, and hopefully more accessible to contributors. Many of the trickiest functions are now <a href="https://github.com/lexi-lambda/hackett/blob/f472859cfc03086d39563e5c0eb81dcb2ceb49dc/hackett-lib/hackett/private/base.rkt#L77-L189">heavily commented</a> with the hope that the codebase won’t be so intimidating to people unfamiliar with Racket or the techniques behind Hackett’s typechecker. I will continue to document the internals of Hackett as I change different places of the codebase, and I have even considered writing a separate Scribble document describing the Hackett internals. It certainly wouldn’t hurt.</p><p>One of the most exciting things about documenting Hackett has been realizing just <em>how much</em> already exists. Seriously, if you have gotten to this point in the blog post but haven’t read <a href="https://pkg-build.racket-lang.org/doc/hackett@hackett-doc/">the actual documentation</a> yet, I would encourage you to do so. No longer does the idea of writing real programs in this language feel out of reach; indeed, aside from potential performance problems, the language is likely extremely close to being usable for very simple things. After all, that’s the goal, isn’t it? As I’ve mentioned before, I’m writing Hackett for other people, but I’m also very much writing it for <em>me</em>: it’s a language I’d like to use.</p><p>Still, writing a general-purpose programming language is a lot of work, and I’ve known from the start that it isn’t something I can accomplish entirely on my own. While this iteration of work on Hackett is a sort of “documentation release”, it might be more accurate to call it an “accessibility release”. If you’re interested in contributing, I finally feel comfortable encouraging you to get involved!</p><h2><a name="a-demo-with-pictures"></a>A demo with pictures</h2><p>Now, if you’re like me, all of this documentation stuff is already pretty exciting. Still, even I view documentation as simply a means to an end, not an end in itself. Documentation is successful when it gets out of the way and makes it possible to write good code that does cool things. Let’s write some, shall we?</p><p>Hackett ships with a special package of demo libraries in the aptly-named <code>hackett-demo</code> package, which are essentially simple, lightweight bindings to existing, dynamically-typed Racket libraries. In <a href="/blog/2017/05/27/realizing-hackett-a-metaprogrammable-haskell/">the previous Hackett blog post</a>, I demonstrated the capabilities of <code>hackett/demo/web-server</code>. In this blog post, we’re going to use <code>hackett/demo/pict</code> and <code>hackett/demo/pict/universe</code>, which make it possible to write interactive, graphical programs in Hackett with just a few lines of code!</p><p>As always, we’ll start with <code>#lang hackett</code>, and we’ll import the necessary libraries:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">hackett</span>
<span class="p">(</span><span class="k">require</span> <span class="n">hackett/demo/pict</span>
<span class="n">hackett/demo/pict/universe</span><span class="p">)</span></code></pre><p>With that, we can start immediately with a tiny example. Just to see how <code>hackett/demo/pict</code> works, let’s start by rendering a red square. We can do this by writing a <code>main</code> action that calls <code>print-pict</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">main</span> <span class="p">(</span><span class="n">print-pict</span> <span class="p">(</span><span class="n">colorize</span> <span class="n">red</span> <span class="p">(</span><span class="n">filled-square</span> <span class="mf">50.0</span><span class="p">))))</span></code></pre><p>If you run the above program in DrRacket, you should see a 50 pixel red square printed into the interactions window!</p><p></p><p>Using the REPL, we can inspect the type of <code>print-pict</code>:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="kd">#:type</span> <span class="n">print-pict</span><span class="p">)</span>
<span class="n">:</span> <span class="p">(</span><span class="k">-></span> <span class="n">Pict</span> <span class="p">(</span><span class="n">IO</span> <span class="n">Unit</span><span class="p">))</span></code></pre><p>Unsurprisingly, displaying a picture to the screen needs <code>IO</code>. However, what’s interesting is that the rest of the expression is totally pure. Take a look at the type of <code>filled-square</code>:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="kd">#:type</span> <span class="n">filled-square</span><span class="p">)</span>
<span class="n">:</span> <span class="p">(</span><span class="k">-></span> <span class="n">Double</span> <span class="n">Pict</span><span class="p">)</span></code></pre><p>No <code>IO</code> to be seen! This is because “picts” are entirely <em>pure</em> values that represent images built out of simple shapes, and they can be put together to make more complex images. For example, we can put two squares next to one another:</p><pre><code class="pygments"><span class="p">(</span><span class="n">main</span> <span class="p">(</span><span class="n">print-pict</span> <span class="p">{(</span><span class="n">colorize</span> <span class="n">red</span> <span class="p">(</span><span class="n">filled-square</span> <span class="mf">50.0</span><span class="p">))</span>
<span class="n">hc-append</span>
<span class="p">(</span><span class="n">colorize</span> <span class="n">blue</span> <span class="p">(</span><span class="n">filled-square</span> <span class="mf">50.0</span><span class="p">))}))</span></code></pre><p>This code will print out a red square to the left of a blue one.</p><p></p><p>Again, <code>hc-append</code> is a simple, pure function, a binary composition operator that places two picts side by side to produce a new one:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="kd">#:type</span> <span class="n">hc-append</span><span class="p">)</span>
<span class="n">:</span> <span class="p">(</span><span class="k">-></span> <span class="n">Pict</span> <span class="p">(</span><span class="k">-></span> <span class="n">Pict</span> <span class="n">Pict</span><span class="p">))</span></code></pre><p>Using the various features of this toolkit, not only can we make interesting pictures and diagrams, we can even create a foundation for a game!</p><h3><a name="implementing-a-snake-clone"></a>Implementing a snake clone</h3><p>This blog post is not a Hackett tutorial; it is merely a demo. For that reason, I am not going to spend much time explaining how the following program is built. This section is closer to annotated source code than a guide to the <code>pict</code> or <code>universe</code> libraries. Hopefully it’s still illustrative.</p><p>We’ll start by writing some type definitions. We’ll need a type to represent 2D points on a grid, as well as a type to represent a cardinal direction (to keep track of which direction the player is moving, for example). We’ll also want an <code>Eq</code> instance for our points.</p><pre><code class="pygments"><span class="p">(</span><span class="n">data</span> <span class="n">Point</span> <span class="p">(</span><span class="n">point</span> <span class="n">Integer</span> <span class="n">Integer</span><span class="p">))</span>
<span class="p">(</span><span class="n">data</span> <span class="n">Direction</span> <span class="n">d:left</span> <span class="n">d:right</span> <span class="n">d:up</span> <span class="n">d:down</span><span class="p">)</span>
<span class="p">(</span><span class="n">instance</span> <span class="p">(</span><span class="n">Eq</span> <span class="n">Point</span><span class="p">)</span>
<span class="p">[</span><span class="k">==</span> <span class="p">(</span><span class="k">λ</span> <span class="p">[(</span><span class="n">point</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="n">point</span> <span class="n">c</span> <span class="n">d</span><span class="p">)]</span> <span class="p">{{</span><span class="n">a</span> <span class="k">==</span> <span class="n">c</span><span class="p">}</span> <span class="n">&&</span> <span class="p">{</span><span class="n">b</span> <span class="k">==</span> <span class="n">d</span><span class="p">}})])</span></code></pre><p>With these two datatypes, we can implement a <code>move</code> function that accepts a point and a direction and produces a new point for an adjacent tile:</p><pre><code class="pygments"><span class="p">(</span><span class="n">defn</span> <span class="n">move</span> <span class="n">:</span> <span class="p">{</span><span class="n">Direction</span> <span class="k">-></span> <span class="n">Point</span> <span class="k">-></span> <span class="n">Point</span><span class="p">}</span>
<span class="p">[[</span><span class="n">d:left</span> <span class="p">(</span><span class="n">point</span> <span class="n">x</span> <span class="n">y</span><span class="p">)]</span> <span class="p">(</span><span class="n">point</span> <span class="p">{</span><span class="n">x</span> <span class="nb">-</span> <span class="mi">1</span><span class="p">}</span> <span class="n">y</span><span class="p">)]</span>
<span class="p">[[</span><span class="n">d:right</span> <span class="p">(</span><span class="n">point</span> <span class="n">x</span> <span class="n">y</span><span class="p">)]</span> <span class="p">(</span><span class="n">point</span> <span class="p">{</span><span class="n">x</span> <span class="nb">+</span> <span class="mi">1</span><span class="p">}</span> <span class="n">y</span><span class="p">)]</span>
<span class="p">[[</span><span class="n">d:up</span> <span class="p">(</span><span class="n">point</span> <span class="n">x</span> <span class="n">y</span><span class="p">)]</span> <span class="p">(</span><span class="n">point</span> <span class="n">x</span> <span class="p">{</span><span class="n">y</span> <span class="nb">-</span> <span class="mi">1</span><span class="p">})]</span>
<span class="p">[[</span><span class="n">d:down</span> <span class="p">(</span><span class="n">point</span> <span class="n">x</span> <span class="n">y</span><span class="p">)]</span> <span class="p">(</span><span class="n">point</span> <span class="n">x</span> <span class="p">{</span><span class="n">y</span> <span class="nb">+</span> <span class="mi">1</span><span class="p">})])</span></code></pre><p>The next step is to define a type for our world state. The <code>big-bang</code> library operates using a game loop, with a function to update the state that’s called each “tick”. Our state will need to hold all the information about our game, which in this case, is just three things:</p><pre><code class="pygments"><span class="p">(</span><span class="n">data</span> <span class="n">World-State</span> <span class="p">(</span><span class="n">world-state</span>
<span class="n">Direction</span> <span class="c1">; snake direction</span>
<span class="p">(</span><span class="n">List</span> <span class="n">Point</span><span class="p">)</span> <span class="c1">; snake blocks</span>
<span class="p">(</span><span class="n">List</span> <span class="n">Point</span><span class="p">)</span> <span class="c1">; food blocks</span>
<span class="p">))</span></code></pre><p>It will also be useful to have a functional setter for the direction, which we’ll have to write ourselves, since Hackett does not (currently) have anything like Haskell’s record syntax:</p><pre><code class="pygments"><span class="p">(</span><span class="n">defn</span> <span class="n">set-ws-direction</span> <span class="p">[[</span><span class="n">d</span> <span class="p">(</span><span class="n">world-state</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span><span class="p">)]</span> <span class="p">(</span><span class="n">world-state</span> <span class="n">d</span> <span class="n">b</span> <span class="n">c</span><span class="p">)])</span></code></pre><p>Next, we’ll write some top-level constants that we’ll use in our rendering function, such as the number of tiles in the game board, the size of each tile in pixels, and some simple picts that represent the tiles we’ll use to draw our game:</p><pre><code class="pygments"><span class="p">(</span><span class="n">def</span> <span class="n">board-width</span> <span class="mi">50</span><span class="p">)</span>
<span class="p">(</span><span class="n">def</span> <span class="n">board-height</span> <span class="mi">30</span><span class="p">)</span>
<span class="p">(</span><span class="n">def</span> <span class="n">tile->absolute</span> <span class="p">{(</span><span class="n">d*</span> <span class="mf">15.0</span><span class="p">)</span> <span class="o">.</span> <span class="n">integer->double</span><span class="p">})</span>
<span class="p">(</span><span class="n">def</span> <span class="n">empty-board</span> <span class="p">(</span><span class="n">blank-rect</span> <span class="p">(</span><span class="n">tile->absolute</span> <span class="n">board-width</span><span class="p">)</span> <span class="p">(</span><span class="n">tile->absolute</span> <span class="n">board-height</span><span class="p">)))</span>
<span class="p">(</span><span class="n">def</span> <span class="n">block</span> <span class="p">(</span><span class="n">filled-square</span> <span class="mf">13.0</span><span class="p">))</span>
<span class="p">(</span><span class="n">def</span> <span class="n">food-block</span> <span class="p">(</span><span class="n">colorize</span> <span class="n">red</span> <span class="n">block</span><span class="p">))</span>
<span class="p">(</span><span class="n">def</span> <span class="n">snake-block</span> <span class="p">(</span><span class="n">colorize</span> <span class="n">black</span> <span class="n">block</span><span class="p">))</span></code></pre><p>Now we can write our actual <code>render</code> function. To do this, we simply need to render each <code>Point</code> in our <code>World-State</code>’s two lists as a block on an <code>empty-board</code>. We’ll write a helper function, <code>render-on-board</code>, which does exactly that:</p><pre><code class="pygments"><span class="p">(</span><span class="n">defn</span> <span class="n">render-on-board</span> <span class="n">:</span> <span class="p">{</span><span class="n">Pict</span> <span class="k">-></span> <span class="p">(</span><span class="n">List</span> <span class="n">Point</span><span class="p">)</span> <span class="k">-></span> <span class="n">Pict</span><span class="p">}</span>
<span class="p">[[</span><span class="n">pict</span> <span class="n">points</span><span class="p">]</span>
<span class="p">(</span><span class="nb">foldr</span> <span class="p">(</span><span class="k">λ</span> <span class="p">[(</span><span class="n">point</span> <span class="n">x</span> <span class="n">y</span><span class="p">)</span> <span class="n">acc</span><span class="p">]</span>
<span class="p">(</span><span class="n">pin-over</span> <span class="n">acc</span> <span class="p">(</span><span class="n">tile->absolute</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="n">tile->absolute</span> <span class="n">y</span><span class="p">)</span> <span class="n">pict</span><span class="p">))</span>
<span class="n">empty-board</span> <span class="n">points</span><span class="p">)])</span></code></pre><p>This function uses <code>foldr</code> to collect each point and place the provided pict at the right location using <code>pin-over</code> on an empty board. Using <code>render-on-board</code>, we can write the <code>render</code> function in just a couple of lines:</p><pre><code class="pygments"><span class="p">(</span><span class="n">defn</span> <span class="n">render</span> <span class="n">:</span> <span class="p">{</span><span class="n">World-State</span> <span class="k">-></span> <span class="n">Pict</span><span class="p">}</span>
<span class="p">[[(</span><span class="n">world-state</span> <span class="k">_</span> <span class="n">snake-points</span> <span class="n">food-points</span><span class="p">)]</span>
<span class="p">(</span><span class="n">pin-over</span> <span class="p">(</span><span class="n">render-on-board</span> <span class="n">snake-block</span> <span class="n">snake-points</span><span class="p">)</span>
<span class="mf">0.0</span> <span class="mf">0.0</span>
<span class="p">(</span><span class="n">render-on-board</span> <span class="n">food-block</span> <span class="n">food-points</span><span class="p">))])</span></code></pre><p>Next, we’ll need to handle the update logic. On each tick, the snake should advance by a single tile in the direction it’s currently moving. If it runs into a food tile, it should grow one tile larger, and we need to generate a new food tile elsewhere on the board. To help with that last part, the <code>big-bang</code> library provides a <code>random-integer</code> function, which we can use to write a <code>random-point</code> action:</p><pre><code class="pygments"><span class="p">(</span><span class="n">def</span> <span class="n">random-point</span> <span class="n">:</span> <span class="p">(</span><span class="n">IO</span> <span class="n">Point</span><span class="p">)</span>
<span class="p">{</span><span class="n">point</span> <span class="n"><$></span> <span class="p">(</span><span class="n">random-integer</span> <span class="mi">0</span> <span class="n">board-width</span><span class="p">)</span>
<span class="n"><*></span> <span class="p">(</span><span class="n">random-integer</span> <span class="mi">0</span> <span class="n">board-height</span><span class="p">)})</span></code></pre><p>Hackett supports applicative notation using infix operators, so <code>random-point</code> looks remarkably readable. It also runs in <code>IO</code>, since the result is, obviously, random. Fortunately, the <code>on-tick</code> function runs in <code>IO</code> as well (unlike <code>render</code>, which must be completely pure), so we can use <code>random-point</code> when necessary to generate a new food block:</p><pre><code class="pygments"><span class="p">(</span><span class="n">def</span> <span class="n">init!</span> <span class="n">:</span> <span class="p">(</span><span class="n">forall</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="p">{(</span><span class="n">List</span> <span class="n">a</span><span class="p">)</span> <span class="k">-></span> <span class="p">(</span><span class="n">List</span> <span class="n">a</span><span class="p">)})</span>
<span class="p">{</span><span class="nb">reverse</span> <span class="o">.</span> <span class="n">tail!</span> <span class="o">.</span> <span class="nb">reverse</span><span class="p">})</span>
<span class="p">(</span><span class="n">defn</span> <span class="n">on-tick</span> <span class="n">:</span> <span class="p">{</span><span class="n">World-State</span> <span class="k">-></span> <span class="p">(</span><span class="n">IO</span> <span class="n">World-State</span><span class="p">)}</span>
<span class="p">[[(</span><span class="n">world-state</span> <span class="n">dir</span> <span class="n">snake-points</span> <span class="n">food-points</span><span class="p">)]</span>
<span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">new-snake-point</span> <span class="p">(</span><span class="n">move</span> <span class="n">dir</span> <span class="p">(</span><span class="n">head!</span> <span class="n">snake-points</span><span class="p">))])</span>
<span class="p">(</span><span class="k">if</span> <span class="p">{</span><span class="n">new-snake-point</span> <span class="n">elem?</span> <span class="n">food-points</span><span class="p">}</span>
<span class="p">(</span><span class="k">do</span> <span class="p">[</span><span class="n">new-food-point</span> <span class="n"><-</span> <span class="n">random-point</span><span class="p">]</span>
<span class="p">(</span><span class="n">pure</span> <span class="p">(</span><span class="n">world-state</span> <span class="n">dir</span> <span class="p">{</span><span class="n">new-snake-point</span> <span class="n">::</span> <span class="n">snake-points</span><span class="p">}</span>
<span class="p">{</span><span class="n">new-food-point</span> <span class="n">::</span> <span class="p">(</span><span class="n">delete</span> <span class="n">new-snake-point</span> <span class="n">food-points</span><span class="p">)})))</span>
<span class="p">(</span><span class="n">pure</span> <span class="p">(</span><span class="n">world-state</span> <span class="n">dir</span> <span class="p">{</span><span class="n">new-snake-point</span> <span class="n">::</span> <span class="p">(</span><span class="n">init!</span> <span class="n">snake-points</span><span class="p">)}</span>
<span class="n">food-points</span><span class="p">))))])</span></code></pre><p>This function is the most complicated one in the whole program, but it’s still not terribly complex. It figures out what the snake’s next location is and binds it to <code>new-snake-point</code>, then checks if there is a food block at that location. If there is, it generates a <code>new-food-point</code>, then puts it in the new world state. Otherwise, it removes the last snake point and continues as usual.</p><p>The game is already almost completely written. The next step is just to handle key events, which are obviously important for allowing the player to control the snake. Fortunately, this is easy, since we can just use our <code>set-ws-direction</code> function that we wrote earlier:</p><pre><code class="pygments"><span class="p">(</span><span class="n">defn</span> <span class="n">on-key</span> <span class="n">:</span> <span class="p">{</span><span class="n">KeyEvent</span> <span class="k">-></span> <span class="n">World-State</span> <span class="k">-></span> <span class="p">(</span><span class="n">IO</span> <span class="n">World-State</span><span class="p">)}</span>
<span class="p">[[</span><span class="n">ke:left</span> <span class="p">]</span> <span class="p">{</span><span class="n">pure</span> <span class="o">.</span> <span class="p">(</span><span class="n">set-ws-direction</span> <span class="n">d:left</span><span class="p">)}]</span>
<span class="p">[[</span><span class="n">ke:right</span><span class="p">]</span> <span class="p">{</span><span class="n">pure</span> <span class="o">.</span> <span class="p">(</span><span class="n">set-ws-direction</span> <span class="n">d:right</span><span class="p">)}]</span>
<span class="p">[[</span><span class="n">ke:up</span> <span class="p">]</span> <span class="p">{</span><span class="n">pure</span> <span class="o">.</span> <span class="p">(</span><span class="n">set-ws-direction</span> <span class="n">d:up</span><span class="p">)}]</span>
<span class="p">[[</span><span class="n">ke:down</span> <span class="p">]</span> <span class="p">{</span><span class="n">pure</span> <span class="o">.</span> <span class="p">(</span><span class="n">set-ws-direction</span> <span class="n">d:down</span><span class="p">)}]</span>
<span class="p">[[</span><span class="k">_</span> <span class="p">]</span> <span class="p">{</span><span class="n">pure</span> <span class="o">.</span> <span class="n">id</span><span class="p">}])</span></code></pre><p>The <code>on-key</code> function runs in <code>IO</code>, but we don’t actually need that power, since all of our keypress update logic is completely pure, so we just wrap everything in <code>pure</code>.</p><p>We’re almost done now—all we need to do is set up the <em>initial</em> state when the game begins. We’ll write a small binding that creates a world state with the snake in the middle of the board and some random food locations scattered about:</p><pre><code class="pygments"><span class="p">(</span><span class="n">def</span> <span class="n">initial-state</span>
<span class="p">(</span><span class="k">do</span> <span class="p">[</span><span class="n">initial-food</span> <span class="n"><-</span> <span class="p">(</span><span class="n">sequence</span> <span class="p">(</span><span class="nb">take</span> <span class="mi">5</span> <span class="p">(</span><span class="n">repeat</span> <span class="n">random-point</span><span class="p">)))]</span>
<span class="p">(</span><span class="n">pure</span> <span class="p">(</span><span class="n">world-state</span> <span class="n">d:right</span>
<span class="p">{(</span><span class="n">point</span> <span class="mi">25</span> <span class="mi">15</span><span class="p">)</span> <span class="n">::</span> <span class="p">(</span><span class="n">point</span> <span class="mi">24</span> <span class="mi">15</span><span class="p">)</span> <span class="n">::</span> <span class="p">(</span><span class="n">point</span> <span class="mi">23</span> <span class="mi">15</span><span class="p">)</span> <span class="n">::</span> <span class="n">nil</span><span class="p">}</span>
<span class="n">initial-food</span><span class="p">))))</span></code></pre><p>Notably, we can use the <code>repeat</code> function to create an infinite list of <code>random-point</code> actions, <code>take</code> the first five of them, then call <code>sequence</code> to execute them from left to right. Now, all we have to do is put the pieces together in a <code>main</code> block:</p><pre><code class="pygments"><span class="p">(</span><span class="n">main</span> <span class="p">(</span><span class="k">do</span> <span class="p">[</span><span class="n">state</span> <span class="n"><-</span> <span class="n">initial-state</span><span class="p">]</span>
<span class="p">(</span><span class="n">big-bang</span> <span class="n">state</span>
<span class="kd">#:to-draw</span> <span class="n">render</span>
<span class="kd">#:on-tick</span> <span class="n">on-tick</span> <span class="mf">0.2</span>
<span class="kd">#:on-key</span> <span class="n">on-key</span><span class="p">)))</span></code></pre><p>And that’s it! We haven’t implemented any win or loss conditions, but the basics are all there. In 80 lines of code, we’ve implemented a working snake game in Hackett.</p><p></p><h2><a name="contributing-to-hackett"></a>Contributing to Hackett</h2><p>If you are excited enough about Hackett to be interested in contributing, your first question is very likely “What can I do?” or “Where do I start?” My answer to that is (perhaps a little unhelpfully): it depends! My general recommendation is to try and write something with Hackett, and if you run into anything that prevents you from accomplishing your goal, look into what would need to be changed to support your program. Having a use case is a great way to come up with useful improvements.</p><p>On the other hand, you might not have anything in mind, or you might find Hackett’s scope a little too overwhelming to just jump right in and start contributing. Fortunately, <a href="https://github.com/lexi-lambda/hackett/issues">Hackett has an issue tracker</a>, so feel free to take a look and pick something that looks interesting and achievable. Alternatively, the standard library can always use fleshing out, and quite a lot of that can be written without ever even touching the scary Hackett internals.</p><p>Additionally, if you have any questions, please don’t hesitate to ask them! If you have a question about the codebase, get stuck implementing something, or just don’t know where to start, feel free to <a href="https://github.com/lexi-lambda/hackett/issues">open an issue on GitHub</a>, send me a message on the <code>#racket</code> IRC channel on Freenode, or ping me on <a href="http://racket-slack.herokuapp.com">the Racket Slack team</a>.</p><h2><a name="acknowledgements"></a>Acknowledgements</h2><p>Speaking of contributors, I’m excited to say that this is the first time I can truly say Hackett includes code written by someone other than me! I want to call attention to <a href="https://github.com/gelisam">Samuel Gélineau, aka gelisam</a>, who is officially the second contributor to Hackett. He helped to implement the new approach the Hackett REPL uses for printing expressions, which ended up being quite useful when implementing some of the other REPL improvements.</p><p>Additionally, I want to specially thank <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a>, <a href="http://eecs.northwestern.edu/~robby/">Robby Findler</a>, and <a href="http://www.ccs.neu.edu/home/samth/">Sam Tobin-Hochstadt</a> for being especially responsive and helpful to my many questions about Scribble and the Racket top level. Racket continues to be extremely impressive, both as a project and as a community.</p><p>Finally, many thanks to the various people who have expressed interest in the project and continue to push me and ask questions. Working on Hackett is a lot of work—both time and effort—and it’s your continued enthusiasm that inspires me to put in the hours.</p><ol class="footnotes"></ol></article>User-programmable infix operators in Racket2017-08-12T00:00:00Z2017-08-12T00:00:00ZAlexis King<article><p>Lisps are not known for infix operators, quite the opposite; infix operators generally involve more syntax and parsing than Lispers are keen to support. However, in <a href="https://github.com/lexi-lambda/hackett">Hackett</a>, all functions are curried, and variable-arity functions do not exist. Infix operators are almost necessary for that to be palatable, and though there are other reasons to want them, it may not be obvious how to support them without making the reader considerably more complex.</p><p>Fortunately, if we require users to syntactically specify where they wish to use infix expressions, support for infix operators is not only possible, but can support be done <em>without</em> modifying the stock <code>#lang racket</code> reader. Futhermore, the resulting technique makes it possible for fixity information to be specified locally in a way that cooperates nicely with the Racket macro system, allowing the parsing of infix expressions to be manipulated at compile-time by users’ macros.</p><h2><a name="our-mission"></a>Our mission</h2><p>Before we embark, let’s clarify our goal. We want to support infix operators in Racket, of course, but that could mean a lot of different things! Let’s start with what we <em>do</em> want:</p><ul><li><p>Infix operators should be user-extensible, not limited to a special set of built-in operators.</p></li><li><p>Furthermore, operators’ names should not be restricted to a separate “operator” character set. Any valid Lisp identifier should be usable as an infix operator.</p></li><li><p>We want to be able to support fixity/associativity annotations. Some operators should associate to the left, like subtraction, but others should associate to the right, like <code>cons</code>. This allows <code>5 - 1 - 2</code> to be parsed as <code>(- (- 5 1) 2)</code>, but <code>5 :: 1 :: nil</code> to be parsed as <code>(:: 5 (:: 1 nil))</code>.</p></li></ul><p>These are nice goals, but we also won’t be too ambitious. In order to keep things simple and achievable, we’ll keep the following restrictions:</p><ul><li><p>We will <strong>not</strong> permit infix expressions in arbitrary locations, since that would be impossible to parse given how we want to allow users to pick any names for operators they wish. Instead, infix expressions must be wrapped in curly braces, e.g. replacing <code>(+ 1 2)</code> with <code>{1 + 2}</code>.</p></li><li><p>Our implementation will <strong>not</strong> support any notion of operator precedence; all operators will have equal precedence, and it will be illegal to mix operators of different associativity in the same expression. Precedence is entirely possible to implement in theory, but it would be considerably more work, so this blog post does not include it.</p></li><li><p>All operators will be binary, and we will <strong>not</strong> support unary or mixfix operators. My intuition is that this technique should be able to be generalized to both of those things, but it would be considerably more complicated.</p></li></ul><p>With those points in mind, what would the interface for our infix operator library look like for our users? Ideally, something like this:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">prefix-in</span> <span class="n">racket/base/</span> <span class="n">racket/base</span><span class="p">)</span>
<span class="s2">"infix.rkt"</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">-</span> <span class="n">racket/base/-</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">::</span> <span class="nb">cons</span> <span class="kd">#:fixity</span> <span class="n">right</span><span class="p">)</span>
<span class="p">{{</span><span class="mi">2</span> <span class="nb">-</span> <span class="mi">1</span><span class="p">}</span> <span class="n">::</span> <span class="p">{</span><span class="mi">10</span> <span class="nb">-</span> <span class="mi">3</span><span class="p">}</span> <span class="n">::</span> <span class="o">'</span><span class="p">()}</span>
<span class="c1">; => '(1 7)</span></code></pre><p>Let’s get started.</p><h2><a name="implementing-infix-operators"></a>Implementing infix operators</h2><p>Now that we know what we want, how do we get there? Well, there are a few pieces to this puzzle. We’ll need to solve a two main problems:</p><ol><li><p>How do we “hook into” expressions wrapped with curly braces so that we can perform a desugaring pass?</p></li><li><p>How can we associate fixity information with certain operators?</p></li></ol><p>We’ll start by tackling the first problem, since its solution will inform the answer to the second. Since we won’t have any fixity information to start with, we’ll just assume that all operators associate left by default.</p><p>So, how <em>do</em> we detect if a Racket expression is surrounded by curly braces? Normally, in <code>#lang racket</code>, parentheses, square brackets, and curly braces are all interchangeable. Indeed, if you use curly braces in the REPL, you will find that they are treated <em>exactly</em> the same as parentheses:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">{</span><span class="nb">+</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">}</span>
<span class="mi">3</span></code></pre><p>If they are treated identically, giving them special behavior might seem hopeless, but don’t despair! Racket is no ordinary programming language, and it provides some tools to help us out here.</p><p>Someone who has worked with Lisps before is likely already aware that Lisp source code is a very direct representation of its AST, composed mostly of lists, pairs, symbols, numbers, and strings. In Racket, this is also true, but Racket also wraps these datums in boxes known as <a href="http://docs.racket-lang.org/reference/syntax-model.html#%28tech._syntax._object%29"><em>syntax objects</em></a>. Syntax objects contain extra metadata about the code, most notably its lexical context, necessary for Racket’s hygiene system. However, syntax objects can also contain arbitrary metadata, known as <a href="http://docs.racket-lang.org/reference/stxprops.html#%28tech._syntax._property%29"><em>syntax properties</em></a>. Macros can attach arbitrary values to the syntax objects they produce using syntax properties, and other macros can inspect them. Racket’s <a href="http://docs.racket-lang.org/guide/Pairs__Lists__and_Racket_Syntax.html#%28tech._reader%29"><em>reader</em></a> (the syntax parser that turns program text into Racket syntax objects) also attaches certain syntax properties as part of its parsing process. One of those is named <a href="http://docs.racket-lang.org/reference/reader.html#%28idx._%28gentag._30._%28lib._scribblings%2Freference%2Freference..scrbl%29%29%29"><code>'paren-shape</code></a>.</p><p>This syntax property, as the name implies, keeps track of the shape of parentheses in syntax objects. You can see that for yourself by inspecting the property’s value for different syntax objects in the REPL:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="nb">syntax-property</span> <span class="o">#'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span> <span class="o">'</span><span class="ss">paren-shape</span><span class="p">)</span>
<span class="no">#f</span>
<span class="nb">></span> <span class="p">(</span><span class="nb">syntax-property</span> <span class="o">#'</span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span> <span class="o">'</span><span class="ss">paren-shape</span><span class="p">)</span>
<span class="sc">#\[</span>
<span class="nb">></span> <span class="p">(</span><span class="nb">syntax-property</span> <span class="o">#'</span><span class="p">{</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">}</span> <span class="o">'</span><span class="ss">paren-shape</span><span class="p">)</span>
<span class="sc">#\{</span></code></pre><p>This syntax property gives us the capability to distinguish between syntax objects that use curly braces and those that don’t, which is a step in the right direction, but it still doesn’t give us any hook with which we can change the behavior of certain expressions. Fortunately, there’s something else that can.</p><h3><a name="customizing-application"></a>Customizing application</h3><p>Racket is a language <em>designed</em> to be extended, and it provides a variety of hooks in the language for the purposes of tweaking pieces in minor ways. One such hook is named <a href="http://docs.racket-lang.org/reference/application.html#%28form._%28%28lib._racket%2Fprivate%2Fbase..rkt%29._~23~25app%29%29"><code>#%app</code></a>, which is automatically introduced by the macroexpander whenever it encounters a function application. That means it effectively turns this:</p><pre><code class="pygments"><span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span></code></pre><p>…into this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">#%app</span> <span class="nb">+</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span></code></pre><p>What’s special about <code>#%app</code> is that the macroexpander will use whichever <code>#%app</code> is in scope in the expression’s lexical context, so if we write our own version of <code>#%app</code>, it will be used instead of the one from <code>#lang racket</code>. This is what we will use to hook into ordinary Racket expressions.</p><p>To write our custom version of <code>#%app</code>, we will use the usual tool: Racket’s industrial-strength macro-authoring DSL, <a href="http://docs.racket-lang.org/syntax/stxparse.html"><code>syntax/parse</code></a>. We’ll also use a helper library that provides some tools for pattern-matching on syntax objects with the <code>'paren-shape</code> syntax property, <a href="http://docs.racket-lang.org/syntax-classes/index.html#%28mod-path._syntax%2Fparse%2Fclass%2Fparen-shape%29"><code>syntax/parse/class/paren-shape</code></a>. Using these, we can transform expressions that are surrounded in curly braces differently from how we would transform expressions surrounded by parentheses:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">for-syntax</span> <span class="n">syntax/parse/class/paren-shape</span><span class="p">)</span>
<span class="p">(</span><span class="k">prefix-in</span> <span class="n">racket/base/</span> <span class="n">racket/base</span><span class="p">)</span>
<span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-syntax-parser</span> <span class="k">#%app</span>
<span class="p">[{</span><span class="n">~braces</span> <span class="k">_</span> <span class="n">arg</span> <span class="k">...</span><span class="p">}</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">racket/base/#%app</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)])</span></code></pre><p>This code will transform any applications surrounded in curly braces into one that starts with <code>#%infix</code> instead of <code>#%app</code>, so <code>{1 + 2}</code> will become <code>(#%infix 1 + 2)</code>, for example. The identifier <code>#%infix</code> isn’t actually special in any way, it just has a funny name, but we haven’t actually defined <code>#%infix</code> yet, so we need to do that next!</p><p>To start, we’ll just handle the simplest case: infix expressions with precisely three subexpressions, like <code>{1 + 2}</code>, should be converted into the equivalent prefix expressions, in this case <code>(+ 1 2)</code>. We can do this with a simple macro:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-syntax-parser</span> <span class="n">#%infix</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">racket/base/#%app</span> <span class="n">op</span> <span class="n">a</span> <span class="n">b</span><span class="p">)])</span></code></pre><p>Due to the way Racket propagates syntax properties, we explicitly indicate that the resulting expansion should use the <code>#%app</code> from <code>racket/base</code>, which will avoid any accidental infinite recursion between our <code>#%app</code> and <code>#%infix</code>. With this in place, we can now try our code out in the REPL, and believe it or not, we now support infix expressions with just those few lines of code:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">3</span>
<span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="nb">+</span> <span class="mi">2</span><span class="p">}</span>
<span class="mi">3</span></code></pre><p>That’s pretty cool!</p><p>Of course, we probably want to support infix applications with more than just a single binary operator, such as <code>{1 + 2 + 3}</code>. We can implement that just by adding another case to <code>#%infix</code> that handles more subforms:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-syntax-parser</span> <span class="n">#%infix</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">racket/base/#%app</span> <span class="n">op</span> <span class="n">a</span> <span class="n">b</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="p">(</span><span class="n">#%infix</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">)</span> <span class="n">more</span> <span class="k">...</span><span class="p">)])</span></code></pre><p>…and now, just by adding those two lines, we support arbitrarily-large sequences of infix operators:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="nb">+</span> <span class="mi">2</span> <span class="nb">+</span> <span class="mi">3</span><span class="p">}</span>
<span class="mi">6</span>
<span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="nb">+</span> <span class="mi">2</span> <span class="nb">+</span> <span class="mi">3</span> <span class="nb">+</span> <span class="mi">4</span><span class="p">}</span>
<span class="mi">10</span></code></pre><p>I don’t know about you, but I think being able to do this in less than 20 lines of code is pretty awesome. We can even mix different operators in the same expression:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="nb">+</span> <span class="mi">2</span> <span class="nb">*</span> <span class="mi">3</span> <span class="nb">-</span> <span class="mi">4</span><span class="p">}</span>
<span class="mi">5</span></code></pre><p>Of course, all of our infix expressions currently assume that all operators associate left, as was our plan. In general, though, there are lots of useful operators that associate right, such as <code>cons</code>, nested <code>-></code> types or contracts for curried functions, and <code>expt</code>, the exponentiation operator.</p><h3><a name="tracking-operator-fixity"></a>Tracking operator fixity</h3><p>Clearly, we need some way to associate operator fixity with certain identifiers, and we need to be able to do it at compile-time. Fortunately, Racket has a very robust mechanism for creating compile-time values. Unfortunately, simply associating metadata with an identifier is a little less convenient than it could be, but there is a general technique that can be done with little boilerplate.</p><p>Essentially, Racket (like Scheme) uses a <code>define-syntax</code> form to define macros, which is what <code>define-syntax-parser</code> eventually expands into. However, unlike Scheme, Racket’s <code>define-syntax</code> is not <em>just</em> for defining macros—it’s for defining arbitrary bindings with compile-time (aka “phase 1”) values. Using this, we can define bindings that have entirely arbitrary values at compile-time, including plain data like numbers or strings:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">foo</span> <span class="mi">3</span><span class="p">)</span></code></pre><p>Once a binding has been defined using <code>define-syntax</code>, a macro can look up the value associated with it by using the <a href="http://docs.racket-lang.org/reference/stxtrans.html#%28def._%28%28quote._~23~25kernel%29._syntax-local-value%29%29"><code>syntax-local-value</code></a> function, which returns the compile-time value associated with an identifier:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="nb">println</span> <span class="p">(</span><span class="nb">syntax-local-value</span> <span class="o">#'</span><span class="n">foo</span><span class="p">)))</span>
<span class="c1">; => 3</span></code></pre><p>The cool thing is that <code>syntax-local-value</code> gets the value associated with a specific <em>binding</em>, not a specific name. This means a macro can look up the compile-time value associated with an identifier provided to it as a subform. This is close to what we want, since we could use <code>syntax-local-value</code> to look up something associated with our infix operator bindings, but the trouble is that they would then cease to be usable as ordinary functions. For example, if you try and use the <code>foo</code> binding from the above example as an expression, Racket will complain about an “illegal use of syntax”, which makes sense, because <code>foo</code> is not bound to anything at runtime.</p><p>To solve this problem, we can use something of a trick: any compile-time binding that happens to have a procedure as its value will be treated like a macro—that is, using it as an expression will cause the macroexpander to invoke the procedure with a syntax object representing the macro invocation, and the procedure is expected to produce a new syntax object as output. Additionally, Racket programmers can make custom datatypes valid procedures by using the <a href="http://docs.racket-lang.org/reference/procedures.html#%28def._%28%28lib._racket%2Fprivate%2Fbase..rkt%29._prop~3aprocedure%29%29"><code>prop:procedure</code></a> structure type property.</p><p>If you are not familiar with the Racket macro system, this probably sounds rather complicated, but in practice, it’s not as confusing as it might seem. The trick here is to create a custom structure type at compile-time that we can use to track operator fixity alongside its runtime binding:</p><pre><code class="pygments"><span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">for-syntax</span> <span class="n">syntax/transformer</span><span class="p">))</span>
<span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">infix-operator</span> <span class="p">(</span><span class="n">runtime-binding</span> <span class="n">fixity</span><span class="p">)</span>
<span class="kd">#:property</span> <span class="nb">prop:procedure</span>
<span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">operator</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">((</span><span class="nb">set!-transformer-procedure</span>
<span class="p">(</span><span class="n">make-variable-like-transformer</span>
<span class="p">(</span><span class="n">infix-operator-runtime-binding</span> <span class="n">operator</span><span class="p">)))</span>
<span class="n">stx</span><span class="p">))))</span></code></pre><p>This is quite the magical incantation, and all the details of what is going on here are outside the scope of this blog post. Essentially, though, we can use values of this structure as a compile-time binding that will act just like the identifier provided for <code>runtime-binding</code>, but we can also include a value of our choosing for <code>fixity</code>. Here’s an example:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">::</span> <span class="p">(</span><span class="n">infix-operator</span> <span class="o">#'</span><span class="nb">cons</span> <span class="o">'</span><span class="ss">right</span><span class="p">))</span></code></pre><p>This new <code>::</code> binding will act, in every way, just like <code>cons</code>. If we use it in the REPL, you can see that it acts exactly the same:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">(</span><span class="n">::</span> <span class="mi">1</span> <span class="o">'</span><span class="p">())</span>
<span class="o">'</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span></code></pre><p>However, we can also use <code>syntax-local-value</code> to extract this binding’s fixity at compile-time, and that’s what makes it interesting:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="nb">println</span> <span class="p">(</span><span class="n">infix-operator-fixity</span> <span class="p">(</span><span class="nb">syntax-local-value</span> <span class="o">#'</span><span class="n">::</span><span class="p">))))</span>
<span class="c1">; => 'right</span></code></pre><p>Using this extra compile-time information, we can adjust our <code>#%infix</code> macro to inspect bindings and determine their fixity, then use that to make decisions about parsing. Just like we used <code>syntax/parse/class/paren-shape</code> to make decisions based on the <code>'paren-shape</code> syntax property, we can use <a href="http://docs.racket-lang.org/syntax-classes/index.html#%28mod-path._syntax%2Fparse%2Fclass%2Flocal-value%29"><code>syntax/parse/class/local-value</code></a> to pattern-match on bindings with a particular compile-time value. We’ll wrap this in a syntax class of our own to make the code easier to read:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">infix-op</span>
<span class="kd">#:description</span> <span class="s2">"infix operator"</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">fixity</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">{</span><span class="n">~var</span> <span class="n">op</span> <span class="p">(</span><span class="n">local-value</span> <span class="n">infix-operator?</span><span class="p">)}</span>
<span class="kd">#:attr</span> <span class="n">fixity</span> <span class="p">(</span><span class="n">infix-operator-fixity</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">op.local-value</span><span class="p">))]))</span></code></pre><p>Now, we can update <code>#%infix</code> to use our new <code>infix-op</code> syntax class:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-syntax-parser</span> <span class="n">#%infix</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op:infix-op</span> <span class="n">b</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">racket/base/#%app</span> <span class="n">op</span> <span class="n">a</span> <span class="n">b</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op:infix-op</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:when</span> <span class="p">(</span><span class="nb">eq?</span> <span class="o">'</span><span class="ss">left</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">op.fixity</span><span class="p">))</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="p">(</span><span class="n">#%infix</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">)</span> <span class="n">more</span> <span class="k">...</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">more</span> <span class="k">...</span> <span class="n">a</span> <span class="n">op:infix-op</span> <span class="n">b</span><span class="p">)</span>
<span class="kd">#:when</span> <span class="p">(</span><span class="nb">eq?</span> <span class="o">'</span><span class="ss">right</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">op.fixity</span><span class="p">))</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="n">more</span> <span class="k">...</span> <span class="p">(</span><span class="n">#%infix</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">))])</span></code></pre><p>Notably, we now require all operators to be bound to compile-time infix operator values, and we include two conditions via <code>#:when</code> clauses. These clauses check to ensure that the operator in question has the expected fixity before committing to that clause; if the condition fails, then parsing backtracks. Using this new definition of <code>#%infix</code>, we can successfully use <code>::</code> in an infix expression, and it will be parsed with the associativity that we expect:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="n">::</span> <span class="mi">2</span> <span class="n">::</span> <span class="mi">3</span> <span class="n">::</span> <span class="o">'</span><span class="p">()}</span>
<span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span></code></pre><p>Exciting!</p><h3><a name="a-nicer-interface-for-defining-infix-operators"></a>A nicer interface for defining infix operators</h3><p>We currently have to define infix operators by explicitly using <code>define-syntax</code>, but this is not a very good interface. Users of infix syntax probably don’t want to have to understand the internal workings of the infix operator implementation, so we just need to define one final macro to consider this done: the <code>define-infix-operator</code> form from the example at the very beginning of this blog post.</p><p>Fortunately, this macro is absolutely trivial to write. In fact, we can do it in a mere three lines of code, since it’s very minor sugar over the <code>define-syntax</code> definitions we were already writing:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">op:id</span> <span class="n">value:id</span>
<span class="kd">#:fixity</span> <span class="p">{</span><span class="n">~and</span> <span class="n">fixity</span> <span class="p">{</span><span class="n">~or</span> <span class="p">{</span><span class="n">~datum</span> <span class="n">left</span><span class="p">}</span> <span class="p">{</span><span class="n">~datum</span> <span class="n">right</span><span class="p">}}})</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">op</span> <span class="p">(</span><span class="n">infix-operator</span> <span class="o">#'</span><span class="n">value</span> <span class="o">'</span><span class="ss">fixity</span><span class="p">)))</span></code></pre><p>With this in hand, we can define some infix operators with a much nicer syntax:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">+</span> <span class="n">racket/base/+</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">-</span> <span class="n">racket/base/-</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">*</span> <span class="n">racket/base/*</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">/</span> <span class="n">racket/base//</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">^</span> <span class="nb">expt</span> <span class="kd">#:fixity</span> <span class="n">right</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">::</span> <span class="nb">cons</span> <span class="kd">#:fixity</span> <span class="n">right</span><span class="p">)</span></code></pre><p>With these simple definitions, we can write some very nice mathematical expressions that use infix syntax, in ordinary <code>#lang racket</code>:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="nb">+</span> <span class="mi">2</span> <span class="nb">-</span> <span class="mi">4</span><span class="p">}</span>
<span class="mi">-1</span>
<span class="nb">></span> <span class="p">{</span><span class="mi">2</span> <span class="n">^</span> <span class="mi">2</span> <span class="n">^</span> <span class="mi">3</span><span class="p">}</span>
<span class="mi">256</span>
<span class="nb">></span> <span class="p">{{</span><span class="mi">2</span> <span class="n">^</span> <span class="mi">2</span><span class="p">}</span> <span class="n">^</span> <span class="mi">3</span><span class="p">}</span>
<span class="mi">64</span></code></pre><p>And you know what’s most amazing about this? The entire thing is <strong>only 50 lines of code</strong>. Here is the entire implementation of infix operators from this blog post in a single code block, with absolutely nothing hidden or omitted:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="p">(</span><span class="k">for-syntax</span> <span class="n">syntax/parse/class/local-value</span>
<span class="n">syntax/parse/class/paren-shape</span>
<span class="n">syntax/transformer</span><span class="p">)</span>
<span class="p">(</span><span class="k">prefix-in</span> <span class="n">racket/base/</span> <span class="n">racket/base</span><span class="p">)</span>
<span class="n">syntax/parse/define</span><span class="p">)</span>
<span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">infix-operator</span> <span class="p">(</span><span class="n">runtime-binding</span> <span class="n">fixity</span><span class="p">)</span>
<span class="kd">#:property</span> <span class="nb">prop:procedure</span>
<span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">operator</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">((</span><span class="nb">set!-transformer-procedure</span>
<span class="p">(</span><span class="n">make-variable-like-transformer</span>
<span class="p">(</span><span class="n">infix-operator-runtime-binding</span> <span class="n">operator</span><span class="p">)))</span>
<span class="n">stx</span><span class="p">)))</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">infix-op</span>
<span class="kd">#:description</span> <span class="s2">"infix operator"</span>
<span class="kd">#:attributes</span> <span class="p">[</span><span class="n">fixity</span><span class="p">]</span>
<span class="p">[</span><span class="n">pattern</span> <span class="p">{</span><span class="n">~var</span> <span class="n">op</span> <span class="p">(</span><span class="n">local-value</span> <span class="n">infix-operator?</span><span class="p">)}</span>
<span class="kd">#:attr</span> <span class="n">fixity</span> <span class="p">(</span><span class="n">infix-operator-fixity</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">op.local-value</span><span class="p">))]))</span>
<span class="p">(</span><span class="n">define-syntax-parser</span> <span class="k">#%app</span>
<span class="p">[{</span><span class="n">~braces</span> <span class="k">_</span> <span class="n">arg</span> <span class="k">...</span><span class="p">}</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">racket/base/#%app</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)])</span>
<span class="p">(</span><span class="n">define-syntax-parser</span> <span class="n">#%infix</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op:infix-op</span> <span class="n">b</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">racket/base/#%app</span> <span class="n">op</span> <span class="n">a</span> <span class="n">b</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">a</span> <span class="n">op:infix-op</span> <span class="n">b</span> <span class="n">more</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:when</span> <span class="p">(</span><span class="nb">eq?</span> <span class="o">'</span><span class="ss">left</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">op.fixity</span><span class="p">))</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="p">(</span><span class="n">#%infix</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">)</span> <span class="n">more</span> <span class="k">...</span><span class="p">)]</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">more</span> <span class="k">...</span> <span class="n">a</span> <span class="n">op:infix-op</span> <span class="n">b</span><span class="p">)</span>
<span class="kd">#:when</span> <span class="p">(</span><span class="nb">eq?</span> <span class="o">'</span><span class="ss">right</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">op.fixity</span><span class="p">))</span>
<span class="o">#'</span><span class="p">(</span><span class="n">#%infix</span> <span class="n">more</span> <span class="k">...</span> <span class="p">(</span><span class="n">#%infix</span> <span class="n">a</span> <span class="n">op</span> <span class="n">b</span><span class="p">))])</span>
<span class="p">(</span><span class="n">define-simple-macro</span> <span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">op:id</span> <span class="n">value:id</span>
<span class="kd">#:fixity</span> <span class="p">{</span><span class="n">~and</span> <span class="n">fixity</span> <span class="p">{</span><span class="n">~or</span> <span class="p">{</span><span class="n">~datum</span> <span class="n">left</span><span class="p">}</span> <span class="p">{</span><span class="n">~datum</span> <span class="n">right</span><span class="p">}}})</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">op</span> <span class="p">(</span><span class="n">infix-operator</span> <span class="o">#'</span><span class="n">value</span> <span class="o">'</span><span class="ss">fixity</span><span class="p">)))</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">+</span> <span class="n">racket/base/+</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">-</span> <span class="n">racket/base/-</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">*</span> <span class="n">racket/base/*</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="nb">/</span> <span class="n">racket/base//</span> <span class="kd">#:fixity</span> <span class="n">left</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">^</span> <span class="nb">expt</span> <span class="kd">#:fixity</span> <span class="n">right</span><span class="p">)</span>
<span class="p">(</span><span class="n">define-infix-operator</span> <span class="n">::</span> <span class="nb">cons</span> <span class="kd">#:fixity</span> <span class="n">right</span><span class="p">)</span></code></pre><p>Racket is a hell of a programming language.</p><h2><a name="applications-limitations-and-implications"></a>Applications, limitations, and implications</h2><p>This blog post has outlined a complete, useful model for infix operators, and it is now hopefully clear how they work, but many of the most interesting properties of this implementation are probably not obvious. As far as I can make out, this embedding of infix operators into a macro system is novel, and I am <em>almost certain</em> that the way this implementation tracks fixity information is unique. One of the most interesting capabilities gained from this choice of implementation is the ability for macros to define infix operators and control their fixity, even <em>locally</em>.</p><p>What does this mean? Well, remember that infix operators are just special syntax bindings. Racket includes a variety of forms for binding or adjusting macros locally, such as <code>let-syntax</code> and <code>syntax-parameterize</code>. Using these tools, it would be entirely possible to implement a <code>with-fixity</code> macro, that could adjust the fixity of an operator within a syntactic block. This could be used, for example, to make <code>/</code> right associative within a block of code:</p><pre><code class="pygments"><span class="nb">></span> <span class="p">{</span><span class="mi">1</span> <span class="nb">/</span> <span class="mi">2</span> <span class="nb">/</span> <span class="mi">3</span><span class="p">}</span>
<span class="m">1/6</span>
<span class="nb">></span> <span class="p">(</span><span class="n">with-fixity</span> <span class="p">([</span><span class="nb">/</span> <span class="n">right</span><span class="p">])</span>
<span class="p">{</span><span class="mi">1</span> <span class="nb">/</span> <span class="mi">2</span> <span class="nb">/</span> <span class="mi">3</span><span class="p">})</span>
<span class="mi">1</span> <span class="m">1/2</span></code></pre><p>In fact, this macro is hardly theoretical, since it could be implemented in a trivial 7 lines, simply expanding to uses of <code>splicing-let</code> and <code>splicing-let-syntax</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-simple-macro</span>
<span class="p">(</span><span class="n">with-fixity</span> <span class="p">([</span><span class="n">op:id</span> <span class="p">{</span><span class="n">~and</span> <span class="n">fixity</span> <span class="p">{</span><span class="n">~or</span> <span class="p">{</span><span class="n">~datum</span> <span class="n">left</span><span class="p">}</span> <span class="p">{</span><span class="n">~datum</span> <span class="n">right</span><span class="p">}}}]</span> <span class="k">...</span><span class="p">)</span>
<span class="n">body</span> <span class="k">...</span><span class="p">)</span>
<span class="kd">#:with</span> <span class="p">[</span><span class="n">op-tmp</span> <span class="k">...</span><span class="p">]</span> <span class="p">(</span><span class="nb">generate-temporaries</span> <span class="o">#'</span><span class="p">[</span><span class="n">op</span> <span class="k">...</span><span class="p">])</span>
<span class="p">(</span><span class="n">splicing-let</span> <span class="p">([</span><span class="n">op-tmp</span> <span class="n">op</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">splicing-let-syntax</span> <span class="p">([</span><span class="n">op</span> <span class="p">(</span><span class="n">infix-operator</span> <span class="o">#'</span><span class="n">op-tmp</span> <span class="o">'</span><span class="ss">fixity</span><span class="p">)]</span> <span class="k">...</span><span class="p">)</span>
<span class="n">body</span> <span class="k">...</span><span class="p">)))</span></code></pre><p>This is not especially useful given the current set of infix operator features, but it’s easy to imagine how useful it could be in a system that also supported a notion of precedence. It is not entirely uncommon to encounter certain expressions that could be more cleanly expressed with a local set of operator precedence rules, perhaps described as a set of relations <em>between</em> operators rather than a global table of magic precedence numbers. With traditional approaches to infix operators, parsing such code would be difficult without a very rigid syntactic structure, but this technique makes it easy.</p><p>As mentioned at the beginning of this blog post, this technique is also not merely a novelty—as of now, I am actively using this in <a href="https://github.com/lexi-lambda/hackett">Hackett</a> to support infix operators with all of the features outlined here. The Hackett implementation is a little bit fancier than the one in this blog post, since it works harder to produce better error messages. It explicitly disallows mixing left associative and right associative operators in the same expression, so it does some additional validation as part of expansion, and it arranges for source location information to be copied onto the result. It also make a different design decision to allow <em>any</em> expression to serve as an infix operator, assuming left associativity if no fixity annotation is available.</p><p>If you’re interested in the code behind the additional steps Hackett takes to make infix operators more usable and complete, take a look at <a href="https://github.com/lexi-lambda/hackett/blob/0d177d00a9ee96f30dd76761f1cb86f15830779f/hackett-lib/hackett/private/infix.rkt">this file for the definition of infix bindings</a>, as well as <a href="https://github.com/lexi-lambda/hackett/blob/0d177d00a9ee96f30dd76761f1cb86f15830779f/hackett-lib/hackett/private/kernel.rkt#L80-L101">this file for the defintion of infix application</a>. My hope is to eventually add support for some sort of precedence information, though who knows—maybe infix operators will be easier to reason about if the rules are kept extremely simple. I am also considering adding support for so-called “operator sections” at some point, which would allow things like <code>{_ - 1}</code> to serve as a shorthand for <code>(lambda [x] {x - 1})</code>, but I haven’t yet decided if I like the tradeoffs involved.</p><p>It’s possible that this implementation of infix operators might also be useful in languages in the Racket ecosystem besides Hackett. However, I’m not sure it makes a ton of sense in <code>#lang racket</code> without modifications, as variadic functions subsume many of the cases where infix operators are needed in Haskell. If there is a clamoring for this capability, I would be happy to consider extracting the functionality into a library, but as of right now, I don’t have any plans to do so.</p><p>Finally, the main point of this blog post is to showcase how easy it is to do things in Racket that would be impossible in most languages and difficult even in most Lisps. It also helps to show off how Hackett is already benefitting from those capabilities: while this particular feature is built-in to <code>#lang hackett</code>, there’s no reason something similar but more powerful couldn’t be built as a separate library by a <em>user</em> of Hackett. Even as Hackett’s author, I think that’s exciting, since makes it possible for users to experiment with improvements to the language on their own. Some of those improvements may eventually be rolled into the core language or standard library, but many of them can likely live effectively in separate libraries, accessible on-demand to those who need them. After all, that’s one of Racket’s most important promises—languages as libraries—and it’s why Hackett is a part of the Racket ecosystem.</p><ol class="footnotes"></ol></article>Unit testing effectful Haskell with monad-mock2017-06-29T00:00:00Z2017-06-29T00:00:00ZAlexis King<article><p>Nearly eight months ago, <a href="/blog/2016/10/03/using-types-to-unit-test-in-haskell/">I wrote a blog post about unit testing effectful Haskell code</a> using a library called test-fixture. That library has served us well, but it wasn’t as easy to use as I would have liked, and it worked better with certain patterns than others. Since then, I’ve learned more about Haskell and more about testing, and I’m pleased to announce that I am releasing an entirely new testing library, <a href="https://hackage.haskell.org/package/monad-mock">monad-mock</a>.</p><h2><a name="a-first-glance-at-monad-mock"></a>A first glance at monad-mock</h2><p>The monad-mock library is, first and foremost, designed to be <em>easy</em>. It doesn’t ask much from you, and it requires almost zero boilerplate.</p><p>The first step is to write an mtl-style interface that encodes an effect you want to mock. For example, you might want to test some code that interacts with the filesystem:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">String</span>
<span class="n">writeFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>Now you just have to write your code as normal. For demonstration purposes, here’s a function that defines copying a file in terms of <code>readFile</code> and <code>writeFile</code>:</p><pre><code class="pygments"><span class="nf">copyFile</span> <span class="ow">::</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="nf">copyFile</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">contents</span> <span class="ow"><-</span> <span class="n">readFile</span> <span class="n">a</span>
<span class="n">writeFile</span> <span class="n">b</span> <span class="n">contents</span></code></pre><p>Making this function work on the real filesystem is trivial, since we just need to define an instance of <code>MonadFileSystem</code> for <code>IO</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadFileSystem</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="ow">=</span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">readFile</span>
<span class="n">writeFile</span> <span class="ow">=</span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">writeFile</span></code></pre><p>But how do we test this? Well, we <em>could</em> run some real code in <code>IO</code>, which might not be so bad for such a simple function, but this seems like a bad idea. For one thing, a bad implementation of <code>copyFile</code> could do some pretty horrible things if it misbehaved and decided to overwrite important files, and if you’re constantly running a test suite whenever a file changes, it’s easy to imagine causing a lot of damage. Running tests against the real filesystem also makes tests slower and harder to parallelize, and it only gets much worse once you are doing more complex effects than interacting with the filesystem.</p><p>Using monad-mock, we can test this function in just a couple of lines of code:</p><pre><code class="pygments"><span class="kr">import</span> <span class="nn">Control.Exception</span> <span class="p">(</span><span class="nf">evaluate</span><span class="p">)</span>
<span class="kr">import</span> <span class="nn">Control.Monad.Mock</span>
<span class="kr">import</span> <span class="nn">Control.Monad.Mock.TH</span>
<span class="kr">import</span> <span class="nn">Data.Function</span> <span class="p">((</span><span class="o">&</span><span class="p">))</span>
<span class="kr">import</span> <span class="nn">Test.Hspec</span>
<span class="nf">makeMock</span> <span class="s">"FileSystemAction"</span> <span class="p">[</span><span class="n">ts</span><span class="o">|</span> <span class="kt">MonadFileSystem</span> <span class="o">|</span><span class="p">]</span>
<span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"copyFile"</span> <span class="o">$</span>
<span class="n">it</span> <span class="s">"reads a file and writes its contents to another file"</span> <span class="o">$</span>
<span class="n">evaluate</span> <span class="o">$</span> <span class="n">copyFile</span> <span class="s">"foo.txt"</span> <span class="s">"bar.txt"</span>
<span class="o">&</span> <span class="n">runMock</span> <span class="p">[</span> <span class="kt">ReadFile</span> <span class="s">"foo.txt"</span> <span class="kt">:-></span> <span class="s">"contents"</span>
<span class="p">,</span> <span class="kt">WriteFile</span> <span class="s">"bar.txt"</span> <span class="s">"contents"</span> <span class="kt">:-></span> <span class="nb">()</span> <span class="p">]</span></code></pre><p>That’s it!</p><p>The last two lines of the above snippet are the real interesting bits, which specify the actions that are expected to be executed, and it couples them with their results. You will find that if you tweak the list in any way, such as reordering the actions, eliminating one or both of them, or adding an additional action to the end, the test will fail. We could even turn this into a property-based test that generated arbitrary file paths and file contents.</p><p>Admittedly, in this trivial example, the mock is a little silly, since converting this into a property-based test would demonstrate how much we’ve basically just reimplemented the function in our test. However, once our function starts to do somewhat more complicated things, then our tests become more meaningful. Here’s a similar function that only copies a file if it is nonempty:</p><pre><code class="pygments"><span class="nf">copyNonemptyFile</span> <span class="ow">::</span> <span class="kt">MonadFileSystem</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="nf">copyNonemptyFile</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">contents</span> <span class="ow"><-</span> <span class="n">readFile</span> <span class="n">a</span>
<span class="n">unless</span> <span class="p">(</span><span class="n">null</span> <span class="n">contents</span><span class="p">)</span> <span class="o">$</span>
<span class="n">writeFile</span> <span class="n">b</span> <span class="n">contents</span></code></pre><p>This function has some logic which is very clearly <em>not</em> expressed in its type, and it would be difficult to encode that information into the type in a safe way. Fortunately, we can guarantee that it works by writing some tests:</p><pre><code class="pygments"><span class="nf">describe</span> <span class="s">"copyNonemptyFile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"copies a file with contents"</span> <span class="o">$</span>
<span class="n">evaluate</span> <span class="o">$</span> <span class="n">copyNonemptyFile</span> <span class="s">"foo.txt"</span> <span class="s">"bar.txt"</span>
<span class="o">&</span> <span class="n">runMock</span> <span class="p">[</span> <span class="kt">ReadFile</span> <span class="s">"foo.txt"</span> <span class="kt">:-></span> <span class="s">"contents"</span>
<span class="p">,</span> <span class="kt">WriteFile</span> <span class="s">"bar.txt"</span> <span class="s">"contents"</span> <span class="kt">:-></span> <span class="nb">()</span> <span class="p">]</span>
<span class="n">it</span> <span class="s">"does nothing with an empty file"</span> <span class="o">$</span>
<span class="n">evaluate</span> <span class="o">$</span> <span class="n">copyNonemptyFile</span> <span class="s">"foo.txt"</span> <span class="s">"bar.txt"</span>
<span class="o">&</span> <span class="n">runMock</span> <span class="p">[</span> <span class="kt">ReadFile</span> <span class="s">"foo.txt"</span> <span class="kt">:-></span> <span class="s">""</span> <span class="p">]</span></code></pre><p>These tests are much more useful, and they have some actual value to them. Imagine we had accidentally written <code>when</code> instead of <code>unless</code>, an easy typo to make. Our tests would fail with some useful error messages:</p><pre><code>1) copyNonemptyFile copies a file with contents
uncaught exception: runMockT: expected the following unexecuted actions to be run:
WriteFile "bar.txt" "contents"
2) copyNonemptyFile does nothing with an empty file
uncaught exception: runMockT: expected end of program, called writeFile
given action: WriteFile "bar.txt" ""
</code></pre><p>You now know enough to write tests with monad-mock.</p><h2><a name="why-unit-test"></a>Why unit test?</h2><p>When the issue of testing is brought up in Haskell, it is often treated with a certain distaste by a portion of the community. There are some points I’ve seen a number of times, and though they take different forms, they boil down to two ideas:</p><ol><li><p>“Haskell code does not need tests because the type system can prove correctness.”</p></li><li><p>“Testing in Haskell is trivial because it is a pure language, and testing pure functions is easy.”</p></li></ol><p>I’ve been writing Haskell professionally for over a year now, and I can happily say that there <em>is</em> some truth to both of those things! When my Haskell code typechecks, I feel a confidence in it that I would not feel were I using a language with a less powerful type system. Furthermore, Haskell encourages a “pure core, impure shell” approach to system design that makes testing many things pleasant and straightforward, and it completely eliminates the worry of subtle nondeterminism leaking into tests.</p><p>That said, Haskell is not a proof assistant, and its type system cannot guarantee everything, especially for code that operates on the boundaries of what Haskell can control. For much the same reason, I find that my pure code is the code I am <em>least</em> likely to need to test, since it is also the code with the strongest type safety guarantees, operating on types in my application’s domain. In contrast, the effectful code is often what I find the most value in extensively testing, since it often contains the most subtle complexity, and it is frequently difficult or even impossible to encode into types.</p><p>Haskell has the power to provide remarkably strong correctness guarantees with a surprisingly small amount of effort by using a combination of tests and types, using each to accommodate for the other’s weaknesses and playing to each technique’s strengths. Some code is test-driven, other code is type-driven. Most code ends up being a mix of both. Testing is just a tool like any other, and it’s nice to feel confident in one’s ability to effectively structure code in a decoupled, testable manner.</p><h2><a name="why-mock"></a>Why mock?</h2><p>Even if you accept that testing is good, the question of whether or not to <em>mock</em> is a subtler issue. To some people, “unit testing” is synonymous with mocks. This is emphatically not true, and in fact, overly aggressive mocking is one of the best ways to make your test suite completely worthless. The monad-mock approach to mocking is a bit more principled than mocking in many dynamic, object-oriented languages, but it comes with many of the same drawbacks: mocks couple your tests to your implementation in ways that make them less valuable and less meaningful.</p><p>For the <code>MonadFileSystem</code> example above, I would actually probably <em>not</em> use a mock. Instead, I would use a <strong>fake</strong>, in-memory filesystem implementation:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">FakeFileSystemT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">FakeFileSystemT</span> <span class="p">(</span><span class="kt">StateT</span> <span class="p">[(</span><span class="kt">FilePath</span><span class="p">,</span> <span class="kt">String</span><span class="p">)]</span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">)</span>
<span class="nf">fakeFileSystemT</span> <span class="ow">::</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="p">[(</span><span class="kt">FilePath</span><span class="p">,</span> <span class="kt">String</span><span class="p">)]</span>
<span class="ow">-></span> <span class="kt">FakeFileSystemT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="p">[(</span><span class="kt">FilePath</span><span class="p">,</span> <span class="kt">String</span><span class="p">)])</span>
<span class="nf">fakeFileSystemT</span> <span class="n">fs</span> <span class="p">(</span><span class="kt">FakeFileSystemT</span> <span class="n">x</span><span class="p">)</span> <span class="ow">=</span> <span class="n">second</span> <span class="n">sort</span> <span class="o"><$></span> <span class="n">runStateT</span> <span class="n">x</span> <span class="n">fs</span>
<span class="kr">instance</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">FakeFileSystemT</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="n">path</span> <span class="ow">=</span> <span class="kt">FakeFileSystemT</span> <span class="o">$</span> <span class="n">get</span> <span class="o">>>=</span> <span class="nf">\</span><span class="n">fs</span> <span class="ow">-></span> <span class="n">lookup</span> <span class="n">path</span> <span class="n">fs</span> <span class="o">&</span>
<span class="n">maybe</span> <span class="p">(</span><span class="n">fail</span> <span class="o">$</span> <span class="s">"readFile: no such file ‘"</span> <span class="o">++</span> <span class="n">path</span> <span class="o">++</span> <span class="s">"’"</span><span class="p">)</span> <span class="n">return</span>
<span class="n">writeFile</span> <span class="n">path</span> <span class="n">contents</span> <span class="ow">=</span> <span class="kt">FakeFileSystemT</span> <span class="o">.</span> <span class="n">modify</span> <span class="o">$</span> <span class="nf">\</span><span class="n">fs</span> <span class="ow">-></span>
<span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">contents</span><span class="p">)</span> <span class="kt">:</span> <span class="n">filter</span> <span class="p">((</span><span class="o">/=</span> <span class="n">path</span><span class="p">)</span> <span class="o">.</span> <span class="n">fst</span><span class="p">)</span> <span class="n">fs</span></code></pre><p>The above snippet demonstrates how easy it is to define a <code>MonadFileSystem</code> implementation in terms of <code>StateT</code>, and while this may seem like a lot of boilerplate, it really isn’t. You have to write a fake <em>once</em> per interface, and the above block is a minuscule twelve lines of code. With this technique, you are still able to write tests that depend on the state of the filesystem before and after running the implementation, but you decouple yourself from the precise process of getting there:</p><pre><code class="pygments"><span class="nf">describe</span> <span class="s">"copyNonemptyFile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"copies a file with contents"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span> <span class="n">fs</span><span class="p">)</span> <span class="ow">=</span> <span class="n">runIdentity</span> <span class="o">$</span> <span class="n">copyNonemptyFile</span> <span class="s">"foo.txt"</span> <span class="s">"bar.txt"</span>
<span class="o">&</span> <span class="n">fakeFileSystemT</span> <span class="p">[</span> <span class="p">(</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">"contents"</span><span class="p">)</span> <span class="p">]</span>
<span class="n">fs</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span> <span class="p">(</span><span class="s">"bar.txt"</span><span class="p">,</span> <span class="s">"contents"</span><span class="p">),</span> <span class="p">(</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">"contents"</span><span class="p">)</span> <span class="p">]</span>
<span class="n">it</span> <span class="s">"does nothing with an empty file"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span> <span class="n">fs</span><span class="p">)</span> <span class="ow">=</span> <span class="n">runIdentity</span> <span class="o">$</span> <span class="n">copyNonemptyFile</span> <span class="s">"foo.txt"</span> <span class="s">"bar.txt"</span>
<span class="o">&</span> <span class="n">fakeFileSystemT</span> <span class="p">[</span> <span class="p">(</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span> <span class="p">]</span>
<span class="n">fs</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span> <span class="p">(</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span> <span class="p">]</span></code></pre><p>This is better than using a mock, and I would highly recommend doing it if you can! However, a lot of real applications have to interact with services of much greater complexity than an idealized filesystem, and creating that sort of in-memory fake is not always practical. One such situation might be interacting with AWS CloudFormation, for example:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadAWS</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">createStack</span> <span class="ow">::</span> <span class="kt">StackName</span> <span class="ow">-></span> <span class="kt">StackTemplate</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">AWSError</span> <span class="kt">StackId</span><span class="p">)</span>
<span class="n">listStacks</span> <span class="ow">::</span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">AWSError</span> <span class="p">[</span><span class="kt">StackSummaries</span><span class="p">])</span>
<span class="n">describeStack</span> <span class="ow">::</span> <span class="kt">StackId</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">AWSError</span> <span class="kt">StackInfo</span><span class="p">)</span>
<span class="c1">-- and so on...</span></code></pre><p>AWS is a very complex system, and it can do dozens of different things (and fail in dozens of different ways) based on an equally complex set of inputs. For example, in the above API, <code>createStack</code> needs to parse its template, which can be YAML or JSON, in order to determine which of many possible errors and behaviors can be produced, both on the initial call and on subsequent ones.</p><p>Creating a fake implementation of <em>AWS</em> is hardly feasible, and this is where a mock can be useful. By simply writing <code>makeMock "AWSAction" [ts| MonadAWS |]</code>, we can test functions that interact with AWS in a pure way without necessarily needing to replicate all of its complexity.</p><h3><a name="isolating-mocks"></a>Isolating mocks</h3><p>Of course, tests that use mocks provide less value than tests that use “smarter” fakes, since they are far more tightly coupled to the implementation, and it’s dramatically more likely that you will need to change the tests when you change the logic. To avoid this, it can be helpful to create multiple interfaces to the same thing: a high-level interface and a low-level one. If our above <code>MonadAWS</code> is a low-level interface, we could create a high-level counterpart that does precisely what our application needs:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadDeploy</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">executeDeployment</span> <span class="ow">::</span> <span class="kt">Deployment</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">DeployError</span> <span class="nb">()</span><span class="p">)</span></code></pre><p>When running our application “for real”, we would use <code>MonadAWS</code> to implement <code>MonadDeploy</code>:</p><pre><code class="pygments"><span class="nf">executeDeploymentImpl</span> <span class="ow">::</span> <span class="kt">MonadAWS</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">Deployment</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">DeployError</span> <span class="nb">()</span><span class="p">)</span>
<span class="nf">executeDeploymentImpl</span> <span class="ow">=</span> <span class="o">...</span></code></pre><p>The nice thing about this is we can actually test <code>executeDeploymentImpl</code> using a <code>MonadAWS</code> mock, so we can still have unit test coverage of the code on the boundaries of our system! Additionally, by containing the mock to a single place, we can test the rest of our code using a smarter fake implementation of <code>MonadDeploy</code>, helping to decouple our code from AWS’s complex API and improve the reliability and usefulness of our test suite.</p><p>They key point here is that mocking is just a small piece of the larger testing puzzle in <em>any</em> language, and that is just as true in Haskell. An overemphasis on mocking is an easy way to end up with a test suite that feels useless, probably because it is. Use mocks as a technique to insulate your application from the complexity in others’ APIs, then use more domain-specific testing techniques and type-level assertions to ensure the correctness of your logic.</p><h2><a name="how-monad-mock-works"></a>How monad-mock works</h2><p>If you’ve read this far and are convinced that monad-mock is useful, you may safely stop reading now. However, if you are interested in the details of what it actually does and what makes it tick, the rest of this blog post is going to focus on how the implementation works and how it compares to other techniques.</p><p>The centerpiece of monad-mock’s API is its monad transformer, <code>MockT</code>, which is a type constructor that accepts three types:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">MockT</span> <span class="p">(</span><span class="n">f</span> <span class="ow">::</span> <span class="o">*</span> <span class="ow">-></span> <span class="o">*</span><span class="p">)</span> <span class="p">(</span><span class="n">m</span> <span class="ow">::</span> <span class="o">*</span> <span class="ow">-></span> <span class="o">*</span><span class="p">)</span> <span class="p">(</span><span class="n">a</span> <span class="ow">::</span> <span class="o">*</span><span class="p">)</span></code></pre><p>The <code>m</code> and <code>a</code> type variables obviously correspond to the usual monad transformer arguments, which represent the underlying monad and the result of the monadic computation, respectively. The <code>f</code> variable is more interesting, since it’s what makes <code>MockT</code> work at all, and it isn’t even a type: it’s a type constructor with kind <code>* -> *</code>. What does it mean?</p><p>Looking at the type signature of <code>runMockT</code> gives us a little bit more information about what that <code>f</code> actually represents:</p><pre><code class="pygments"><span class="nf">runMockT</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">Action</span> <span class="n">f</span><span class="p">,</span> <span class="kt">Monad</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=></span> <span class="p">[</span><span class="kt">WithResult</span> <span class="n">f</span><span class="p">]</span> <span class="ow">-></span> <span class="kt">MockT</span> <span class="n">f</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>This type signature provides two pieces of key information:</p><ol><li><p>The <code>f</code> parameter is constrained by the <code>Action f</code> constraint.</p></li><li><p>Running a mocked computation requires supplying a list of <code>WithResult f</code> values. This list corresponds to the list of expectations provided to <code>runMock</code> in earlier examples.</p></li></ol><p>To understand both of these things, it helps to examine the definition of an actual datatype that can have an <code>Action</code> instance. For the filesystem example, the action datatype looks like this:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">FileSystemAction</span> <span class="n">r</span> <span class="kr">where</span>
<span class="kt">ReadFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">FileSystemAction</span> <span class="kt">String</span>
<span class="kt">WriteFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="kt">FileSystemAction</span> <span class="nb">()</span></code></pre><p>Notice how each constructor clearly corresponds to one of the methods of <code>MonadFileSystem</code>, with a type to match. Now the purpose of the type provided to the <code>FileSystemAction</code> constructor (in this case <code>r</code>) should hopefully become clear: it represents the type of the value <em>produced</em> by each method. Also note that the type is completely phantom—it does not appear in negative position in any of the constructors.</p><p>With this in mind, we can take a look at the definition of <code>WithResult</code>:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">WithResult</span> <span class="n">f</span> <span class="kr">where</span>
<span class="p">(</span><span class="kt">:-></span><span class="p">)</span> <span class="ow">::</span> <span class="n">f</span> <span class="n">r</span> <span class="ow">-></span> <span class="n">r</span> <span class="ow">-></span> <span class="kt">WithResult</span> <span class="n">f</span></code></pre><p>This is what defines the <code>(:->)</code> constructor from earlier in the blog post, and you can see that it effectively just represents a tuple of an action and a value of its associated result. It’s completely type-safe, since it ensures the result matches the type argument to the action.</p><p>Finally, this brings us to the <code>Action</code> class, which is not complex, but is unfortunately necessary:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Action</span> <span class="n">f</span> <span class="kr">where</span>
<span class="n">eqAction</span> <span class="ow">::</span> <span class="n">f</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">f</span> <span class="n">b</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="p">(</span><span class="n">a</span> <span class="kt">:~:</span> <span class="n">b</span><span class="p">)</span>
<span class="n">showAction</span> <span class="ow">::</span> <span class="n">f</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">String</span></code></pre><p>Notice that these methods are effectively just <code>(==)</code> and <code>show</code>, lifted to type constructors of kind <code>* -> *</code>. One significant difference is that <code>eqAction</code> produces <code>Maybe (a :~: b)</code> instead of <code>Bool</code>, where <code>(:~:)</code> is from <code>Data.Type.Equality</code>. This is a type equality witness, which means a successful equality between two values allows the compiler to be sure that the two <em>types</em> are equal. This is necessary for the implementation of <code>runMockT</code> due to the phantom type in actions—in order to convince GHC that we can properly return the result of a mocked action, we need to assure it that the value we’re going to return is actually of the proper type.</p><p>Implementing this typeclass is not particularly burdensome, but it’s entirely boilerplate, so even if you want to define your own action type (that is, you don’t want to use <code>makeMock</code>), you can use the <code>deriveAction</code> function from <code>Control.Monad.Mock.TH</code> to derive an <code>Action</code> instance on an existing datatype.</p><h3><a name="connecting-the-mock-to-its-class"></a>Connecting the mock to its class</h3><p>Now that we have an action with which to mock a class, we need to actually define an instance of that class for <code>MockT</code>. For this process, monad-mock provides a <code>mockAction</code> function with the following type:</p><pre><code class="pygments"><span class="nf">mockAction</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">Action</span> <span class="n">f</span><span class="p">,</span> <span class="kt">Monad</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">f</span> <span class="n">r</span> <span class="ow">-></span> <span class="kt">MockT</span> <span class="n">f</span> <span class="n">m</span> <span class="n">r</span></code></pre><p>This function accepts two arguments: the name of the method being mocked and the action that represents the current call. This is easier to illustrate with an actual instance of <code>MonadFileSystem</code> using <code>MockT</code> and our <code>FileSystemAction</code> type:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFileSystem</span> <span class="p">(</span><span class="kt">MockT</span> <span class="kt">FileSystemAction</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="n">a</span> <span class="ow">=</span> <span class="n">mockAction</span> <span class="s">"readFile"</span> <span class="p">(</span><span class="kt">ReadFile</span> <span class="n">a</span><span class="p">)</span>
<span class="n">writeFile</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">=</span> <span class="n">mockAction</span> <span class="s">"writeFile"</span> <span class="p">(</span><span class="kt">WriteFile</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span></code></pre><p>This allows <code>readFile</code> and <code>writeFile</code> to defer to the mock, and providing the names of the functions as strings helps monad-mock to produce useful error messages upon failure. Internally, <code>MockT</code> is a <code>StateT</code> that keeps track of a list of <code>WithResult f</code> values as its state. Each call to the mock checks the action against the internal list of calls, and if they match, it returns the associated result. Otherwise, it throws an exception.</p><p>This scheme is simple, but it seems to work remarkably well. There are some obvious enhancements that will probably be eventually necessary, like allowing action results that run in the underlying monad <code>m</code> in order to support things like <code>throwError</code> from <code>MonadError</code>, but so far, it hasn’t been necessary for what we’ve been using it for. Certain tricky signatures defy this simple technique, such as signatures where a monadic action appears in a negative position (that is, the signatures you need things like <a href="https://hackage.haskell.org/package/monad-control">monad-control</a> or <a href="https://hackage.haskell.org/package/monad-unlift">monad-unlift</a> for), but we’ve found that most of our effects don’t have any reason to include such signatures.</p><h2><a name="a-brief-comparison-with-free-r-monads"></a>A brief comparison with free(r) monads</h2><p>At this point, astute readers will likely be thinking about free monads, which parts of this technique greatly resemble. The representation of actions as GADTs is especially similar to <a href="https://hackage.haskell.org/package/freer">freer</a>, which does something extremely similar. Indeed, you can think of this technique as something that combines a freer-style representation with mtl-style classes. Given that freer already does this, you might ask yourself what the point is.</p><p>If you are already sold on free monads, monad-mock may very well be uninteresting to you. From the perspective of theoretical novelty, monad-mock is not anything new or different. However, there are a variety of practical reasons to prefer mtl over free, and it’s nice to see how easy it is to enjoy the testing benefits of free without too much extra effort.</p><p>An in-depth comparison between mtl and free is well outside the scope of this blog post. However, the key point is that this technique <em>only</em> affects test code, so the real runtime implementation will not be affected in any way. This means you can take advantage of the performance benefits and ecosystem support of mtl without sacrificing simple, expressive testing.</p><h2><a name="conclusion"></a>Conclusion</h2><p>To cap things off, I want to emphasize monad-mock’s role as a single part of a larger initiative we’ve been making for the better part of the past eighteen months. Haskell is a language with ever-evolving techniques and style, and it’s sometimes dizzying to figure out how to use all the pieces together to develop robust, maintainable applications. While monad-mock might not be anything drastically different from existing testing techniques, my hope is that it can provide an opinionated mechanism to make testing easy and accessible, even for complex interactions with other services and systems.</p><p>I’ve made an effort to make it abundantly clear in this blog post that monad-mock is <em>not</em> a silver bullet to testing, and in fact, I would prefer other techniques for ensuring correctness whenever possible. Even so, mocking is a nice tool to have in your toolbox, and it’s a good fallback to get even the worst APIs under test coverage.</p><p>If you want to try out monad-mock for yourself, <a href="https://hackage.haskell.org/package/monad-mock">take a look at the documentation on Hackage</a> and start playing around! It’s still early software, so it’s not the most proven or featureful, but we’ve managed to get mileage out of it already, all the same. If you find any problems, have a use case it does not support, or just find something about it unclear, please do not hesitate to <a href="https://github.com/cjdev/monad-mock">open an issue on the GitHub repository</a>—we obviously can’t fix issues we don’t know about.</p><p>Thanks as always to the many people who have contributed ideas that have shaped my philosophy and approach to testing and have helped provide the tools that make this library work. Happy testing!</p><ol class="footnotes"></ol></article>Realizing Hackett, a metaprogrammable Haskell2017-05-27T00:00:00Z2017-05-27T00:00:00ZAlexis King<article><p><a href="/blog/2017/01/02/rascal-a-haskell-with-more-parentheses/">Almost five months ago, I wrote a blog post about my new programming language, Hackett</a>, a fanciful sketch of a programming language from a far-off land with Haskell’s type system and Racket’s macros. At that point in time, I had a little prototype that barely worked, that I barely understood, and was a little bit of a technical dead-end. People saw the post, they got excited, but development sort of stopped.</p><p>Then, almost two months ago, I took a second stab at the problem in earnest. I read a lot, I asked a lot of people for help, and eventually I got something sort of working. Suddenly, <a href="https://github.com/lexi-lambda/hackett">Hackett is not only real, it’s working, and you can try it out yourself</a>!</p><h2><a name="a-first-look-at-hackett"></a>A first look at Hackett</h2><p>Hackett is still very new, very experimental, and an enormous work in progress. However, that doesn’t mean it’s useless! Hackett is already a remarkably capable programming language. Let’s take a quick tour.</p><p>As Racket law decrees it, every Hackett program must begin with <code>#lang</code>. We can start with the appropriate incantation:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">hackett</span></code></pre><p>If you’re using DrRacket or racket-mode with background expansion enabled, then congratulations: the typechecker is online. We can begin by writing a well-typed, albeit boring program:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">hackett</span>
<span class="p">(</span><span class="n">main</span> <span class="p">(</span><span class="nb">println</span> <span class="s2">"Hello, world!"</span><span class="p">))</span></code></pre><p>In Hackett, a use of <code>main</code> at the top level indicates that running the module as a program should execute some <code>IO</code> action. In this case, <code>println</code> is a function of type <code>{String -> (IO Unit)}</code>. Just like Haskell, Hackett is pure, and the runtime will figure out how to actually run an <code>IO</code> value. If you run the above program, you will notice that it really does print out <code>Hello, world!</code>, exactly as we would like.</p><p>Of course, hello world programs are boring—so imperative! We are functional programmers, and we have our <em>own</em> class of equally boring programs we must write when learning a new language. How about some Fibonacci numbers?</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">hackett</span>
<span class="p">(</span><span class="n">def</span> <span class="n">fibs</span> <span class="n">:</span> <span class="p">(</span><span class="n">List</span> <span class="n">Integer</span><span class="p">)</span>
<span class="p">{</span><span class="mi">0</span> <span class="n">::</span> <span class="mi">1</span> <span class="n">::</span> <span class="p">(</span><span class="n">zip-with</span> <span class="nb">+</span> <span class="n">fibs</span> <span class="p">(</span><span class="n">tail!</span> <span class="n">fibs</span><span class="p">))})</span>
<span class="p">(</span><span class="n">main</span> <span class="p">(</span><span class="nb">println</span> <span class="p">(</span><span class="n">show</span> <span class="p">(</span><span class="nb">take</span> <span class="mi">10</span> <span class="n">fibs</span><span class="p">))))</span></code></pre><p>Again, Hackett is just like Haskell in that it is <em>lazy</em>, so we can construct an infinite list of Fibonacci numbers, and the runtime will happily do nothing at all. When we call <code>take</code>, we realize the first ten numbers in the list, and when you run the program, you should see them printed out, clear as day!</p><p>But these programs are boring. Printing strings and laziness may have been novel when you first learned about them, but if you’re reading this blog post, my bet is that you probably <em>aren’t</em> new to programming. How about something more interesting, <strong>like a web server</strong>?</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">hackett</span>
<span class="p">(</span><span class="k">require</span> <span class="n">hackett/demo/web-server</span><span class="p">)</span>
<span class="p">(</span><span class="n">data</span> <span class="n">Greeting</span> <span class="p">(</span><span class="n">greeting</span> <span class="n">String</span><span class="p">))</span>
<span class="p">(</span><span class="n">instance</span> <span class="p">(</span><span class="n">->Body</span> <span class="n">Greeting</span><span class="p">)</span>
<span class="p">[</span><span class="n">->body</span> <span class="p">(</span><span class="k">λ</span> <span class="p">[(</span><span class="n">greeting</span> <span class="n">name</span><span class="p">)]</span> <span class="p">{</span><span class="s2">"Hello, "</span> <span class="n">++</span> <span class="n">name</span> <span class="n">++</span> <span class="s2">"!"</span><span class="p">})])</span>
<span class="p">(</span><span class="n">defserver</span> <span class="n">run-server</span>
<span class="p">[</span><span class="n">GET</span> <span class="s2">"/"</span> <span class="k">-></span> <span class="n">String</span> <span class="k">=></span> <span class="s2">"Hello, world!"</span><span class="p">]</span>
<span class="p">[</span><span class="n">GET</span> <span class="s2">"greet"</span> <span class="k">-></span> <span class="n">String</span> <span class="k">-></span> <span class="n">Greeting</span> <span class="k">=></span> <span class="n">greeting</span><span class="p">])</span>
<span class="p">(</span><span class="n">main</span> <span class="p">(</span><span class="k">do</span> <span class="p">(</span><span class="nb">println</span> <span class="s2">"Running server on port 8080."</span><span class="p">)</span>
<span class="p">(</span><span class="n">run-server</span> <span class="mi">8080</span><span class="p">)))</span></code></pre><pre><code class="pygments">$ racket my-server.rkt
Running server on port <span class="m">8080</span>.
^Z
$ <span class="nb">bg</span>
$ curl <span class="s1">'http://localhost:8080/greet/Alexis'</span>
Hello, Alexis!</code></pre><p><strong>Welcome to Hackett.</strong></p><h2><a name="what-is-hackett"></a>What is Hackett?</h2><p>Excited yet? I hope so. I certainly am.</p><p>Before you get a little <em>too</em> excited, however, let me make a small disclaimer: the above program, while quite real, is a demo. It is certainly not a production web framework, and it actually just uses the Racket web server under the hood. It does not handle very many things right now. You cannot use it to build your super awesome webapp, and even if you could, I would not recommend attempting to do so.</p><p>All that said, it is a <em>real</em> tech demo, and it shows off the potential for Hackett to do some pretty cool things. While the server implementation is just reusing Racket’s dynamically typed web server, the Hackett interface to it is 100% statically typed, and the above example shows off a host of features:</p><ul><li><p><strong>Algebraic datatypes.</strong> Hackett has support for basic ADTs, including recursive datatypes (though not yet mutually recursive datatypes).</p></li><li><p><strong>Typeclasses.</strong> The demo web server uses a <code>->Body</code> typeclass to render server responses, and this module implements a <code>->Body</code> instance for the custom <code>Greeting</code> datatype.</p></li><li><p><strong>Macros.</strong> The <code>defserver</code> macro provides a concise, readable, <em>type safe</em> way to define a simple, RESTful web server. It defines two endpoints, a homepage and a greeting, and the latter parses a segment from the URL.</p></li><li><p><strong>Static typechecking.</strong> Obviously. If you try and change the homepage endpoint to produce a number instead of a string, you will get a type error! Alternatively, try removing the <code>->Body</code> instance and see what happens.</p></li><li><p><strong>Infix operators.</strong> In Hackett, <code>{</code> curly braces <code>}</code> enter <em>infix mode</em>, which permits arbitrary infix operators. Most Lisps have variadic functions, so infix operators are not strictly necessary, but Hackett only supports curried, single-argument functions, so infix operators are some especially sweet sugar.</p></li><li><p><strong>Pure, monadic I/O.</strong> The <code>println</code> and <code>run-server</code> functions both produce <code>(IO Unit)</code>, and <code>IO</code> is a monad. <code>do</code> notation is provided as a macro, and it works with any type that implements the <code>Monad</code> typeclass.</p></li></ul><p>All these features are already implemented, and they really work! Of course, you might look at this list and be a little confused: sure, there are macros, but all these other things are firmly Haskellisms. If you thought that, you’d be quite right! <strong>Hackett is much closer to Haskell than Racket, even though it is syntactically a Lisp.</strong> Keep this guiding principal in mind as you read this blog post or explore Hackett. Where Haskell and Racket conflict, Hackett usually prefers Haskell.</p><p>For a bit more information about what Hackett is and what it aims to be, <a href="/blog/2017/01/02/rascal-a-haskell-with-more-parentheses/">check out my blog post from a few months ago</a> from back when Hackett was called Rascal. I won’t reiterate everything I said there, but I do want to give a bit of a status update, explain what I’ve been working on, and hopefully give you some idea about where Hackett is going.</p><h2><a name="the-story-so-far-and-getting-to-hackett-0-1"></a>The story so far, and getting to Hackett 0.1</h2><p>In September of 2016, I attended <a href="http://con.racket-lang.org/2016/">(sixth RacketCon)</a>, where I saw a <a href="https://www.youtube.com/watch?v=j5Hauz6cewM">pretty incredible and extremely exciting talk</a> about implementing type systems as macros. Finally, I could realize my dream of having an elegant Lisp with a safe, reliable macro system and a powerful, expressive type system! Unfortunately, reality ensued, and I remembered I didn’t actually know any type theory.</p><p>Therefore, in October, I started to learn about type systems, and I began to read through Pierce’s Types and Programming Languages, then tried to learn the things I would need to understand Haskell’s type system. I learned about Hindley-Milner and basic typeclasses, and I tried to apply these things to the Type Systems as Macros approach. Throughout October, I hacked and I hacked, and by the end of the month, I stood back and admired my handiwork!</p><p>…it <em>sort of</em> worked?</p><p>The trouble was that I found myself stuck. I wasn’t sure how to proceed. My language had bugs, programs sometimes did things I didn’t understand, the typechecker was clearly unsound, and there didn’t seem to be an obvious path forward. Other things in my life became distracting or difficult, and I didn’t have the energy to work on it anymore, so I stopped. I put Hackett (then Rascal) on the shelf for a couple months, only to finally return to it in late December.</p><p>At the beginning of January, I decided it would be helpful to be public about what I was working on, so I wrote a blog post! Feedback was positive, overwhelmingly so, and while it was certainly encouraging, I suddenly felt nervous about expectations I had not realized I was setting. Could I really build this? Did I have the knowledge or the time? At that point, I didn’t really, so work stalled.</p><p>Fortunately, in early April, some things started to become clear. I took another look at Hackett, and I knew I needed to reimplement it from the ground up. I also knew that I needed a different technique, but this time, I knew a bit more about where to find it. I got some help from <a href="http://www.ccs.neu.edu/home/samth/">Sam Tobin-Hochstadt</a> and put together <a href="https://gist.github.com/lexi-lambda/045ba782c8a0d915bd8abf97167d3bb5">an implementation of Pierce and Turner’s Local Type Inference</a>. Unfortunately, it didn’t really provide the amount of type inference I was looking for, but fortunately, implementing it helped me figure out how to understand the rather more complicated (though very impressive) <a href="http://www.cs.cmu.edu/~joshuad/papers/bidir/">Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism</a>. After that, things just sort of started falling into place:</p><ul><li><p>First, I <a href="https://github.com/lexi-lambda/higher-rank">implemented the Complete and Easy paper in Haskell</a>, including building a little parser and interpreter. That helped me actually understand the paper, and Haskell really is a rather wonderful language for doing such a thing.</p></li><li><p>Three days later, I <a href="https://github.com/lexi-lambda/racket-higher-rank">ported the Haskell implementation to Racket</a>, using (and somewhat abusing) the Type Systems as Macros techniques. It wasn’t the prettiest, but it seemed to work, and that was rather encouraging.</p></li><li><p>After that, however, I got a little stuck again, as I wasn’t sure how to generalize what I had. I was also incredibly busy with my day job, and I wasn’t able to really make progress for a few weeks. In early May, however, I decided to <a href="https://twitter.com/lexi_lambda/status/865026650487967744">take a vacation</a> for a week, and with some time to focus, I <a href="https://github.com/lexi-lambda/higher-rank/tree/algebraic">souped up the Haskell implementation with products and sums</a>. This was progress!</p></li><li><p>The <em>following day</em> I managed to make <a href="https://github.com/lexi-lambda/racket-higher-rank/tree/type-constructors">similar changes to the Racket implementation</a>, but rather than add anonymous products and sums, I added arbitrary type constructors.</p></li><li><p>A couple days later and with more than a bit of help from <a href="http://functorial.com">Phil Freeman</a>, I <a href="https://github.com/lexi-lambda/hackett/commit/1fd7fc905b93f68e39b9d01fedc4fb52aa44c4c4">rebranded the Racket implementation as Hackett, Mk II</a>, and I started working towards turning it into a real programming language.</p></li></ul><p><em>Less than three weeks later</em>, and I have a programming language with everything from laziness and typeclasses to a tiny, proof-of-concept web server with <a href="https://twitter.com/lexi_lambda/status/867617563206758400">editor support</a>. The future of Hackett looks bright, and though there’s a <em>lot</em> of work left before I will be even remotely satisfied with it, I am excited and reassured that it already seems to be bearing some fruit.</p><p>So what’s left? Is Hackett ready for an initial release? Can you start writing programs in it today? Well, unfortunately, the answer is mostly <strong>no</strong>, at least if you want those programs to be at all reliable in a day or two. If everything looks so cheery, though, what’s left? What is Hackett still missing?</p><h3><a name="what-hackett-still-isn-t"></a>What Hackett still <em>isn’t</em></h3><p>I have a laundry list of features I want for Hackett. I want GADTs, indexed type families, newtype deriving, and a compiler that can target multiple backends. These things, however, are not essential. You can probably imagine writing useful software without any of them. Before I can try to tackle those, I first need to tackle some of the bits of the foundation that simply don’t exist yet (or have at least been badly neglected).</p><p>Fortunately, these things are not insurmountable, nor are they necessarily especially hard. They’re things like default class methods, static detection and prevention of orphan instances, exhaustiveness checking for pattern-matching, and a real kind system. That’s right—right now, Hackett’s type system is effectively dynamically typed, and even though you can write a higher-kinded type, there is no such thing as a “kind error”.</p><p>Other things are simply necessary quality of life improvements before Hackett can become truly usable. Type errors are currently rather atrocious, though they could certainly be worse. Additionally, typechecking currently just halts whenever it encounters a type error, and it makes no attempt to generate more than one type error at a time. Derivation of simple instances like <code>Show</code> and <code>Eq</code> is important, and it will also likely pave the way for a more general form of typeclass deriving (since it can most certainly be implemented via macros), so it’s uncharted territory that still needs to be explored.</p><p>Bits of plumbing are still exposed in places, whether it’s unexpected behavior when interoperating with Racket or errors sometimes reported in terms of internal forms. Local bindings are, if you can believe it, still entirely unimplemented, so <code>let</code> and <code>letrec</code> need to be written up. The standard library needs fleshing out, and certain bits of code need to be cleaned up and slotted into the right place.</p><p>Oh, and of course, <strong>the whole thing needs to be documented</strong>. That in and of itself is probably a pretty significant project, especially since there’s a good chance I’ll want to figure out how to best make use of Scribble for a language that’s a little bit different from Racket.</p><p>All in all, there’s a lot of work to be done! I am eager to make it happen, but I also work a full-time job, and I don’t have it in me to continue at the pace I’ve been working at for the past couple of weeks. Still, if you’re interested in the project, stay tuned and keep an eye on it—if all goes as planned, I hope to make it truly useful before too long.</p><h2><a name="answering-some-questions"></a>Answering some questions</h2><p>It’s possible that this blog post does not seem like much; after all, it’s not terribly long. However, if you’re anything like me, there’s a good chance you are interested enough to have some questions! Obviously, I cannot anticipate all your questions and answer them here in advance, but I will try my best.</p><h3><a name="can-i-try-hackett"></a>Can I try Hackett?</h3><p>Yes! With the caveat that it’s alpha software in every sense of the word: undocumented, not especially user friendly, and completely unstable. However, if you <em>do</em> want to give it a try, it isn’t difficult: just install Racket, then run <code>raco pkg install hackett</code>. Open DrRacket and write <code>#lang hackett</code> at the top of the module, then start playing around.</p><p>Also, note that the demo web server used in the example at the top of this blog post is <em>not</em> included when you install the <code>hackett</code> package. If you want to try that out, you’ll have to run <code>raco pkg install hackett-demo</code> to install the demo package as well.</p><h3><a name="are-there-any-examples-of-hackett-code"></a>Are there any examples of Hackett code?</h3><p>Unfortunately, not a lot right now, aside from the tiny examples in this blog post. However, if you are already familiar with Haskell, the syntax likely won’t be hard to pick up. Reading the Hackett source code is not especially recommended, given that it is filled with implementation details. However, if you are interested, reading the module where most of the prelude is defined isn’t so bad. You can <a href="https://github.com/lexi-lambda/hackett/blob/6ceeac05e3d2a4b2dacd39163744baf239cf65a4/hackett-lib/hackett/private/prim/base.rkt">find it on GitHub here</a>, or you can open the <code>hackett/private/prim/base</code> module on a local installation.</p><h3><a name="how-can-i-learn-more-ask-questions-about-hackett"></a>How can I learn more / ask questions about Hackett?</h3><p>Feel free to ping me and ask me questions! I may not always be able to get back to you immediately, but if you hang around, I will eventually send you a response. The best ways to contact me are via the #racket IRC channel on Freenode, the snek Slack community (<a href="http://snek.jneen.net">which you can sign up for here</a>), sending me <a href="https://twitter.com/lexi_lambda">a DM on Twitter</a>, opening <a href="https://github.com/lexi-lambda/hackett/issues">an issue on the GitHub repo</a>, or even just <a href="mailto:lexi.lambda@gmail.com">sending me an email</a> (though I’m usually a bit slower to respond to the latter).</p><h3><a name="how-can-i-help"></a>How can I help?</h3><p>Probably the easiest way to help out is to try Hackett for yourself and <a href="https://github.com/lexi-lambda/hackett/issues">report any bugs or infelicities you run into</a>. Of course, many issues right now are known, there’s just so much to do that I haven’t had the chance to clean everything up. For that reason, the most effective way to contribute is probably to pick an existing issue and try and implement it yourself, but I wouldn’t be surprised if most people found the existing implementation a little intimidating.</p><p>If you <em>are</em> interested in helping out, I’d be happy to give you some pointers and answer some questions, since it would be extremely nice to have some help. Please feel free to contact me using any of the methods mentioned in the previous section, and I’ll try and help you find something you could work on.</p><h3><a name="how-does-hackett-compare-to-x-why-doesn-t-hackett-support-y"></a>How does Hackett compare to <em>X</em> / why doesn’t Hackett support <em>Y</em>?</h3><p>These tend to be complex questions, and I don’t always have comprehensive answers for them, especially since the language is evolving so quickly. Still, if you want to ask me about this, feel free to just send the question to me directly. In my experience, it’s usually better to have a conversation about this sort of thing rather than just answering in one big comparison, since there’s usually a fair amount of nuance.</p><h3><a name="when-will-hackett-be-ready-for-me-to-use"></a>When will Hackett be ready for me to use?</h3><p>I don’t know.</p><p>Obviously, there is a lot left to implement, that is certainly true, but there’s more to it than that. If all goes well, I don’t see any reason why Hackett can’t be early beta quality by the end of this year, even if it doesn’t support all of the goodies necessary to achieve perfection (which, of course, it never really can).</p><p>However, there are other things to consider, too. The Racket package system is currently flawed in ways that make rapidly iterating on Hackett hard, since it is extremely difficult (if not impossible) to make backwards-incompatible changes without potentially breaking someone’s program (even if they don’t update anything about their dependencies)! This is a solvable problem, but it would take some work modifying various elements of the package system and build tools, so that might need to get done before I can recommend Hackett in good faith.</p><h2><a name="appendix"></a>Appendix</h2><p>It would be unfair not to mention all the people that have made Hackett possible. I cannot list them all here, but I want to give special thanks to <a href="http://www.ccs.neu.edu/home/stchang/">Stephen Chang</a>, <a href="http://www.cs.ubc.ca/~joshdunf/">Joshua Dunfield</a>, <a href="http://eecs.northwestern.edu/~robby/">Robby Findler</a>, <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a>, <a href="http://functorial.com">Phil Freeman</a>, <a href="http://www.ccs.neu.edu/home/types/">Ben Greenman</a>, <a href="https://github.com/AlexKnauth">Alex Knauth</a>, <a href="http://www.cl.cam.ac.uk/~nk480/">Neelakantan Krishnaswami</a>, and <a href="http://www.ccs.neu.edu/home/samth/">Sam Tobin-Hochstadt</a>. I’d also like to thank everyone involved in the Racket and Haskell projects as a whole, as well as everyone who has expressed interest and encouragement about what I’ve been working on.</p><p>As a final point, just for fun, I thought I’d keep track of all the albums I’ve been listening to while working on Hackett, just in the past few weeks. It is <a href="/blog/2017/01/05/rascal-is-now-hackett-plus-some-answers-to-questions/#whats-in-a-name">on theme with the name</a>, after all. This list is not completely exhaustive, as I’m sure some slipped through the cracks, but you can thank the following artists for helping me power through a few of the hills in Hackett’s implementation:</p><ul><li><p>The Beach Boys — Pet Sounds</p></li><li><p>Boards of Canada — Music Has The Right To Children, Geogaddi</p></li><li><p>Bruce Springsteen — Born to Run</p></li><li><p>King Crimson — In the Court of the Crimson King, Larks’ Tongues in Aspic, Starless and Bible Black, Red, Discipline</p></li><li><p>Genesis — Nursery Cryme, Foxtrot, Selling England by the Pound, The Lamb Lies Down on Broadway, A Trick of the Tail</p></li><li><p>Mahavishnu Orchestra — Birds of Fire</p></li><li><p>Metric — Fantasies, Synthetica, Pagans in Vegas</p></li><li><p>Muse — Origin of Symmetry, Absolution, The Resistance</p></li><li><p>Peter Gabriel — Peter Gabriel I, II, III, IV / Security, Us, Up</p></li><li><p>Pink Floyd — Wish You Were Here</p></li><li><p>Supertramp — Breakfast In America</p></li><li><p>The Protomen — The Protomen, Act II: The Father of Death</p></li><li><p>Talking Heads — Talking Heads: 77, More Songs About Buildings and Food, Fear of Music, Remain in Light</p></li><li><p>Yes — Fragile, Relayer, Going For The One</p></li></ul><p>And of course, <em>Voyage of the Acolyte</em>, by <strong>Steve Hackett</strong>.</p><ol class="footnotes"></ol></article>Lifts for free: making mtl typeclasses derivable2017-04-28T00:00:00Z2017-04-28T00:00:00ZAlexis King<article><p>Perhaps the most important abstraction a Haskell programmer must understand to effectively write modern Haskell code, beyond the level of the monad, is the <em>monad transformer</em>, a way to compose monads together in a limited fashion. One frustrating downside to monad transformers is a proliferation of <code>lift</code>s, which explicitly indicate which monad in a transformer “stack” a particular computation should run in. Fortunately, the venerable <a href="https://hackage.haskell.org/package/mtl">mtl</a> provides typeclasses that make this lifting mostly automatic, using typeclass machinery to insert <code>lift</code> where appropriate.</p><p>Less fortunately, the mtl approach does not actually eliminate <code>lift</code> entirely, it simply moves it from use sites to instances. This requires a small zoo of extraordinarily boilerplate-y instances, most of which simply implement each typeclass method using <code>lift</code>. While we cannot eliminate the instances entirely without somewhat dangerous techniques like <a href="https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/glasgow_exts.html#overlapping-instances">overlapping instances</a>, we <em>can</em> automatically derive them using features of modern GHC, eliminating the truly unnecessary boilerplate.</p><h2><a name="the-problem-with-mtl-style-typeclasses"></a>The problem with mtl-style typeclasses</h2><p>To understand what problem it is exactly that we’re trying to solve, we first need to take a look at an actual mtl-style typeclass. I am going to start with an mtl-<em>style</em> typeclass, rather than an actual typeclass in the mtl, due to slight complications with mtl’s actual typeclasses that we’ll get into later. Instead, let’s start with a somewhat boring typeclass, which we’ll call <code>MonadExit</code>:</p><pre><code class="pygments"><span class="kr">import</span> <span class="nn">System.Exit</span> <span class="p">(</span><span class="kt">ExitCode</span><span class="p">)</span>
<span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">::</span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>This is a simple typeclass that abstracts over the concept of early exit, given an exit code. The most obvious implementation of this typeclass is over <code>IO</code>, which will actually exit the program:</p><pre><code class="pygments"><span class="kr">import</span> <span class="k">qualified</span> <span class="nn">System.Exit</span> <span class="k">as</span> <span class="n">IO</span> <span class="p">(</span><span class="n">exitWith</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="kt">IO</span><span class="o">.</span><span class="n">exitWith</span></code></pre><p>One of the cool things about these typeclasses, though, is that we don’t have to have just one implementation. We could also write a pure implementation of <code>MonadExit</code>, which would simply short-circuit the current computation and return the <code>ExitCode</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">ExitCode</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="kt">Left</span></code></pre><p>Instead of simply having an instance on a concrete monad, though, we probably want to be able to use this in a larger monad stack, so we can define an <code>ExitT</code> monad transformer that can be inserted into any monad transformer stack:</p><pre><code class="pygments"><span class="cm">{-# LANGUAGE GeneralizedNewtypeDeriving #-}</span>
<span class="kr">import</span> <span class="nn">Control.Monad.Except</span> <span class="p">(</span><span class="kt">ExceptT</span><span class="p">,</span> <span class="nf">runExceptT</span><span class="p">,</span> <span class="nf">throwError</span><span class="p">)</span>
<span class="kr">import</span> <span class="nn">Control.Monad.Trans</span> <span class="p">(</span><span class="kt">MonadTrans</span><span class="p">)</span>
<span class="kr">newtype</span> <span class="kt">ExitT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">ExitT</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="kt">ExitCode</span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">,</span> <span class="kt">MonadTrans</span><span class="p">)</span>
<span class="nf">runExitT</span> <span class="ow">::</span> <span class="kt">ExitT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">ExitCode</span> <span class="n">a</span><span class="p">)</span>
<span class="nf">runExitT</span> <span class="p">(</span><span class="kt">ExitT</span> <span class="n">x</span><span class="p">)</span> <span class="ow">=</span> <span class="n">runExceptT</span> <span class="n">x</span>
<span class="kr">instance</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ExitT</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="kt">ExitT</span> <span class="o">.</span> <span class="n">throwError</span></code></pre><p>With this in place, we can write actual programs using our <code>ExitT</code> monad transformer:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">runExitT</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">putStrLn</span> <span class="s">"hello"</span>
<span class="n">exitWith</span> <span class="p">(</span><span class="kt">ExitFailure</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">lift</span> <span class="o">$</span> <span class="n">putStrLn</span> <span class="s">"world"</span>
<span class="nf">hello</span>
<span class="kt">Left</span> <span class="p">(</span><span class="kt">ExitFailure</span> <span class="mi">1</span><span class="p">)</span></code></pre><p>This is pretty cool! Unfortunately, experienced readers will see the rather large problem with what we have so far. Specifically, it won’t actually work if we try and wrap <code>ExitT</code> in another monad transformer:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">logIn</span> <span class="n">password</span> <span class="ow">=</span> <span class="n">runExitT</span> <span class="o">$</span> <span class="n">flip</span> <span class="n">runReaderT</span> <span class="n">password</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">password</span> <span class="ow"><-</span> <span class="n">ask</span>
<span class="n">unless</span> <span class="p">(</span><span class="n">password</span> <span class="o">==</span> <span class="s">"password1234"</span><span class="p">)</span> <span class="o">$</span> <span class="c1">-- super secure password</span>
<span class="n">exitWith</span> <span class="p">(</span><span class="kt">ExitFailure</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">return</span> <span class="s">"access granted"</span>
<span class="nf">ghci</span><span class="o">></span> <span class="n">logIn</span> <span class="s">"not the right password"</span>
<span class="o"><</span><span class="n">interactive</span><span class="o">>:</span> <span class="ne">error</span><span class="kt">:</span>
<span class="err">•</span> <span class="kt">No</span> <span class="kr">instance</span> <span class="n">for</span> <span class="p">(</span><span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="p">[</span><span class="kt">Char</span><span class="p">]</span> <span class="p">(</span><span class="kt">ExitT</span> <span class="n">m0</span><span class="p">)))</span>
<span class="n">arising</span> <span class="n">from</span> <span class="n">a</span> <span class="n">use</span> <span class="kr">of</span> <span class="err">‘</span><span class="n">it</span><span class="err">’</span>
<span class="err">•</span> <span class="kt">In</span> <span class="n">a</span> <span class="n">stmt</span> <span class="kr">of</span> <span class="n">an</span> <span class="n">interactive</span> <span class="kt">GHCi</span> <span class="n">command</span><span class="kt">:</span> <span class="n">print</span> <span class="n">it</span></code></pre><p>The error message is relatively self-explanatory if you are familiar with mtl error messages: there is no <code>MonadExit</code> instance for <code>ReaderT</code>. This makes sense, since we only defined a <code>MonadExit</code> instance for <em><code>ExitT</code></em>, nothing else. Fortunately, the instance for <code>ReaderT</code> is completely trivial, since we just need to use <code>lift</code> to delegate to the next monad in the stack:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span></code></pre><p>Now that the delegating instance is set up, we can actually use our <code>logIn</code> function:</p><pre><code class="pygments"><span class="nf">ghci</span><span class="o">></span> <span class="n">logIn</span> <span class="s">"not the right password"</span>
<span class="kt">Left</span> <span class="p">(</span><span class="kt">ExitFailure</span> <span class="mi">1</span><span class="p">)</span>
<span class="nf">ghci</span><span class="o">></span> <span class="n">logIn</span> <span class="s">"password1234"</span>
<span class="kt">Right</span> <span class="s">"access granted"</span></code></pre><h3><a name="an-embarrassment-of-instances"></a>An embarrassment of instances</h3><p>We’ve managed to make our program work properly now, but we’ve still only defined the delegating instance for <code>ReaderT</code>. What if someone wants to use <code>ExitT</code> with <code>WriterT</code>? Or <code>StateT</code>? Or any of <code>ExceptT</code>, <code>RWST</code>, or <code>ContT</code>? Well, we have to define instances for each and every one of them, and as it turns out, the instances are all identical!</p><pre><code class="pygments"><span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadExit</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadExit</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">RWST</span> <span class="n">r</span> <span class="n">w</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ContT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span></code></pre><p>This is bad enough on its own, but this is actually the <em>simplest</em> case: a typeclass with a single method which is trivially lifted through any other monad transformer. Another thing we’ve glossed over is actually defining all the delegating instances for the <em>other</em> mtl typeclasses on <code>ExitT</code> itself. Fortunately, we can derive these ones with <code>GeneralizedNewtypeDeriving</code>, since <code>ExceptT</code> has already done most of the work for us:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">ExitT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">ExitT</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="kt">ExitCode</span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span> <span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">,</span> <span class="kt">MonadIO</span> <span class="c1">-- base</span>
<span class="p">,</span> <span class="kt">MonadBase</span> <span class="kt">IO</span> <span class="c1">-- transformers-base</span>
<span class="p">,</span> <span class="kt">MonadTrans</span><span class="p">,</span> <span class="kt">MonadReader</span> <span class="n">r</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span><span class="p">,</span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="c1">-- mtl</span>
<span class="p">,</span> <span class="kt">MonadThrow</span><span class="p">,</span> <span class="kt">MonadCatch</span><span class="p">,</span> <span class="kt">MonadMask</span> <span class="c1">-- exceptions</span>
<span class="p">,</span> <span class="kt">MonadTransControl</span><span class="p">,</span> <span class="kt">MonadBaseControl</span> <span class="kt">IO</span> <span class="c1">-- monad-control</span>
<span class="p">)</span></code></pre><p>Unfortunately, we have to write the <code>MonadError</code> instance manually if we want it, since we don’t want to pick up the instance from <code>ExceptT</code>, but rather wish to defer to the underlying monad. This means writing some truly horrid delegation code:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="p">(</span><span class="kt">ExitT</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">throwError</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">throwError</span>
<span class="n">catchError</span> <span class="p">(</span><span class="kt">ExitT</span> <span class="n">x</span><span class="p">)</span> <span class="n">f</span> <span class="ow">=</span> <span class="kt">ExitT</span> <span class="o">.</span> <span class="kt">ExceptT</span> <span class="o">$</span> <span class="n">catchError</span> <span class="p">(</span><span class="n">runExceptT</span> <span class="n">x</span><span class="p">)</span> <span class="o">$</span> <span class="nf">\</span><span class="n">e</span> <span class="ow">-></span>
<span class="kr">let</span> <span class="p">(</span><span class="kt">ExitT</span> <span class="n">x'</span><span class="p">)</span> <span class="ow">=</span> <span class="n">f</span> <span class="n">e</span> <span class="kr">in</span> <span class="n">runExceptT</span> <span class="n">x'</span></code></pre><p>(Notably, this is so awful because <code>catchError</code> is more complex than the simple <code>exitWith</code> method we’ve studied so far, which is why we’re starting with a simpler typeclass. We’ll get more into this later, as promised.)</p><p>This huge number of instances is sometimes referred to as the “n<sup>2</sup> instances” problem, since it requires every monad transformer have an instance of every single mtl-style typeclass. Fortunately, in practice, this proliferation is often less horrible than it might seem, mostly because deriving helps a lot. However, remember that if <code>ExitT</code> <em>weren’t</em> a simple wrapper around an existing monad transformer, we wouldn’t be able to derive the instances at all! Instead, we’d have to write them all out by hand, just like we did with all the <code>MonadExit</code> instances.</p><p>It’s a shame that these typeclass instances can’t be derived in a more general way, allowing derivation for arbitrary monad transformers instead of simply requiring the newtype deriving machinery. As it turns out, with clever use of modern GHC features, we actually <strong>can</strong>. It’s not even all that hard.</p><h2><a name="default-instances-with-default-signatures"></a>Default instances with default signatures</h2><p>It’s not hard to see that our <code>MonadExit</code> instances are all exactly the same: just <code>lift . exitWith</code>. Why is that, though? Well, every instance is an instance on a monad transformer over a monad that is already an instance of <code>MonadExit</code>. In fact, we can express this in a type signature, and we can extract <code>lift . exitWith</code> into a separate function:</p><pre><code class="pygments"><span class="nf">defaultExitWith</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadExit</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">t</span> <span class="n">m</span> <span class="nb">()</span>
<span class="nf">defaultExitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span></code></pre><p>However, writing <code>defaultExitWith</code> really isn’t any easier than writing <code>lift . exitWith</code>, so this deduplication doesn’t really buy us anything. However, it <em>does</em> indicate that we could write a default implementation of <code>exitWith</code> if we could require just a little bit more from the implementing type. With <a href="https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/glasgow_exts.html#default-method-signatures">GHC’s <code>DefaultSignatures</code> extension</a>, we can do precisely that.</p><p>The idea is that we can write a separate type signature for a default implementation of <code>exitWith</code>, which can be more specific than the type signature for <code>exitWith</code> in general. This allows us to use our <code>defaultExitWith</code> implementation more or less directly:</p><pre><code class="pygments"><span class="cm">{-# LANGUAGE DefaultSignatures #-}</span>
<span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">::</span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="kr">default</span> <span class="n">exitWith</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadExit</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">t</span> <span class="n">m1</span> <span class="nb">()</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span></code></pre><p>We have to use <code>m1</code> instead of <code>m</code>, since type variables in the instance head are always scoped, and the names would conflict. However, this creates another problem, since our specialized type signature replaces <code>m</code> with <code>t m1</code>, which won’t quite work (as GHC can’t automatically figure out they should be the same). Instead, we can use <code>m</code> in the type signature, then just add a type equality constraint ensuring that <code>m</code> and <code>t m1</code> must be the same type:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">exitWith</span> <span class="ow">::</span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="kr">default</span> <span class="n">exitWith</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadExit</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="n">exitWith</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">exitWith</span></code></pre><p>Now we can write all of our simple instances without even needing to write a real implementation! All of the instance bodies can be empty:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadExit</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadExit</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">RWST</span> <span class="n">r</span> <span class="n">w</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadExit</span> <span class="p">(</span><span class="kt">ContT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span></code></pre><p>While this doesn’t completely alleviate the pain of writing instances, it’s definitely an improvement over what we had before. With <a href="https://downloads.haskell.org/~ghc/8.2.1-rc1/docs/html/users_guide/glasgow_exts.html#deriving-strategies">GHC 8.2’s new <code>DerivingStrategies</code> extension</a>, it becomes especially beneficial when defining entirely new transformers that should also have <code>ExitT</code> instances, since they can be derived with <code>DeriveAnyClass</code>:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">ParserT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">ParserT</span> <span class="p">(</span><span class="kt">Text</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="p">(</span><span class="kt">Text</span><span class="p">,</span> <span class="n">a</span><span class="p">)))</span>
<span class="kr">deriving</span> <span class="n">anyclass</span> <span class="p">(</span><span class="kt">MonadExit</span><span class="p">)</span></code></pre><p>This is pretty wonderful.</p><p>Given that only <code>MonadExit</code> supports being derived in this way, we sadly still need to implement the other, more standard mtl-style typeclasses ourselves, like <code>MonadIO</code>, <code>MonadBase</code>, <code>MonadReader</code>, <code>MonadWriter</code>, etc. However, what if all of those classes provided the same convenient default signatures that our <code>MonadExit</code> does? If that were the case, then we could write something like this:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">ParserT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">ParserT</span> <span class="p">(</span><span class="kt">Text</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="p">(</span><span class="kt">Text</span><span class="p">,</span> <span class="n">a</span><span class="p">)))</span>
<span class="kr">deriving</span> <span class="n">anyclass</span> <span class="p">(</span> <span class="kt">MonadIO</span><span class="p">,</span> <span class="kt">MonadBase</span> <span class="n">b</span>
<span class="p">,</span> <span class="kt">MonadReader</span> <span class="n">r</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span><span class="p">,</span> <span class="kt">MonadState</span> <span class="n">s</span>
<span class="p">,</span> <span class="kt">MonadThrow</span><span class="p">,</span> <span class="kt">MonadCatch</span><span class="p">,</span> <span class="kt">MonadMask</span>
<span class="p">,</span> <span class="kt">MonadExit</span>
<span class="p">)</span></code></pre><p>Compared to having to write all those instances by hand, this would be a pretty enormous difference. Unfortunately, many of these typeclasses are not quite as simple as our <code>MonadExit</code>, and we’d have to be a bit more clever to make them derivable.</p><h2><a name="making-mtl-s-classes-derivable"></a>Making mtl’s classes derivable</h2><p>Our <code>MonadExit</code> class was extremely simple, since it only had a single method with a particularly simple type signature. For reference, this was the type of our generic <code>exitWith</code>:</p><pre><code class="pygments"><span class="nf">exitWith</span> <span class="ow">::</span> <span class="kt">MonadExit</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">ExitCode</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>Let’s now turn our attention to <code>MonadReader</code>. At first blush, this typeclass should not be any trickier to implement than <code>MonadExit</code>, since the types of <code>ask</code> and <code>reader</code> are both quite simple:</p><pre><code class="pygments"><span class="nf">ask</span> <span class="ow">::</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">r</span>
<span class="nf">reader</span> <span class="ow">::</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>However, the type of the other method, <code>local</code>, throws a bit of a wrench in our plans. It has the following type signature:</p><pre><code class="pygments"><span class="nf">local</span> <span class="ow">::</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">r</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>Why is this so much more complicated? Well, the key is in the second argument, which has the type <code>m a</code>. That’s not something that can be simply <code>lift</code>ed away! Try it yourself: try to write a <code>MonadReader</code> instance for some monad transformer. It’s not as easy as it looks!</p><p>We can illustrate the problem by creating our own version of <code>MonadReader</code> and implementing it for something like <code>ExceptT</code> ourselves. We can start with the trivial methods first:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">r</span> <span class="kr">where</span>
<span class="n">ask</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">r</span>
<span class="n">local</span> <span class="ow">::</span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">r</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">reader</span> <span class="ow">::</span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="kr">instance</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">ask</span> <span class="ow">=</span> <span class="n">lift</span> <span class="n">ask</span>
<span class="n">reader</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">reader</span></code></pre><p>However, implementing <code>local</code> is harder. Let’s specialize the type signature to <code>ExceptT</code> to make it more clear why:</p><pre><code class="pygments"><span class="nf">local</span> <span class="ow">::</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">r</span><span class="p">)</span> <span class="ow">-></span> <span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span> <span class="n">a</span></code></pre><p>Our base monad, <code>m</code>, implements <code>local</code>, but we have to convert the first argument from <code>ExceptT e m a</code> into <code>m (Either e a)</code> first, run it through <code>local</code> in <code>m</code>, then wrap it back up in <code>ExceptT</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">ask</span> <span class="ow">=</span> <span class="n">lift</span> <span class="n">ask</span>
<span class="n">reader</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">reader</span>
<span class="n">local</span> <span class="n">f</span> <span class="n">x</span> <span class="ow">=</span> <span class="kt">ExceptT</span> <span class="o">$</span> <span class="n">local</span> <span class="n">f</span> <span class="p">(</span><span class="n">runExceptT</span> <span class="n">x</span><span class="p">)</span></code></pre><p>This operation is actually a mapping operation of sorts, since we’re mapping <code>local f</code> over <code>x</code>. For that reason, this can be rewritten using the <code>mapExceptT</code> function provided from <code>Control.Monad.Except</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">ask</span> <span class="ow">=</span> <span class="n">lift</span> <span class="n">ask</span>
<span class="n">reader</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">reader</span>
<span class="n">local</span> <span class="ow">=</span> <span class="n">mapExceptT</span> <span class="o">.</span> <span class="n">local</span></code></pre><p>If you implement <code>MonadReader</code> instances for other transformers, like <code>StateT</code> and <code>WriterT</code>, you’ll find that the instances are exactly the same <em>except</em> for <code>mapExceptT</code>, which is replaced with <code>mapStateT</code> and <code>mapWriterT</code>, respectively. This is sort of obnoxious, given that we want to figure out how to create a generic version of <code>local</code> that works with any monad transformer, but this requires concrete information about which monad we’re in. Obviously, the power <code>MonadTrans</code> gives us is not enough to make this generic. Fortunately, there is a typeclass which does: <a href="http://hackage.haskell.org/package/monad-control-1.0.1.0/docs/Control-Monad-Trans-Control.html#t:MonadTransControl"><code>MonadTransControl</code></a> from the <code>monad-control</code> package.</p><p>Using <code>MonadTransControl</code>, we can write a generic <code>mapT</code> function that maps over an arbitrary monad transformer with a <code>MonadTransControl</code> instance:</p><pre><code class="pygments"><span class="nf">mapT</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">Monad</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monad</span> <span class="p">(</span><span class="n">t</span> <span class="n">m</span><span class="p">),</span> <span class="kt">MonadTransControl</span> <span class="n">t</span><span class="p">)</span>
<span class="ow">=></span> <span class="p">(</span><span class="n">m</span> <span class="p">(</span><span class="kt">StT</span> <span class="n">t</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">StT</span> <span class="n">t</span> <span class="n">b</span><span class="p">))</span>
<span class="ow">-></span> <span class="n">t</span> <span class="n">m</span> <span class="n">a</span>
<span class="ow">-></span> <span class="n">t</span> <span class="n">m</span> <span class="n">b</span>
<span class="nf">mapT</span> <span class="n">f</span> <span class="n">x</span> <span class="ow">=</span> <span class="n">liftWith</span> <span class="p">(</span><span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">f</span> <span class="p">(</span><span class="n">run</span> <span class="n">x</span><span class="p">))</span> <span class="o">>>=</span> <span class="n">restoreT</span> <span class="o">.</span> <span class="n">return</span></code></pre><p>This type signature may look complicated (and, well, it is), but the idea is that the <code>StT</code> associated type family encapsulates the monadic state that <code>t</code> introduces. For example, for <code>ExceptT</code>, <code>StT (ExceptT e) a</code> is <code>Either e a</code>. For <code>StateT</code>, <code>StT (StateT s) a</code> is <code>(a, s)</code>. Some transformers, like <code>ReaderT</code>, have no state, so <code>StT (ReaderT r) a</code> is just <code>a</code>.</p><p>I will not go into the precise mechanics of how <code>MonadTransControl</code> works in this blog post, but it doesn’t matter significantly; the point is that we can now use <code>mapT</code> to create a generic implementation of <code>local</code> for use with <code>DefaultSignatures</code>:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">r</span> <span class="kr">where</span>
<span class="n">ask</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">r</span>
<span class="kr">default</span> <span class="n">ask</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">r</span>
<span class="n">ask</span> <span class="ow">=</span> <span class="n">lift</span> <span class="n">ask</span>
<span class="n">local</span> <span class="ow">::</span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">r</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="kr">default</span> <span class="n">local</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTransControl</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">r</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">local</span> <span class="ow">=</span> <span class="n">mapT</span> <span class="o">.</span> <span class="n">local</span>
<span class="n">reader</span> <span class="ow">::</span> <span class="p">(</span><span class="n">r</span> <span class="ow">-></span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">reader</span> <span class="n">f</span> <span class="ow">=</span> <span class="n">f</span> <span class="o"><$></span> <span class="n">ask</span></code></pre><p>Once more, we now get instances of our typeclass, in this case <code>MonadReader</code>, <strong>for free</strong>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadReader</span> <span class="n">r</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span></code></pre><p>It’s also worth noting that we <em>don’t</em> get a <code>ContT</code> instance for free, even though <code>ContT</code> has a <code>MonadReader</code> instance in mtl. Unlike the other monad transformers mtl provides, <code>ContT</code> does not have a <code>MonadTransControl</code> instance because it cannot be generally mapped over. While a <code>mapContT</code> function does exist, its signature is more restricted:</p><pre><code class="pygments"><span class="nf">mapContT</span> <span class="ow">::</span> <span class="p">(</span><span class="n">m</span> <span class="n">r</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">r</span><span class="p">)</span> <span class="ow">-></span> <span class="kt">ContT</span> <span class="n">k</span> <span class="n">r</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">ContT</span> <span class="n">k</span> <span class="n">r</span> <span class="n">m</span> <span class="n">a</span></code></pre><p>It happens that <code>local</code> can still be implemented for <code>ContT</code>, so it can still have a <code>MonadReader</code> instance, but it cannot be derived in the same way as it can for the other transformers. Still, in practice, I’ve found that most user-defined transformers do not have such complex control flow, so they can safely be instances of <code>MonadTransControl</code>, and they get this deriving for free.</p><h3><a name="extending-this-technique-to-other-mtl-typeclasses"></a>Extending this technique to other mtl typeclasses</h3><p>The default instances for the other mtl typeclasses are slightly different from the one for <code>MonadReader</code>, but for the most part, the same general technique applies. Here’s a derivable <code>MonadError</code>:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">e</span> <span class="kr">where</span>
<span class="n">throwError</span> <span class="ow">::</span> <span class="n">e</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="kr">default</span> <span class="n">throwError</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">e</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">throwError</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">throwError</span>
<span class="n">catchError</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">(</span><span class="n">e</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="kr">default</span> <span class="n">catchError</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTransControl</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">(</span><span class="n">e</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">catchError</span> <span class="n">x</span> <span class="n">f</span> <span class="ow">=</span> <span class="n">liftWith</span> <span class="p">(</span><span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">catchError</span> <span class="p">(</span><span class="n">run</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="n">run</span> <span class="o">.</span> <span class="n">f</span><span class="p">))</span> <span class="o">>>=</span> <span class="n">restoreT</span> <span class="o">.</span> <span class="n">return</span>
<span class="kr">instance</span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadError</span> <span class="n">e</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadError</span> <span class="n">e</span> <span class="p">(</span><span class="kt">RWST</span> <span class="n">r</span> <span class="n">w</span> <span class="n">s</span> <span class="n">m</span><span class="p">)</span></code></pre><p>The <code>MonadState</code> interface turns out to be extremely simple, so it doesn’t even need <code>MonadTransControl</code> at all:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">s</span> <span class="kr">where</span>
<span class="n">get</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">s</span>
<span class="kr">default</span> <span class="n">get</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">s</span>
<span class="n">get</span> <span class="ow">=</span> <span class="n">lift</span> <span class="n">get</span>
<span class="n">put</span> <span class="ow">::</span> <span class="n">s</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="kr">default</span> <span class="n">put</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">s</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="n">put</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">put</span>
<span class="n">state</span> <span class="ow">::</span> <span class="p">(</span><span class="n">s</span> <span class="ow">-></span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">state</span> <span class="n">f</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">s</span> <span class="ow"><-</span> <span class="n">get</span>
<span class="kr">let</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s'</span><span class="p">)</span> <span class="ow">=</span> <span class="n">f</span> <span class="n">s</span>
<span class="n">put</span> <span class="n">s'</span>
<span class="n">return</span> <span class="n">a</span>
<span class="kr">instance</span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span> <span class="n">m</span><span class="p">)</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">MonadState</span> <span class="n">s</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Monoid</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadState</span> <span class="n">s</span> <span class="p">(</span><span class="kt">WriterT</span> <span class="n">w</span> <span class="n">m</span><span class="p">)</span></code></pre><p>Everything seems to be going well! However, not everything is quite so simple.</p><h3><a name="a-monadwriter-diversion"></a>A <code>MonadWriter</code> diversion</h3><p>Unexpectedly, <code>MonadWriter</code> turns out to be by far the trickiest of the bunch. It’s not too hard to create default implementations for most of the methods of the typeclass:</p><pre><code class="pygments"><span class="kr">class</span> <span class="p">(</span><span class="kt">Monoid</span> <span class="n">w</span><span class="p">,</span> <span class="kt">Monad</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">MonadWriter</span> <span class="n">w</span> <span class="n">m</span> <span class="o">|</span> <span class="n">m</span> <span class="ow">-></span> <span class="n">w</span> <span class="kr">where</span>
<span class="n">writer</span> <span class="ow">::</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="kr">default</span> <span class="n">writer</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="n">writer</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">writer</span>
<span class="n">tell</span> <span class="ow">::</span> <span class="n">w</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="kr">default</span> <span class="n">tell</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTrans</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">w</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="n">tell</span> <span class="ow">=</span> <span class="n">lift</span> <span class="o">.</span> <span class="n">tell</span>
<span class="n">listen</span> <span class="ow">::</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span>
<span class="kr">default</span> <span class="n">listen</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTransControl</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">m</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span>
<span class="n">listen</span> <span class="n">x</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span> <span class="ow"><-</span> <span class="n">liftWith</span> <span class="p">(</span><span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">listen</span> <span class="p">(</span><span class="n">run</span> <span class="n">x</span><span class="p">))</span>
<span class="n">y'</span> <span class="ow"><-</span> <span class="n">restoreT</span> <span class="p">(</span><span class="n">return</span> <span class="n">y</span><span class="p">)</span>
<span class="n">return</span> <span class="p">(</span><span class="n">y'</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span></code></pre><p>However, <code>MonadWriter</code> has a fourth method, <code>pass</code>, which has a particularly tricky type signature:</p><pre><code class="pygments"><span class="nf">pass</span> <span class="ow">::</span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span> <span class="ow">-></span> <span class="n">w</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span></code></pre><p>As far as I can tell, this is not possible to generalize using <code>MonadTransControl</code> alone, since it would require inspection of the result of the monadic argument (that is, it would require a function from <code>StT t (a, b) -> (StT t a, b)</code>), which is not possible in general. My gut is that this could likely also be generalized with a slightly more powerful abstraction than <code>MonadTransControl</code>, but it is not immediately obvious to me what that abstraction should be.</p><p>One extremely simple way to make this possible would be to design something to serve this specific use case:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">RunSplit</span> <span class="n">t</span> <span class="ow">=</span> <span class="n">forall</span> <span class="n">m</span> <span class="n">a</span> <span class="n">b</span><span class="o">.</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="n">t</span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">StT</span> <span class="n">t</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Maybe</span> <span class="n">b</span><span class="p">)</span>
<span class="kr">class</span> <span class="kt">MonadTransControl</span> <span class="n">t</span> <span class="ow">=></span> <span class="kt">MonadTransSplit</span> <span class="n">t</span> <span class="kr">where</span>
<span class="n">liftWithSplit</span> <span class="ow">::</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="p">(</span><span class="kt">RunSplit</span> <span class="n">t</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span> <span class="ow">-></span> <span class="n">t</span> <span class="n">m</span> <span class="n">a</span></code></pre><p>Instances of <code>MonadTransSplit</code> would basically just provide a way to pull out bits of the result, if possible:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadTransSplit</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">r</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">liftWithSplit</span> <span class="n">f</span> <span class="ow">=</span> <span class="n">liftWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">f</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">split</span> <span class="o">.</span> <span class="n">run</span><span class="p">)</span>
<span class="kr">where</span> <span class="n">split</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="kt">Just</span> <span class="n">y</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadTransSplit</span> <span class="p">(</span><span class="kt">ExceptT</span> <span class="n">e</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">liftWithSplit</span> <span class="n">f</span> <span class="ow">=</span> <span class="n">liftWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">f</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">split</span> <span class="o">.</span> <span class="n">run</span><span class="p">)</span>
<span class="kr">where</span> <span class="n">split</span> <span class="p">(</span><span class="kt">Left</span> <span class="n">e</span><span class="p">)</span> <span class="ow">=</span> <span class="p">(</span><span class="kt">Left</span> <span class="n">e</span><span class="p">,</span> <span class="kt">Nothing</span><span class="p">)</span>
<span class="n">split</span> <span class="p">(</span><span class="kt">Right</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">))</span> <span class="ow">=</span> <span class="p">(</span><span class="kt">Right</span> <span class="n">x</span><span class="p">,</span> <span class="kt">Just</span> <span class="n">y</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">MonadTransSplit</span> <span class="p">(</span><span class="kt">StateT</span> <span class="n">s</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">liftWithSplit</span> <span class="n">f</span> <span class="ow">=</span> <span class="n">liftWith</span> <span class="o">$</span> <span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">f</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">split</span> <span class="o">.</span> <span class="n">run</span><span class="p">)</span>
<span class="kr">where</span> <span class="n">split</span> <span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">),</span> <span class="n">s</span><span class="p">)</span> <span class="ow">=</span> <span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">s</span><span class="p">),</span> <span class="kt">Just</span> <span class="n">y</span><span class="p">)</span></code></pre><p>Then, using this, it would be possible to write a generic version of <code>pass</code>:</p><pre><code class="pygments"><span class="kr">default</span> <span class="n">pass</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">MonadTransSplit</span> <span class="n">t</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">w</span> <span class="n">m1</span><span class="p">,</span> <span class="n">m</span> <span class="o">~</span> <span class="n">t</span> <span class="n">m1</span><span class="p">)</span> <span class="ow">=></span> <span class="n">m</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">w</span> <span class="ow">-></span> <span class="n">w</span><span class="p">)</span> <span class="ow">-></span> <span class="n">m</span> <span class="n">a</span>
<span class="nf">pass</span> <span class="n">m</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">r</span> <span class="ow"><-</span> <span class="n">liftWithSplit</span> <span class="o">$</span> <span class="nf">\</span><span class="n">run</span> <span class="ow">-></span> <span class="n">pass</span> <span class="o">$</span> <span class="n">run</span> <span class="n">m</span> <span class="o">>>=</span> <span class="nf">\</span><span class="kr">case</span>
<span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="kt">Just</span> <span class="n">f</span><span class="p">)</span> <span class="ow">-></span> <span class="n">return</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">f</span><span class="p">)</span>
<span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="kt">Nothing</span><span class="p">)</span> <span class="ow">-></span> <span class="n">return</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">id</span><span class="p">)</span>
<span class="n">restoreT</span> <span class="p">(</span><span class="n">return</span> <span class="n">r</span><span class="p">)</span></code></pre><p>However, this seems pretty overkill for just one particular method, given that I have no idea if <code>MonadTransSplit</code> would be useful <em>anywhere</em> else. One interesting thing about going down this rabbit hole, though, is that I learned that <code>pass</code> has some somewhat surprising behavior when mixed with transformers like <code>ExceptT</code> or <code>MaybeT</code>, if you don’t carefully consider how it works. It’s a strange method with a somewhat strange interface, so I don’t think I have a satisfactory conclusion about <code>MonadWriter</code> yet.</p><h2><a name="regrouping-and-stepping-back"></a>Regrouping and stepping back</h2><p>Alright, that was a lot of fairly intense, potentially confusing code. What the heck did we actually accomplish? Well, we got a couple of things:</p><ol><li><p>First, we developed a technique for writing simple mtl-style typeclasses that are derivable using <code>DeriveAnyClass</code> (or simply writing an empty instance declaration). We used a <code>MonadExit</code> class as a proof of concept, but really, the technique is applicable to most mtl-style typeclasses that represent simple effects (including, for example, <code>MonadIO</code>).</p><p>This technique is useful in isolation, even if you completely disregard the rest of the blog post. For an example where I recently applied it in real code, see <a href="https://github.com/cjdev/monad-persist/blob/1ce8568d881da3171f8689dd65f4f2df5f6dd313/library/Control/Monad/Persist.hs#L226-L271">the default signatures provided with <code>MonadPersist</code> from the <code>monad-persist</code> library</a>, which make <a href="https://github.com/cjdev/monad-persist/blob/1ce8568d881da3171f8689dd65f4f2df5f6dd313/library/Control/Monad/Persist.hs#L506-L513">defining instances completely trivial</a>. If you use mtl-style typeclasses in your own application to model effects, I don’t see much of a reason <em>not</em> to use this technique.</p></li><li><p>After <code>MonadExit</code>, we applied the same technique to the mtl-provided typeclasses <code>MonadReader</code>, <code>MonadError</code>, and <code>MonadState</code>. These are a bit trickier, since the first two need <code>MonadTransControl</code> in addition to the usual <code>MonadTrans</code>.</p><p>Whether or not this sort of thing should actually be added to mtl itself probably remains to be seen. For the simplest typeclass, <code>MonadState</code>, it seems like there probably aren’t many downsides, but given the difficulty implementing it for <code>MonadWriter</code> (or, heaven forbid, <code>MonadCont</code>, which I didn’t even seriously take a look at for this blog post), it doesn’t seem like an obvious win. Consistency is important.</p><p>Another downside that I sort of glossed over is possibly even more significant from a practical point of view: adding default signatures to <code>MonadReader</code> would require the removal of the default implementation of <code>ask</code> that is provided by the existing library (which implements <code>ask</code> in terms of <code>reader</code>). This would be backwards-incompatible, so it’d be difficult to change, even if people wanted to do it. Still, it’s interesting to consider what these typeclasses might look like if they were designed today.</p></li></ol><p>Overall, these techniques are not a silver bullet for deriving mtl-style typeclasses, nor do they eliminate the n<sup>2</sup> instances problem that mtl style suffers from. That said, they <em>do</em> significantly reduce boilerplate and clutter in the simplest cases, and they demonstrate how modern Haskell’s hierarchy of typeclasses provides a lot of power, both to describe quite abstract concepts and to alleviate the need to write code by hand.</p><p>I will continue to experiment with the ideas described in this blog post, and I’m sure some more pros and cons will surface as I explore the design space. If you have any suggestions for how to deal with “the <code>MonadWriter</code> problem”, I’d be very interested to hear them! In the meantime, consider using the technique in your application code when writing effectful, monadic typeclasses.</p><ol class="footnotes"></ol></article>Rascal is now Hackett, plus some answers to questions2017-01-05T00:00:00Z2017-01-05T00:00:00ZAlexis King<article><p>Since I published <a href="/blog/2017/01/02/rascal-a-haskell-with-more-parentheses/">my blog post introducing Rascal</a>, I’ve gotten some <em>amazing</em> feedback, more than I had ever anticipated! One of the things that was pointed out, though, is that <a href="http://www.rascal-mpl.org">Rascal is a language that already exists</a>. Given that the name “Rascal” came from a mixture of “Racket” and “Haskell”, I always had an alternative named planned, and that’s “Hackett”. So, to avoid confusion as much as possible, <a href="https://github.com/lexi-lambda/hackett"><strong>Rascal is now known as Hackett</strong></a>.</p><p>With that out of the way, I also want to answer some of the other questions I received, both to hopefully clear up some confusion and to have something I can point to if I get the same questions in the future.</p><h2><a name="what-s-in-a-name"></a>What’s in a name?</h2><p>First, a little trivia.</p><p>I’ve already mentioned that the old “Rascal” name was based on the names “Racket” and “Haskell”, which is true. However, it had a slightly deeper meaning, too: the name fit a tradition of naming languages in the Scheme family after somewhat nefarious things, such as “Gambit”, “Guile”, “Larceny”, and “Racket” itself. The name goes back a little bit further to the Planner programming language; Scheme was originally called Schemer, but it was (no joke) shorted due to filename length restrictions.</p><p>Still, my language isn’t really a Scheme, so the weak connection wasn’t terribly relevant. Curious readers might be wondering if there’s any deeper meaning to the name “Hackett” than a mixture of the two language names. In fact, there is. Hackett is affectionately named after the <a href="https://en.wikipedia.org/wiki/Steve_Hackett">Genesis progressive rock guitarist, Steve Hackett</a>, one of my favorite musicians. The fact that the name is a homophone with “hack-it” is another convenient coincidence.</p><p>Perhaps not the most interesting thing in this blog post, but there it is.</p><h2><a name="why-racket-why-not-haskell"></a>Why Racket? Why <em>not</em> Haskell?</h2><p>One of the most common questions I received is why I used Racket as the implementation language instead of Haskell. This is a decent question, and I think it likely stems at least in part from an item of common confusion: <strong>Racket is actually two things, a programming language and a programming language platform</strong>. The fact that the two things have the same name is probably not ideal, but it’s what we’ve got.</p><p>Racket-the-language is obviously the primary language used on the Racket platform, but there’s actually surprisingly little need for that to be the case; it’s simply the language that is worked on the most. Much of the Racket tooling, including the compiler, macroexpander, and IDE, are actually totally language agnostic. If someone came along and wrote a language that got more popular than <code>#lang racket</code>, then there wouldn’t really be anything hardcoded into any existing tooling that would give the impression that <code>#lang racket</code> was ever the more “dominant” language, aside from the name.</p><p>For this reason, Racket is ideal for implementing new programming languages, moreso than pretty much any other platform out there. The talk I linked to in the previous blog post, <a href="https://www.youtube.com/watch?v=TfehOLha-18">Languages in an Afternoon</a>, describes this unique capability. It’s short, only ~15 minutes, but if you’re not into videos, I can try and explain why Racket is so brilliant for this sort of thing.</p><p>By leveraging the Racket platform instead of implementing my language from scratch, I get the following things pretty much for free:</p><ol><li><p>I get a JIT compiler for my code, and I don’t have to implement a compiler myself.</p></li><li><p>I also get a package manager that can cooperate with Hackett code to deliver Hackett modules.</p></li><li><p>I get a documentation system that is fully indexed and automatically locally installed when you install Hackett or any package written in Hackett, and that documentation is automatically integrated with the editor.</p></li><li><p>The DrRacket IDE can be used out of the box with Hackett code, it automatically does syntax highlighting and indenting, and it even provides interactive tools for inspecting bindings (something that I demo in my aforementioned talk).</p></li><li><p>If you don’t want to use DrRacket, you can use the <a href="https://github.com/greghendershott/racket-mode">racket-mode</a> major mode for Emacs, which uses the same sets of tools that DrRacket uses under the hood, so you get most of the same DrRacket goodies without sacrificing Emacs’s power of customization.</p></li></ol><p>Reimplementing all of that in another language would take years of work, and I haven’t even mentioned Racket’s module system and macroexpander, which are the underpinnings of Hackett. GHC’s typechecker is likely roughly as complex as Racket’s macroexpander combined with its module system, but I am not currently implementing GHC’s typechecker, since I do not need all of OutsideIn(X)’s features, just Haskell 98 + some extensions.</p><p>In contrast, I truly do need all of the Racket macroexpander to implement Hackett, since the <em>Type Systems as Macros</em> paper uses pretty much every trick the Racket macro system has to offer to implement typechecking as macroexpansion. For those reasons, implementing the Racket macroexpander <strong>alone</strong> in Haskell would likely be monumentally more work than implementing a Hindley-Milner typechecker in Racket, so it doesn’t really make sense to use Haskell for that job.</p><h3><a name="actually-running-hackett-code"></a>Actually running Hackett code</h3><p>Now, it’s worth noting that GHC is much more efficient as a compiler than Racket is, for a whole host of reasons. However, since typechecking and macroexpansion are inherently strictly compile-time phases, it turns out to be totally feasible to run the typechecker/macroexpander in Racket (since in Hackett, the two things are one and the same), then compile the resulting fully-expanded, well-typed code to GHC Core. That could then be handed off to GHC itself and compiled using the full power of the GHC optimizer and compiler toolchain.</p><p>This would be no small amount of work, but it seems theoretically possible, so eventually it’s something I’d love to look into. There are various complexities to making it work, but I think it would let me get the best of both worlds without reinventing the wheel, so it’s something I want long-term.</p><p>There’s also the question of how “native” Hackett code would be, were it compiled to GHC Core. Would Hackett code be able to use Haskell libraries, and vice versa? My guess is that the answer is “yes, with some glue”. It probably wouldn’t be possible to do it completely seamlessly, because Hackett provides type information at macroexpansion time that likely wouldn’t exist in the same form in GHC. It might be possible to do some incredibly clever bridging to be able to use Haskell libraries in Hackett almost directly, but the inverse might not be true if a library’s interface depends on macros.</p><h2><a name="how-do-template-haskell-quasiquoters-compete-with-macros"></a>How do Template Haskell quasiquoters compete with macros?</h2><p>Quasiquoters have a number of drawbacks, but the two main ones are complexity and lack of composition.</p><p>S-expressions happen to be simple, and this means s-expression macros have two lovely properties: they’re easy to write, given good libraries (Racket has <a href="http://docs.racket-lang.org/syntax/stxparse.html"><code>syntax/parse</code></a>), and they’re easy for tools to understand. Quasiquoters force implementors to write their own parsers from raw strings of characters, which is quite a heavy burden, and it usually means those syntaxes are confusing and brittle. To give a good example, consider <a href="http://www.yesodweb.com/book/persistent#persistent_code_generation">persistent’s quasiquoters</a>: they look <em>sort of</em> like Haskell data declarations, but they’re not really, and I honestly have no idea what their actual syntax really is. It feels pretty finicky, though. In contrast, an s-expression based version of the same syntax would basically look just like the usual datatype declaration form, plus perhaps some extra goodies.</p><p>Additionally, s-expression macros <em>compose</em>, and this should probably be valued more than anything else. If you’re writing code that doesn’t compose, it’s usually a bad sign. So much of functional programming is about writing small, reusable pieces of code that can be composed together, and macros are no different. Racket’s <code>match</code>, for example, is an expression, and it contains expressions, so <code>match</code> can be nested within itself, as well as other arbitrary macros that produce expressions. Similarly, many Racket macros can be extended, which is possible due to having such uniform syntax.</p><p>Making macros “stand out” is an issue of some subjectivity, but in my experience such a fear of macros tends to stem from a familiarity with bad macro systems (which, to be fair, is almost all of them) and poor tooling. I’ve found that, in practice, most of the reasons people want to know “is this a macro??” is because macros are scary black boxes and people want to know which things to be suspicious of.</p><p>Really, though, one of the reasons macros are complicated isn’t knowing which things are macros, but it’s knowing <em>which identifiers are uses and which identifiers are bindings</em>, and things like that. Just knowing that something is a macro use doesn’t actually help at all there—the syntax won’t tell you. <a href="http://i.imgur.com/HvYee19.png">Solve that problem with tools that address the problem head on, not by making a syntax that makes macros second-class citizens.</a> One of the reasons I used the phrase “syntactic abstractions” in my previous blog post is because you specifically want them to be <strong>abstractions</strong>. If you have to think of a macro in terms of the thing it expands to then it isn’t a very watertight abstraction. You don’t think about Haskell pattern-matching in terms of what the patterns compile to, you just use them. Macros should be (and can be) just as fluid.</p><h2><a name="how-can-i-help"></a>How can I help?</h2><p>Right now, what I really need is someone who understands type system implementation. You don’t need to be up to date on what’s cutting edge—I’m not implementing anything nearly as complicated as GADTs or dependent types yet—you just need to understand how to implement Haskell 98. If you have that knowledge and you’re interested in helping, even if it just means answering some of my questions, please contact me via email, IRC (the #racket channel on Freenode is a good place for now), or Slack (I’m active in the snek Slack community, <a href="http://snek.jneen.net">which you can sign up for here</a>).</p><p>If you aren’t familiar with those things, but you’re still interested in helping out, there’s definitely plenty of work that needs doing. If you want to find somewhere you can pitch in, contacting me via any of the above means is totally fine, and I can point you in the right direction. Even if you just want to be a guinea pig, that’s useful.</p><ol class="footnotes"></ol></article>Rascal: a Haskell with more parentheses2017-01-02T00:00:00Z2017-01-02T00:00:00ZAlexis King<article><blockquote><p><strong>Note</strong>: since the writing of this blog post, Rascal has been renamed to Hackett. You can read about why in <a href="/blog/2017/01/05/rascal-is-now-hackett-plus-some-answers-to-questions/">the followup blog post</a>.</p></blockquote><p>“Hey! You got your Haskell in my Racket!”</p><p>“No, you got <em>your</em> Racket in <em>my</em> Haskell!”</p><p>Welcome to the <a href="https://github.com/lexi-lambda/hackett">Rascal</a> programming language.</p><h2><a name="why-rascal"></a>Why Rascal?</h2><p>Why yet <em>another</em> programming language? Anyone who knows me knows that I already have two programming languages that I <em>really</em> like: Haskell and Racket. Really, I think they’re both great! Each brings some things to the table that aren’t really available in any other programming language I’ve ever used.</p><p>Haskell, in many ways, is a programming language that fits my mental model of how to structure programs better than any other programming language I’ve used. Some people would vehemently disagree, and it seems that there is almost certainly some heavy subjectivity in how people think about programming. I think Haskell’s model is awesome once you get used to it, though, but this blog post is not really going to try and convince you why you should care about Haskell (though that <em>is</em> something I want to write at some point). What you <em>should</em> understand, though, is that to me, Haskell is pretty close to what I want in a programming language.</p><p>At the same time, though, Haskell has problems, and a lot of that revolves around its story for metaprogramming. “Metaprogramming” is another M word that people seem to be very afraid of, and for good reason: most metaprogramming systems are ad-hoc, unsafe, unpredictable footguns that require delicate care to use properly, and <em>even then</em> the resulting code is brittle and difficult to understand. Haskell doesn’t suffer from this problem as much as some languages, but it isn’t perfect by any means: Haskell has at least two different metaprogramming systems (generics and Template Haskell) that are designed for different tasks, but they’re both limited in scope and both tend to be pretty complicated to use.</p><p>Discussing the merits and drawbacks of Haskell’s various metaprogramming capabilities is also outside the scope of this blog post, but there’s one <em>fact</em> that I want to bring up, which is that <strong>Haskell does not provide any mechanism for adding syntactic abstractions to the language</strong>. What do I mean by this? Well, in order to understand what a “syntactic abstraction” is and why you should care about it, I want to shift gears a little and take a look at why Racket is so amazing.</p><h3><a name="a-programmable-programming-language-theory-and-practice"></a>A programmable programming language: theory and practice</h3><p>I feel confident in saying that Racket has <em>the</em> most advanced macro system in the world, and it is pretty much unparalleled in that space. There are many languages with powerful type systems, but Racket is more or less alone in many of the niches it occupies. Racket has a large number of innovations that I don’t know of in any other programming language, and a significant portion of them focus on making Racket a <a href="http://www.ccs.neu.edu/home/matthias/manifesto/">programmable programming language, a language for building languages</a>.</p><p>This lofty goal is backed up by decades of research, providing Racket with an unparalleled toolkit for creating languages that can communicate, be extended, and even cooperate with tooling to provide introspection and error diagnostics. Working in Haskell feels like carefully designing a mould that cleanly and precisely fits your domain, carefully carving, cutting, and whittling. In contrast, working with Racket feels like moulding your domain until it looks the way <em>you</em> want it to look, poking and prodding at a pliable substrate. The sheer <em>ease</em> of it all is impossible for me to convey in words, so <a href="https://twitter.com/andmkent_/status/724036694773628930">you will have to see it for yourself</a>.</p><p>All this stuff is super abstract, though. What does it mean for practical programming, and why should you care? Well, I’m not going to try and sell you if you’re extremely skeptical, but if you’re interested, <a href="https://www.youtube.com/watch?v=TfehOLha-18">I gave a talk on some of Racket’s linguistic capabilities last year called <em>Languages in an Afternoon</em></a>. If you’re curious, give it a watch, and you might find yourself (hopefully) a little impressed. If you prefer reading, well, I have some <a href="/blog/2015/12/21/adts-in-typed-racket-with-macros/">blog posts</a> on this very blog that <a href="/blog/2015/08/30/managing-application-configuration-with-envy/">demonstrate what Racket can do</a>.</p><p>The basic idea, though, is that by having a simple syntax and a powerful macro system with a formalization of lexical scope, users can effectively invent entirely new language constructs as ordinary libraries, constructs that would have to be core forms in other programming languages. For example, Racket supports pattern-matching, but it isn’t built into the compiler: it’s simply implemented in the <code>racket/match</code> module distributed with Racket. Not only is it defined in ordinary Racket code, it’s actually <em>extensible</em>, so users can add their own pattern-matching forms that cooperate with <code>match</code>.</p><p>This is the power of a macro system to produce “syntactic abstractions”, things that can transform the way a user thinks of the code they’re writing. Racket has the unique capability of making these abstractions both easy to write and watertight, so instead of being a scary tool you have to handle with extreme care, you can easily whip up a powerful, user-friendly embedded domain specific language in a matter of <em>minutes</em>, and it’ll be safe, provide error reporting for misuse, and cooperate with existing tooling pretty much out of the box.</p><h3><a name="fusing-haskell-and-racket"></a>Fusing Haskell and Racket</h3><p>So, let’s assume that we <em>do</em> want Haskell’s strong type system and that we <em>also</em> want a powerful metaprogramming model that permits syntactic extensions. What would that look like? Well, one way we could do it is to put one in front of the other: macro expansion is, by nature, a compile-time pass, so we could stick a macroexpander in front of the typechecker. This leads to a simple technique: first, macroexpand the program to erase the macros, then typecheck it and erase the types, then send the resulting code off to be compiled. This technique has the following properties:</p><ol><li><p>First of all, <strong>it’s easy to implement</strong>. Racket’s macroexpander, while complex, is well-documented in academic literature and works extremely well in practice. In fact, this strategy has already been implemented! Typed Racket, the gradually-typed sister language of Racket, expands every program before typechecking. It would be possible to effectively create a “Lisp-flavored Haskell” by using this technique, and it might not even be that hard.</p></li><li><p>Unfortunately, there’s a huge problem with this approach: <strong>type information is not available at macroexpansion time</strong>. This is the real dealbreaker with the “expand, then typecheck” model, since static type information is some of the most useful information possibly available to a macro writer. In an ideal world, macros should not only have access to type information, they should be able to manipulate it and metaprogram the typechecker as necessary, but if macroexpansion is a separate phase from typechecking, then that information simply doesn’t exist yet.</p></li></ol><p>For me, the second option is unacceptable. I am <em>not</em> satisfied by a “Lisp-flavored Haskell”; I want my types and macros to be able to cooperate and communicate with each other. The trouble, though, is that solving that problem is really, really hard! For a couple years now, I’ve been wishing this ideal language existed, but I’ve had no idea how to make it actually work. Template Haskell implements a highly restricted system of interweaving typechecking and splice evaluation, but it effectively does it by running the typechecker and the splice expander alternately, splitting the source into chunks and typechecking them one at a time. This works okay for Template Haskell, but for the more powerful macro system I am looking for, it wouldn’t scale.</p><p>There’s something a little bit curious, though, about the problem as I just described it. The processes of “macroexpanding the program to erase the macros” and “typechecking the program to erase the types” sound awfully similar. It seems like maybe these are two sides of the same coin, and it would be wonderful if we could encode one in terms of the other, effectively turning the two passes into a single, unified pass. Unfortunately, while this sounds great, I had no idea how to do this (and it didn’t help that I really had no idea how existing type systems were actually implemented).</p><p>Fortunately, last year, Stephen Chang, Alex Knauth, and Ben Greenman put together a rather exciting paper called <a href="http://www.ccs.neu.edu/home/stchang/popl2017/"><em>Type Systems as Macros</em></a>, which does precisely what I just described, and it delivers it all in a remarkably simple and elegant presentation. The idea is to “distribute” the task of typechecking over the individual forms of the language, leveraging existing macro communication facilities avaiable in the Racket macroexpander to propagate type information as macros are expanded. To me, it was exactly what I was looking for, and I almost immediately started playing with it and seeing what I could do with it.</p><p>The result is <a href="https://github.com/lexi-lambda/hackett"><em>Rascal</em></a>, a programming language built in the Racket ecosystem that attempts to implement a Haskell-like type system.</p><h2><a name="a-first-peek-at-rascal"></a>A first peek at Rascal</h2><p>Rascal is a very new programming language I’ve only been working on over the past few months. It is extremely experimental, riddled with bugs, half-baked, and may turn your computer into scrambled eggs. Still, while I might not recommend that you actually <em>use</em> it just yet, I want to try and share what it is I’m working on, since I’d bet at least a few other people will find it interesting, too.</p><p>First, let me say this up front: <strong>Rascal is probably a lot closer to Haskell than Racket</strong>. That might come as a surprise, given that Rascal has very Lisp-y syntax, it’s written in Racket, and it runs on the Racket platform, but semantically, Rascal is mostly just Haskell 98. This is important, because it may come as a surprise, given that there are so few statically typed Lisps, but there’s obviously no inherent reason that Lisps need to be dynamically typed. They just seem to have mostly evolved that way.</p><p>Taking a look at a snippet of Rascal code, it’s easy to see that the language doesn’t work quite like a traditional Lisp, though:<sup><a href="#footnote-1" id="footnote-ref-1-1">1</a></sup></p><pre><code>(def+ map-every-other : (forall [a] {{a -> a} -> (List a) -> (List a)})
[_ nil -> nil]
[_ {x :: nil} -> {x :: nil}]
[f {x :: y :: ys} -> {x :: (f y) :: (map-every-other f ys)}])
</code></pre><p>This is a Lisp with all the goodies you would expect out of Haskell: static types, parametric polymorphism, automatically curried functions, algebraic datatypes, pattern-matching, infix operators, and of course, <em>typeclasses</em>. Yes, with Rascal you can have your monads in all their statically dispatched glory:</p><pre><code>(data (Maybe a)
(just a)
nothing)
(instance (Monad Maybe)
[join (case-lambda
[(just (just x)) (just x)]
[_ nothing])])
</code></pre><p>So far, though, this really <em>is</em> just “Haskell with parentheses”. As alluded to above, however, Rascal is a bit more than that.</p><h3><a name="core-forms-can-be-implemented-as-derived-concepts"></a>Core forms can be implemented as derived concepts</h3><p>Rascal’s type system is currently very simple, being nothing more than Hindley-Milner plus ad-hoc polymorphism in the form of typeclasses. Something interesting to note about it is that it does not implement ADTs or pattern-matching anywhere in the core! In fact, ADTs are defined as two macros <code>data</code> and <code>case</code>, in an entirely separate module, which can be imported just like any other library.</p><p>The main <code>rascal</code> language provides ADTs by default, of course, but it would be perfectly possible to produce a <code>rascal/kernel</code> language which does not include them at all. In this particular case, it seems unlikely that Rascal programmers would want their own implementation of ADTs, but it’s an interesting proof of concept, and it hints at other “core” features that could be implemented using macros.</p><p>Simple syntactic transformations are, of course, trivially defined as macros. Haskell <code>do</code> notation is defined as <a href="https://github.com/lexi-lambda/hackett/blob/87d001a82c86fb66544d25c37ffba9be1ac63464/rascal-lib/rascal/monad.rkt#L48-L58">an eleven-line macro in <code>rascal/monad</code></a>, and GHC’s useful <code>LambdaCase</code> extension is also possible to implement without modifying Rascal at all. This is useful, because there are many syntactic shorthands that are extremely useful to implement, but don’t make any sense to be in GHC because they are specific to certain libraries or applications. Racket’s macro system makes those not only possible, but actually pretty easy.</p><p>While the extent of what is possible to implement as derived forms remains to be seen, many useful GHC features seem quite possible to implement without touching the core language, including things like <code>GeneralizedNewtypeDeriving</code> and other generic deriving mechanisms like <code>GHC.Generics</code>, <code>DeriveGeneric</code>, and <code>DeriveAnyClass</code>.</p><h3><a name="the-language-is-not-enough"></a>The language is not enough</h3><p>No language is perfect. Most people would agree with this, but I would take it a step further: no language is even sufficient! This makes a lot of sense, given that general-purpose programming languages are designed to do <em>everything</em>, and it’s impossible to do everything well.</p><p>Haskell programmers know this, and they happily endorse the creation of embedded domain specific languages. These are fantastic, and we need more of them. Things like <a href="http://hackage.haskell.org/package/servant">servant</a> let me write a third of the code I might otherwise need to, and the most readable code is the code you didn’t have to write in the first place. DSLs are good.</p><p>Unfortunately, building DSLs is traditionally difficult, largely in part because building embedded DSLs means figuring out a way to encode your domain into your host language of choice. Sometimes, your domain simply does not elegantly map to your host language’s syntax or semantics, and you have to come up with a compromise. This is easy to see with servant, which, while it does a remarkably good job, still has to resort to some very clever type magic to create some semblance of an API description in Haskell types:</p><pre><code>type UserAPI = "users" :> Get '[JSON] [User]
:<|> "users" :> ReqBody '[JSON] User :> Post '[JSON] User
:<|> "users" :> Capture "userid" Integer
:> Get '[JSON] User
:<|> "users" :> Capture "userid" Integer
:> ReqBody '[JSON] User
:> Put '[JSON] User
</code></pre><p>The above code is <em>remarkably</em> readable for what it is, but what if we didn’t have to worry about working within the constraints of Haskell’s syntax? What if we could design a syntax that was truly the best for the job? Perhaps we would come up with something like this:</p><pre><code>(define-api User-API
#:content-types [JSON]
[GET "users" => (List User)]
[POST "users" => User -> User]
[GET "users" [userid : Integer] => User]
[PUT "users" [userid : Integer] => User -> User])
</code></pre><p>This would be extremely easy to write with Racket’s macro-writing utilities, and it could even be made extensible. This could also avoid having to do the complicated typeclass trickery servant has to perform to then generate code from the above specification, since it would be much easier to just generate the necessary code directly (which still maintaining type safety).</p><p>In addition to the type-level hacks that Haskell programmers often have to pull in order to make these kinds of fancy DSLs work, free monads tend to be used to create domain-specific languages. This works okay for some DSLs, but remember that when you use a free monad, you are effectively writing a <em>runtime interpreter</em> for your language! Macros, on the other hand, are compiled, and you get ability to <em>compile</em> your DSL to code that can be optimized by all the existing facilities of the compiler toolchain.</p><h2><a name="rascal-is-embryonic"></a>Rascal is embryonic</h2><p>I’m pretty excited about Rascal. I think that it could have the potential to do some pretty interesting things, and I have some ideas in my head for how having macros in a Haskell-like language could change things. I also think that, based on what I’ve seen so far, having both macros and a Haskell-like type system could give rise to <em>completely</em> different programming paradigms than exist in either Haskell or Racket today. My gut tells me that this is a case where the whole might actually be greater than the sum of its parts.</p><p>That said, Rascal doesn’t really exist yet. Yes, <a href="https://github.com/lexi-lambda/hackett">there is a GitHub repository</a>, and it has some code in it that does… something. Unfortunately, the code is also currently extremely buggy, to the point of being borderline broken, and it’s also in such early stages that you can’t really do <em>anything</em> interesting with it, aside from some tiny toy programs.</p><p>As I have worked on Rascal, I’ve come to a somewhat unfortunate conclusion, which is that I really have almost zero interest in implementing type systems. I felt that way before I started the project, but I was hoping that maybe once I got into them, I would find them more interesting. Unfortunately, as much as I love working with powerful type systems (and really, I adore working with Haskell and using all the fancy features GHC provides), I find implementing the software that makes them tick completely dull.</p><p>Still, I’m willing to invest the time to get something that I can use. Even so, resources for practical type system implementation are scarce. I want to thank <a href="https://web.cecs.pdx.edu/~mpj/">Mark P Jones</a> for his wonderful resource <a href="https://web.cecs.pdx.edu/~mpj/thih/">Typing Haskell in Haskell</a>, without which getting to where I am now would likely have been impossible. I also want to thank <a href="http://www.stephendiehl.com">Stephen Diehl</a> for his wonderful <a href="http://dev.stephendiehl.com/fun/">Write You a Haskell</a> series, which was also wonderfully useful to study, even if it is unfinished and doesn’t cover anything beyond ML just yet.</p><p>Even with these wonderful resources, I’ve come to the realization that <strong>I probably can’t do all of this on my own</strong>. I consider myself pretty familiar with macros and macro expanders at this point, but I don’t know much about type systems (at least not their implementation), and I could absolutely use some help. So if you’re interested in Rascal and think you might be able to pitch in, please: I would appreciate even the littlest bits of help or guidance!</p><p>In the meantime, I will try to keep picking away at Rascal in the small amount of free time I currently have. Thanks, as always, to all the amazing people who have contributed to the tools I’ve been using for this project: special thanks to the authors of <em>Type Systems as Macros</em> for their help as well as the people I mentioned just above, and also to all of the people who have built Racket and Haskell and made them what they are today. Without them, Rascal would most definitely not exist.</p><ol class="footnotes"><li id="footnote-1"><p>Note that most of the Rascal code in this blog post probably doesn’t actually work on the current Rascal implementation. Pretty much all of it can be implemented in the current implementation, the syntax just isn’t quite as nice yet. <a href="#footnote-ref-1-1">↩</a></p></li></ol></article>Using types to unit-test in Haskell2016-10-03T00:00:00Z2016-10-03T00:00:00ZAlexis King<article><p>Object-oriented programming languages make unit testing easy by providing obvious boundaries between units of code in the form of classes and interfaces. These boundaries make it easy to stub out parts of a system to test functionality in isolation, which makes it possible to write fast, deterministic test suites that are robust in the face of change. When writing Haskell, it can be unclear how to accomplish the same goals: even inside pure code, it can become difficult to test a particular code path without also testing all its collaborators.</p><p>Fortunately, by taking advantage of Haskell’s expressive type system, it’s possible to not only achieve parity with object-oriented testing techniques, but also to provide stronger static guarantees as well. Furthermore, it’s all possible without resorting to extra-linguistic hacks that static object-oriented languages sometimes use for mocking, such as dynamic bytecode generation.</p><h2><a name="first-an-aside-on-testing-philosophy"></a>First, an aside on testing philosophy</h2><p>Testing methodology is a controversial topic within the larger programming community, and there are a multitude of different approaches. This blog post is about <em>unit testing</em>, an already nebulous term with a number of different definitions. For the purposes of this post, I will define a unit test as a test that stubs out collaborators of the code under test in some way. Accomplishing that in Haskell is what this is primarily about.</p><p>I want to be clear that I do not think that unit tests are the only way to write tests, nor the best way, nor even always an applicable way. Depending on your domain, rigorous unit testing might not even make sense, and other forms of tests (end-to-end, integration, benchmarks, etc.) might fulfill your needs.</p><p>In practice, though, implementing those other kinds of tests seems to be well-documented in Haskell compared to pure, object-oriented style unit testing. As my Haskell applications have grown, I have found myself wanting a more fine-grained testing tool that allows me to both test a piece of my codebase in isolation and also use my domain-specific types. This blog post is about that.</p><p>With that disclaimer out of the way, let’s talk about testing in Haskell.</p><h2><a name="drawing-seams-using-types"></a>Drawing seams using types</h2><p>One of the primary attributes of unit tests in object-oriented languages, especially statically-typed ones, is the concept of “seams” within a codebase. These are internal boundaries between components of a system. Some boundaries are obvious—interactions with a database, manipulation of the file system, and performing I/O over the network, to name a few examples—but others are more subtle. Especially in larger codebases, it can be helpful to isolate two related but distinct pieces of functionality as much as possible, which makes them easier to reason about, even if they’re actually part of the same codebase.</p><p>In OO languages, these seams are often marked using interfaces, whether explicitly (in the case of static languages) or implicitly (in the case of dynamic ones). By programming to an interface, it’s possible to create “fake” implementations of that interface for use in unit tests, effectively making it possible to stub out code that isn’t directly relevant to the code being tested.</p><p>In Haskell, representing these seams is a lot less obvious. Consider a fairly trivial function that reverses a file’s contents on the file system:</p><pre><code class="pygments"><span class="nf">reverseFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">reverseFile</span> <span class="n">path</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">contents</span> <span class="ow"><-</span> <span class="n">readFile</span> <span class="n">path</span>
<span class="n">writeFile</span> <span class="n">path</span> <span class="p">(</span><span class="n">reverse</span> <span class="n">contents</span><span class="p">)</span></code></pre><p>This function is impossible to test without testing against a real file system. It simply performs I/O directly, and there’s no way to “mock out” the file system for testing purposes. Now, admittedly, this function is so trivial that a unit test might not seem worth the cost, but consider a slightly more complicated function that interacts with a database:</p><pre><code class="pygments"><span class="nf">renderUserProfile</span> <span class="ow">::</span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="kt">HTML</span>
<span class="nf">renderUserProfile</span> <span class="n">userId</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">user</span> <span class="ow"><-</span> <span class="n">fetchUser</span> <span class="n">userId</span>
<span class="n">posts</span> <span class="ow"><-</span> <span class="n">fetchRecentPosts</span> <span class="n">userId</span>
<span class="n">return</span> <span class="o">$</span> <span class="n">div</span>
<span class="p">[</span> <span class="n">h1</span> <span class="p">(</span><span class="n">userName</span> <span class="n">user</span> <span class="o"><></span> <span class="s">"’s Profile"</span><span class="p">)</span>
<span class="p">,</span> <span class="n">h2</span> <span class="s">"Recent Posts"</span>
<span class="p">,</span> <span class="n">ul</span> <span class="p">(</span><span class="n">map</span> <span class="p">(</span><span class="n">li</span> <span class="o">.</span> <span class="n">postTitle</span><span class="p">)</span> <span class="n">posts</span><span class="p">)</span>
<span class="p">]</span></code></pre><p>It might now be a bit more clear that it could be useful to test the above function without running a real database and doing all the necessary context setup before each test case. Indeed, it would be nice if a test could just provide stubbed implementations for <code>fetchUser</code> and <code>fetchRecentPosts</code>, then make assertions about the output.</p><p>One way to solve this problem is to pass the results of those two functions to <code>renderUserProfile</code> as arguments, turning it into a pure function that could be easily tested. This becomes obnoxious for functions of even just slightly more complexity, though (it is not unreasonable to imagine needing a handful of different queries to render a user’s profile page), and it requires significantly restructuring code simply because the tests need it.</p><p>The above code is not only difficult to test, however—it has another problem, too. Specifically, both functions return <code>IO</code> values, which means they can effectively do <em>anything</em>. Haskell has a very strong type system for typing terms, but it doesn’t provide any guarantees about effects beyond a simple yes/no answer about function purity. Even though the <code>renderUserProfile</code> function should really only need to interact with the database, it could theoretically delete files, send emails, make HTTP requests, or do any number of other things.</p><p>Fortunately, it’s possible to solve <em>both</em> problems—a lack of testability and a lack of type safety—using the same general technique. This approach is reminiscent of the interface-based seams of object-oriented languages, but unlike most object-oriented approaches, it provides additional type safety guarantees without the need to explicitly modify the code to support some kind of dependency injection.</p><h3><a name="making-implicit-interfaces-explicit"></a>Making implicit interfaces explicit</h3><p>Statically typed, object-oriented languages provide interfaces as a language construct to encode certain kinds of contracts into the type system, and Haskell has something similar. Typeclasses are, in many ways, an analog to OO interfaces, and they can be used in a similar way. In the above case, let’s write down interfaces that the <code>reverseFile</code> and <code>renderUserProfile</code> functions can use:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFS</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">String</span>
<span class="n">writeFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadDB</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">fetchUser</span> <span class="ow">::</span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">User</span>
<span class="n">fetchRecentPosts</span> <span class="ow">::</span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">[</span><span class="kt">Post</span><span class="p">]</span></code></pre><p>The really nice thing about these interfaces is that our function implementations don’t have to change <em>at all</em> to take advantage of them. In fact, all we have to change is their types:</p><pre><code class="pygments"><span class="nf">reverseFile</span> <span class="ow">::</span> <span class="kt">MonadFS</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="nf">reverseFile</span> <span class="n">path</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">contents</span> <span class="ow"><-</span> <span class="n">readFile</span> <span class="n">path</span>
<span class="n">writeFile</span> <span class="n">path</span> <span class="p">(</span><span class="n">reverse</span> <span class="n">contents</span><span class="p">)</span>
<span class="nf">renderUserProfile</span> <span class="ow">::</span> <span class="kt">MonadDB</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">HTML</span>
<span class="nf">renderUserProfile</span> <span class="n">userId</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">user</span> <span class="ow"><-</span> <span class="n">fetchUser</span> <span class="n">userId</span>
<span class="n">posts</span> <span class="ow"><-</span> <span class="n">fetchRecentPosts</span> <span class="n">userId</span>
<span class="n">return</span> <span class="o">$</span> <span class="n">div</span>
<span class="p">[</span> <span class="n">h1</span> <span class="p">(</span><span class="n">userName</span> <span class="n">user</span> <span class="o"><></span> <span class="s">"’s Profile"</span><span class="p">)</span>
<span class="p">,</span> <span class="n">h2</span> <span class="s">"Recent Posts"</span>
<span class="p">,</span> <span class="n">ul</span> <span class="p">(</span><span class="n">map</span> <span class="p">(</span><span class="n">li</span> <span class="o">.</span> <span class="n">postTitle</span><span class="p">)</span> <span class="n">posts</span><span class="p">)</span>
<span class="p">]</span></code></pre><p>This is pretty neat, since we haven’t had to alter our code at all, but we’ve managed to completely decouple ourselves from <code>IO</code>. This has the direct effect of both making our code more abstract (we no longer rely on the “real” file system or a “real” database, which makes our code easier to test) and restricting what our functions can do (just from looking at the type signatures, we know what side-effects they can perform).</p><p>Of course, since we’re now coding against an interface, our code doesn’t actually do much of anything. If we want to actually use the functions we’ve written, we’ll have to define instances of <code>MonadFS</code> and <code>MonadDB</code>. When actually running our code, we’ll probably still use <code>IO</code> (or some monad transformer stack with <code>IO</code> at the bottom), so we can define trivial instances for that existing use case:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadFS</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="ow">=</span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">readFile</span>
<span class="n">writeFile</span> <span class="ow">=</span> <span class="kt">Prelude</span><span class="o">.</span><span class="n">writeFile</span>
<span class="kr">instance</span> <span class="kt">MonadDB</span> <span class="kt">IO</span> <span class="kr">where</span>
<span class="n">fetchUser</span> <span class="ow">=</span> <span class="kt">SQL</span><span class="o">.</span><span class="n">fetchUser</span>
<span class="n">fetchRecentPosts</span> <span class="ow">=</span> <span class="kt">SQL</span><span class="o">.</span><span class="n">fetchRecentPosts</span></code></pre><p>Even if we go no further, <strong>this is already incredibly useful</strong>. By restricting the sorts of effects our functions can perform at the type level, it becomes a lot easier to see which code is interacting with what. This can be invaluable when working in a part of a moderately large codebase that you are unfamiliar with. Even if the only instance of these typeclasses is <code>IO</code>, the benefits are immediately apparent.</p><p>Of course, this blog post is about testing, so we’re going to go further and take advantage of these seams we’ve now drawn. The question is: how?</p><h2><a name="testing-with-typeclasses-an-initial-attempt"></a>Testing with typeclasses: an initial attempt</h2><p>Given that we now have functions depending on an interface instead of <code>IO</code>, we can create separate instances of our typeclasses for use in tests. Let’s start with the <code>renderUserProfile</code> function. We’ll create a simple wrapper around the <code>Identity</code> type, since we don’t actually care much about the “effects” of our <code>MonadDB</code> methods:</p><pre><code class="pygments"><span class="kr">import</span> <span class="nn">Data.Functor.Identity</span>
<span class="kr">newtype</span> <span class="kt">TestM</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">TestM</span> <span class="p">(</span><span class="kt">Identity</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">)</span>
<span class="nf">unTestM</span> <span class="ow">::</span> <span class="kt">TestM</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">unTestM</span> <span class="p">(</span><span class="kt">TestM</span> <span class="p">(</span><span class="kt">Identity</span> <span class="n">x</span><span class="p">))</span> <span class="ow">=</span> <span class="n">x</span></code></pre><p>Now, we’ll create a trivial instance of <code>MonadDB</code> for <code>TestM</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadDB</span> <span class="kt">TestM</span> <span class="kr">where</span>
<span class="n">fetchUser</span> <span class="kr">_</span> <span class="ow">=</span> <span class="n">return</span> <span class="kt">User</span> <span class="p">{</span> <span class="n">userName</span> <span class="ow">=</span> <span class="s">"Alyssa"</span> <span class="p">}</span>
<span class="n">fetchRecentPosts</span> <span class="kr">_</span> <span class="ow">=</span> <span class="n">return</span>
<span class="p">[</span> <span class="kt">Post</span> <span class="p">{</span> <span class="n">postTitle</span> <span class="ow">=</span> <span class="s">"Metacircular Evaluator"</span> <span class="p">}</span> <span class="p">]</span></code></pre><p>With this instance, it’s now possible to write a simple unit test of the <code>renderUserProfile</code> function that doesn’t need a real database running at all:</p><pre><code class="pygments"><span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"renderUserProfile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"shows the user’s name"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">result</span> <span class="ow">=</span> <span class="n">unTestM</span> <span class="p">(</span><span class="n">renderUserProfile</span> <span class="p">(</span><span class="n">intToId</span> <span class="mi">1234</span><span class="p">))</span>
<span class="n">result</span> <span class="p">`</span><span class="n">shouldContainElement</span><span class="p">`</span> <span class="n">h1</span> <span class="s">"Alyssa’s Profile"</span>
<span class="n">it</span> <span class="s">"shows a list of the user’s posts"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">result</span> <span class="ow">=</span> <span class="n">unTestM</span> <span class="p">(</span><span class="n">renderUserProfile</span> <span class="p">(</span><span class="n">intToId</span> <span class="mi">1234</span><span class="p">))</span>
<span class="n">result</span> <span class="p">`</span><span class="n">shouldContainElement</span><span class="p">`</span> <span class="n">ul</span> <span class="p">[</span> <span class="n">li</span> <span class="s">"Metacircular Evaluator"</span> <span class="p">]</span></code></pre><p>This is pretty nice, and running the above tests reveals a nice property of these kinds of isolated test cases: the test suite runs <em>really, really fast</em>. Communicating with a database, even in extremely simple ways, takes a measurable amount of time, especially with dozens of tests. In contrast, even with hundreds of tests, our unit test suite runs in less than a tenth of a second.</p><p>This all seems to be successful, so let’s try and apply the same testing technique to <code>reverseFile</code>.</p><h3><a name="testing-side-effectful-code"></a>Testing side-effectful code</h3><p>Looking at the type signature for <code>reverseFile</code>, we have a small problem:</p><pre><code class="pygments"><span class="nf">reverseFile</span> <span class="ow">::</span> <span class="kt">MonadFS</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span></code></pre><p>Specifically, the return type is <code>()</code>. Making any assertions against the result of this function would be completely worthless, given that it’s guaranteed to be the same exact thing each time. Instead, <code>reverseFile</code> is inherently side-effectful, so we want to be able to test that it properly interacts with the file system in the correct way.</p><p>In order to do this, a simple wrapper around <code>Identity</code> won’t be enough, but we can replace it with something more powerful: <code>Writer</code>. Specifically, we can use a writer monad to “log” what gets called in order to test side-effects. We’ll start by creating a new <code>TestM</code> type, just like last time:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">TestM</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">TestM</span> <span class="p">(</span><span class="kt">Writer</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="p">[</span><span class="kt">String</span><span class="p">])</span>
<span class="nf">logTestM</span> <span class="ow">::</span> <span class="kt">TestM</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span>
<span class="nf">logTestM</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=</span> <span class="n">execWriter</span> <span class="n">w</span></code></pre><p>Using this slightly more powerful type, we can write a useful instance of <code>MonadFS</code> that will track the argument given to <code>writeFile</code>:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">MonadFS</span> <span class="kt">TestM</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="kr">_</span> <span class="ow">=</span> <span class="n">return</span> <span class="s">"hello"</span>
<span class="n">writeFile</span> <span class="kr">_</span> <span class="n">contents</span> <span class="ow">=</span> <span class="n">tell</span> <span class="p">[</span><span class="n">contents</span><span class="p">]</span></code></pre><p>Again, the instance is quite simple, but it now enables us to write a straightforward unit test for <code>reverseFile</code>:</p><pre><code class="pygments"><span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"reverseFile"</span> <span class="o">$</span>
<span class="n">it</span> <span class="s">"reverses a file’s contents on the filesystem"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">calls</span> <span class="ow">=</span> <span class="n">logTestM</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">calls</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"olleh"</span><span class="p">]</span></code></pre><p>Again, quite simple to both implement and use, and the test itself is blindingly fast. There’s another problem, though, which is that we have technically left part of <code>reverseFile</code> untested: we’ve completely ignored the <code>path</code> argument.</p><p>In this contrived example, it may seem silly to test something so trivial, but in real code, it’s quite possible that one would care very much about testing multiple different aspects about a single function. When testing <code>renderUserProfile</code>, this was not hard, since we could reuse the same <code>TestM</code> type and <code>MonadDB</code> instance for both test cases, but in the <code>reverseFile</code> example, we’ve ignored the path entirely.</p><p>We <em>could</em> adjust our <code>MonadFS</code> instance to also track the path provided to each method, but this has a few problems. First, it means every test case would depend on all the various properties we are testing, which would mean updating every test case when we add a new one. It would also be simply impossible if we needed to track multiple types—in this particular case, it turns out that <code>String</code> and <code>FilePath</code> are actually the same type, but in practice, there may be a handful of disparate, incompatible types.</p><p>Both of the above issues could be fixed by creating a sum type and manually filtering out the relevant elements in each test case, but a much more intuitive approach would be to simply have a separate instance for each case. Unfortunately, in Haskell, creating a new instance means creating an entirely new type. To illustrate how much duplication that would entail, we could create the following type and instance for testing proper propagation of the <code>path</code> argument:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">TestM'</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">TestM'</span> <span class="p">(</span><span class="kt">Writer</span> <span class="p">[</span><span class="kt">FilePath</span><span class="p">]</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">,</span> <span class="kt">MonadWriter</span> <span class="p">[</span><span class="kt">FilePath</span><span class="p">])</span>
<span class="nf">logTestM'</span> <span class="ow">::</span> <span class="kt">TestM'</span> <span class="n">a</span> <span class="ow">-></span> <span class="p">[</span><span class="kt">FilePath</span><span class="p">]</span>
<span class="nf">logTestM'</span> <span class="p">(</span><span class="kt">TestM'</span> <span class="n">w</span><span class="p">)</span> <span class="ow">=</span> <span class="n">execWriter</span> <span class="n">w</span>
<span class="kr">instance</span> <span class="kt">MonadFS</span> <span class="kt">TestM'</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="n">path</span> <span class="ow">=</span> <span class="n">tell</span> <span class="p">[</span><span class="n">path</span><span class="p">]</span> <span class="o">>></span> <span class="n">return</span> <span class="s">""</span>
<span class="n">writeFile</span> <span class="n">path</span> <span class="kr">_</span> <span class="ow">=</span> <span class="n">tell</span> <span class="p">[</span><span class="n">path</span><span class="p">]</span></code></pre><p>Now it’s possible to add an extra test case that asserts that the proper path is provided to the two filesystem functions:</p><pre><code class="pygments"><span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"reverseFile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"reverses a file’s contents on the filesystem"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">calls</span> <span class="ow">=</span> <span class="n">logTestM</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">calls</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"olleh"</span><span class="p">]</span>
<span class="n">it</span> <span class="s">"operates on the file at the provided path"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">paths</span> <span class="ow">=</span> <span class="n">logTestM'</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">paths</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">"foo.txt"</span><span class="p">]</span></code></pre><p>This works, but it’s ultimately unacceptably complicated. Our test harness code is now significantly larger than the actual tests themselves, and the amount of boilerplate is frustrating. Verbose test suites are especially bad, since forcing programmers to jump through hoops just to implement a single test reduces the likelihood that people will actually write good tests, if they write tests at all. In contrast, if writing tests is easy, then people will naturally write more of them.</p><p>The above strategy to writing tests is not good enough, but it does reveal a particular problem: in Haskell, typeclass instances are not first-class values that can be manipulated and abstracted over, they are static constructs that can only be managed by the compiler, and users do not have a direct way to modify them. With some cleverness, however, we can actually create an approximation of first-class typeclass dictionaries, which will allow us to dramatically simplify the above testing mechanism.</p><h2><a name="creating-first-class-typeclass-instances"></a>Creating first-class typeclass instances</h2><p>In order to provide an easy way to construct instances, we need a way to represent instances as ordinary Haskell values. This is not terribly difficult, given that instances are conceptually just records containing a collection of functions. For example, we could create a datatype that represents an instance of the <code>MonadFS</code> typeclass:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">MonadFSInst</span> <span class="n">m</span> <span class="ow">=</span> <span class="kt">MonadFSInst</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">String</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="p">}</span></code></pre><p>To avoid namespace clashes with the actual method identifiers, the record fields are prefixed with an underscore, but otherwise, the translation is remarkably straightforward. Using this record type, we can easily create values that represent the two instances we defined above:</p><pre><code class="pygments"><span class="nf">contentInst</span> <span class="ow">::</span> <span class="kt">MonadWriter</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFSInst</span> <span class="n">m</span>
<span class="nf">contentInst</span> <span class="ow">=</span> <span class="kt">MonadFSInst</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="kr">_</span> <span class="ow">-></span> <span class="n">return</span> <span class="s">"hello"</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="kr">_</span> <span class="n">contents</span> <span class="ow">-></span> <span class="n">tell</span> <span class="p">[</span><span class="n">contents</span><span class="p">]</span>
<span class="p">}</span>
<span class="nf">pathInst</span> <span class="ow">::</span> <span class="kt">MonadWriter</span> <span class="p">[</span><span class="kt">FilePath</span><span class="p">]</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">MonadFSInst</span> <span class="n">m</span>
<span class="nf">pathInst</span> <span class="ow">=</span> <span class="kt">MonadFSInst</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="n">path</span> <span class="ow">-></span> <span class="n">tell</span> <span class="p">[</span><span class="n">path</span><span class="p">]</span> <span class="o">>></span> <span class="n">return</span> <span class="s">""</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="n">path</span> <span class="kr">_</span> <span class="ow">-></span> <span class="n">tell</span> <span class="p">[</span><span class="n">path</span><span class="p">]</span>
<span class="p">}</span></code></pre><p>These two values represent two different implementations of <code>MonadFS</code>, but since they’re ordinary Haskell values, they can be manipulated and even <em>extended</em> like any other records. This can be extremely useful, since it makes it possible to create a sort of “base” instance, then have individual test cases override individual pieces of functionality piecemeal.</p><p>Of course, although we’ve written these two instances, we have no way to actually use them. After all, Haskell does not provide a way to explicitly provide typeclass dictionaries. Fortunately, we can create a sort of “proxy” type that will use a reader to thread the dictionary around explicitly, and the instance can defer to the dictionary’s implementation.</p><h3><a name="creating-an-instance-proxy"></a>Creating an instance proxy</h3><p>To represent our proxy type, we’ll use a combination of a <code>Writer</code> and a <code>ReaderT</code>; the former to implement the logging used by instances, and the latter to actually thread around the dictionary. Our type will look like this:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">TestM</span> <span class="n">log</span> <span class="n">a</span> <span class="ow">=</span>
<span class="kt">TestM</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="p">(</span><span class="kt">MonadFSInst</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">log</span><span class="p">))</span> <span class="p">(</span><span class="kt">Writer</span> <span class="n">log</span><span class="p">)</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span> <span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span>
<span class="p">,</span> <span class="kt">MonadReader</span> <span class="p">(</span><span class="kt">MonadFSInst</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">log</span><span class="p">))</span>
<span class="p">,</span> <span class="kt">MonadWriter</span> <span class="n">log</span>
<span class="p">)</span>
<span class="nf">logTestM</span> <span class="ow">::</span> <span class="kt">MonadFSInst</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">log</span><span class="p">)</span> <span class="ow">-></span> <span class="kt">TestM</span> <span class="n">log</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">log</span>
<span class="nf">logTestM</span> <span class="n">inst</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=</span> <span class="n">execWriter</span> <span class="p">(</span><span class="n">runReaderT</span> <span class="n">m</span> <span class="n">inst</span><span class="p">)</span></code></pre><p>This might look rather complicated, and it is, but let’s break down exactly what it’s doing.</p><ol><li><p>The <code>TestM</code> type includes two type parameters. The first is the type of value that will be logged (hence the name <code>log</code>), which corresponds to the argument to <code>Writer</code> from previous incarnations of <code>TestM</code>. Unlike those types, though, we want this version to work with any <code>Monoid</code>, so we’ll make it a type parameter. The second parameter is simply the type of the current monadic value, as before.</p></li><li><p>The type itself is defined as a wrapper around a small monad transformer stack, the first of which is <code>ReaderT</code>. The state threaded around by the reader is, in this case, the instance dictionary, which is <code>MonadFSInst</code>.</p><p>However, recall that <code>MonadFSInst</code> accepts a type variable—the type of a monad itself—so we must provide <code>TestM log</code> as an argument to <code>MonadFSInst</code>. This slight bit of indirection allows us to tie the knot between the mutually dependent instances and proxy type.</p></li><li><p>The base monad in the transformer stack is <code>Writer</code>, which is used to actually implement the logging functionality, just like in prior cases. The only difference now is that the <code>log</code> type parameter now determines what the writer actually produces.</p></li><li><p>Finally, as before, we use <code>GeneralizedNewtypeDeriving</code> to derive all the relevant <code>mtl</code> classes, adding the somewhat wordy <code>MonadReader</code> constraint to the list.</p></li></ol><p>Using this single type, we can now implement a <code>MonadFS</code> instance that defers to the dictionary carried around within <code>TestM</code>’s reader state:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">Monoid</span> <span class="n">log</span> <span class="ow">=></span> <span class="kt">MonadFS</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">log</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">readFile</span> <span class="n">path</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">f</span> <span class="ow"><-</span> <span class="n">asks</span> <span class="n">_readFile</span>
<span class="n">f</span> <span class="n">path</span>
<span class="n">writeFile</span> <span class="n">path</span> <span class="n">contents</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">f</span> <span class="ow"><-</span> <span class="n">asks</span> <span class="n">_writeFile</span>
<span class="n">f</span> <span class="n">path</span> <span class="n">contents</span></code></pre><p>This may seem somewhat boilerplate-y, and it is to some extent, but the important consideration is that this boilerplate only needs to be written <em>once</em>. With this in place, it’s now possible to write an arbitrary number of first-class instances that use the above mechanism without extending the mechanism at all.</p><p>To see what actually using this code would look like, let’s update the <code>reverseFile</code> tests to use the new <code>TestM</code> implementation, as well as the <code>contentInst</code> and <code>pathInst</code> dictionaries from earlier:</p><pre><code class="pygments"><span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"reverseFile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"reverses a file’s contents on the filesystem"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">calls</span> <span class="ow">=</span> <span class="n">logTestM</span> <span class="n">contentInst</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">calls</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"olleh"</span><span class="p">]</span>
<span class="n">it</span> <span class="s">"operates on the file at the provided path"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">paths</span> <span class="ow">=</span> <span class="n">logTestM</span> <span class="n">pathInst</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">paths</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">"foo.txt"</span><span class="p">]</span></code></pre><p>We can do a little bit better, though. Really, the definitions of <code>contentInst</code> and <code>pathInst</code> are specific to each test case. With ordinary typeclass instances, we cannot scope them to any particular block, but since <code>MonadFSInst</code> is just an ordinary Haskell datatype, we can manipulate them just like any other Haskell values. Therefore, we can just inline those instances’ definitions into the test cases themselves to keep them closer to the actual tests.</p><pre><code class="pygments"><span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"reverseFile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"reverses a file’s contents on the filesystem"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">contentInst</span> <span class="ow">=</span> <span class="kt">MonadFSInst</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="kr">_</span> <span class="ow">-></span> <span class="n">return</span> <span class="s">"hello"</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="kr">_</span> <span class="n">contents</span> <span class="ow">-></span> <span class="n">tell</span> <span class="p">[</span><span class="n">contents</span><span class="p">]</span>
<span class="p">}</span>
<span class="kr">let</span> <span class="n">calls</span> <span class="ow">=</span> <span class="n">logTestM</span> <span class="n">contentInst</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">calls</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"olleh"</span><span class="p">]</span>
<span class="n">it</span> <span class="s">"operates on the file at the provided path"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">pathInst</span> <span class="ow">=</span> <span class="kt">MonadFSInst</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="n">path</span> <span class="ow">-></span> <span class="n">tell</span> <span class="p">[</span><span class="n">path</span><span class="p">]</span> <span class="o">>></span> <span class="n">return</span> <span class="s">""</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="n">path</span> <span class="kr">_</span> <span class="ow">-></span> <span class="n">tell</span> <span class="p">[</span><span class="n">path</span><span class="p">]</span>
<span class="p">}</span>
<span class="kr">let</span> <span class="n">paths</span> <span class="ow">=</span> <span class="n">logTestM</span> <span class="n">pathInst</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span>
<span class="n">paths</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">"foo.txt"</span><span class="p">]</span></code></pre><p>This is pretty good. We’re now able to create inline instances of our <code>MonadFS</code> typeclass, which allows us to write extremely concise unit tests using ordinary Haskell typeclasses as system seams. We’ve managed to cut down on the boilerplate considerably, though we still have a couple problems. For one, this example only uses a single typeclass containing only two methods. A real <code>MonadFS</code> typeclass would likely have at least a dozen methods for performing various filesystem operations, and writing out the instance dictionaries for every single method, even the ones that aren’t used within the code under test, would be pretty frustratingly verbose.</p><p>This problem is solvable, though. Since instances are just ordinary Haskell records, we can create a “base” instance that just throws an exception whenever the method is called:</p><pre><code class="pygments"><span class="nf">baseInst</span> <span class="ow">::</span> <span class="kt">MonadFSInst</span> <span class="n">m</span>
<span class="nf">baseInst</span> <span class="ow">=</span> <span class="kt">MonadFSInst</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="ne">error</span> <span class="s">"unimplemented instance method ‘_readFile’"</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="ne">error</span> <span class="s">"unimplemented instance method ‘_writeFile’"</span>
<span class="p">}</span></code></pre><p>Then code that only uses <code>readFile</code> could only override that particular method, for example:</p><pre><code class="pygments"><span class="kr">let</span> <span class="n">myInst</span> <span class="ow">=</span> <span class="n">baseInst</span> <span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="o">...</span> <span class="p">}</span></code></pre><p>Normally, of course, this would be a terrible idea. However, since this is all just test code, it can be extremely useful in quickly figuring out what methods need to be stubbed out for a particular test case. Since all the code actually gets run at test time, attempts to use unimplemented instance methods will immediately raise an error, informing the programmer which methods need to be implemented to make the test pass. This can also help to significantly cut down on the amount of effort it takes to implement each test.</p><p>Another problem is that our approach is specialized exclusively to <code>MonadFS</code>. What about functions that use both <code>MonadFS</code> <em>and</em> <code>MonadDB</code>, for example? Fortunately, that is not hard to solve, either. We can adapt the <code>MonadFSInst</code> type to include fields for all of the typeclasses relevant to our system, turning it into a generic test fixture of sorts:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">FixtureInst</span> <span class="n">m</span> <span class="ow">=</span> <span class="kt">FixtureInst</span>
<span class="p">{</span> <span class="c1">-- MonadFS</span>
<span class="n">_readFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">String</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">::</span> <span class="kt">FilePath</span> <span class="ow">-></span> <span class="kt">String</span> <span class="ow">-></span> <span class="n">m</span> <span class="nb">()</span>
<span class="c1">-- MonadDB</span>
<span class="p">,</span> <span class="n">_fetchUser</span> <span class="ow">::</span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">User</span>
<span class="p">,</span> <span class="n">_fetchRecentPosts</span> <span class="ow">::</span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">[</span><span class="kt">Post</span><span class="p">]</span>
<span class="p">}</span></code></pre><p>Updating <code>TestM</code> to use <code>FixtureInst</code> instead of <code>MonadFSInst</code> is trivial, and all the rest of the infrastructure still works. However, this means that every time a new typeclass is added, three things need to be updated:</p><ol><li><p>Its methods need to be added to the <code>FixtureInst</code> record.</p></li><li><p>Those methods need to be given error-raising defaults in the <code>baseInst</code> value.</p></li><li><p>An actual instance of the typeclass needs to be written for <code>TestM</code> that defers to the <code>FixtureInst</code> value.</p></li></ol><p>Furthermore, most of this manual manipulation of methods is required every time a particular typeclass changes, whether that means adding a method, removing a method, renaming a method, or changing a method’s type. This is especially frustrating given that all this code is really just mechanical boilerplate that could all be derived by the set of typeclasses being tested.</p><p>That last point is especially important: aside from the instances themselves, every piece of boilerplate above is obviously possible to generate from existing types alone. With that piece of information in mind, we can do even better: we can use Template Haskell.</p><h2><a name="removing-the-boilerplate-using-test-fixture"></a>Removing the boilerplate using <code>test-fixture</code></h2><p>The above code was not only rather boilerplate-heavy, it was pretty complicated. Fortunately, you don’t actually have to write it. Enter the library <a href="http://hackage.haskell.org/package/test-fixture"><code>test-fixture</code></a>:</p><pre><code class="pygments"><span class="kr">import</span> <span class="nn">Control.Monad.TestFixture</span>
<span class="kr">import</span> <span class="nn">Control.Monad.TestFixture.TH</span>
<span class="nf">mkFixture</span> <span class="s">"FixtureInst"</span> <span class="p">[</span><span class="kt">''MonadFS</span><span class="p">,</span> <span class="kt">''MonadDB</span><span class="p">]</span>
<span class="nf">spec</span> <span class="ow">=</span> <span class="n">describe</span> <span class="s">"reverseFile"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">it</span> <span class="s">"reverses a file’s contents on the filesystem"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">contentInst</span> <span class="ow">=</span> <span class="n">def</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="kr">_</span> <span class="ow">-></span> <span class="n">return</span> <span class="s">"hello"</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="kr">_</span> <span class="n">contents</span> <span class="ow">-></span> <span class="n">log</span> <span class="n">contents</span>
<span class="p">}</span>
<span class="kr">let</span> <span class="n">calls</span> <span class="ow">=</span> <span class="n">logTestFixture</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span> <span class="n">contentInst</span>
<span class="n">calls</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"olleh"</span><span class="p">]</span>
<span class="n">it</span> <span class="s">"operates on the file at the provided path"</span> <span class="o">$</span> <span class="kr">do</span>
<span class="kr">let</span> <span class="n">pathInst</span> <span class="ow">=</span> <span class="n">def</span>
<span class="p">{</span> <span class="n">_readFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="n">path</span> <span class="ow">-></span> <span class="n">log</span> <span class="n">path</span> <span class="o">>></span> <span class="n">return</span> <span class="s">""</span>
<span class="p">,</span> <span class="n">_writeFile</span> <span class="ow">=</span> <span class="nf">\</span><span class="n">path</span> <span class="kr">_</span> <span class="ow">-></span> <span class="n">log</span> <span class="n">path</span>
<span class="p">}</span>
<span class="kr">let</span> <span class="n">paths</span> <span class="ow">=</span> <span class="n">logTestFixture</span> <span class="p">(</span><span class="n">reverseFile</span> <span class="s">"foo.txt"</span><span class="p">)</span> <span class="n">pathInst</span>
<span class="n">paths</span> <span class="p">`</span><span class="n">shouldBe</span><span class="p">`</span> <span class="p">[</span><span class="s">"foo.txt"</span><span class="p">,</span> <span class="s">"foo.txt"</span><span class="p">]</span></code></pre><p><strong>That’s it.</strong> The above code automatically generates everything you need to write fast, simple, deterministic unit tests in Haskell. The <code>mkFixture</code> function is a Template Haskell macro that expands into a definition quite similar to the <code>FixtureInst</code> type we wrote by hand, but since it’s automatically generated from the typeclass definitions, it never needs to be updated.</p><p>The <code>logTestFixture</code> function replaces the <code>logTestM</code> function we wrote by hand, but it works exactly the same. The <code>Control.Monad.TestFixture</code> library also exports a <code>log</code> function that is a synonym for <code>tell . singleton</code>, but using <code>tell</code> directly still works if you prefer.</p><p>The <code>mkFixture</code> function also generates a <code>Default</code> instance, which replaces the <code>baseInst</code> value defined earlier. It functions the same way, though, producing useful error messages that refer to the names of unimplemented typeclass methods that have not been stubbed out.</p><p>This blog post is not a <code>test-fixture</code> tutorial—indeed, it is much more complicated than a <code>test-fixture</code> tutorial would be, since it covers what the library is really doing under the hood—but if you’re interested, I would highly recommend you take a look at the <a href="http://hackage.haskell.org/package/test-fixture"><code>test-fixture</code> documentation on Hackage</a>.</p><h2><a name="conclusion-credits-and-similar-techniques"></a>Conclusion, credits, and similar techniques</h2><p>This blog post came about as the result of a need my coworkers and I found when writing Haskell code; we wanted a way to write unit tests quickly and easily, but we didn’t find much advice from the rest of the Haskell ecosystem. The <code>test-fixture</code> library is the result of that exploratory work, and we currently use it to test a significant portion of our Haskell code.</p><p>It would be extremely unfair to suggest that I was the inventor of this technique or the inventor of the library. Two of my coworkers, <a href="https://github.com/jxv">Joe Vargas</a> and <a href="https://github.com/aztecrex">Greg Wiley</a>, came up with the general approach and wrote <code>Control.Monad.TestFixture</code>, and I simply wrote the Template Haskell macro to eliminate the boilerplate. With that in mind, I think I can say with some fairness that I think this technique is a joy to use when unit testing is a desirable goal, and I would definitely recommend it if you are interested in doing isolated testing in Haskell.</p><p>The general technique of using typeclasses to emulate effects was in part inspired by the well-known <code>mtl</code> library. An alternate approach to writing unit-testable Haskell code is using free monads, but overall, I prefer this approach over free monads because the typeclass constraints add type safety in ways that free monads do not (at least not without additional boilerplate), and this approach also lends itself well to static analysis-based boilerplate reduction techniques. It has its own tradeoffs, though, so if you’ve had success with free monads, then I certainly make no claim this is a superior approach, just one that I’ve personally found pleasant.</p><p>As a final note, if you <em>do</em> check out <code>test-fixture</code>, feel free to leave feedback by opening issues on <a href="https://github.com/cjdev/test-fixture/issues">the GitHub issue tracker</a>—even things like confusing documentation are worth a bug report.</p><ol class="footnotes"></ol></article>Understanding the npm dependency model2016-08-24T00:00:00Z2016-08-24T00:00:00ZAlexis King<article><p>Currently, <a href="https://www.npmjs.com">npm</a> is <em>the</em> package manager for the frontend world. Sure, there are alternatives, but for the time being, npm seems to have won. Even tools like <a href="https://bower.io">Bower</a> are being pushed to the wayside in favor of the One True Package Manager, but what’s most interesting to me is npm’s relatively novel approach to dependency management. Unfortunately, in my experience, it is actually not particularly well understood, so consider this an attempt to clarify how exactly it works and how it affects <strong>you</strong> as a user or package developer.</p><h2><a name="first-the-basics"></a>First, the basics</h2><p>At a high level, npm is not too dissimilar from other package managers for programming languages: packages depend on other packages, and they express those dependencies with <em>version ranges</em>. npm happens to use the <a href="http://semver.org">semver</a> versioning scheme to express those ranges, but the way it performs version resolution is mostly immaterial; what matters is that packages can depend on ranges rather than specific versions of packages.</p><p>This is rather important in any ecosystem, since locking a library to a specific set of dependencies could cause significant problems, but it’s actually much less of a problem in npm’s case compared to other, similar package systems. Indeed, it is often safe for a library author to pin a dependency to a specific version without affecting dependent packages or applications. The tricky bit is determining <em>when</em> this is safe and when it’s not, and this is what I so frequently find that people get wrong.</p><h2><a name="dependency-duplication-and-the-dependency-tree"></a>Dependency duplication and the dependency tree</h2><p>Most users of npm (or at least most package authors) eventually learn that, unlike other package managers, npm installs a <em>tree</em> of dependencies. That is, every package installed gets its own set of dependencies rather than forcing every package to share the same canonical set of packages. Obviously, virtually every single package manager in existence has to model a dependency tree at some point, since that’s how dependencies are expressed by programmers.</p><p>For example, consider two packages, <code>foo</code> and <code>bar</code>. Each of them have their own set of dependencies, which can be represented as a tree:</p><pre><code>foo
├── hello ^0.1.2
└── world ^1.0.7
bar
├── hello ^0.2.8
└── goodbye ^3.4.0
</code></pre><p>Imagine an application that depends on <em>both</em> <code>foo</code> and <code>bar</code>. Obviously, the <code>world</code> and <code>goodbye</code> dependencies are totally unrelated, so how npm handles them is relatively uninteresting. However, consider the case of <code>hello</code>: both packages require conflicting versions.</p><p>Most package managers (including RubyGems/Bundler, pip, and Cabal) would simply barf here, reporting a version conflict. This is because, in most package management models, <strong>only one version of any particular package can be installed at a time</strong>. In that sense, one of the package manager’s primary responsibilities is to figure out a set of package versions that will satisfy every version constraint simultaneously.</p><p>In contrast, npm has a somewhat easier job: it’s totally okay with installing different versions of the same package because each package gets its own set of dependencies. In the aforementioned example, the resulting directory structure would look something like this:</p><pre><code>node_modules/
├── foo/
│ └── node_modules/
│ ├── hello/
│ └── world/
└── bar/
└── node_modules/
├── hello/
└── goodbye/
</code></pre><p>Notably, the directory structure very closely mirrors the actual dependency tree. The above diagram is something of a simplification: in practice, each transitive dependency would have its own <code>node_modules</code> directory and so on, but the directory structure can get pretty messy pretty quickly. (Furthermore, npm 3 performs some optimizations to attempt to share dependencies when it can, but those are ultimately unnecessary to actually understanding the model.)</p><p>This model is, of course, extremely simple. The obvious effect is that every package gets its own little sandbox, which works absolutely marvelously for utility libraries like <code>ramda</code>, <code>lodash</code>, or <code>underscore</code>. If <code>foo</code> depends on <code>ramda@^0.19.0</code> but <code>bar</code> depends on <code>ramda@^0.22.0</code>, they can both coexist completely peacefully without any problems.</p><p>At first blush, this system is <em>obviously</em> better than the alternative, flat model, so long as the underlying runtime supports the required module loading scheme. However, it is not without drawbacks.</p><p>The most apparent downside is a significant increase in code size, given the potential for many, many copies of the same package, all with different versions. An increase in code size can often mean more than just a larger program—it can have a significant impact on performance. Larger programs just don’t fit into CPU caches as easily, and merely having to page a program in and out can significantly slow things down. That’s mostly just a tradeoff, though, since you’re sacrificing performance, not program correctness.</p><p>The more insidious problem (and the one that I see crop up quite a lot in the npm ecosystem without much thought) is how dependency isolation can affect cross-package communication.</p><h2><a name="dependency-isolation-and-values-that-pass-package-boundaries"></a>Dependency isolation and values that pass package boundaries</h2><p>The earlier example of using <code>ramda</code> is a place where npm’s default dependency management scheme really shines, given that Ramda just provides a bunch of plain ol’ functions. Passing these around is totally harmless. In fact, mixing functions from two different versions of Ramda would be totally okay! Unfortunately, not all cases are nearly that simple.</p><p>Consider, for a moment, <code>react</code>. React components are very much <em>not</em> plain old data; they are complex values that can be extended, instantiated, and rendered in a variety of ways. React represents component structure and state using an internal, private format, using a mixture of carefully arranged keys and values and some of the more powerful features of JavaScript’s object system. This internal structure might very well change between React versions, so a React component defined with <code>react@0.3.0</code> likely won’t work quite right with <code>react@15.3.1</code>.</p><p>With that in mind, consider two packages that define their own React components and export them for consumers to use. Looking at their dependency tree, we might see something like this:</p><pre><code>awesome-button
└── react ^0.3.0
amazing-modal
└── react ^15.3.1
</code></pre><p>Given that these two packages use wildly different versions of React, npm would give each of them their own copy of React, as requested, and packages would happily install. However, if you tried to use these components together, they wouldn’t work at all! A newer version of React simply cannot understand an old version’s component, so you would get a (likely confusing) runtime error.</p><p>What went wrong? Well, dependency isolation works great when a package’s dependencies are purely implementation details, never observable from outside of a package. However, as soon as a package’s dependency becomes exposed as part of its <em>interface</em>, dependency isolation is not only subtly wrong, it can cause complete failure at runtime. These are cases when traditional dependency management are much better—they will tell you as soon as you attempt to install two packages that they just don’t work together, rather than waiting for you to figure that out for yourself.</p><p>This might not sound <em>too</em> bad—after all, JavaScript is a very dynamic language, so static guarantees are mostly few and far between, and your tests should catch these problems should they arise—but it can cause unnecessary issues when two packages <em>can</em> theoretically work together fine, but because npm assigned each one its own copy of a particular package (that is, it wasn’t quite smart enough to figure out it could give them both the same copy), things break down.</p><p>Looking outside of npm specifically and considering this model when applied to other languages, it becomes increasingly clear that this won’t do. This blog post was inspired by <a href="https://www.reddit.com/r/haskell/comments/4zc6y3/why_doesnt_cabal_use_a_model_like_that_of_npm/?ref=share&ref_source=link">a Reddit thread discussing the npm model applied to Haskell</a>, and this flaw was touted as a reason why it couldn’t possibly work for such a static language.</p><p>Due to the way the JavaScript ecosystem has evolved, it’s true that most people can often get away with this subtle potential for incorrect behavior without any problems. Specifically, JavaScript tends to rely on duck typing rather than more restrictive checks like <code>instanceof</code>, so objects that satisfy the same protocol will still be compatible, even if their implementations aren’t <em>quite</em> the same. However, npm actually provides a robust solution to this problem that allows package authors to explicitly express these “cross-interface” dependencies.</p><h3><a name="peer-dependencies"></a>Peer dependencies</h3><p>Normally, npm package dependencies are listed under a <code>"dependencies"</code> key in the package’s <code>package.json</code> file. There is, however, another, less-used key called <code>"peerDependencies"</code>, which has the same format as the ordinary dependencies list. The difference shows up in how npm performs dependency resolution: rather than getting its own copy of a peer dependency, a package expects that dependency to be provided by its dependent.</p><p>This effectively means that peer dependencies are effectively resolved using the “traditional” dependency resolution mechanism that tools like Bundler and Cabal use: there must be one canonical version that satisfies everyone’s constraint. Since npm 3, things are a little bit less straightforward (specifically, peer dependencies are not automatically installed unless a dependent package explicitly depends on the peer package itself), but the basic idea is the same. This means that package authors must make a choice for each dependency they install: should it be a normal dependency or a peer dependency?</p><p>This is where I think people tend to get a little lost, even those familiar with the peer dependency mechanism. Fortunately, the answer is relatively simple: is the dependency in question visible in <em>any place</em> in the package’s interface?</p><p>This is sometimes hard to see in JavaScript because the “types” are invisible; that is, they are dynamic and rarely explicitly written out. However, just because the types are dynamic does not mean they are not there at runtime (and in the heads of various programmers), so the rule still holds: if the type of a function in a package’s public interface somehow depends on a dependency, it should be a peer dependency.</p><p>To make this a little more concrete, let’s look at a couple of examples. First off, let’s take a look at some simple cases, starting with some uses of <code>ramda</code>:</p><pre><code class="pygments"><span class="kr">import</span> <span class="p">{</span> <span class="nx">merge</span><span class="p">,</span> <span class="nx">add</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">'ramda'</span>
<span class="kr">export</span> <span class="kr">const</span> <span class="nx">withDefaultConfig</span> <span class="o">=</span> <span class="p">(</span><span class="nx">config</span><span class="p">)</span> <span class="p">=></span>
<span class="nx">merge</span><span class="p">({</span> <span class="nx">path</span><span class="o">:</span> <span class="s1">'.'</span> <span class="p">},</span> <span class="nx">config</span><span class="p">)</span>
<span class="kr">export</span> <span class="kr">const</span> <span class="nx">add5</span> <span class="o">=</span> <span class="nx">add</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span></code></pre><p>The first example here is pretty obvious: in <code>withDefaultConfig</code>, <code>merge</code> is used purely as an implementation detail, so it’s safe, and it’s not part of the module’s interface. In <code>add5</code>, the example is a little trickier: the result of <code>add(5)</code> is a partially-applied function created by Ramda, so technically, a Ramda-created value is a part of this module’s interface. However, the contract <code>add5</code> has with the outside world is simply that it is a JavaScript function that adds five to its argument, and it doesn’t depend on any Ramda-specific functionality, so <code>ramda</code> can safely be a non-peer dependency.</p><p>Now let’s look at another example using the <code>jpeg</code> image library:</p><pre><code class="pygments"><span class="kr">import</span> <span class="p">{</span> <span class="nx">Jpeg</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">'jpeg'</span>
<span class="kr">export</span> <span class="kr">const</span> <span class="nx">createSquareBuffer</span> <span class="o">=</span> <span class="p">(</span><span class="nx">size</span><span class="p">,</span> <span class="nx">cb</span><span class="p">)</span> <span class="p">=></span>
<span class="nx">createSquareJpeg</span><span class="p">(</span><span class="nx">size</span><span class="p">).</span><span class="nx">encode</span><span class="p">(</span><span class="nx">cb</span><span class="p">)</span>
<span class="kr">export</span> <span class="kr">const</span> <span class="nx">createSquareJpeg</span> <span class="o">=</span> <span class="p">(</span><span class="nx">size</span><span class="p">)</span> <span class="p">=></span>
<span class="k">new</span> <span class="nx">Jpeg</span><span class="p">(</span><span class="nx">Buffer</span><span class="p">.</span><span class="nx">alloc</span><span class="p">(</span><span class="nx">size</span> <span class="o">*</span> <span class="nx">size</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span> <span class="nx">size</span><span class="p">,</span> <span class="nx">size</span><span class="p">)</span></code></pre><p>In this case, the <code>createSquareBuffer</code> function invokes a callback with an ordinary Node.js <code>Buffer</code> object, so the <code>jpeg</code> library is an implementation detail. If that were the only function exposed by this module, <code>jpeg</code> could safely be a non-peer dependency. However, the <code>createSquareJpeg</code> function violates that rule: it returns a <code>Jpeg</code> object, which is an opaque value with a structure defined exclusively by the <code>jpeg</code> library. Therefore, a package with the above module <em>must</em> list <code>jpeg</code> as a peer dependency.</p><p>This sort of restriction works in reverse, too. For example, consider the following module:</p><pre><code class="pygments"><span class="kr">import</span> <span class="p">{</span> <span class="nx">writeFile</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">'fs'</span>
<span class="kr">export</span> <span class="kr">const</span> <span class="nx">writeJpeg</span> <span class="o">=</span> <span class="p">(</span><span class="nx">filename</span><span class="p">,</span> <span class="nx">jpeg</span><span class="p">,</span> <span class="nx">cb</span><span class="p">)</span> <span class="p">=></span>
<span class="nx">jpeg</span><span class="p">.</span><span class="nx">encode</span><span class="p">((</span><span class="nx">image</span><span class="p">)</span> <span class="p">=></span> <span class="nx">fs</span><span class="p">.</span><span class="nx">writeFile</span><span class="p">(</span><span class="nx">filename</span><span class="p">,</span> <span class="nx">image</span><span class="p">,</span> <span class="nx">cb</span><span class="p">))</span></code></pre><p>The above module does not even <em>import</em> the <code>jpeg</code> package, yet it implicitly depends on the <code>encode</code> method of the <code>Jpeg</code> interface. Therefore, despite not even explicitly using it anywhere in the code, a package containing the above module should include <code>jpeg</code> as a peer dependency.</p><p>They key is to carefully consider what contract your modules have with their dependents. If those contracts involve other packages in any way, they should be peer dependencies. If they don’t, they should be ordinary dependencies.</p><h2><a name="applying-the-npm-model-to-other-programming-languages"></a>Applying the npm model to other programming languages</h2><p>The npm model of package management is more complicated than that of other languages, but it provides a real advantage: implementation details are kept as implementation details. In other systems, it’s quite possible to find yourself in “dependency hell”, when you personally know that the version conflict reported by your package manager is not a real problem, but because the package system must pick a single canonical version, there’s no way to make progress without adjusting code in your dependencies. This is extremely frustrating.</p><p>This sort of dependency isolation is not the most advanced form of package management in existence—indeed, far from it—but it’s definitely more powerful than most other mainstream systems out there. Of course, most other languages could not adopt the npm model simply by changing the package manager: having a global package namespace can prevent multiple versions of the same package being installed at a <em>runtime</em> level. The reason npm is able to do what it does is because Node itself supports it.</p><p>That said, the dichotomy between peer and non-peer dependencies is a little confusing, especially to people who aren’t package authors. Figuring out which packages need to go in which group is not always obvious or trivial. Fortunately, other languages might be able to help.</p><p>Returning to Haskell, its strong static type system would potentially allow this distinction to be detected entirely automatically, and Cabal could actually report an error when a package used in an exposed interface was not listed as a peer dependency (much like how it currently prevents importing a transitive dependency without explicitly depending on it). This would allow helper function packages to keep on being implementation details while still maintaining strong interface safety. This would likely take a lot of work to get just right—managing the global nature of typeclass instances would likely make this much more complicated than a naïve approach would accommodate—but it would add a nice layer of flexibility that does not currently exist.</p><p>From the perspective of JavaScript, npm has demonstrated that it can be a capable package manager, despite the monumental burden placed upon it by the ever-growing, ever-changing JS ecosystem. As a package author myself, I would implore other users to carefully consider the peer dependencies feature and work hard to encode their interfaces’ contracts using it—it’s a commonly misunderstood gem of the npm model, and I hope this blog post helped to shed at least a little more light upon it.</p><ol class="footnotes"></ol></article>Climbing the infinite ladder of abstraction2016-08-11T00:00:00Z2016-08-11T00:00:00ZAlexis King<article><p>I started programming in elementary school.</p><p>When I was young, I was fascinated by the idea of automation. I loathed doing the same repetitive task over and over again, and I always yearned for a way to <a href="https://xkcd.com/974/">solve the general problem</a>. When I learned about programming, I was immediately hooked: it was <em>so easy</em> to turn repetitive tasks into automated pipelines that would free me from ever having to do the same dull, frustrating exercise ever again.</p><p>Of course, one of the first things I found out once I’d started was that nothing is ever quite so simple. Before long, my solutions to eliminate repetition grew repetitive, and it became clear I spent a lot of time typing out the same things, over and over again, creating the very problem I had initially set out to destroy. It was through this that I grew interested in functions, classes, and other repetition-reducing aids, and soon enough, I discovered the wonderful world of <strong>abstraction</strong>.</p><h2><a name="the-brick-wall-of-inexpressiveness"></a>The brick wall of inexpressiveness</h2><p>When I started programming, I was mostly playing with ActionScript and Java, just tinkering with things and seeing what I could come up with. I had quite a lot of fun, and the joy of solving problems hooked me almost immediately, but I also ran into frustrations pretty quickly. Specifically, I started writing a lot of code that looked like this:</p><pre><code class="pygments"><span class="kd">public</span> <span class="n">String</span> <span class="nf">getName</span><span class="o">()</span> <span class="o">{</span>
<span class="k">return</span> <span class="k">this</span><span class="o">.</span><span class="na">name</span><span class="o">;</span>
<span class="o">}</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">setName</span><span class="o">(</span><span class="n">String</span> <span class="n">name</span><span class="o">)</span> <span class="o">{</span>
<span class="k">this</span><span class="o">.</span><span class="na">name</span> <span class="o">=</span> <span class="n">name</span><span class="o">;</span>
<span class="o">}</span></code></pre><p>This is a bit of a cheap example, given that Java getters and setters are something of a programming language punching bag at this point, but I really did write them, and I really did get frustrated by them! I learned object-oriented design patterns, and I pored over books, forum threads, blog posts, and Stack Overflow questions about how to structure code to prevent spaghetti, but no matter how hard I tried, I kept having to type things that looked suspiciously similar to each other.</p><p>It was really quite frustrating, because no matter how I approached the problem, I ended up with a boilerplate-heavy mess. The <em>whole reason</em> I got started programming was to avoid this sort of thing, so what could I do? Well, it became increasingly obvious to me that Java had to go, and I needed to try something else. I started learning two very different programming languages, JavaScript and Objective-C, and I liked them both, for different reasons.</p><p>When I learned JavaScript, I discovered the closure, the first-class function, and I was entranced by it. Through jQuery, I learned of its power to design APIs that could be fun to use, dropping the boring, “heavy” feeling that Java carried around everywhere. With Objective-C, on the other hand, I learned about the power of a more dynamic object system, something with interesting syntax and the ability to handle “message passing” at a far higher level than Java ever could.</p><p>Both of these languages were flawed, as all languages are, but they opened my mind to the idea that <em>programming languages</em> could drastically influence the way I thought about problem solving, and they set me on a quest to find the programming language that would eliminate boilerplate once and for all.</p><h2><a name="discovering-lisp"></a>Discovering Lisp</h2><p>Over the next few years, I grew to appreciate JavaScript’s small, simple core, despite rather disliking its object system and poor faculties for user-friendly data modeling. I pored over its history, and I found out that its design was heavily influenced by an obscure little language called Scheme, as well as an even more obscure language called Self, and a part of me started to wonder what it would be like to incorporate those languages’ ideas without some of the compromises JavaScript had made.</p><p>This idea lingered in the back of my head for a couple years, and while I tried to play with Scheme a couple times, it was simply too inaccessible for me. I was used to languages with powerful, easy to use IDEs, and when I found myself with nothing more than a command-line executable and rather scarce documentation, I was at a loss for how to begin. Even if I could do math in the REPL, where could I go from there? I’d started programming by building games, then websites. What could I possibly do with Scheme?</p><p>The language (or rather, its lack of an ecosystem) proved too intimidating for me at that young age, but the idea of Lisp’s homoiconicity stuck with me. Eventually, I started to design my very own programming language, a <a href="https://github.com/lexi-lambda/libsol">highly dynamic Lisp with a prototypal object system called Sol</a>. I worked on it for about a year, and when I was done with it, it had a not-too-shabby complement of features: it had lambdas, macros, a fully-featured object model, and a CommonJS-esque module system, complete with the ability to dynamically import arbitrary C extensions. It was by far the largest project I’d ever worked on, and when I was done, I was pretty pleased.</p><p>Unfortunately, it was also abysmally slow.</p><p>I turned to a local college to find some people who could give me feedback and maybe point me in the right direction, and someone told me about another obscure programming language called <a href="http://racket-lang.org">Racket</a>. At about the same time, someone pointed me to a totally different language called <a href="https://www.haskell.org">Haskell</a>. This was uncharted territory for me, and for a while, I didn’t really explore either of those languages further. Eventually, though, I dove into them in earnest, and what I found has dramatically altered my perspective on programming since then.</p><h2><a name="a-journey-into-complexity"></a>A journey into complexity</h2><p>Fast forward about three years, and today, I am employed writing Haskell, and I spend most of my free time writing Racket. These languages left a mark on me, and while I’ve learned <em>so much more</em> since then, I find myself continually bucking the mainstream and coming back to functional programming, hygienic macros, and possibly the most powerful type system in existence in a production-ready programming language.</p><p>I’ve also started realizing something else, though: <strong>the languages I’ve settled into are <em>really complicated</em>.</strong></p><p>When I started programming, I thought about things like numbers, text, and shapes on a screen. Before long, I learned about functions, then classes, then message-passing and lambdas. I dove into macros and typeclasses, and now I speak in functors and monads, sets of scopes and internal definition contexts, and parser combinators and domain specific languages.</p><p>Why?</p><p>Sometimes I talk to fellow programmers, and they are horrified by the types of terms I fling around. “Why would you ever need something called a ‘monad’?” they ask, completely perplexed. “Macros are confusing,” they argue. “Being explicit is better.”</p><p>Obviously, I disagree, but why? What have I given up? If my fellow programmers cannot understand what I’m writing, is it actually worth it?</p><p>I’ve searched for years to find a programming language that will eliminate boilerplate, that will allow me to express my ideas succinctly and cleanly, that will let me turn hard problems into trivial ones, and I’ve discovered two completely different approaches to tackling those issues. Racket has macros, and Haskell has its fancy type system. Both of these things are lightyears ahead of where I was nearly a decade ago, writing dozens of lines of repetitive Java that ultimately did very little, but I’m still dealing with the same problems.</p><p>Racket knows too little about my program—it can’t figure out what I mean based on the type of thing I’m operating on because it is (mostly) dynamically typed. I <em>still</em> have to clarify myself and write things that feel redundant because the computer isn’t smart enough to figure out the “obvious”. Similarly, Haskell is too limiting—the compiler cannot deduce constraints I can solve in my head in seconds, and its syntax is not extensible like Racket’s is. Every day, I peer into piles upon piles of monadic computation, and really, what have I gained?</p><h3><a name="improvement-but-never-mastery"></a>Improvement, but never mastery</h3><p>Like almost anything in life, programming is not really a perfectable art. There’s always some unlearned skill or undiscovered technique, and part of this potential for perpetual self-improvement is one of the things that I find so attractive about the field. That said, I this it is reasonable to say that certain languages have higher ceilings than others.</p><p>For example I am pretty confident that I <em>get</em> JavaScript. The language has lots of nooks and crannies that I don’t completely understand, but I feel pretty confident that I understand its semantics well enough to be able to grasp any piece of JavaScript code without too much incredulity. Now, that’s not to say that JavaScript is a simplistic language—far from it—but most of the ways I improve my JavaScripting abilities are learning new techniques <em>within</em> the language, not entirely new linguistic constructs.</p><p>On the other hand, languages like Haskell and Racket tend to blur the line. I feel like I have a good grasp of Haskell’s core, but do I have a good intuition for laziness? Do I completely grok type families? What about <code>TypeInType</code>? Ultimately, I have to come to the conclusion that I do not fully understand Haskell, much less a lot of the advanced category theory that composes some of its most powerful libraries. Racket manages to blur the line between language and library even further, and while I consider myself a decent Racketeer, I absolutely do <em>not</em> have a good grasp on all the intricacies of Racket’s macro system.</p><p>This is especially obvious to me at work, given that I write Haskell in a team setting. Just like back when I was writing Java, I end up with solutions that don’t satisfy me, and I reach for increasingly powerful constructs to help alleviate my qualms. Sometimes, I find myself cracking out <code>DataKinds</code>, and it might even help my problem, but there’s a cost: my coworkers are sometimes confused.</p><p>Every time I climb to the next rung on the ladder of abstraction, those only a couple rungs below me (even if we’re all hundreds of rungs up!) find themselves perplexed. In the worst case, people may even blame their confusion on their own inadequacy or lack of skill. This is <em>terrible</em>, especially when I know that, by the time they’ve caught up, I’ll be off playing with some new toy: comonads or type families or classy lenses. The cycle continues, and nobody is ever truly satisfied—I always want to find a new abstraction that will make things simpler, and those just a couple steps behind me struggle to keep up.</p><p>Of course, I experience it from the opposite perspective just as often: I delve into Edward Kmett’s fancier libraries or Phil Freeman’s blog posts about category theory, and I recognize that I am rather lost. Sometimes, I find myself understanding things, but just as often, I cannot wrap my head around the concepts being discussed. I may figure them out eventually, sure, but by then everyone else has moved on to even <em>more</em> advanced things, and still, none of them truly solve my problems.</p><h2><a name="ultimately-it-all-has-at-least-a-little-value"></a>Ultimately, it all has (at least a little) value</h2><p>It would be nice to think about all that and say, well, “Let’s finally break the cycle. Let’s stop deluding ourselves into thinking our solutions to our self-made problems are actually solving anything.” It would be great if I could tell myself that, but I unfortunately really can’t.</p><p>The scariest part of all is that I think it’s completely worthwhile.</p><p>So much of these more and more complicated abstractions are trying to do the same basic thing: come up with a better way of modeling the problem. In some sense, that’s all programming really is, modeling a domain in a way that can be leveraged by a digital computer. Our increasingly complicated DSLs <em>seem</em> unnecessarily complicated, they <em>seem</em> increasingly removed from reality, but that’s only because we’re getting better at creating languages that are closer to our domains without the baggage of preconceptions that came before us.</p><p>The downside is that, without an understanding of those preconceptions, a lot of what we come up with seems like patent gibberish to those unaware of our languages’ history.</p><p>Most programmers, even those who have never seen BASIC before, can figure out what this snippet does:</p><pre><code class="pygments"><span class="nl">10</span><span class="w"> </span><span class="kr">INPUT</span><span class="w"> </span><span class="s2">"What is your name: "</span><span class="p">;</span><span class="w"> </span><span class="vg">U$</span>
<span class="nl">20</span><span class="w"> </span><span class="kr">PRINT</span><span class="w"> </span><span class="s2">"Hello "</span><span class="p">;</span><span class="w"> </span><span class="vg">U$</span></code></pre><p>On the other hand, very few would probably understand this one:</p><pre><code class="pygments"><span class="c1">-- | A class for categories.</span>
<span class="c1">-- id and (.) must form a monoid.</span>
<span class="kr">class</span> <span class="kt">Category</span> <span class="n">cat</span> <span class="kr">where</span>
<span class="c1">-- | the identity morphism</span>
<span class="n">id</span> <span class="ow">::</span> <span class="n">cat</span> <span class="n">a</span> <span class="n">a</span>
<span class="c1">-- | morphism composition</span>
<span class="p">(</span><span class="o">.</span><span class="p">)</span> <span class="ow">::</span> <span class="n">cat</span> <span class="n">b</span> <span class="n">c</span> <span class="ow">-></span> <span class="n">cat</span> <span class="n">a</span> <span class="n">b</span> <span class="ow">-></span> <span class="n">cat</span> <span class="n">a</span> <span class="n">c</span></code></pre><p>Yet very few new programs are being written in BASIC, and lots are being written in Haskell.</p><p>Even one of the most popular, fastest-growing programming languages in the world, JavaScript, a language considered relatively accessible compared to things like Haskell, would likely be incomprehensible to a programmer not familiar with its syntax:</p><pre><code class="pygments"><span class="kr">export</span> <span class="kr">const</span> <span class="nx">composeWithProps</span> <span class="o">=</span> <span class="nx">curry</span><span class="p">((</span><span class="nx">a</span><span class="p">,</span> <span class="nx">parentProps</span><span class="p">,</span> <span class="nx">b</span><span class="p">)</span> <span class="p">=></span> <span class="p">{</span>
<span class="kr">const</span> <span class="nx">composed</span> <span class="o">=</span> <span class="nx">childProps</span> <span class="p">=></span>
<span class="nx">createElement</span><span class="p">(</span><span class="nx">a</span><span class="p">,</span> <span class="nx">parentProps</span><span class="p">,</span> <span class="nx">createElement</span><span class="p">(</span><span class="nx">b</span><span class="p">,</span> <span class="nx">omit</span><span class="p">([</span><span class="s1">'children'</span><span class="p">],</span> <span class="nx">childProps</span><span class="p">),</span> <span class="nx">childProps</span><span class="p">.</span><span class="nx">children</span><span class="p">));</span>
<span class="c1">// give the composed component a pretty display name for debugging</span>
<span class="nx">composed</span><span class="p">.</span><span class="nx">displayName</span> <span class="o">=</span> <span class="sb">`Composed(</span><span class="si">${</span><span class="nx">getDisplayName</span><span class="p">(</span><span class="nx">a</span><span class="p">)</span><span class="si">}</span><span class="sb">, </span><span class="si">${</span><span class="nx">getDisplayName</span><span class="p">(</span><span class="nx">b</span><span class="p">)</span><span class="si">}</span><span class="sb">)`</span><span class="p">;</span>
<span class="k">return</span> <span class="nx">composed</span><span class="p">;</span>
<span class="p">});</span></code></pre><p>Moving towards increasingly specialized syntaxes is not inherently bad—it can often be indicative of a more streamlined, domain-specific way of thinking—but while it may dramatically increase the productivity of a seasoned programmer, it can be nothing short of baffling to a newcomer.</p><p>That, specifically, is the crux of my fear: are we always aware of who we are optimizing for? I do not have a moral problem with writing code to optimize concision for seasoned programmers; after all, brevity is one of the primary ways code is made more readable (verbosity is the enemy of understanding). However, when that concision comes at the cost of beginners’ understanding, the picture becomes a bit more grey. It is not wrong to write things that are highly optimized for one’s own knowledge and understanding, and establishing a group of such people can make for an <em>extremely</em> productive team. It’s just also important to understand that others will likely be confused, and without being willing to invest the time and money into education, smart, diligent people will still fail to grasp the concepts, and they will likely be wholly uninterested in them.</p><h3><a name="reactionary-anti-intellectualism-and-the-search-for-moderation"></a>Reactionary anti-intellectualism and the search for moderation</h3><p>I have noticed lately that people close to my circles have started regularly slinging insults at people who work in highly specialized notation. Math, including things like category and type theory, has become an especially acceptable punching bag. <a href="https://twitter.com/lexi_lambda/status/763111451691134976">I recently tweeted a picture of some rather dense mathematics from a paper I’d read</a>, and I was frankly disturbed at some of the vitriolic responses. Academia is sometimes described as “masturbatory”, and honestly, that is both offensive and hypocritical.</p><p>Mathematical notation is not perfect, no more than dense Haskell, heavily metaprogrammed Ruby, or IIFE-packed JavaScript. Still, it serves a purpose, and sometimes spelling things out is neither practically feasible nor a theoretical improvement. Programmers would not take kindly to being asked to write all their code out as prose, nor would they like being told that using higher-order functions like <code>map</code> should be banned because they are too confusing and not immediately self-explanatory.</p><p>I am glad that people are focusing on usability and accessibility more than ever, and I think that’s one of the areas I’m the most interested in. I want to get the best of both worlds: I aim to write code in a highly concise, precise style, but I try and produce intuitive interfaces with human-readable errors upon failure. To me, a user-hostile yet technically functional library is a buggy one, and I would happily file a bug report about a confusing API or error message.</p><p>Abstraction is what seems to make programming possible, and indeed, it’s what makes most modern <em>technology</em> possible. It’s what allows people to drive a car without knowing how an internal combustion engine works, and it’s what allows people to browse the web without having a deep understanding of internet protocol. In programming, abstraction serves a similar purpose. Of course, just like all tools, abstractions can have rather different goals: the average user will not pick up Photoshop in a day, but a power user is not going to be satisfied with Paint.</p><p>Programmers are professionals, and we work in a technical domain. I am absolutely of the belief that programming, like any other field, is not always about what comes easiest: sometimes it’s important to sit down and study for a while to grok a particularly complicated concept, and other times, it’s simply important to learn by trying, failing, and asking questions. I strive to find that blend of accessible, concise, and robust, and just like everything else, that target shifts depending on the situation and people I’m working with.</p><p>I honestly don’t know if Racket and Haskell are worth their costs in complexity. At the end of the day, maybe what really matters is writing simple, consistent things that other people can understand. I really hope that there is a place for more powerful languages within a team, but there’s something to be said about which languages tend to get the most popular.</p><p>Ultimately, though, I am just trying to be aware of the tradeoffs I’m making, the benefits I’m getting, and the impact on those I’m working with. I will continue to search for abstractions that can better fit my needs, and I am sure I will keep on climbing the ladder of abstraction for years to come—I just really hope I’m not wasting my time.</p><ol class="footnotes"></ol></article>Four months with Haskell2016-06-12T00:00:00Z2016-06-12T00:00:00ZAlexis King<article><p>At the end of January of this year, I switched to a new job, almost exclusively because I was enticed by the idea of being able to write Haskell. The concept of using such an interesting programming language every day instead of what I’d been doing before (mostly Rails and JavaScript) was very exciting, and I’m pleased to say that the switch seems to have been well worth it.</p><p>Haskell was a language I had played with in the past but never really used for anything terribly practical, but lately I think I can confidently say that it really is an <em>incredible</em> programming language. At the same time, it has some significant drawbacks, too, though probably not the ones people expect. I certainly wasn’t prepared for some of the areas where Haskell would blow me away, nor was I capable of realizing which parts would leave me hopelessly frustrated until I actually sat down and started writing lots and lots of code.</p><h2><a name="dispelling-some-myths"></a>Dispelling some myths</h2><p>Before moving on and discussing my experiences in depth, I want to take a quick detour to dispel some frequent rumors I hear about why Haskell is at least potentially problematic. These are things I hear a <em>lot</em>, and nothing in my experience so far would lead me to believe these are actually true. Ultimately, I don’t want to spend too much time on these—I think that, for the most part, they are nitpicks that people complain about to avoid understanding the deeper and more insidious problems with the language—but I think it’s important to at least mention them.</p><h3><a name="hiring-haskell-developers-is-not-hard"></a>Hiring Haskell developers is not hard</h3><p>I am on the first Haskell team in my company, and I am among the first Haskell developers we ever hired. Not only were we hiring without much experience with Haskell at all, we explicitly <em>did not</em> want to hire remote. Debate all you like about whether or not permitting remote work is a good idea, but I don’t think anyone would dispute that this constraint makes hiring much harder. We didn’t have any trouble finding a very large stream of qualified applicants, and it definitely seems to have dispelled any fears that we would have trouble finding new candidates in the future.</p><h3><a name="performing-i-o-in-haskell-is-easy"></a>Performing I/O in Haskell is easy</h3><p>Haskell’s purity is a point of real contention, and it’s one of the most frustrating complaints I often hear about Haskell. It is surprisingly common to hear concerns along the lines of “I don’t want to use Haskell because its academic devotion to purity sounds like it would make it very hard to get anything done”. There are very valid reasons to avoid Haskell, but in practice, I/O is not one of them. In fact, I found that isolating I/O in Haskell was much the same as isolating I/O in every other language, which I need to do anyway to permit unit testing.</p><p>...you <em>do</em> write deterministic unit tests for your impure logic, right?</p><h3><a name="working-with-lots-of-monads-is-not-very-difficult"></a>Working with lots of monads is not very difficult</h3><p>The “M word” has ended up being a running joke <em>about</em> Haskell that actually ends up coming up fairly rarely <em>within</em> the Haskell community. To be clear, there is <em>no doubt</em> in my mind that monads make Haskell intimidating and provide a steep learning curve for new users. The proliferation of the joke that monads are impossible to explain, to the point of becoming mythologized, is absolutely indicative of a deeper problem about Haskell’s accessibility. However, once people learn the basics about monads, I’ve found that applying them is just as natural as applying any other programming pattern.</p><p>Monads are used to assist the programmer, not impede them, and they really do pay off in practice. When something has a monadic interface, there’s a decent chance I already know what that interface is going to do, and that makes working with lots of different monads surprisingly easy. Admittedly, I do rely very, very heavily on tooling to help me out here, but with things like mouseover type tooltips, I’ve actually found that working with a variety of different monads and monad transformers is actually quite pleasant, and it makes things very readable!</p><h2><a name="haskell-the-good-parts"></a>Haskell: the good parts</h2><p>With the disclaimers out of the way, I really just want to gush for a little bit. This is not going to be an objective, reasoned survey of why Haskell is good. I am not even really going to touch upon why types are so great and why purity is so wonderful—I’d love to discuss those in depth, but that’s for a different blog post. For now, I just want to touch upon the real surprises, the real things that made me <em>excited</em> about Haskell in ways I didn’t expect. These are the things that my subjective little experience has found fun.</p><h3><a name="language-extensions-are-haskell"></a>Language extensions <em>are</em> Haskell</h3><p>There was a time in my life when I spent a lot of time writing C. There are a lot of compilers for C, and they all implement the language in subtly different but often incompatible ways, especially on different platforms. The only way to maintain a modicum of predictability was to adhere to the standards <em>religiously</em>, even when certain GCC or MSVC extensions seem tantalizingly useful. I was actually bitten a few times by real instances where I figured I’d just use a harmless extension that was implemented everywhere, then found out it worked slightly differently across different compilers in a particular edge case. It was a learning experience.</p><p>It seems that this fear provides a very real distrust for using GHC’s numerous <em>language extensions</em>, and indeed, for a long time, I felt that it was probably an admirable goal to stick to Haskell 98 or Haskell 2010 as closely as possible. Sometimes I chose a slightly more verbose solution that was standard Haskell to avoid turning on a trivial extension that would make the code look a little bit cleaner.</p><p>About a year later, I’m finding that attitude was not only a mistake, but it forced me to often completely miss out on a lot of Haskell’s core value. GHC <em>won</em>, and now GHC and Haskell are basically synonymous. With that in mind, the portability concerns of language extensions are a bit of a non-issue, and turning them on is a very good idea! Some extensions are more than a little dangerous, so they cannot all be turned on without thinking, but the question is absolutely not “Is using language extensions a good idea?” and more “Is using <em>this</em> language extension a good idea?”</p><p>This is important, and I bring it up for a reason: so much of the awesomeness of Haskell is locked behind language extensions. Turning a lot of these on is one of the main things that made me really start to see how incredibly powerful Haskell actually is.</p><h3><a name="phantom-types"></a>Phantom types</h3><p>I’m going to start out by talking about <em>phantom types</em>, which are a pretty simple concept but a powerful one, and they serve as the foundation for a lot of other cool type-level tricks that can make Haskell extremely interesting. The basic idea of a phantom type is simple; it’s a type parameter that isn’t actually used to represent any particular runtime value:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">Id</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Id</span> <span class="kt">Text</span></code></pre><p>This type represents an id for some kind of value, but although the kind of value is specified in the type as the <code>a</code> type parameter, it isn’t actually used anywhere on the data definition—no matter what <code>a</code> is, an <code>Id</code> is just a piece of text. This makes it possible to write functions that operate on specific kinds of ids, and those invariants will be statically checked by the compiler, even though the runtime representation is entirely identical:</p><pre><code class="pygments"><span class="nf">fetchUser</span> <span class="ow">::</span> <span class="kt">MonadDB</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">User</span></code></pre><p>Using <code>FlexibleInstances</code>, it’s also possible to create different instances for different kinds of ids. For example, it would be possible to have different <code>Show</code> instances depending on the type of id in question.</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">Show</span> <span class="p">(</span><span class="kt">Id</span> <span class="kt">User</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">show</span> <span class="p">(</span><span class="kt">Id</span> <span class="n">txt</span><span class="p">)</span> <span class="ow">=</span> <span class="s">"user #"</span> <span class="o"><></span> <span class="n">unpack</span> <span class="n">txt</span>
<span class="kr">instance</span> <span class="kt">Show</span> <span class="p">(</span><span class="kt">Id</span> <span class="kt">Post</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">show</span> <span class="p">(</span><span class="kt">Id</span> <span class="n">txt</span><span class="p">)</span> <span class="ow">=</span> <span class="s">"post #"</span> <span class="o"><></span> <span class="n">unpack</span> <span class="n">txt</span></code></pre><p>This provides a simple framework for encoding entirely arbitrary information into the type system, then asking the compiler to actually check assertions about that information. This is made even more powerful with some other extensions, which I’ll talk about shortly.</p><h3><a name="letting-the-compiler-write-code"></a>Letting the compiler write code</h3><p>One of the things I really dislike, more than most things, is boilerplate. A little bit of boilerplate is fine—even necessary at times—but as soon as I start wondering if a code generator would improve things, I think the programming language has pretty much failed me.</p><p>I write a lot of Racket because, in a sense, Racket is the ultimate boilerplate killer: the macro system is a first-class code generator integrated with the rest of the language, and it means that boilerplate is almost never an issue. Of course, that’s not always true: sometimes a bit of boilerplate <em>is</em> still necessary because macros cannot deduce enough information about the program to generate the code entirely on their own, and in Haskell, some of that information is actually present in the type system.</p><p>This leads to two absolutely incredible extensions, both of which are simple and related, but which actually <em>completely change</em> how I approach problems when programming. These two extensions are <code>GeneralizedNewtypeDeriving</code> and <code>StandaloneDeriving</code>.</p><h4><a name="newtypes-and-type-safety"></a>Newtypes and type safety</h4><p>The basic idea is that “newtypes” are just simple wrapper types in Haskell. This turns out to be extremely important when trying to find the value of Haskell because they allow you to harden type safety by specializing types to <em>your</em> domain. For example, consider a type representing a user’s name:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">Name</span> <span class="ow">=</span> <span class="kt">Name</span> <span class="kt">Text</span></code></pre><p>This type is extremely simple, and in fact isn’t even at all different from a simple <code>Text</code> value with respect to its representation, since all combinations of unicode characters are allowed in a name. Therefore, what’s the point of a separate type? Well, this allows Haskell to introduce actual compilation failures when two different kinds of textual data are mixed. This is not a new idea, and even in languages that don’t support this sort of thing, Joel Spolsky’s old blog post <a href="http://www.joelonsoftware.com/articles/Wrong.html">Making Wrong Code Look Wrong</a> describes how it can be done by convention. Still, almost every modern language makes this possible: in C, it would be a single-member <code>struct</code>, in class-based OO languages, it would be a single-member class... this is not a complicated idea.</p><p>The difference lies in its usage. In other languages, this strategy is actually not very frequently employed for the simple reason that it is almost always extremely annoying. You are forced to do tons of wrapping/unwrapping, and at that point it isn’t really clear if you’re even getting all that much value out of the distinction when your first solution to a type mismatch is wrapping or unwrapping the value without a second thought. In Haskell, however, this can be heavily mitigated by asking the compiler to <em>automatically derive typeclass implementations</em>, which allow the unwrapping/wrapping to effectively happen implicitly for a constrained set of operations.</p><h4><a name="using-generalizednewtypederiving"></a>Using <code>GeneralizedNewtypeDeriving</code></h4><p>Consider the <code>Name</code> type once again, but this time, let’s derive a class:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">Name</span> <span class="ow">=</span> <span class="kt">Name</span> <span class="kt">Text</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">IsString</span><span class="p">)</span></code></pre><p>The <code>IsString</code> typeclass in Haskell allows custom types to automatically be created from string literals. It is <em>not</em> handled specially by Haskell’s <code>deriving</code> mechanism. Since <code>Text</code> implements <code>IsString</code>, an instance will be generated that simply defers to the underlying type, automatically generating the code to wrap the result up in a <code>Name</code> box at the end. This means that code like this will now just magically work:</p><pre><code class="pygments"><span class="nf">name</span> <span class="ow">::</span> <span class="kt">Name</span>
<span class="nf">name</span> <span class="ow">=</span> <span class="s">"Alyssa P. Hacker"</span></code></pre><p>No boilerplate needs to be written! This is a neat trick, but it actually turns out to be far more useful than that simple example in practice. What really makes this functionality shine is when you want to derive <em>some</em> kinds of functionality but disallow some others. For example, using the <a href="https://hackage.haskell.org/package/text-conversions"><code>text-conversions</code></a> package, it’s possible to do something like this:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">Id</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Id</span> <span class="kt">Text</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Eq</span><span class="p">,</span> <span class="kt">Show</span><span class="p">,</span> <span class="kt">ToText</span><span class="p">,</span> <span class="kt">ToJSON</span><span class="p">)</span></code></pre><p>This creates an opaque <code>Id</code> type, but it automatically generates conversions <em>to</em> textual formats. However, it does <em>not</em> automatically create <code>FromText</code> or <code>FromJSON</code> instances, which would be dangerous because decoding <code>Id</code>s can potentially fail. It’s then possible to write out those instances manually to preserve a type safety:</p><pre><code class="pygments"><span class="kr">instance</span> <span class="kt">FromText</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="p">(</span><span class="kt">Id</span> <span class="n">a</span><span class="p">))</span> <span class="kr">where</span>
<span class="n">fromText</span> <span class="n">str</span> <span class="ow">=</span> <span class="kr">if</span> <span class="n">isValidId</span> <span class="n">str</span> <span class="kr">then</span> <span class="kt">Just</span> <span class="p">(</span><span class="kt">Id</span> <span class="n">str</span><span class="p">)</span> <span class="kr">else</span> <span class="kt">Nothing</span>
<span class="kr">instance</span> <span class="kt">FromJSON</span> <span class="p">(</span><span class="kt">Id</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">parseJSON</span> <span class="p">(</span><span class="kt">String</span> <span class="n">val</span><span class="p">)</span> <span class="ow">=</span> <span class="n">maybe</span> <span class="p">(</span><span class="n">fail</span> <span class="s">"invalid id"</span><span class="p">)</span> <span class="n">return</span> <span class="p">(</span><span class="n">fromText</span> <span class="n">val</span><span class="p">)</span>
<span class="n">parseJSON</span> <span class="kr">_</span> <span class="ow">=</span> <span class="n">fail</span> <span class="s">"invalid id"</span></code></pre><h4><a name="using-standalonederiving"></a>Using <code>StandaloneDeriving</code></h4><p>The ordinary <code>deriving</code> mechanism is extremely useful, especially when paired with the above, but sometimes it is desirable to have a little bit more flexibility. In these cases, <code>StandaloneDeriving</code> can help.</p><p>Take the <code>Id</code> example again: it has a phantom type, and simply adding something like <code>deriving (ToText)</code> with derive <code>ToText</code> instances for <em>all</em> kinds of ids. It is potentially useful, however, to derive instances for more specific id types. Using standalone <code>deriving</code> constructs permits this sort of flexibility.</p><pre><code class="pygments"><span class="kr">deriving</span> <span class="kr">instance</span> <span class="kt">ToText</span> <span class="p">(</span><span class="kt">Id</span> <span class="kt">User</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">ToText</span> <span class="p">(</span><span class="kt">Id</span> <span class="kt">Post</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">toText</span> <span class="ow">=</span> <span class="n">postIdToText</span></code></pre><p>This is an example where GHC language extensions end up becoming significantly more than the sum of their parts, which seems to be a fairly frequent realization. The <code>StandaloneDeriving</code> mechanism is a little bit useful without <code>GeneralizedNewtypeDeriving</code>, but when combined, they are incredibly powerful tools for getting a very fine-grained kind of type safety <em>without</em> writing any boilerplate.</p><h3><a name="datakinds-are-super-cool-with-caveats"></a>DataKinds are super cool, with caveats</h3><p>Phantom types are quite wonderful, but they can only encode <em>types</em>, not arbitrary data. That’s where <code>DataKinds</code> and <code>KindSignatures</code> come in: they allow lifting arbitrary datatypes to the type level so that things that would normally be purely runtime values can be used at compile-time as well.</p><p>The way this works is pretty simple—when you define a datatype, you also define a “datakind”:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">RegistrationStatus</span> <span class="ow">=</span> <span class="kt">Registered</span> <span class="o">|</span> <span class="kt">Anonymous</span></code></pre><p>Normally, the above declaration declares a <em>type</em>, <code>RegistrationStatus</code>, and two <em>data constructors</em>, <code>Registered</code> and <code>Anonymous</code>. With <code>DataKinds</code>, it also defines a <em>kind</em>, <code>RegistrationStatus</code>, and two <em>type constructors</em>, <code>Registered</code> and <code>Anonymous.</code></p><p>If that’s confusing, the way to understand that is to realize there is a sort of natural ordering here: types describe values, and kinds describe types. Therefore, turning on <code>DataKinds</code> “lifts” each definition by a single level, so types become kinds and values become types. This permits using these things at the type level:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">UserId</span> <span class="p">(</span><span class="n">s</span> <span class="ow">::</span> <span class="kt">RegistrationStatus</span><span class="p">)</span> <span class="ow">=</span> <span class="kt">UserId</span> <span class="kt">Text</span></code></pre><p>In this example, <code>UserId</code> still has a single phantom type variable, <code>s</code>, but this time it is constrained to the <code>RegistrationStatus</code> kind. Therefore, it can <em>only</em> be <code>Registered</code> or <code>Anonymous</code>. This cooperates well with the aforementioned <code>StandaloneDeriving</code> mechanism, and it mostly provides a convenient way to constrain type variables to custom kinds.</p><p>In general, <code>DataKinds</code> is a much more powerful extension, allowing things like type-level natural numbers or strings, which can be used to perform actual type-level computation (especially in combination with <code>TypeFamilies</code>) or a sort of metaprogramming. In some cases, they can even be used to implement things emulating things you can do with dependent types.</p><p>I think <code>DataKinds</code> are a very cool Haskell extension, but there are a couple caveats. One of the main ones is how new kinds are defined: <code>DataKinds</code> “hijacks” the existing datatype declaration syntax by making every single datatype declaration define a type <em>and</em> a kind. This is a little confusing, and it would be nice if a different syntax was used so that each could be defined independently.</p><p>Similarly, it seems that a lot of work is being done to allow using runtime values at the type level, but I wonder if people will ever need to use, say, runtime values at the <em>kind</em> level. This immediately evokes thoughts of Racket’s phase-based macro system, and I wonder if some of this duplication would be unnecessary with something similar.</p><p>Food for thought, but overall, <code>DataKinds</code> are a very nice addition to help with precisely and specifically typing particular problems.</p><h3><a name="typeclasses-can-emulate-effects"></a>Typeclasses can emulate effects</h3><p>This is something that I’ve found interesting in my time writing Haskell because I have <em>no idea</em> if it’s idiomatic or not, but it seems pretty powerful. The initial motivator for this idea was figuring out how to test our code without constantly dropping into <code>IO</code>.</p><p>More generally, we wanted to be able to unit test by “mocking” out collaborators, as it would be described in object oriented programming. I was always semi-distrustful of mocking, and indeed, it seems likely that it is heavily abused in certain circles, but I’ve come to appreciate the need that sometimes it is important to stub things out, <em>even in pure code</em>.</p><p>As an example, consider some code that needs access to the current time. This is something that would normally require <code>IO</code>, but we likely want to be able to use the value in a pure context without “infecting” the entire program with <code>IO</code> types. In Haskell, I have generally seen three ways of handling this sort of thing:</p><ol><li><p>Just inject the required values into the function and produce them “higher up” where I/O is okay. If threading the value around becomes too burdensome, use a Reader monad.</p></li><li><p>Use a free monad or similar to create a pure DSL of sorts, then write interpreters for various implementations, one of which uses <code>IO</code>.</p></li><li><p>Create custom monadic typeclasses that provide interfaces to the functionality you want to perform, then create instances, one of which is an instance over <code>IO</code>.</p></li></ol><p>This last approach seems to be less common in Haskell, but it’s the approach we took, and it seems to work out remarkably well. Returning to the need to get the current time, we could pretty easily write such a typeclass to encode that need:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">CurrentTime</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">getCurrentTime</span> <span class="ow">::</span> <span class="n">m</span> <span class="kt">UTCTime</span></code></pre><p>Now we can write functions that use the current time:</p><pre><code class="pygments"><span class="nf">validateToken</span> <span class="ow">::</span> <span class="kt">CurrentTime</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">Token</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">Bool</span>
<span class="nf">validateToken</span> <span class="n">tok</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">currentTime</span> <span class="ow"><-</span> <span class="n">getCurrentTime</span>
<span class="n">return</span> <span class="p">(</span><span class="n">tokenExpirationDate</span> <span class="n">tok</span> <span class="o">></span> <span class="n">currentTime</span><span class="p">)</span></code></pre><p>Now, we can write instances for <code>CurrentTime</code> that will allow us to run the same code in different contexts:</p><pre><code class="pygments"><span class="kr">newtype</span> <span class="kt">AppM</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">AppM</span> <span class="p">{</span> <span class="n">runAppM</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="n">a</span> <span class="p">}</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">,</span> <span class="kt">MonadIO</span><span class="p">)</span>
<span class="kr">newtype</span> <span class="kt">TestM</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">TestM</span> <span class="p">(</span><span class="kt">Identity</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Functor</span><span class="p">,</span> <span class="kt">Applicative</span><span class="p">,</span> <span class="kt">Monad</span><span class="p">)</span>
<span class="nf">runTestM</span> <span class="ow">::</span> <span class="kt">TestM</span> <span class="n">a</span> <span class="ow">-></span> <span class="n">a</span>
<span class="nf">runTestM</span> <span class="p">(</span><span class="kt">TestM</span> <span class="n">x</span><span class="p">)</span> <span class="ow">=</span> <span class="n">runIdentity</span> <span class="n">x</span>
<span class="kr">instance</span> <span class="kt">CurrentTime</span> <span class="kt">AppM</span> <span class="kr">where</span>
<span class="n">getCurrentTime</span> <span class="ow">=</span> <span class="n">liftIO</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Time</span><span class="o">.</span><span class="kt">Clock</span><span class="o">.</span><span class="n">getCurrentTime</span>
<span class="kr">instance</span> <span class="kt">CurrentTime</span> <span class="kt">TestM</span> <span class="kr">where</span>
<span class="n">getCurrentTime</span> <span class="ow">=</span> <span class="n">return</span> <span class="o">$</span> <span class="n">posixSecondsToUTCTime</span> <span class="mi">0</span></code></pre><p>Where this really starts to shine is when adding additional effects. For example, the above token validation function might also need information about some kind of secret used for signing. Under this model, it’s just another typeclass:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">TokenSecret</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">getTokenSecret</span> <span class="ow">::</span> <span class="n">m</span> <span class="kt">Secret</span>
<span class="nf">validateToken</span> <span class="ow">::</span> <span class="p">(</span><span class="kt">CurrentTime</span> <span class="n">m</span><span class="p">,</span> <span class="kt">TokenSecret</span> <span class="n">m</span><span class="p">)</span> <span class="ow">=></span> <span class="kt">Token</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">Bool</span>
<span class="nf">validateToken</span> <span class="n">tok</span> <span class="ow">=</span> <span class="kr">do</span>
<span class="n">currentTime</span> <span class="ow"><-</span> <span class="n">getCurrentTime</span>
<span class="n">secret</span> <span class="ow"><-</span> <span class="n">getTokenSecret</span>
<span class="n">return</span> <span class="p">(</span><span class="n">tokenExpirationDate</span> <span class="n">tok</span> <span class="o">></span> <span class="n">currentTime</span>
<span class="o">&&</span> <span class="n">verifySignature</span> <span class="n">tok</span> <span class="n">secret</span><span class="p">)</span></code></pre><p>Of course, so far all of these functions have been extremely simple, and we’ve basically been using them as a glorified reader monad. In practice, though, we use this pattern for lots more than just retrieving values. For example, we might have a typeclass for database interactions:</p><pre><code class="pygments"><span class="kr">class</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">Persistence</span> <span class="n">m</span> <span class="kr">where</span>
<span class="n">fetchUser</span> <span class="ow">::</span> <span class="kt">Id</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="kt">User</span><span class="p">)</span>
<span class="n">insertUser</span> <span class="ow">::</span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="p">(</span><span class="kt">Either</span> <span class="kt">PersistenceError</span> <span class="p">(</span><span class="kt">Id</span> <span class="kt">User</span><span class="p">))</span></code></pre><p>With all of this done, it becomes incredibly easy to see which functions are using which effects:</p><pre><code class="pygments"><span class="nf">postUsers</span>
<span class="ow">::</span> <span class="p">(</span><span class="kt">CurrentTime</span> <span class="n">m</span><span class="p">,</span> <span class="kt">Persistence</span> <span class="n">m</span><span class="p">,</span> <span class="kt">TokenSecret</span> <span class="n">m</span><span class="p">)</span>
<span class="ow">=></span> <span class="kt">User</span> <span class="ow">-></span> <span class="n">m</span> <span class="kt">Response</span>
<span class="nf">postUsers</span> <span class="ow">=</span> <span class="o">...</span>
<span class="nf">getHealthcheck</span>
<span class="ow">::</span> <span class="kt">CurrentTime</span> <span class="n">m</span>
<span class="ow">=></span> <span class="n">m</span> <span class="kt">Response</span>
<span class="nf">getHealthcheck</span> <span class="ow">=</span> <span class="o">...</span></code></pre><p>There’s no need to perform any lifting, and this all seems to scale quite nicely. We’ve written some additional utilities to help write tests against functions using these kinds of monadic interfaces, and even though there’s a little bit of annoying boilerplate in a few spots, overall it seems to work quite elegantly.</p><p>I’m not entirely sure how common this is in the Haskell community, but it’s certainly pretty neat how easy it is to get nearly all of the benefits of effect types in other languages simply by composing some of Haskell’s simplest features.</p><h3><a name="atom-s-ide-haskell-tooling-is-invaluable"></a>Atom’s ide-haskell tooling is invaluable</h3><p>Alright, so, confession time: I don’t use Emacs.</p><p>I know, I know, how is that possible? I write Lisp, after all. Well, honestly, I tried picking it up a number of times, but none of those times did I get far enough to ditch my other tools. For Racket work, I use DrRacket, but for almost everything else, I use Atom.</p><p>Atom has a lot of flaws, but it’s also pretty amazing in places, and I absolutely <em>love</em> the Haskell tooling written by the wonderful <a href="https://github.com/atom-haskell">atom-haskell</a> folks. I use it constantly, and even though it doesn’t always work perfectly, it works pretty well. When it has problems, I’ve at least figured out how to get it working correctly.</p><p>This is probably hard to really explain without seeing it for yourself, but I’ve found that I basically <em>depend</em> on this sort of tooling to be fully productive in Haskell, and I have no problem admitting that. The ability to get instant feedback about type errors tied to visual source locations, to be able to directly manipulate the source by selecting expressions and getting type information, and even the option to get inline linter suggestions means I spend a lot less time glancing at the terminal, and even less time in the REPL.</p><p>The tooling is far from perfect, and it leaves a lot to be desired in places (the idea of using that static information for automated, project-wide refactoring <em>a la</em> Java is tantalizing), but most of those things are ideas of what amazing things could be, not broken or missing essentials. I am pretty satisfied with ide-haskell right now, and I can only hope it continues to get better and better.</p><h2><a name="frustrations-drawbacks-and-pain-points"></a>Frustrations, drawbacks, and pain points</h2><p>Haskell is not perfect. In fact, far from it. There is a vast array of little annoyances that I have with the language, as is the case with any language. Still, there are a few overarching problems that I would really like to at least mention. These are the biggest sources of frustration for me so far.</p><h3><a name="purity-failure-and-exception-handling"></a>Purity, failure, and exception-handling</h3><p>One of Haskell’s defining features is its purity—I don’t think many would disagree with that. Some people consider it a drawback, others consider it one of its greatest boons. Personally, I like it a lot, and I think one of the best parts about it is how it requires the programmer to be incredibly deliberate about failure.</p><p>In many languages, when looking up a value from a container where the key doesn’t exist, there are really two ways to go about expressing this failure:</p><ol><li><p>Throw an exception.</p></li><li><p>Return <code>null</code>.</p></li></ol><p>The former is scary because it means <em>any</em> call to any function can make the entire program blow up, and it’s often impossible to know which functions even have the potential to throw. This creates a certain kind of non-local control flow that can sometimes cause a lot of unpredictability. The second option is much the same, especially when any value in a program might be <code>null</code>; it just defers the failure.</p><p>In languages with option types, this is somewhat mitigated. Java now has option types, too, but they are still frequently cumbersome to use because there is nothing like monads to use to simply chain operations together. Haskell, in comparison, has an incredible complement of tools to simply handle errors without a whole lot of burden on the programmer, and I have found that, in practice, this is <em>actually helpful</em> and I really do write better error-handling code.</p><h4><a name="first-the-good-parts"></a>First, the good parts</h4><p>I have seen a comparison drawn between throwing checked exceptions and returning <code>Maybe</code> or <code>Either</code> types, but in practice the difference is massive. Handling checked exceptions is a monotonous chore because they are not first-class values, they are actually entirely separate linguistic constructs. Consider a library that throws a <code>LibraryException</code>, and you want to wrap that library and convert those exceptions to <code>ApplicationException</code>s. Well, have fun writing this code dozens of times:</p><pre><code class="pygments"><span class="k">try</span> <span class="o">{</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">doSomething</span><span class="o">();</span>
<span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">LibraryException</span> <span class="n">ex</span><span class="o">)</span> <span class="o">{</span>
<span class="k">throw</span> <span class="n">ApplicationException</span><span class="o">.</span><span class="na">fromLibraryException</span><span class="o">(</span><span class="n">ex</span><span class="o">);</span>
<span class="o">}</span>
<span class="c1">// ...</span>
<span class="k">try</span> <span class="o">{</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">doSomethingElse</span><span class="o">();</span>
<span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">LibraryException</span> <span class="n">ex</span><span class="o">)</span> <span class="o">{</span>
<span class="k">throw</span> <span class="n">ApplicationException</span><span class="o">.</span><span class="na">fromLibraryException</span><span class="o">(</span><span class="n">ex</span><span class="o">);</span>
<span class="o">}</span></code></pre><p>In Haskell, failure is just represented by first-class values, and it’s totally possible to write helper functions to abstract over that kind of boilerplate:</p><pre><code class="pygments"><span class="nf">libraryToApplication</span> <span class="ow">::</span> <span class="kt">LibraryError</span> <span class="ow">-></span> <span class="kt">ApplicationError</span>
<span class="nf">libraryToApplication</span> <span class="ow">=</span> <span class="o">...</span>
<span class="nf">liftLibrary</span> <span class="ow">::</span> <span class="kt">Either</span> <span class="kt">LibraryError</span> <span class="n">a</span> <span class="ow">-></span> <span class="kt">Either</span> <span class="kt">ApplicationError</span> <span class="n">a</span>
<span class="nf">liftLibrary</span> <span class="ow">=</span> <span class="n">mapLeft</span> <span class="n">libraryToApplication</span></code></pre><p>Now, that same boilerplate-y code becomes nearly invisible:</p><pre><code class="pygments"><span class="nf">x</span> <span class="ow"><-</span> <span class="n">liftLibrary</span> <span class="n">doSomething</span>
<span class="c1">-- ...</span>
<span class="nf">y</span> <span class="ow"><-</span> <span class="n">liftLibrary</span> <span class="n">doSomethingElse</span></code></pre><p>This might not <em>seem</em> like much, but it really cuts down on the amount of visual noise, which ends up making all the difference. Boilerplate incurs a cost much bigger than simply taking the time to type it all out (though that’s important, too): the cognitive overhead of parsing which parts of a program are boilerplate has a significant impact on readability.</p><h4><a name="so-what-s-the-problem"></a>So what’s the problem?</h4><p>If error handling is so great in Haskell, then why am I putting it under the complaints section? Well, it turns out that not everyone seems to think it’s as great as I make it out to be because people seem to keep writing Haskell APIs that throw exceptions!</p><p>Despite what some purists would have you believe, Haskell has exceptions, and they are not uncommon. Lots of things can throw exceptions, some of which are probably reasonable. Failing to connect to a database is a pretty catastrophic error, so it seems fair that it would throw. On the other hand, inserting a duplicate record is pretty normal operation, so it seems like that should <em>not</em> throw.</p><p>I mostly treat exceptions in Haskell as unrecoverable catastrophes. If I throw an error in <em>my</em> code, I do not intend to catch it. That means something horrible happened, and I just want that horribleness to show up in a log somewhere so I can fix the problem. If I care about failure, there are better ways to handle that failure gracefully.</p><p>It’s also probably worth noting that exceptions in Haskell can be thrown from anywhere, even pure code, but can only be <em>caught</em> within the <code>IO</code> monad. This is especially scary, but I’ve seen it happen in actual libraries out in the wild, even ones that the entire Haskell ecosystem is built on. One of the crowning examples of this is the <code>text</code> package, which provides a function called <code>decodeUtf8</code> to convert bytestrings into text. Its type is very simple:</p><pre><code class="pygments"><span class="nf">decodeUtf8</span> <span class="ow">::</span> <span class="kt">ByteString</span> <span class="ow">-></span> <span class="kt">Text</span></code></pre><p>But wait, what if the bytestring is not actually a valid UTF-8 string?</p><p>Boom. There goes the application.</p><p>Okay, okay, well, at least the <code>text</code> package provides another function, this one called <code>decodeUtf8'</code>, which returns an <code>Either</code>. This is good, and I’ve trained myself to only ever use <code>decodeUtf8'</code>, but it still has some pretty significant problems:</p><ul><li><p>The <em>safe</em> version of this function is the “prime” version, rather than the other way around, which encourages people to use the unsafe one. Ideally, the unsafe one should be explicitly labeled as such... maybe call it <code>unsafeDecodeUtf8</code>?</p></li><li><p>This is not a hypothetical problem. When using a Haskell JWT library, we found a function that converts a string into a JWT. Since not all strings are JWTs, the library intelligently returns a <code>Maybe</code>. Therefore, we figured we were safe.</p><p>A couple weeks later, we found that providing this function with invalid data was returning HTTP 500 errors. Why? Our error handling was meticulous! Well, the answer was a <code>decodeUtf8</code> call, hidden inside of the JWT library. This is especially egregious, given that the API it exposed returned a <code>Maybe</code> anyway! It would have been trivial to use the safe version there, instead, but the poor, misleading name led the library developer to overlook the bug lurking in the otherwise innocuous function.</p><p>Even worse, this function was totally pure, and we used it in pure code, so we could not simply wrap the function and catch the exception. We had two options: use <code>unsafePerformIO</code> (yuck!) or perform a check before handing the data to the buggy function. We chose the latter, but in some cases, I imagine that could be too difficult to do in order to make it feasible.</p></li></ul><p>The point I’m trying to make is that this is a real problem, and it seems to me that throwing exceptions invalidates one of the primary advantages of Haskell. It disappointed me to realize that a significant amount of code written by FP Complete, one of the primary authors of some of the most important “modern Haskell” code in existence (including Stack), seem to very frequently expose APIs that will throw.</p><p>I’m not sure how much of this stems from a fundamental divide in the Haskell ecosystem and how much it is simply due to Michael Snoyman’s coding style, given that he is the primary author of a number of these tools and libraries that seem very eager to throw exceptions. As just one example of a real situation in which we were surprised by this behavior, we used Snoyman’s http-client library and found that it mysteriously throws upon nearly <em>any</em> failure state:</p><blockquote><p>A note on exceptions: for the most part, all actions that perform I/O should be assumed to throw an <code>HttpException</code> in the event of some problem, and all pure functions will be total. For example, <code>withResponse</code>, <code>httpLbs</code>, and <code>BodyReader</code> can all throw exceptions.</p></blockquote><p>This doesn’t seem entirely unreasonable—after all, isn’t a failure to negotiate TLS fairly catastrophic?—until you consider our use case. We needed to make a subrequest during the extent of another HTTP request to our server, and if that subrequest fails, we absolutely need to handle that failure gracefully. Of course, this is not <em>terrible</em> given that we are in <code>IO</code> so we can actually catch these exceptions, but since this behavior was only noted in a single aside at the top of the documentation, we didn’t realize we were forgetting error handling until far too late and requests were silently failing.</p><p>Exceptions seem to devalue one of the most powerful concepts in Haskell: if I don’t consider all the possibilities, my code <em>does not compile</em>. In practice, when working with APIs that properly encode these possibilities into the type system, this value proposition seems to be real. I really do find myself writing code that works correctly as soon as it compiles. It’s almost magical.</p><p>Using exceptions throws that all out the window, and I wish the Haskell ecosystem was generally more cautious about when to use them.</p><h3><a name="the-string-problem"></a>The String problem</h3><p>I sort of alluded to this a tiny bit in the last section, and that is probably indicative of how bad this issue is. I’m just going to be blunt: <strong>In Haskell, strings suck.</strong></p><p>This is always a bit of an amusing point whenever it is discussed because of how silly it seems. Haskell is a research language with a cutting-edge type system and some of the fanciest features of any language in existence. When everyday programming might use things like “profunctors”, “injective type families”, and “generalized algebraic datatypes”, you would think that dealing with <em>strings</em> would be a well-solved problem.</p><p>But it isn’t. Haskell libraries frequently use not one, not two, but <em><strong>five</strong></em> kinds of strings. Let’s list them off, shall we?</p><ul><li><p>First off, there’s the built-in <code>String</code> type, which is actually an alias for the <code>[Char]</code> type. For those not intimately familiar with Haskell, that’s a <em>linked list of characters</em>. As <a href="http://www.stephendiehl.com/">Stephen Diehl</a> recently put it in <a href="http://www.stephendiehl.com/posts/strings.html">a blog post describing the disaster that is Haskell string types</a>:</p><blockquote><p>This is not only a bad representation, it’s quite possibly the least efficient (non-contrived) representation of text data possible and has horrible performance in both time and space. <em>And it’s used everywhere in Haskell.</em></p></blockquote><p>The point is, it’s really bad. This type is not a useful representation for textual data in practical applications.</p></li><li><p>Moving on, we have a fairly decent type, <code>Text</code>, which comes from <code>Data.Text</code> in the <code>text</code> package. This is a decent representation of text, and it’s probably the one that everything should use. Well, maybe. Because <code>Text</code> comes in two varieties: lazy and strict. Nobody seems to agree on which of those two should be used, though, and they are totally incompatible types: functions that work with one kind of text won’t work with the other. You have to manually convert between them.</p></li><li><p>Finally, we have <code>ByteString</code>, which is horribly misnamed because it really isn’t a string at all, at least not in the textual sense. A better name for this type would have simply been <code>Bytes</code>, which sounds a lot scarier. And that would be good, because data typed as a <code>ByteString</code> is as close as you can get in Haskell to not assigning a type at all: a bytestring holds arbitrary bytes without assigning them any meaning whatsoever!</p><p>Or at least, that’s the intention. The trouble is that people <em>don’t</em> treat bytestrings like that—they just use them to toss pieces of text around, even when those pieces of text have a well-defined encoding and represent textual data. This leads to the <code>decodeUtf8</code> problem mentioned above, but it’s bigger than that because it often ends up with some poor APIs that assign some interpretation to <code>ByteString</code> data without assigning it a different type.</p><p>Again, this is throwing away so much of Haskell’s safety. It would be like using <code>Int</code> to keep track of boolean data (“just use 0 and 1!”) or using empty and singleton lists instead of using <code>Maybe</code>. When you use the precise type, you encode invariants and contracts into statically-checked assertions, but when you use general types like <code>ByteString</code>, you give that up.</p><p>Oh, and did I mention that <code>ByteString</code>s also come in incompatible lazy and strict versions, too?</p></li></ul><p>So, obviously, the answer is to just stop using the bad types and to just use (one kind of) <code>Text</code> everywhere. Great! Except that the other types are totally inescapable. The entire standard library uses <code>String</code> exclusively—after all, <code>text</code> is a separate package—and small libraries often use <code>String</code> instead of <code>text</code> because they have no need to bring in the dependency. Of course, this just means every real application pays the performance hit of converting between all these different kinds of strings.</p><p>Similarly, those that <em>do</em> use <code>Text</code> often use different kinds of text, so code ends up littered with <code>fromStrict</code> or <code>toStrict</code> coercions, which (again) have a cost. I’ve already ranted enough about <code>ByteString</code>, but basically, if you’re using <code>ByteString</code> in your API to pass around data that is semantically text, you are causing me pain. Please stop.</p><p>It seems that the way <code>Data.Text</code> probably <em>should</em> have been designed was by making <code>Text</code> a typeclass, then making the lazy and strict implementations instances of that typeclass. Still, the fact that both of them exist would always cause problems. I’m actually unsure which one is the “correct” choice—I don’t know enough about how the two perform in practice—but it seems likely that picking <em>either</em> one would be a performance improvement over the current system, which is constantly spending time converting between the two.</p><p>This issue has been ranted about plenty, so I won’t ramble on, but if you’re designing new libraries, please, <em>please</em> use <code>Text</code>. Your users will thank you.</p><h3><a name="documentation-is-nearly-worthless"></a>Documentation is nearly worthless</h3><p>Finally, let’s talk about documentation.</p><p>One of my favorite programming languages is Racket. Racket has a documentation tool called Scribble. Scribble is special because it is a totally separate domain-specific language for writing documentation, and it makes it fun and easy to write good explanations. There are even forms for typesetting automatically-rendered examples that look like a REPL. If the examples ever break or become incorrect, the docs don’t even compile.</p><p>All of the Racket core library documentation makes sure to set a good example about what good documentation should look like. The vast majority of the documentation is paragraphs of prose and simple but practical examples. There are also type signatures (in the form of contracts), and those are super important, but they are so effective because of how the prose explains what each function does, when to use it, <em>why</em> you’d use it, and <em>why you wouldn’t use it</em>.</p><p>Everything is cross-referenced automatically. The documentation is completely searchable locally out of the box. As soon as you install a package, its docs are automatically indexed. User-written libraries tend to have pretty good docs, too, because the standard libraries set such a good example <em>and</em> because the tools are so fantastic. Racket docs are really nice, and they’re so good they actually make things like Stack Overflow or even Google mostly irrelevant. It’s all there in the manual.</p><p>Haskell documentation is the opposite of everything I just said.</p><ul><li><p>The core libraries are poorly documented. Most functions include a sentence of description, and almost none include examples. At their worst, the descriptions simply restate the type signature.</p></li><li><p>Third-party libraries’ documentation is even worse, going frequently completely undocumented and actually only including type signatures and nothing else.</p></li><li><p>Haddock is an incredibly user-hostile tool for writing anything other than tiny snippets of documentation and is not very good at supporting prose. Notably, Haddock’s documentation is not generated using Haddock (and it still manages to be almost unusable). Forcing all documentation into inline comments makes users unlikely to write much explanation, and there is no ability for abstraction.</p></li><li><p>Reading documentation locally is very difficult because there is no easy way to open documentation for a particular package in a web browser, and it’s <em>certainly</em> not searchable. This is especially ridiculous given that Hoogle exists, which is one of best ways to search API docs in existence. There should be a <code>stack hoogle</code> command that just opens a Hoogle page for all locally-installed packages and Just Works, but there isn’t.</p></li><li><p>Most valuable information exists outside of documentation, so Google becomes a go-to immediately after a quick glance at the docs, and information is spread across blog posts, mailing lists, and obscure reddit posts.</p></li></ul><p>This is a problem that cannot be fixed by just making Haddock better, nor can it be fixed simply by improving the existing standard library documentation. There is a fundamental problem with Haskell documentation (which, to be completely fair, is not unique to Haskell), which is that its tools do not support anything more than API docs.</p><p>Good documentation is so much more than “here’s what this function does”; it’s about guides and tutorials and case studies and common pitfalls. <a href="http://docs.racket-lang.org/lens/lens-guide.html">This is documentation for someone new to lenses.</a> <a href="https://hackage.haskell.org/package/lens#readme">This is not.</a> Take note of the difference.</p><h2><a name="conclusion-and-other-thoughts"></a>Conclusion and other thoughts</h2><p>Haskell is an incredible programming platform, and indeed, it is sometimes mind-boggling how complete it is. It also has a lot of rough edges, sometimes in places that feel like they need a lot more care, or perhaps they’re even simply unfinished.</p><p>I could spend weeks writing about all the things I really like or dislike about the language, discussing in fine detail all the things that have made me excited or all the little bits that have made me want to tear my hair out. Heck, I could probably spend a month writing about strings alone. That’s not the point, though... I took a risk with Haskell, and it’s paid off. I’m not yet sure exactly how I feel about it, or when I would chose it relative to other tools, but it is currently very high on my list of favorite technologies.</p><p>I did not come to Haskell with a distaste for static typing, despite the fact that I write so much Racket, a dynamically typed language (by default, at least). I don’t really use Typed Racket, and despite my love for Haskell and its type system, I am not sure I will use much more of it than I did before. Haskell and Racket are very different languages, which is justified in some places and probably sort of circumstantial in others.</p><p>The future of Haskell seems bright, and a lot of the changes in the just-released GHC 8 are extremely exciting. I did not list records as a pain point because the changes in GHC 8 appear to make them a <em>lot</em> more palatable, although whether or not they solve that problem completely remains to be seen. I will absolutely continue to write Haskell and push it to its limits where I can, and hopefully try and take as much as I can from it along the way.</p><ol class="footnotes"></ol></article>Simple, safe multimethods in Racket2016-02-18T00:00:00Z2016-02-18T00:00:00ZAlexis King<article><p>Racket ships with <code>racket/generic</code>, a system for defining <em>generic methods</em>, functions that work differently depending on what sort of value they are supplied. I have made heavy use of this feature in my collections library, and it has worked well for my needs, but that system does have a bit of a limitation: it only supports <em>single dispatch</em>. Method implementations may only be chosen based on a single argument, so multiple dispatch is impossible.</p><h2><a name="motivating-multiple-dispatch"></a>Motivating multiple dispatch</h2><p>What is multiple dispatch and why is it necessary? Well, in most cases, it <em>isn’t</em> necessary at all. <a href="http://dl.acm.org/citation.cfm?doid=1449764.1449808">It has been shown that multiple dispatch is much rarer than single dispatch in practice.</a> However, when actually needed, having multiple dispatch in the toolbox is a valuable asset.</p><p>A classic example of multiple dispatch is multiplication over both scalars and vectors. Ideally, all of the following operations should work:</p><pre><code>2 × 3 = 6
2 × ⟨3, 4⟩ = ⟨6, 8⟩
⟨3, 4⟩ × 2 = ⟨6, 8⟩
</code></pre><p>In practice, most languages do not support such flexible dispatch rules without fairly complicated branching constructs to handle each permutation of input types. Furthermore, since most languages only support single dispatch (such as most object-oriented languages), it is nearly impossible to add support for a new combination of types to an existing method.</p><p>To illustrate the above, even if a language supported operator overloading <em>and</em> it included a <code>Vector</code> class that overloaded multiplication to properly work with numbers and vectors, it might not implement matrix multiplication. If a user defines a <code>Matrix</code> class, they may overload <em>its</em> multiplication to support numbers, vectors, and matrices, but it is impossible to extend the multiplication implementation for the <code>Vector</code> class. That method is now completely set in stone, unless it is edited directly (and the programmer may not have access to <code>Vector</code>’s implementation).</p><p>Multiple dispatch solves all of these problems. Rather than specify implementations of functions for singular types, it is possible to specify implementations for sets of types. In the above example, a programmer would be able to define a new function that operates on <code>Vector</code> and <code>Matrix</code> arguments. Since each definition does not “belong” to any given type, extending this set of operations is trivial.</p><h2><a name="multiple-dispatch-in-racket"></a>Multiple dispatch in Racket</h2><p>This blog post is somewhat long and technical, so before proceeding any further, I want to show some real code that actually works so you can get a feel for what I’m talking about. As a proof-of-concept, I have created <a href="https://github.com/lexi-lambda/racket-multimethod">a very simple implementation of multiple dispatch in Racket</a>. The above example would look like this in Racket using my module:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="n">multimethod</span><span class="p">)</span>
<span class="p">(</span><span class="k">provide</span> <span class="n">mul</span>
<span class="p">(</span><span class="k">struct-out</span> <span class="n">num</span><span class="p">)</span>
<span class="p">(</span><span class="k">struct-out</span> <span class="n">vec</span><span class="p">))</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">num</span> <span class="p">(</span><span class="n">val</span><span class="p">))</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">vec</span> <span class="p">(</span><span class="n">vals</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-generic</span> <span class="p">(</span><span class="n">mul</span> <span class="n">a</span> <span class="n">b</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">num</span> <span class="n">num</span><span class="p">)</span> <span class="n">x</span> <span class="n">y</span><span class="p">)</span>
<span class="p">(</span><span class="n">num</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="n">num-val</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="n">num-val</span> <span class="n">y</span><span class="p">))))</span>
<span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">num</span> <span class="n">vec</span><span class="p">)</span> <span class="n">n</span> <span class="n">v</span><span class="p">)</span>
<span class="p">(</span><span class="n">vec</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="nb">curry</span> <span class="nb">*</span> <span class="p">(</span><span class="n">num-val</span> <span class="n">n</span><span class="p">))</span> <span class="p">(</span><span class="n">vec-vals</span> <span class="n">v</span><span class="p">))))</span>
<span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">vec</span> <span class="n">num</span><span class="p">)</span> <span class="n">v</span> <span class="n">n</span><span class="p">)</span>
<span class="p">(</span><span class="n">mul</span> <span class="n">n</span> <span class="n">v</span><span class="p">))</span></code></pre><p>Pardon the somewhat clunky syntax, but the functionality is there. Using the above code works as expected:</p><pre><code>> (mul (num 2) (num 3))
(num 6)
> (mul (num 2) (vec '(3 4)))
(vec '(6 8))
> (mul (vec '(3 4)) (num 2))
(vec '(6 8))
</code></pre><p>Making the above snippet work is not particularly hard. In fact, it’s likely that most competent Racketeers could do it without much thought. However, there’s a tiny bit more going on behind the scenes than it may seem.</p><h2><a name="the-problem-with-multiple-dispatch"></a>The problem with multiple dispatch</h2><p>The single-dispatch design limitation of <code>racket/generic</code> comes directly from a desire to avoid what has been described as “spooky action at a distance”, a problem that is prevalent in many systems that support methods with multiple dispatch (aka <em>multimethods</em>). Specifically, the issue arises when new method implementations are defined for existing datatypes, which can have far-reaching effects throughout a program because the method table is global state. Both CLOS and Clojure suffer from this shortcoming.</p><p>Interestingly, Haskell with multi-parameter typeclasses (a nonstandard but highly useful extension) makes it quite trivial to create constructs similar to multiple dispatch (though the overload resolution is done at compile-time). The similarities are significant: Haskell <em>also</em> suffers from the possibility of a certain sort of “spooky action”. However, Haskell’s static typing and resolution allows the compiler to catch these potential issues, known as “orphan instances”, at compile time. Even though Racket does not support the same sort of static typing, the same idea can be used to keep multiple dispatch safe using the macro system.</p><h2><a name="safe-dynamically-typed-multiple-dispatch"></a>Safe, dynamically-typed multiple dispatch</h2><p>In order to make multiple dispatch safe, we first need to determine exactly what is unsafe. Haskell has rules for determining what constitutes an “orphan instance”, and these rules are equally applicable for determining dangerous multimethod implementations. Specifically, a definition can be considered unsafe if <em>both</em> of the following conditions are true:</p><ol><li><p>The multimethod that is being implemented was declared in a different module from the implementation.</p></li><li><p><em>All</em> of the types used for dispatch in the multimethod instance were declared in a different module from the implementation.</p></li></ol><p>Conversely, a multimethod implementation is safe if <em>either</em> of the following conditions are true:</p><ol><li><p>The multimethod that is being implemented is declared in the same module as the implementation.</p></li><li><p><em>Any</em> of the types used for dispatch in the multimethod instance are declared in the same module as the implementation.</p></li></ol><p>Why do these two rules provide a strong enough guarantee to eliminate the dangers created by global state? Well, to understand that, we need to understand what can go wrong if these rules are ignored.</p><h3><a name="multimethods-and-dangerous-instances"></a>Multimethods and dangerous instances</h3><p>What exactly is this dangerous-sounding “spooky action”, and what causes it? Well, the trouble stems from the side-effectful nature of multimethod instance definitions. Consider the Racket module from earlier, which defines multiplication instances for scalars and vectors:</p><pre><code class="pygments"><span class="p">(</span><span class="k">provide</span> <span class="n">mul</span>
<span class="p">(</span><span class="k">struct-out</span> <span class="n">num</span><span class="p">)</span>
<span class="p">(</span><span class="k">struct-out</span> <span class="n">vec</span><span class="p">))</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">num</span> <span class="p">(</span><span class="n">val</span><span class="p">))</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">vec</span> <span class="p">(</span><span class="n">vals</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-generic</span> <span class="p">(</span><span class="n">mul</span> <span class="n">a</span> <span class="n">b</span><span class="p">))</span>
<span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">num</span> <span class="n">num</span><span class="p">)</span> <span class="n">x</span> <span class="n">y</span><span class="p">)</span>
<span class="p">(</span><span class="n">num</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="n">num-val</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="n">num-val</span> <span class="n">y</span><span class="p">))))</span>
<span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">num</span> <span class="n">vec</span><span class="p">)</span> <span class="n">n</span> <span class="n">v</span><span class="p">)</span>
<span class="p">(</span><span class="n">vec</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="nb">curry</span> <span class="nb">*</span> <span class="p">(</span><span class="n">num-val</span> <span class="n">n</span><span class="p">))</span> <span class="p">(</span><span class="n">vec-vals</span> <span class="n">v</span><span class="p">))))</span>
<span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">vec</span> <span class="n">num</span><span class="p">)</span> <span class="n">v</span> <span class="n">n</span><span class="p">)</span>
<span class="p">(</span><span class="n">mul</span> <span class="n">n</span> <span class="n">v</span><span class="p">))</span></code></pre><p>Note that there is not actually a <code>(mul vec vec)</code> implementation. This is intentional: there are <em>two</em> ways to take the product of two vectors, so no default implementation is provided. However, it is possible that another module might desire an instance for <code>mul</code> that takes the dot product, and the programmer might write the following definition:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-instance</span> <span class="p">((</span><span class="n">mul</span> <span class="n">vec</span> <span class="n">vec</span><span class="p">)</span> <span class="n">x</span> <span class="n">y</span><span class="p">)</span>
<span class="p">(</span><span class="n">num</span> <span class="p">(</span><span class="nb">foldl</span> <span class="nb">+</span> <span class="mi">0</span> <span class="p">(</span><span class="nb">map</span> <span class="nb">*</span> <span class="p">(</span><span class="n">vec-vals</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="n">vec-vals</span> <span class="n">y</span><span class="p">)))))</span></code></pre><p>However, there is something fishy about the above definition: it doesn’t need to be exported with <code>provide</code> to work! Since instances don’t create new bindings, they only add dispatch options, they don’t ever need to <code>provide</code> anything. This is problematic, though: it means that a program could continue to happily compile <em>even if</em> the module containing the dot product instance was never loaded with <code>require</code>, but an attempt to multiply two vectors would fail at runtime, claiming that there was no <code>(mul vec vec)</code> implementation. This drastic change of behavior violates Racket programmers’ assumptions about the guarantees made by modules (<code>require</code> should not cause any side-effects if the module’s bindings are not used).</p><p>Of course, while this seems potentially unexpected, it is workable: just be careful to <code>require</code> modules containing instances. Unfortunately, it gets much worse—what if a different library defines <em>its own</em> <code>(mul vec vec)</code> instance? What if that instance takes the cross product instead? That library may function entirely properly on its own, but when loaded alongside the program that defines a dot product instance, it is impossible to determine which instance should be used where. Because <code>define-instance</code> operates by modifying the aforementioned global state, the implementations clash, and the two systems <em>cannot</em> continue to operate together as written.</p><p>This is pretty bad. Defining extra instances is a reasonable use-case for multiple dispatch, but if these instances can break <em>third-party code</em>, how can they be trusted? This sort of problem can make multiple dispatch difficult to reason about and even more difficult to trust.</p><h3><a name="what-determines-safety"></a>What determines safety?</h3><p>With those problems in mind, we can turn back to the two rules for <em>safe</em> multiple dispatch. How do they prevent the above issues? Well, let’s take them one at a time.</p><p>Remember that an instance can be unequivocally determined to be safe if either of the two conditions are true, so we can consider them entirely independently. The first one is simple—an instance is safe if the following condition holds:</p><blockquote><p>The multimethod that is being implemented is declared in the same module as the implementation.</p></blockquote><p>This one is pretty obvious. It is impossible to create a “bad” instance of a method declared in the same module because it is impossible to import the method without also bringing in the instance. Furthermore, a conflicting instance cannot be defined at the place where the types themselves are defined because that would require a circular module dependency, which Racket does not permit.</p><p>With the above explanation in mind, the second condition should make sense, too:</p><blockquote><p><em>Any</em> of the types used for dispatch in the multimethod instance are declared in the same module as the implementation.</p></blockquote><p>The same argument for the first point holds for the second, but with the parties swapped. Again, it is impossible to use the instance without somehow requiring the module that defines the datatype itself, so the instance would always be required, anyway. The most interesting aspect of this condition is that it demonstrates that instances can be defined for existing datatypes (that are defined in other modules) just so long as <em>at least one</em> of the datatypes is defined in the same module. This continues to permit the important use-case of extending the interfaces of existing types.</p><h3><a name="encoding-the-safety-rules-into-racket-s-macro-system"></a>Encoding the safety rules into Racket’s macro system</h3><p>In order to keep track of which methods and instances are defined where, I leveraged a technique based on the one <a href="http://www.ccs.neu.edu/racket/pubs/scheme2007-ctf.pdf">used by Typed Racket to keep track of whether or not a typed identifier is used in a typed or untyped context</a>. However, instead of using a simple mutable boolean flag, I used a mutable <a href="http://docs.racket-lang.org/syntax/syntax-helpers.html#%28tech._identifier._set%29">free identifier set</a>, which keeps track of the identifiers within a given module that should be considered “privileged”.</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket/base</span>
<span class="p">(</span><span class="k">require</span> <span class="n">syntax/id-set</span><span class="p">)</span>
<span class="p">(</span><span class="k">provide</span> <span class="n">mark-id-as-privileged!</span>
<span class="n">id-privileged?</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="n">privileged-ids</span> <span class="p">(</span><span class="n">mutable-free-id-set</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">mark-id-as-privileged!</span> <span class="n">id</span><span class="p">)</span>
<span class="p">(</span><span class="n">free-id-set-add!</span> <span class="n">privileged-ids</span> <span class="n">id</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">id-privileged?</span> <span class="n">id</span><span class="p">)</span>
<span class="p">(</span><span class="n">free-id-set-member?</span> <span class="n">privileged-ids</span> <span class="n">id</span><span class="p">))</span></code></pre><p>Making this work with <code>define-generic</code> is obvious: just invoke <code>mark-id-as-privileged!</code> on the method name to note that the method is “privileged” in the scope of the current module. Keeping track of privileged structs is similarly straightforward, though it is a little more devious: the <code>multimethod</code> module provides a custom <code>struct</code> macro that just expands to <code>struct</code> from <code>racket/base</code>, but adds privilege information.</p><p>The <code>define-instance</code> macro does all the heavy lifting to ensure that only privileged identifiers can be used in instance definitions. A simple check for the identifier annotations is performed before proceeding with macro expansion:</p><pre><code class="pygments"><span class="p">(</span><span class="k">unless</span> <span class="p">(</span><span class="k">or</span> <span class="n">privileged?</span> <span class="p">(</span><span class="nb">ormap</span> <span class="n">id-privileged?</span> <span class="n">types</span><span class="p">))</span>
<span class="p">(</span><span class="n">assert-privileged-struct!</span> <span class="p">(</span><span class="nb">first</span> <span class="n">types</span><span class="p">)))</span></code></pre><p>When the privilege checks fail, an error is raised:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">assert-privileged-struct!</span> <span class="n">id</span><span class="p">)</span>
<span class="p">(</span><span class="k">unless</span> <span class="p">(</span><span class="n">id-privileged?</span> <span class="n">id</span><span class="p">)</span>
<span class="p">(</span><span class="nb">raise-syntax-error</span> <span class="o">'</span><span class="ss">define-instance</span>
<span class="s2">"expected name of struct defined in current module"</span>
<span class="n">id</span><span class="p">)))</span></code></pre><p>With the above safeguards in place, the dangerous dot product implementation from above <strong>would not be allowed</strong>. The checks manage to encode both of the safety rules into the macro system such that invalid instances will fail <em>at compile time</em>, preventing dangerous uses of multimethods from ever slipping by unnoticed.</p><h3><a name="actually-implementing-multiple-dispatch"></a>Actually implementing multiple dispatch</h3><p>The rest of the multimethod implementation is relatively straightforward and is not even particularly robust. If anything, it is the bare minimum of what would be needed to allow the safety mechanisms above to work. Lots of features that would likely be needed in a real implementation are not included, and graceful error handling is largely ignored.</p><p>Multimethods themselves are implemented as Racket <a href="http://docs.racket-lang.org/guide/proc-macros.html#%28tech._transformer._binding%29">transformer bindings</a> containing custom data, including a reference to the multimethod’s arity and dispatch table. The custom datatype includes a <code>prop:procedure</code> structure type property, which allows such bindings to also function as macros. The macro procedure expands to an operation that looks up the proper instance to use in the multimethod’s dispatch table and invokes it with the supplied arguments.</p><p>The relevant code for defining multimethods is reproduced below:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">multimethod</span> <span class="p">(</span><span class="n">arity</span> <span class="n">dispatch-table</span><span class="p">)</span>
<span class="kd">#:property</span> <span class="nb">prop:procedure</span>
<span class="p">(</span><span class="k">λ</span> <span class="p">(</span><span class="n">method</span> <span class="n">stx</span><span class="p">)</span>
<span class="p">(</span><span class="n">syntax-parse</span> <span class="n">stx</span>
<span class="p">[(</span><span class="n">method</span> <span class="n">arg</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">apply-multimethod</span> <span class="n">method</span> <span class="p">(</span><span class="nb">list</span> <span class="n">arg</span> <span class="k">...</span><span class="p">))]</span>
<span class="p">[</span><span class="n">method</span>
<span class="o">#'</span><span class="p">(</span><span class="k">λ</span> <span class="n">args</span> <span class="p">(</span><span class="n">apply-multimethod</span> <span class="n">method</span> <span class="n">args</span><span class="p">))]))))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">define-generic</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="p">(</span><span class="n">method:id</span> <span class="n">arg:id</span> <span class="n">...+</span><span class="p">))</span>
<span class="p">(</span><span class="k">with-syntax</span> <span class="p">([</span><span class="n">arity</span> <span class="p">(</span><span class="nb">length</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">arg</span><span class="p">))]</span>
<span class="p">[</span><span class="n">dispatch-table</span> <span class="p">(</span><span class="n">generate-temporary</span> <span class="o">#'</span><span class="n">method</span><span class="p">)])</span>
<span class="p">(</span><span class="n">mark-id-as-privileged!</span> <span class="o">#'</span><span class="n">method</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">define</span> <span class="n">dispatch-table</span> <span class="p">(</span><span class="nb">make-hash</span><span class="p">))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">method</span> <span class="p">(</span><span class="n">multimethod</span> <span class="n">arity</span> <span class="o">#'</span><span class="n">dispatch-table</span><span class="p">))))]))</span></code></pre><p>The dispatch tables are implemented entirely in terms of Racket’s structure types, so while they can be defined on arbitrary structure types (including ones defined in the Racket standard library), they <em>cannot</em> be defined on primitives such as pairs or vectors. Implementations are registered in the dispatch table using the compile-time information associated with structs’ transformer bindings, and the same information is retrieved from struct instances at runtime to look up the proper implementation to call. Notably, this only works if the struct is <code>#:transparent</code>, or more generally and accurately, if the calling code has access to the struct’s inspector. All structs defined by the <code>struct</code> form from the <code>multimethod</code> module are automatically marked as <code>#:transparent</code>.</p><p>The following code implements defining multimethod instances:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">assert-privileged-struct!</span> <span class="n">id</span><span class="p">)</span>
<span class="p">(</span><span class="k">unless</span> <span class="p">(</span><span class="n">id-privileged?</span> <span class="n">id</span><span class="p">)</span>
<span class="p">(</span><span class="nb">raise-syntax-error</span> <span class="o">'</span><span class="ss">define-instance</span>
<span class="s2">"expected name of struct defined in current module"</span>
<span class="n">id</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">define-instance</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="c1">; standard (define (proc ...) ...) shorthand</span>
<span class="p">[(</span><span class="k">_</span> <span class="p">((</span><span class="n">method</span> <span class="n">type:id</span> <span class="n">...+</span><span class="p">)</span> <span class="o">.</span> <span class="n">args</span><span class="p">)</span> <span class="n">body:expr</span> <span class="n">...+</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="n">define-instance</span> <span class="p">(</span><span class="n">method</span> <span class="n">type</span> <span class="k">...</span><span class="p">)</span> <span class="p">(</span><span class="k">λ</span> <span class="n">args</span> <span class="n">body</span> <span class="k">...</span><span class="p">))]</span>
<span class="c1">; full (define proc lambda-expr) notation</span>
<span class="p">[(</span><span class="k">_</span> <span class="p">(</span><span class="n">method</span> <span class="n">type:id</span> <span class="n">...+</span><span class="p">)</span> <span class="n">proc:expr</span><span class="p">)</span>
<span class="p">(</span><span class="k">let*</span> <span class="p">([</span><span class="n">multimethod</span> <span class="p">(</span><span class="nb">syntax-local-value</span> <span class="o">#'</span><span class="n">method</span><span class="p">)]</span>
<span class="p">[</span><span class="n">privileged?</span> <span class="p">(</span><span class="n">id-privileged?</span> <span class="o">#'</span><span class="n">method</span><span class="p">)])</span>
<span class="p">(</span><span class="k">unless</span> <span class="p">(</span><span class="k">or</span> <span class="n">privileged?</span> <span class="p">(</span><span class="nb">ormap</span> <span class="n">id-privileged?</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">type</span><span class="p">)))</span>
<span class="p">(</span><span class="n">assert-privileged-struct!</span> <span class="p">(</span><span class="nb">first</span> <span class="p">(</span><span class="n">attribute</span> <span class="n">type</span><span class="p">))))</span>
<span class="p">(</span><span class="k">with-syntax</span> <span class="p">([</span><span class="n">dispatch-table</span> <span class="p">(</span><span class="n">multimethod-dispatch-table</span> <span class="n">multimethod</span><span class="p">)]</span>
<span class="p">[(</span><span class="n">struct-type-id</span> <span class="k">...</span><span class="p">)</span> <span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="nb">compose1</span> <span class="nb">first</span> <span class="n">extract-struct-info</span> <span class="nb">syntax-local-value</span><span class="p">)</span>
<span class="p">(</span><span class="n">attribute</span> <span class="n">type</span><span class="p">))])</span>
<span class="o">#'</span><span class="p">(</span><span class="k">let</span> <span class="p">([</span><span class="n">struct-types</span> <span class="p">(</span><span class="nb">list</span> <span class="n">struct-type-id</span> <span class="k">...</span><span class="p">)])</span>
<span class="p">(</span><span class="nb">hash-set!</span> <span class="n">dispatch-table</span> <span class="n">struct-types</span> <span class="n">proc</span><span class="p">))))]))</span></code></pre><p>The resulting implementation is a useful, if certainly incomplete implementation of multimethods in Racket that does not sacrifice the safety provided by <code>racket/generic</code>’s single-dispatch approach.</p><h2><a name="related-work-advantages-and-disadvantages-and-areas-for-future-improvement"></a>Related work, advantages and disadvantages, and areas for future improvement</h2><p>As previously mentioned, this implementation of multiple dispatch was inspired by the types of APIs offered by CLOS and Clojure while also maintaining the safety of <code>racket/generic</code>. The inspiration for the safety rules came from GHC’s detection of orphan instances. Although most of the ideas presented above exist in other places, I am unsure if the concept of safety checking has been used before in any dynamically-typed programming languages.</p><p>The primary advantage offered over Racket’s existing generics system is obvious: multiple dispatch. Furthermore, this system can supersede many uses of <code>racket/generic</code> simply by dispatching on a single type. However, the current implementation does <em>not</em> support all of the features of <code>racket/generic</code>, such as supporting non-structure types and allowing fallback implementations. While those are well within the realm of possibility, other things like attaching structure type properties are probably not possible with this approach, so it is unlikely that the existing system could be subsumed by one like this one.</p><p>Additionally, this implementation would almost certainly need numerous improvements before being useful to most programmers:</p><ul><li><p><strong>Good error reporting for failure cases.</strong> Right now, even something obvious like calling a method on values that do not implement it simply fails with an error produced by <code>hash-ref</code>. In a more interesting sense, using the arity to generate compile-time error messages for <code>define-instance</code> would be a nice improvement.</p></li><li><p><strong>Support for Racket primitive data types.</strong> This might require some cooperation from Racket itself to permit an elegant implementation, but they could also just be special-cased. So long as lookup for primitives was done <em>after</em> consulting the main dispatch table, there wouldn’t be any performance hit for non-primitive types.</p></li><li><p><strong>Option to supply fallback implementations.</strong> This wouldn’t be too hard at all, though it’s questionable whether or not it would be useful without method groupings like <code>define/generic</code> provides. There would likely also need to be some sort of way to check if a set of values implements a particular method.</p></li><li><p><strong>Better cooperation with structure inspectors to alleviate the need for all structures to be transparent.</strong> It’s currently unclear to me how exactly this works and how it <em>should</em> work. There might be a better way to do this without mucking with inspectors.</p></li><li><p><strong>Much more flexible argument lists, including the ability to specify arguments that are not used for dispatch.</strong> This is really a pretty fundamental requirement, but the parsing required was significant enough for me to put it off for this initial prototype.</p></li><li><p><strong>Scribble forms to document generic methods and their instances.</strong> This is something <code>racket/generic</code> <em>doesn’t</em> have, and it has suffered for it. It would be very nice to have easy documentation forms for multimethods.</p></li><li><p><strong>Proper consideration of struct subtyping.</strong> Racket structs support subtyping, which I have not given much thought for this prototype. It is possible that subtyping violates constraints I had assumed would hold, so reviewing the existing code with that context would be useful.</p></li></ul><p>I’m not sure how much effort is involved in most of the above ideas, and in fact I’m not even completely sure how useful this system is to begin with. I have not found myself reaching much for multiple dispatch in my time as a Racket programmer, but that could simply be because it was previously unavailable. It will be interesting to see if that changes now that I have built this system, even if it is a bit rough around the edges.</p><h2><a name="conclusion"></a>Conclusion</h2><p>Despite the lack of need for multiple dispatch to solve most problems, as indicated by its general lack of support in mainstream programming languages, it’s a nice tool to have in the toolbox, and it <em>is</em> asked for in the Racket community from time to time (perhaps due to its familiarity in other parts of the Lisp world). Time will tell if pointing people to something like this will create or stifle interest in multiple dispatch for Racket.</p><p>The source for the <a href="https://github.com/lexi-lambda/racket-multimethod"><code>multimethod</code> package can be found here</a> if you are at all interested in playing with it yourself.</p><ol class="footnotes"></ol></article>ADTs in Typed Racket with macros2015-12-21T00:00:00Z2015-12-21T00:00:00ZAlexis King<article><p>Macros are one of Racket's flagship features, and its macro system really is state of the art. Of course, it can sometimes be difficult to demonstrate <em>why</em> macros are so highly esteemed, in part because it can be hard to find self-contained examples of using macros in practice. Of course, one thing that macros are perfect for is filling a "hole" in the language by introducing a feature a language lacks, and one of those features in Typed Racket is <strong>ADTs</strong>.</p><h2><a name="warning-this-is-not-a-macro-tutorial"></a>Warning: this is not a macro tutorial</h2><p>First, a disclaimer: this post assumes at least some knowledge of Scheme/Racket macros. Ideally, you would be familiar with Racket itself. But if you aren't, fear not: if you get lost, don't worry. Hold on to the bigger picture, and you'll likely learn more than someone who knows enough to follow all the way through. If you <em>are</em> interested in learning about macros, I must recommend Greg Hendershott's <a href="http://www.greghendershott.com/fear-of-macros/">Fear of Macros</a>. It is good. This is not that.</p><p>Now, with that out of the way, let's get started.</p><h2><a name="what-we-re-building"></a>What we’re building</h2><p><a href="https://en.wikipedia.org/wiki/Algebraic_data_type">Algebraic data types</a>, or <em>ADTs</em>, are a staple of the ML family of functional programming languages. I won't go into detail here—I want to focus on the implementation—but they're a very descriptive way of modeling data that encourages designing functions in terms of pattern-matching, something that Racket is already good at.</p><p>Racket also already has a facility for creating custom data structures in the form of <em>structs</em>, which are extremely flexible, but also a little verbose. Racket structs are more powerful than we need, but that means we can implement our ADTs in terms of Racket's struct system.</p><p>With that in mind, what should our syntax look like? Well, let's consider a quintessential example of ADTs: modeling a simple tree. For now, let's just consider a tree of integers. For reference, the Haskell syntax for such a data structure would look like this:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Tree</span> <span class="ow">=</span> <span class="kt">Empty</span>
<span class="o">|</span> <span class="kt">Leaf</span> <span class="kt">Int</span>
<span class="o">|</span> <span class="kt">Node</span> <span class="kt">Tree</span> <span class="kt">Tree</span></code></pre><p>This already demonstrates a few of the core things we'll need to build:</p><ol><li><p>Each ADT has a <em>data type</em>, in this case <code>Tree</code>. This name only exists in the world of types, it isn't a value.</p></li><li><p>Each ADT has various <em>data constructors</em>, in this case <code>Leaf</code> and <code>Node</code>.</p></li><li><p>Each data constructor may accept any number of arguments, each of which have a specific type.</p></li><li><p>The types that data constructors may accept include the ADT's datatype itself—that is, definitions can be recursive.</p></li></ol><p>Of course, there's one more important feature we're missing: polymorphism. Our definition of a tree is overly-specific, and really, it should be able to hold any kind of data, not just integers. In Haskell, we can do that by adding a type parameter:</p><pre><code class="pygments"><span class="kr">data</span> <span class="kt">Tree</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">Empty</span>
<span class="o">|</span> <span class="kt">Leaf</span> <span class="n">a</span>
<span class="o">|</span> <span class="kt">Node</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span></code></pre><p>With this in mind, we can add a fifth and final point to our list:</p><ol start="5"><li><p>ADTs must be able to be parametrically polymorphic.</p></li></ol><p>That covers all of our requirements for basic ADTs. Now we're ready to port this idea to Racket.</p><h3><a name="describing-adts-in-racket"></a>Describing ADTs in Racket</h3><p>How should we take the Haskell syntax for an ADT definition and adapt it to Racket's parenthetical s-expressions? By taking some cues from the Haskell implementation, Typed Racket's type syntax, and Racket's naming conventions, a fairly logical syntax emerges:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-datatype</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)</span>
<span class="n">Empty</span>
<span class="p">(</span><span class="n">Leaf</span> <span class="n">a</span><span class="p">)</span>
<span class="p">(</span><span class="n">Node</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)))</span></code></pre><p>This looks pretty good. Just like with the Haskell implementation, <code>Tree</code> should only exist at the type level, and <code>Empty</code>, <code>Leaf</code>, and <code>Node</code> should be constructor functions. Our syntax mirrors Racket function application, too—the proper way to create a leaf would be <code>(Leaf 7)</code>.</p><p>Now that we can create ADT values, how should we extract the values from them? Well, just like in ML-likes, we can use pattern-matching. We don't need to reinvent the wheel for this one; we should be able to just use Racket's <code>match</code>[racket] with our datatypes. For example, a function that sums all the values in a tree might look like this:</p><pre><code class="pygments"><span class="p">(</span><span class="n">:</span> <span class="n">tree-sum</span> <span class="p">((</span><span class="n">Tree</span> <span class="n">Integer</span><span class="p">)</span> <span class="k">-></span> <span class="n">Integer</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">tree-sum</span> <span class="n">tree</span><span class="p">)</span>
<span class="p">(</span><span class="k">match</span> <span class="n">tree</span>
<span class="p">[(</span><span class="n">Empty</span><span class="p">)</span> <span class="mi">0</span> <span class="p">]</span>
<span class="p">[(</span><span class="n">Leaf</span> <span class="n">n</span><span class="p">)</span> <span class="n">n</span> <span class="p">]</span>
<span class="p">[(</span><span class="n">Node</span> <span class="n">l</span> <span class="n">r</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="n">tree-sum</span> <span class="n">l</span><span class="p">)</span>
<span class="p">(</span><span class="n">tree-sum</span> <span class="n">r</span><span class="p">))]))</span></code></pre><p>Given that Racket's <code>struct</code> form automatically produces identifiers that cooperate with <code>match</code>, this shouldn't be hard at all. And with our syntax settled, we're ready to begin implementation.</p><h2><a name="implementing-adts-as-syntax"></a>Implementing ADTs as syntax</h2><p>Now for the fun part. To implement our ADT syntax, we'll employ Racket's industrial-strength macro DSL, <a href="http://docs.racket-lang.org/syntax/stxparse.html"><code>syntax/parse</code></a>. The <code>syntax/parse</code> library works like the traditional Scheme <code>syntax-case</code> on steroids, and one of the most useful features is the ability to define "syntax classes" that encapsulate reusable parsing rules into declarative components.</p><p>Since this is not a macro tutorial, the following implementation assumes you already know how to use <code>syntax/parse</code>. However, all of the concepts here are well within the reaches of any intermediate macrologist, so don't be intimidated by some of the more complex topics at play.</p><h3><a name="parsing-types-with-a-syntax-class"></a>Parsing types with a syntax class</h3><p>To implement ADTs, we're going to want to define exactly one syntax class, a class that describes the grammar for a type. As we've seen, types can be bare identifiers, like <code>Tree</code>, or they can be identifiers with parameters, like <code>(Tree a)</code>. We'll want to cover both cases.</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">type</span>
<span class="p">(</span><span class="n">pattern</span> <span class="n">name:id</span> <span class="kd">#:attr</span> <span class="p">[</span><span class="n">param</span> <span class="mi">1</span><span class="p">]</span> <span class="o">'</span><span class="p">())</span>
<span class="p">(</span><span class="n">pattern</span> <span class="p">(</span><span class="n">name:id</span> <span class="n">param</span> <span class="n">...+</span><span class="p">))))</span></code></pre><p>This syntax class has two rules, one that's a bare identifier, and one that's a list. The ellipsis followed by a plus (<code>...+</code>) in the second example means "one or more", so parsing those parameters will automatically be handled for us. In the bare identifier example, we use <code>#:attr</code> to give the <code>param</code> attribute the default value of an empty list, so this syntax class will actually <em>normalize</em> the input we get in addition to actually parsing it.</p><h3><a name="a-first-attempt-at-define-datatype"></a>A first attempt at <code>define-datatype</code></h3><p>Now we can move on to actually implementing <code>define-datatype</code>. The rules are simple: we need to generate a structure type for each one of the data constructors, and we need to generate a type definition for the parent type itself. This is pretty simple to implement using <code>syntax-parser</code>, which actually does the parsing for our macro.</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">define-datatype</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">type-name:type</span> <span class="n">data-constructor:type</span> <span class="k">...</span><span class="p">)</span>
<span class="p">]))</span></code></pre><p>This definition will do all the parsing we need. It parses the entire macro "invocation", ignoring the first datum with <code>_</code> (which will just be the identifier <code>define-datatype</code>), then expecting a <code>type-name</code>, which uses the <code>type</code> syntax class we defined above. Next, we expect zero or more <code>data-constructor</code>s, which also use the <code>type</code> syntax class. That's all we have to do for parsing. We now have all the information we need to actually output the expansion for the macro.</p><p>Of course, it won't be that easy: this is the difficult part. The first step is to generate a Racket struct for each data constructor. We can do this pretty easily with some simple use of Racket's syntax templating facility. A naïve attempt would look like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">define-syntax</span> <span class="n">define-datatype</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">type-name:type</span> <span class="n">data-constructor:type</span> <span class="k">...</span><span class="p">)</span>
<span class="o">#'</span><span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">data-constructor.name</span> <span class="p">([</span><span class="n">f</span> <span class="n">:</span> <span class="n">data-constructor.param</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span>
<span class="k">...</span><span class="p">))]))</span></code></pre><p>This is actually really close to being correct. This will generate a struct definition for each <code>data-constructor</code>, where each struct has the name of the data constructor and the same number of fields as arguments provided. The trouble is that in Racket structs, all of the fields have <em>names</em>, but in our ADTs, all the fields are anonymous and by-position. Currently, we're just using the same name for <em>all</em> the fields, <code>f</code>, so if any data constructor has two or more fields, we'll get an error.</p><p>Since we don't care about the field names, what we want to do is just generate random names for every field. To do this, we can use a Racket function called <code>generate-temporary</code>, which generates random identifiers. Our next attempt might look like this:</p><pre><code class="pygments"><span class="o">#`</span><span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">data-constructor.name</span>
<span class="p">([</span><span class="o">#,</span><span class="p">(</span><span class="n">generate-temporary</span><span class="p">)</span> <span class="n">:</span> <span class="n">data-constructor.param</span><span class="p">]</span> <span class="k">...</span><span class="p">)</span>
<span class="k">...</span><span class="p">))</span></code></pre><p>The <code>#,</code> lets us "escape" from the template to execute <code>(generate-temporary)</code> and interpolate its result into the syntax. Unfortunately, this doesn't work. We <em>do</em> generate a random field name, but the ellipsis will re-use the same generated value when it repeats the fields, rendering our whole effort pointless. We need to generate the field names once per type.</p><h3><a name="more-leveraging-syntax-classes"></a>More leveraging syntax classes</h3><p>As it turns out, this is <em>also</em> easy to do with syntax classes. We can add an extra attribute to our <code>type</code> syntax class to generate a random identifier with each one. Again, we can use <code>#:attr</code> to do that automatically. Our new definition for <code>type</code> will look like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">type</span>
<span class="p">(</span><span class="n">pattern</span> <span class="n">name:id</span>
<span class="kd">#:attr</span> <span class="p">[</span><span class="n">param</span> <span class="mi">1</span><span class="p">]</span> <span class="o">'</span><span class="p">()</span>
<span class="kd">#:attr</span> <span class="p">[</span><span class="n">field-id</span> <span class="mi">1</span><span class="p">]</span> <span class="o">'</span><span class="p">())</span>
<span class="p">(</span><span class="n">pattern</span> <span class="p">(</span><span class="n">name:id</span> <span class="n">param</span> <span class="n">...+</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="p">[</span><span class="n">field-id</span> <span class="mi">1</span><span class="p">]</span> <span class="p">(</span><span class="nb">generate-temporaries</span> <span class="o">#'</span><span class="p">(</span><span class="n">param</span> <span class="k">...</span><span class="p">)))))</span></code></pre><p>Here we're using <code>generate-temporaries</code> instead of <code>generate-temporary</code>, which will conveniently generate a new identifier for each of the elements in the list we provide it. This way, we'll get a fresh identifier for each <code>param</code>.</p><p>We can now fix our macro to use this <code>field-id</code> attribute instead of the static field name:</p><pre><code class="pygments"><span class="o">#'</span><span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">data-constructor.name</span>
<span class="p">([</span><span class="n">data-constructor.field-id</span> <span class="n">:</span> <span class="n">data-constructor.param</span><span class="p">]</span> <span class="k">...</span><span class="p">))</span>
<span class="k">...</span><span class="p">)</span></code></pre><h3><a name="creating-the-supertype"></a>Creating the supertype</h3><p>We're almost done—now we just need to implement our overall type, the one defined by <code>type-name</code>. This is implemented as a trivial type alias, but we need to ensure that polymorphic types are properly handled. For example, a non-polymorphic type would need to be handled like this:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-type</span> <span class="n">Tree</span> <span class="p">(</span><span class="n">U</span> <span class="n">Empty</span> <span class="n">Leaf</span> <span class="n">Node</span><span class="p">))</span></code></pre><p>However, a polymorphic type alias would need to include the type parameters in each subtype, like this:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-type</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">U</span> <span class="p">(</span><span class="n">Empty</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">Leaf</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">Node</span> <span class="n">a</span><span class="p">)))</span></code></pre><p>How can we do this? Well, so far, we've been very declarative by using syntax patterns, templates, and classes. However, this is a more pernicious problem to solve with our declarative tools. Fortunately, it's very easy to fall back to using <strong>procedural macros</strong>.</p><p>To build each properly-instantiated type, we'll use a combination of <code>define/with-syntax</code> and Racket's list comprehensions, <code>for/list</code>. The <code>define/with-syntax</code> form binds values to pattern identifiers, which can be used within syntax patterns just like the ones bound by <code>syntax-parser</code>. This will allow us to break up our result into multiple steps. Technically, <code>define/with-syntax</code> is not strictly necessary—we could just use <code>#`</code> and <code>#,</code>—but it's cleaner to work with.</p><p>We'll start by defining a set of instantiated data constructor types, one per <code>data-constructor</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define/with-syntax</span> <span class="p">[</span><span class="n">data-type</span> <span class="k">...</span><span class="p">]</span>
<span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">name</span> <span class="p">(</span><span class="nb">in-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">data-constructor.name</span> <span class="k">...</span><span class="p">))])</span>
<span class="p">))</span></code></pre><p>Now we can fill in the body with any code we'd like, so long as each body returns a syntax object. We can use some trivial branching logic to determine which form we need:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define/with-syntax</span> <span class="p">[</span><span class="n">data-type</span> <span class="k">...</span><span class="p">]</span>
<span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">name</span> <span class="p">(</span><span class="nb">in-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">data-constructor.name</span> <span class="k">...</span><span class="p">))])</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">stx-null?</span> <span class="o">#'</span><span class="p">(</span><span class="n">type-name.param</span> <span class="k">...</span><span class="p">))</span>
<span class="n">name</span>
<span class="o">#`</span><span class="p">(</span><span class="o">#,</span><span class="n">name</span> <span class="n">type-name.param</span> <span class="k">...</span><span class="p">))))</span></code></pre><p>Now with our definition for <code>data-type</code>, we can implement our type alias for the supertype extremely easily:</p><pre><code class="pygments"><span class="o">#'</span><span class="p">(</span><span class="n">define-type</span> <span class="n">type-name</span> <span class="p">(</span><span class="n">U</span> <span class="n">data-type</span> <span class="k">...</span><span class="p">))</span></code></pre><h3><a name="putting-it-all-together"></a>Putting it all together</h3><p>There's just one more thing to do before we can call this macro finished: we need to ensure that all the type parameters defined by <code>type-name</code> are in scope for each data constructor's structure definition. We can do this by making use of <code>type-name.param</code> within each produced struct definition, resulting in this:</p><pre><code class="pygments"><span class="o">#'</span><span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">struct</span> <span class="n">data-constructor.name</span> <span class="p">(</span><span class="n">type-name.param</span> <span class="k">...</span><span class="p">)</span>
<span class="p">([</span><span class="n">data-constructor.field-id</span> <span class="n">:</span> <span class="n">data-constructor.param</span><span class="p">]</span> <span class="k">...</span><span class="p">))</span>
<span class="k">...</span><span class="p">)</span></code></pre><p>And we're done! The final macro, now completed, looks like this:</p><pre><code class="pygments"><span class="p">(</span><span class="k">begin-for-syntax</span>
<span class="p">(</span><span class="n">define-syntax-class</span> <span class="n">type</span>
<span class="p">(</span><span class="n">pattern</span> <span class="n">name:id</span>
<span class="kd">#:attr</span> <span class="p">[</span><span class="n">param</span> <span class="mi">1</span><span class="p">]</span> <span class="o">'</span><span class="p">()</span>
<span class="kd">#:attr</span> <span class="p">[</span><span class="n">field-id</span> <span class="mi">1</span><span class="p">]</span> <span class="o">'</span><span class="p">())</span>
<span class="p">(</span><span class="n">pattern</span> <span class="p">(</span><span class="n">name:id</span> <span class="n">param</span> <span class="n">...+</span><span class="p">)</span>
<span class="kd">#:attr</span> <span class="p">[</span><span class="n">field-id</span> <span class="mi">1</span><span class="p">]</span> <span class="p">(</span><span class="nb">generate-temporaries</span> <span class="o">#'</span><span class="p">(</span><span class="n">param</span> <span class="k">...</span><span class="p">)))))</span>
<span class="p">(</span><span class="k">define-syntax</span> <span class="n">define-datatype</span>
<span class="p">(</span><span class="n">syntax-parser</span>
<span class="p">[(</span><span class="k">_</span> <span class="n">type-name:type</span> <span class="n">data-constructor:type</span> <span class="k">...</span><span class="p">)</span>
<span class="p">(</span><span class="n">define/with-syntax</span> <span class="p">[</span><span class="n">data-type</span> <span class="k">...</span><span class="p">]</span>
<span class="p">(</span><span class="k">for/list</span> <span class="p">([</span><span class="n">name</span> <span class="p">(</span><span class="nb">in-syntax</span> <span class="o">#'</span><span class="p">(</span><span class="n">data-constructor.name</span> <span class="k">...</span><span class="p">))])</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="n">stx-null?</span> <span class="o">#'</span><span class="p">(</span><span class="n">type-name.param</span> <span class="k">...</span><span class="p">))</span>
<span class="n">name</span>
<span class="o">#`</span><span class="p">(</span><span class="o">#,</span><span class="n">name</span> <span class="n">type-name.param</span> <span class="k">...</span><span class="p">))))</span>
<span class="o">#'</span><span class="p">(</span><span class="k">begin</span>
<span class="p">(</span><span class="k">struct</span> <span class="p">(</span><span class="n">type-name.param</span> <span class="k">...</span><span class="p">)</span> <span class="n">data-constructor.name</span>
<span class="p">([</span><span class="n">data-constructor.field-id</span> <span class="n">:</span> <span class="n">data-constructor.param</span><span class="p">]</span> <span class="k">...</span><span class="p">))</span> <span class="k">...</span>
<span class="p">(</span><span class="n">define-type</span> <span class="n">type-name</span> <span class="p">(</span><span class="n">U</span> <span class="n">data-type</span> <span class="k">...</span><span class="p">)))]))</span></code></pre><p>It's a little bit dense, certainly, but it is not as complicated or scary as it might seem. It's a simple, mostly declarative, powerful way to transform a DSL into ordinary Typed Racket syntax, and now all we have to do is put it to use.</p><h2><a name="using-our-adts"></a>Using our ADTs</h2><p>With the macro built, we can now actually use our ADTs using the syntax we described! The following is now <em>valid code</em>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-datatype</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)</span>
<span class="n">Empty</span>
<span class="p">(</span><span class="n">Leaf</span> <span class="n">a</span><span class="p">)</span>
<span class="p">(</span><span class="n">Node</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">Tree</span> <span class="n">a</span><span class="p">)))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">Node</span> <span class="p">(</span><span class="n">Leaf</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="n">Node</span> <span class="p">(</span><span class="n">Empty</span><span class="p">)</span> <span class="p">(</span><span class="n">Leaf</span> <span class="mi">7</span><span class="p">)))</span>
<span class="nb">-</span> <span class="n">:</span> <span class="p">(</span><span class="n">Node</span> <span class="n">Positive-Byte</span><span class="p">)</span>
<span class="p">(</span><span class="n">Node</span> <span class="p">(</span><span class="n">Leaf</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="n">Node</span> <span class="p">(</span><span class="n">Empty</span><span class="p">)</span> <span class="p">(</span><span class="n">Leaf</span> <span class="mi">7</span><span class="p">)))</span></code></pre><p>We can use this to define common data types, such as Haskell's <code>Maybe</code>:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-datatype</span> <span class="p">(</span><span class="n">Maybe</span> <span class="n">a</span><span class="p">)</span>
<span class="p">(</span><span class="n">Just</span> <span class="n">a</span><span class="p">)</span>
<span class="n">Nothing</span><span class="p">)</span>
<span class="p">(</span><span class="n">:</span> <span class="n">maybe-default</span> <span class="p">(</span><span class="n">All</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="p">(</span><span class="n">Maybe</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="k">-></span> <span class="n">a</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">maybe-default</span> <span class="n">m</span> <span class="n">v</span><span class="p">)</span>
<span class="p">(</span><span class="k">match</span> <span class="n">m</span>
<span class="p">[(</span><span class="n">Just</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span><span class="p">]</span>
<span class="p">[(</span><span class="n">Nothing</span><span class="p">)</span> <span class="n">v</span><span class="p">]))</span>
<span class="p">(</span><span class="n">:</span> <span class="n">maybe-then</span> <span class="p">(</span><span class="n">All</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="p">(</span><span class="n">Maybe</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">a</span> <span class="k">-></span> <span class="p">(</span><span class="n">Maybe</span> <span class="n">a</span><span class="p">))</span> <span class="k">-></span> <span class="p">(</span><span class="n">Maybe</span> <span class="n">a</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">maybe-then</span> <span class="n">m</span> <span class="n">f</span><span class="p">)</span>
<span class="p">(</span><span class="k">match</span> <span class="n">m</span>
<span class="p">[(</span><span class="n">Just</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">)]</span>
<span class="p">[(</span><span class="n">Nothing</span><span class="p">)</span> <span class="p">(</span><span class="n">Nothing</span><span class="p">)]))</span></code></pre><p>And of course, we can also use it to define ADTs that use concrete types rather that type parameters, if we so desire. This implements a small mathematical language, along with a trivial interpreter:</p><pre><code class="pygments"><span class="p">(</span><span class="n">define-datatype</span> <span class="n">Expr</span>
<span class="p">(</span><span class="n">Value</span> <span class="n">Number</span><span class="p">)</span>
<span class="p">(</span><span class="n">Add</span> <span class="n">Expr</span> <span class="n">Expr</span><span class="p">)</span>
<span class="p">(</span><span class="n">Subtract</span> <span class="n">Expr</span> <span class="n">Expr</span><span class="p">)</span>
<span class="p">(</span><span class="n">Multiply</span> <span class="n">Expr</span> <span class="n">Expr</span><span class="p">)</span>
<span class="p">(</span><span class="n">Divide</span> <span class="n">Expr</span> <span class="n">Expr</span><span class="p">))</span>
<span class="p">(</span><span class="n">:</span> <span class="n">evaluate</span> <span class="p">(</span><span class="n">Expr</span> <span class="k">-></span> <span class="n">Number</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">e</span><span class="p">)</span>
<span class="p">(</span><span class="k">match</span> <span class="n">e</span>
<span class="p">[(</span><span class="n">Value</span> <span class="n">x</span><span class="p">)</span> <span class="n">x</span> <span class="p">]</span>
<span class="p">[(</span><span class="n">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="n">Subtract</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">-</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="n">Multiply</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">*</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">b</span><span class="p">))]</span>
<span class="p">[(</span><span class="n">Divide</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="nb">/</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="n">evaluate</span> <span class="n">b</span><span class="p">))]))</span>
<span class="nb">></span> <span class="p">(</span><span class="n">evaluate</span> <span class="p">(</span><span class="n">Add</span> <span class="p">(</span><span class="n">Value</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="n">Multiply</span> <span class="p">(</span><span class="n">Divide</span> <span class="p">(</span><span class="n">Value</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="n">Value</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">(</span><span class="n">Value</span> <span class="mi">7</span><span class="p">))))</span>
<span class="mi">4</span> <span class="m">1/2</span></code></pre><p>There's all the power of ADTs, right in Racket, all implemented in 22 lines of code. If you'd like to see all the code together in a runnable form, <a href="https://gist.github.com/lexi-lambda/18cf7a9156f743a1317e">I've put together a gist here</a>.</p><h2><a name="conclusions-and-credit"></a>Conclusions and credit</h2><p>This isn't the simplest macro to create, nor is it the most complex. The code examples might not even make much sense until you try it out yourself. Macros, like any difficult concept, are not always easy to pick up, but they certainly <em>are</em> powerful. The ability to extend the language in such a way, in the matter of minutes, is unparalleled in languages other than Lisp.</p><p>This is, of course, a blessing and a curse. Lisps reject some of the syntactic landmarks that often aid in readability for the power to abstract programs into their bare components. In the end, is this uniform conciseness more or less readable? That's an incredibly subjective question, one that has prompted powerfully impassioned discussions, and I will not attempt to argue one way or the other here.</p><p>That said, I think it's pretty cool.</p><p>Finally, I must give credit where credit is due. Thanks to <a href="http://andmkent.com">Andrew M. Kent</a> for the creation of the <a href="https://github.com/andmkent/datatype">datatype</a> package, which served as the inspiration for this blog post. Many thanks to <a href="http://www.ccs.neu.edu/home/samth/">Sam Tobin-Hochstadt</a> for his work creating Typed Racket, as well as helping me dramatically simplify the implementation used in this blog post. Also thanks to <a href="http://www.ccs.neu.edu/home/ryanc/">Ryan Culpepper</a> and <a href="http://www.ccs.neu.edu/home/matthias/">Matthias Felleisen</a> for their work on creating <code>syntax/parse</code>, which is truly a marvelous tool for exploring the world of macros, and, of course, a big thanks to <a href="http://www.cs.utah.edu/~mflatt/">Matthew Flatt</a> for his implementation of hygiene in Racket, as well as much of the rest of Racket itself. Not to mention the entire legacy of those who formulated the foundations of the Scheme macro system and created the framework for all of this to be possible so many decades later.</p><p>Truly, working in Racket feels like standing on the shoulders of giants. If you're intrigued, give it a shot. It's a fun feeling.</p><ol class="footnotes"></ol></article>Functionally updating record types in Elm2015-11-06T00:00:00Z2015-11-06T00:00:00ZAlexis King<article><p><a href="http://elm-lang.org">Elm</a> is a wonderful language for building web apps, and I love so much of its approach to language design. Elm does so many things <em>right</em> straight out of the box, and that's a real breath of fresh air in the intersection of functional programming and web development. Still, it gets one thing wrong, and unfortunately, that one thing is incredibly important. Elm took the "functions" out of "functional record types".</p><p>Almost any software program, at its core, is all about data. Maybe it's about computing data, maybe it's about manipulating data, or maybe it's about displaying data, but at the end of the day, some sort of data model is going to be needed. The functional model is a breathtakingly elegant system for handling data and shuttling it around throughout a program, and <a href="https://en.wikipedia.org/wiki/Functional_reactive_programming">functional reactive programming</a>, which Elm uses to model event-like interactions, makes this model work even better. The really important thing, though, is what tools Elm actually gives you to model your data.</p><h2><a name="a-brief-primer-on-elm-records"></a>A brief primer on Elm records</h2><p>Elm supports all the core datatypes one would expect—numbers, strings, booleans, optionals, etc.—and it allows users to define their own types with ADTs. However, Elm also provides another datatype, which it calls "records". Records are similar to objects in JavaScript: they're effectively key-value mappings. They're cool data structures, and they work well. Here's an example of creating a <code>Point</code> datatype in Elm:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">alias</span> <span class="kt">Point</span> <span class="nf">=</span>
<span class="p">{</span> <span class="nv">x</span> <span class="nf">:</span> <span class="kt">Float</span><span class="p">,</span> <span class="nv">y</span> <span class="nf">:</span> <span class="kt">Float</span> <span class="p">}</span></code></pre><p>Notice that <code>Point</code> is declared as a type <em>alias</em>, not as a separate type like an ADT. This is because record types are truly encoded in the type system as values with named fields, not as disparate types. This allows for some fun tricks, but that's outside the scope of this blog post.</p><h2><a name="the-good"></a>The good</h2><p>What I'd like to discuss is what it looks like to <em>manipulate</em> these data structures. Constructing them is completely painless, and reading from them is super simple. This is where the record system gets everything very <em>right</em>.</p><pre><code class="pygments"><span class="nv">origin</span> <span class="nf">:</span> <span class="kt">Point</span>
<span class="nv">origin</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">x</span> <span class="nf">=</span> <span class="mi">0</span><span class="p">,</span> <span class="nv">y</span> <span class="nf">=</span> <span class="mi">0</span> <span class="p">}</span>
<span class="nv">distanceBetween</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Float</span>
<span class="nv">distanceBetween</span> <span class="nv">a</span> <span class="nv">b</span> <span class="nf">=</span>
<span class="kr">let</span> <span class="nv">dx</span> <span class="nf">=</span> <span class="nv">a</span><span class="nf">.</span><span class="nv">x</span> <span class="nf">-</span> <span class="nv">b</span><span class="nf">.</span><span class="nv">x</span>
<span class="nv">dy</span> <span class="nf">=</span> <span class="nv">a</span><span class="nf">.</span><span class="nv">y</span> <span class="nf">-</span> <span class="nv">b</span><span class="nf">.</span><span class="nv">y</span>
<span class="kr">in</span> <span class="nv">sqrt</span> <span class="p">(</span><span class="nv">dx</span><span class="nf">*</span><span class="nv">dx</span> <span class="nf">+</span> <span class="nv">dy</span><span class="nf">*</span><span class="nv">dy</span><span class="p">)</span></code></pre><p>The syntax is clean and simple. Most importantly, however, the record system is functional (in the "functional programming" sense). In a functional system, it's useful to express concepts in terms of function composition, and this is very easy to do in Elm. Creating a function to access a field would normally be clunky if you always needed to do <code>record.field</code> to access the value. Fortunately, Elm provides some sugar:</p><pre><code class="pygments"><span class="c1">-- These two expressions are equivalent:</span>
<span class="p">(</span><span class="nf">\</span><span class="nv">record</span> <span class="nf">-></span> <span class="nv">record</span><span class="nf">.</span><span class="nv">field</span><span class="p">)</span>
<span class="nf">.</span><span class="nv">field</span></code></pre><p>Using the <code>.field</code> shorthand allows writing some other functions in terms of composition, as most functional programmers would desire:</p><pre><code class="pygments"><span class="nv">doubledX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Float</span>
<span class="nv">doubledX</span> <span class="nf">=</span> <span class="p">(</span><span class="nf">(*)</span> <span class="mi">2</span><span class="p">)</span> <span class="nf"><<</span> <span class="nf">.</span><span class="nv">x</span></code></pre><p>This satisfies me.</p><h2><a name="the-bad"></a>The bad</h2><p>So if everything in Elm is so great, what am I complaining about? Well, while the syntax to access fields is convenient, the syntax to <em>functionally set</em> fields is questionably clunky. Consider a function that accepts a point and returns a new point with its <code>x</code> field set to <code>0</code>:</p><pre><code class="pygments"><span class="nv">zeroedX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">zeroedX</span> <span class="nv">point</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">point</span> <span class="nf">|</span> <span class="nv">x</span> <span class="nf"><-</span> <span class="mi">0</span> <span class="p">}</span></code></pre><p>This doesn't look too bad, does it? It's clear and concise. To me, though, there's something deeply wrong here... this function has a lot of redundancy! It seems to me like we should be able to write this function more clearly in a point-free style. The <code>.field</code> shorthand "functionalizes" the record getter syntax, so there must be a function version of the update syntax, right? Maybe it would look something like this:</p><pre><code class="pygments"><span class="nv">zeroedX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">zeroedX</span> <span class="nf">=</span> <span class="err">!</span><span class="nv">x</span> <span class="mi">0</span></code></pre><p>But alas, there is no such syntax.</p><p>Now you may ask... why does it matter? This seems trivial, and in fact, the explicit updater syntax may actually be more readable by virtue of how explicit it is. You'd be right, because so far, these examples have been horribly contrived. But let's consider a slightly more useful example: <em>functionally updating</em> a record.</p><p>What's the difference? Well, say I wanted to take a point and increment its <code>x</code> field by one. Well, I can easily write a function for that:</p><pre><code class="pygments"><span class="nv">incrementX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">incrementX</span> <span class="nv">point</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">point</span> <span class="nf">|</span> <span class="nv">x</span> <span class="nf"><-</span> <span class="nv">point</span><span class="nf">.</span><span class="nv">x</span> <span class="nf">+</span> <span class="mi">1</span> <span class="p">}</span></code></pre><p>Not terrible, though a <em>little</em> verbose. Still, what if we want to also add a function that <em>decrements</em> <code>x</code>?</p><pre><code class="pygments"><span class="nv">decrementX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">decrementX</span> <span class="nv">point</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">point</span> <span class="nf">|</span> <span class="nv">x</span> <span class="nf"><-</span> <span class="nv">point</span><span class="nf">.</span><span class="nv">x</span> <span class="nf">-</span> <span class="mi">1</span> <span class="p">}</span></code></pre><p>Oh, gosh. That's basically the exact same definition but with the operation flipped. Plus we probably want these operations for <code>y</code>, too. Fortunately, there's an easy solution: just pass a function in to <em>transform</em> the value! We can define an <code>updateX</code> function that allows us to do that easily, then we can define our derived operations in terms of that:</p><pre><code class="pygments"><span class="nv">updateX</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Float</span> <span class="nf">-></span> <span class="kt">Float</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">updateX</span> <span class="nv">f</span> <span class="nv">point</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">point</span> <span class="nf">|</span> <span class="nv">x</span> <span class="nf"><-</span> <span class="nv">f</span> <span class="nv">point</span><span class="nf">.</span><span class="nv">x</span> <span class="p">}</span>
<span class="nv">incrementX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">incrementX</span> <span class="nf">=</span> <span class="nv">updateX</span> <span class="p">(</span><span class="nf">(+)</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">decrementX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">decrementX</span> <span class="nf">=</span> <span class="nv">updateX</span> <span class="p">(</span><span class="nf">\</span><span class="nv">x</span> <span class="nf">-></span> <span class="nv">x</span> <span class="nf">-</span> <span class="mi">1</span><span class="p">)</span></code></pre><p>Not only is that much cleaner, but we can now use it to implement all sorts of other operations that allow us to add, subtract, multiply, or divide the <code>x</code> field. Now we just need to generalize our solution to work with the <code>x</code> <em>and</em> <code>y</code> fields!</p><p>Oh, wait. <strong>We can't.</strong></p><h2><a name="the-ugly"></a>The ugly</h2><p>This is where everything breaks down completely. Elm does not offer enough abstraction to reduce this level of crazy duplication:</p><pre><code class="pygments"><span class="nv">updateX</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Float</span> <span class="nf">-></span> <span class="kt">Float</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">updateX</span> <span class="nv">f</span> <span class="nv">point</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">point</span> <span class="nf">|</span> <span class="nv">x</span> <span class="nf"><-</span> <span class="nv">f</span> <span class="nv">point</span><span class="nf">.</span><span class="nv">x</span> <span class="p">}</span>
<span class="nv">incrementX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">incrementX</span> <span class="nf">=</span> <span class="nv">updateX</span> <span class="p">(</span><span class="nf">(+)</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">decrementX</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">decrementX</span> <span class="nf">=</span> <span class="nv">updateX</span> <span class="p">(</span><span class="nf">\</span><span class="nv">x</span> <span class="nf">-></span> <span class="nv">x</span> <span class="nf">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">updateY</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Float</span> <span class="nf">-></span> <span class="kt">Float</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">updateY</span> <span class="nv">f</span> <span class="nv">point</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">point</span> <span class="nf">|</span> <span class="nv">y</span> <span class="nf"><-</span> <span class="nv">f</span> <span class="nv">point</span><span class="nf">.</span><span class="nv">y</span> <span class="p">}</span>
<span class="nv">incrementY</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">incrementY</span> <span class="nf">=</span> <span class="nv">updateY</span> <span class="p">(</span><span class="nf">(+)</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">decrementY</span> <span class="nf">:</span> <span class="kt">Point</span> <span class="nf">-></span> <span class="kt">Point</span>
<span class="nv">decrementY</span> <span class="nf">=</span> <span class="nv">updateY</span> <span class="p">(</span><span class="nf">\</span><span class="nv">x</span> <span class="nf">-></span> <span class="nv">x</span> <span class="nf">-</span> <span class="mi">1</span><span class="p">)</span></code></pre><p>We sure can give it a shot, though. At the very least, we <em>can</em> implement the increment and decrement functions in a more general way by passing in an updater function:</p><pre><code class="pygments"><span class="nv">increment</span> <span class="nf">:</span> <span class="p">((</span><span class="kt">Float</span> <span class="nf">-></span> <span class="kt">Float</span><span class="p">)</span> <span class="nf">-></span> <span class="nv">a</span> <span class="nf">-></span> <span class="nv">a</span><span class="p">)</span> <span class="nf">-></span> <span class="nv">a</span> <span class="nf">-></span> <span class="nv">a</span>
<span class="nv">increment</span> <span class="nv">update</span> <span class="nf">=</span> <span class="nv">update</span> <span class="p">(</span><span class="nf">(+)</span> <span class="mi">1</span><span class="p">)</span></code></pre><p>Now, with <code>updateX</code> and <code>updateY</code>, we can increment either field very clearly and expressively. If we shorten the names to <code>uX</code> and <code>uY</code>, then the resulting code is actually very readable:</p><pre><code class="pygments"><span class="nv">pointAbove</span> <span class="nf">=</span> <span class="nv">uY</span> <span class="p">(</span><span class="nf">\</span><span class="nv">x</span> <span class="nf">-></span> <span class="nv">x</span> <span class="nf">+</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">pointBelow</span> <span class="nf">=</span> <span class="nv">uY</span> <span class="p">(</span><span class="nf">\</span><span class="nv">x</span> <span class="nf">-></span> <span class="nv">x</span> <span class="nf">-</span> <span class="mi">1</span><span class="p">)</span></code></pre><p>It's almost like English now: "update Y using this transformation". This is actually pretty satisfactory. The trouble arises when you have a struct with many fields:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kr">alias</span> <span class="kt">PlayerStats</span> <span class="nf">=</span>
<span class="p">{</span> <span class="nv">health</span> <span class="nf">:</span> <span class="kt">Integer</span>
<span class="p">,</span> <span class="nv">strength</span> <span class="nf">:</span> <span class="kt">Integer</span>
<span class="p">,</span> <span class="nv">charisma</span> <span class="nf">:</span> <span class="kt">Integer</span>
<span class="p">,</span> <span class="nv">intellect</span> <span class="nf">:</span> <span class="kt">Integer</span>
<span class="c1">-- etc.</span>
<span class="p">}</span></code></pre><p>It might be very convenient to have generic functional updaters in this case. One could imagine a game that has <code>Potion</code> items:</p><pre><code class="pygments"><span class="kr">type</span> <span class="kt">Potion</span> <span class="nf">=</span> <span class="kt">Potion</span> <span class="kt">String</span> <span class="p">(</span><span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span><span class="p">)</span></code></pre><p>And then some different kinds of potions:</p><pre><code class="pygments"><span class="nv">potions</span> <span class="nf">=</span>
<span class="p">[</span> <span class="p">(</span><span class="kt">Potion</span> <span class="s">"Health Potion"</span> <span class="p">(</span><span class="nv">uHealth</span> <span class="p">(</span><span class="nf">(+)</span> <span class="mi">1</span><span class="p">))),</span>
<span class="p">,</span> <span class="p">(</span><span class="kt">Potion</span> <span class="s">"Greater Intellect Potion"</span> <span class="p">(</span><span class="nv">uIntellect</span> <span class="p">(</span><span class="nf">(+)</span> <span class="mi">3</span><span class="p">)))</span>
<span class="p">,</span> <span class="p">(</span><span class="kt">Potion</span> <span class="s">"Potion of Weakness"</span> <span class="p">(</span><span class="nv">uStrength</span> <span class="p">(</span><span class="nf">\</span><span class="nv">x</span> <span class="nf">-></span> <span class="nv">x</span> <span class="nf">//</span> <span class="mi">5</span><span class="p">)))</span>
<span class="p">]</span></code></pre><p>This is a really elegant way to think about items that can affect a player's stats! Unfortunately, it also means you have to define updater functions for <em>every single field in the record</em>. This can get tedious rather quickly:</p><pre><code class="pygments"><span class="nv">uHealth</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Integer</span> <span class="nf">-></span> <span class="kt">Integer</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span>
<span class="nv">uHealth</span> <span class="nv">f</span> <span class="nv">stats</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">stats</span> <span class="nf">|</span> <span class="nv">health</span> <span class="nf"><-</span> <span class="nv">f</span> <span class="nv">stats</span><span class="nf">.</span><span class="nv">health</span> <span class="p">}</span>
<span class="nv">uStrength</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Integer</span> <span class="nf">-></span> <span class="kt">Integer</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span>
<span class="nv">uStrength</span> <span class="nv">f</span> <span class="nv">stats</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">stats</span> <span class="nf">|</span> <span class="nv">strength</span> <span class="nf"><-</span> <span class="nv">f</span> <span class="nv">stats</span><span class="nf">.</span><span class="nv">strength</span> <span class="p">}</span>
<span class="nv">uCharisma</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Integer</span> <span class="nf">-></span> <span class="kt">Integer</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span>
<span class="nv">uCharisma</span> <span class="nv">f</span> <span class="nv">stats</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">stats</span> <span class="nf">|</span> <span class="nv">charisma</span> <span class="nf"><-</span> <span class="nv">f</span> <span class="nv">stats</span><span class="nf">.</span><span class="nv">charisma</span> <span class="p">}</span>
<span class="c1">-- etc.</span></code></pre><p>This is pretty icky. Could there be a better way?</p><h2><a name="trying-to-create-a-more-general-abstraction"></a>Trying to create a more general abstraction</h2><p>Interestingly, this pattern doesn't <em>need</em> to be this bad. There are better ways to do this. Let's revisit our updater functions.</p><p>Really, <code>update</code> can be defined in terms of two other primitive operations: a read and a (functional) write. What would it look like if we implemented it that way instead of requiring special updater functions to be defined? Well, it would look like this:</p><pre><code class="pygments"><span class="nv">update</span> <span class="nf">:</span> <span class="p">(</span><span class="nv">a</span> <span class="nf">-></span> <span class="nv">b</span><span class="p">)</span> <span class="nf">-></span> <span class="p">(</span><span class="nv">b</span> <span class="nf">-></span> <span class="nv">a</span> <span class="nf">-></span> <span class="nv">a</span><span class="p">)</span> <span class="nf">-></span> <span class="p">(</span><span class="nv">b</span> <span class="nf">-></span> <span class="nv">b</span><span class="p">)</span> <span class="nf">-></span> <span class="nv">a</span> <span class="nf">-></span> <span class="nv">a</span>
<span class="nv">update</span> <span class="nv">get</span> <span class="nv">set</span> <span class="nv">f</span> <span class="nv">x</span> <span class="nf">=</span> <span class="nv">set</span> <span class="p">(</span><span class="nv">f</span> <span class="p">(</span><span class="nv">get</span> <span class="nv">x</span><span class="p">))</span> <span class="nv">x</span></code></pre><p>The type definition is a little long, but it's really pretty simple. We just supply a getter and a setter, then a function to do the transformation, and finally a record to actually transform. Of course, as you can see from the type, this function isn't actually specific to records: it can be used with any value for which a getter and setter can be provided.</p><p>The trouble here is that writing field setters isn't any easier in Elm than writing field updaters. They still look pretty verbose:</p><pre><code class="pygments"><span class="nv">sHealth</span> <span class="nf">:</span> <span class="kt">Integer</span> <span class="nf">-></span> <span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span>
<span class="nv">sHealth</span> <span class="nv">x</span> <span class="nv">stats</span> <span class="nf">=</span> <span class="p">{</span> <span class="nv">stats</span> <span class="nf">|</span> <span class="nv">health</span> <span class="nf"><-</span> <span class="nv">x</span> <span class="p">}</span>
<span class="nv">uHealth</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Integer</span> <span class="nf">-></span> <span class="kt">Integer</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span>
<span class="nv">uHealth</span> <span class="nf">=</span> <span class="nv">update</span> <span class="nf">.</span><span class="nv">health</span> <span class="nv">sHealth</span></code></pre><p>So, at the end of it all, this isn't really a better abstraction. Still remember my fantasy <code>!field</code> setter shorthand half a blog post ago? Now perhaps it makes a little more sense. <em>If</em> such a syntax existed, then defining the updater would be incredibly simple:</p><pre><code class="pygments"><span class="nv">uHealth</span> <span class="nf">:</span> <span class="p">(</span><span class="kt">Integer</span> <span class="nf">-></span> <span class="kt">Integer</span><span class="p">)</span> <span class="nf">-></span> <span class="kt">PlayerStats</span> <span class="nf">-></span> <span class="kt">PlayerStats</span>
<span class="nv">uHealth</span> <span class="nf">=</span> <span class="nv">update</span> <span class="nf">.</span><span class="nv">health</span> <span class="err">!</span><span class="nv">health</span></code></pre><p>Still, no syntax, no easy updaters, and by extension, no easy, declarative description of behavior without quite a bit of boilerplate.</p><h2><a name="conclusions-and-related-work"></a>Conclusions and related work</h2><p>Elm is a very promising language, and it seems to be in fairly rapid development. So far, its author, <a href="https://twitter.com/czaplic">Evan Czaplicki</a>, has taken a very cautious approach to implementing language features, especially potentially redundant ones. This caution is why things like operator slicing, "where" clauses, and special updater syntax have not yet made it into the language. Maybe at some point these will be deemed important enough to include, but for the time being, they've been excluded.</p><p>I obviously think that having this sort of thing is incredibly important to being able to write expressive code without a huge amount of overhead. However, I also do <em>not</em> want to give the impression that I think adding special setter syntax is the only way to do it.</p><p>Seasoned functional programmers will surely have noticed that many of these concepts sound a lot like lenses, and Elm actually already has a lens-like library authored by Evan himself, called <a href="https://github.com/evancz/focus">Focus</a>. This, however, does not actually solve the problem: it requires manual description of setters just like the purely function based approach does. Really, lenses are just the logical next step in the line of abstraction I've already laid out above.</p><p>Interestingly, PureScript and Elm, the two Haskell-likes-on-the-frontend that I've paid attention to (though PureScript is much closer to Haskell than Elm), both have this very same problem. Haskell itself solves it with macros via Template Haskell. My favorite language, Racket, solves it with its own macro system. Is there another way to do these things that doesn't involve introducing a heavyweight macro system? Definitely. But I think this is a <em>necessary feature</em>, not a "nice to have", so if a macro system is out of the picture, then a simpler, less flexible solution is the obvious logical alternative.</p><p>I really like Elm, and most of my experiences with it have been more than enough to convince me that it is a fantastic language for the job. Unfortunately, the issue of functional record updaters has been quite the frustrating obstacle in my otherwise frictionless ride. I will continue to happily use Elm over other, far less accommodating tools, but I hope that issues like these will be smoothed out as the language and its ecosystem matures.</p><ol class="footnotes"></ol></article>Canonical factories for testing with factory_girl_api2015-09-23T00:00:00Z2015-09-23T00:00:00ZAlexis King<article><p>Modern web applications are often built as <em>single-page apps</em>, which are great for keeping concerns separated, but problematic when tested. Logic needs to be duplicated in front- and back-end test suites, and if the two apps diverge, the tests won't catch the failure. I haven't found a very good solution to this problem aside from brittle, end-to-end integration tests.</p><p>To attempt to address a fraction of this problem, I built <a href="https://github.com/lexi-lambda/factory_girl_api">factory_girl_api</a>, a way to share context setup between both sides of the application.</p><h2><a name="a-brief-overview-of-factory-girl"></a>A brief overview of factory_girl</h2><p>In the land of Ruby and Rails, <a href="https://github.com/thoughtbot/factory_girl">factory_girl</a> is a convenient gem for managing factories for models. Out of the box, it integrates with Rails' default ORM, ActiveRecord, and provides declarative syntax for describing what attributes factories should initialize. For example, a factory declaration used to create a widget might look like this:</p><pre><code class="pygments"><span class="no">FactoryGirl</span><span class="o">.</span><span class="n">define</span> <span class="k">do</span>
<span class="n">factory</span> <span class="ss">:widget</span> <span class="k">do</span>
<span class="n">sequence</span><span class="p">(</span><span class="ss">:name</span><span class="p">)</span> <span class="p">{</span> <span class="o">|</span><span class="nb">id</span><span class="o">|</span> <span class="s1">'Widget #'</span> <span class="o">+</span> <span class="nb">id</span> <span class="p">}</span>
<span class="n">price</span> <span class="mi">10</span>
<span class="n">trait</span> <span class="ss">:expensive</span> <span class="k">do</span>
<span class="n">price</span> <span class="mi">1000</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span></code></pre><p>This makes it easy to create new instances of <code>Widget</code> and use them for unit tests. For example, this would create and persist a widget with a unique name and a price of 10 units:</p><pre><code class="pygments"><span class="n">widget</span> <span class="o">=</span> <span class="no">FactoryGirl</span><span class="o">.</span><span class="n">create</span> <span class="ss">:widget</span></code></pre><p>We can also create more expensive widgets by using the <code>:expensive</code> trait.</p><pre><code class="pygments"><span class="n">expensive_widget</span> <span class="o">=</span> <span class="no">FactoryGirl</span><span class="o">.</span><span class="n">create</span> <span class="ss">:widget</span><span class="p">,</span> <span class="ss">:expensive</span></code></pre><p>Any number of traits can be specified at once. Additionally, it is possible to override individual attributes manually.</p><pre><code class="pygments"><span class="n">fancy_widget</span> <span class="o">=</span> <span class="no">FactoryGirl</span><span class="o">.</span><span class="n">create</span> <span class="ss">:widget</span><span class="p">,</span> <span class="ss">:expensive</span><span class="p">,</span> <span class="nb">name</span><span class="p">:</span> <span class="s1">'Fancy Widget'</span></code></pre><p>It works well, and it keeps initialization boilerplate out of individual tests.</p><h2><a name="testing-on-the-front-end"></a>Testing on the front-end</h2><p>Trouble arises when we need to write tests for the JavaScript application that use the same models. Suddenly, we need to duplicate the same kind of logic in our front-end tests. We might start out by setting up object state manually:</p><pre><code class="pygments"><span class="kd">var</span> <span class="nx">fancyWidget</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Widget</span><span class="p">({</span>
<span class="nx">name</span><span class="o">:</span> <span class="s1">'Fancy Widget'</span><span class="p">,</span>
<span class="nx">price</span><span class="o">:</span> <span class="mi">1000</span>
<span class="p">});</span></code></pre><p>Things can quickly get out of hand when models grow complex. Even if we use a factory library in JavaScript, it's possible for our front-end factories to diverge from their back-end counterparts. This means our integration tests will fail, but our unit tests will still blindly pass. Having to duplicate all that logic in two places is dangerous. It would be nice to have a <em>single, canonical source</em> for all of our factories.</p><h3><a name="reusing-server-side-factories-with-factory-girl-api"></a>Reusing server-side factories with factory_girl_api</h3><p>To help alleviate this problem, I created the <a href="https://github.com/lexi-lambda/factory_girl_api">factory_girl_api</a> gem for Rails and the <a href="https://github.com/lexi-lambda/angular-factory-girl-api">angular-factory-girl-api</a> Bower package for Angular. These packages cooperate with each other to allow server-side factories to be used in JavaScript tests.</p><p>The Angular module provides a service with syntax comparable to factory_girl itself. Both traits and custom attributes are supported:</p><pre><code class="pygments"><span class="nx">FactoryGirl</span><span class="p">.</span><span class="nx">create</span><span class="p">(</span><span class="s1">'widget'</span><span class="p">,</span> <span class="s1">'expensive'</span><span class="p">,</span> <span class="p">{</span> <span class="nx">name</span><span class="o">:</span> <span class="s1">'Fancy Widget'</span> <span class="p">});</span></code></pre><p>In this case, however, a round-trip API call must be made to the server in order to call the factory and return the result. Because of this, the Angular version of FactoryGirl returns a promise that is resolved with the serialized version of the model, which can then be used as sample data in unit tests.</p><h3><a name="the-problems-with-relying-on-the-server-for-data"></a>The problems with relying on the server for data</h3><p>In my preliminary use of this tool, it works. In many ways, it's much nicer than duplicating logic in both places. However, I'm not <em>completely</em> convinced it's the right solution yet.</p><p>First of all, it couples the front-end to the back-end, even during unit testing, which is disappointing. It means that a server needs to be running (in test mode) in order for the tests to run at all. For the kinds of projects I work on, this isn't really a bad thing, and the benefits of the reduced duplication far outweigh the downsides.</p><p>My real concern is that this solves a very small facet of the general problem with fragile front-end test suites. Single-page applications usually depend wholly on their integration with back-end APIs. If those APIs change, the tests will continue to happily pass as long as the API is simply mocked, which seems to be the usual solution in the front-end universe. This is, frankly, unacceptable in real application development.</p><h3><a name="potential-improvements-and-other-paths-to-success"></a>Potential improvements and other paths to success</h3><p>I am ultimately unsatisfied with this approach, but writing brittle end-to-end integration tests is not the solution. This <em>kind</em> of thing may be a step in the right direction: writing tests that aren't really pure unit tests, but also aren't fragile full-stack integration tests. This is a middle-ground that seems infrequently traveled, perhaps due to a lack of tooling (or perhaps because it just doesn't work). I don't know.</p><p>Either way, I'm interested in where this is headed, and I'll be curious to see if I run into any roadblocks using the workflow I've created. If anyone else is interested in playing with these two libraries, the READMEs are much more comprehensive than what I've covered here. Take a look, and give them a spin!</p><ul><li><p><a href="https://github.com/lexi-lambda/factory_girl_api">factory_girl_api</a></p></li><li><p><a href="https://github.com/lexi-lambda/angular-factory-girl-api">angular-factory-girl-api</a></p></li></ul><ol class="footnotes"></ol></article>Managing application configuration with Envy2015-08-30T00:00:00Z2015-08-30T00:00:00ZAlexis King<article><p>Application configuration can be a pain. Modern web apps don't live on dedicated boxes, they run on VPSes somewhere in the amorphous "cloud", and keeping configuration out of your application's repository can seem like more trouble than it's worth. Fortunately, <a href="http://12factor.net">The Twelve-Factor App</a> provides a set of standards for keeping web apps sane, and <a href="http://12factor.net/config">one of those guidelines advises keeping configuration in the environment</a>.</p><p><a href="https://github.com/lexi-lambda/envy">Envy</a> is the declarative bridge between Racket code and the outside world of the environment.</p><h2><a name="introducing-envy"></a>Introducing Envy</h2><p>I built Envy to distill the common tasks needed when working with environment variables into a single, declarative interface that eliminates boilerplate and makes it easy to see which environment variables an application depends on (instead of having them littered throughout the codebase). Using it is simple. Just require <code>envy</code> and you're good to go.</p><p>The best way to use Envy is to create a "manifest" module that declares all the environment variables your application might use. For example, the following module is a manifest that describes an application that uses three environment variables:</p><pre><code class="pygments"><span class="c1">; environment.rkt</span>
<span class="kn">#lang </span><span class="nn">typed/racket/base</span>
<span class="p">(</span><span class="k">require</span> <span class="n">envy</span><span class="p">)</span>
<span class="p">(</span><span class="n">define/provide-environment</span>
<span class="n">api-token</span>
<span class="p">[</span><span class="n">log-level</span> <span class="n">:</span> <span class="n">Symbol</span> <span class="kd">#:default</span> <span class="o">'</span><span class="ss">info</span><span class="p">]</span>
<span class="p">[</span><span class="n">parallel?</span> <span class="n">:</span> <span class="n">Boolean</span><span class="p">])</span></code></pre><p>When this module is required, Envy will automatically do the following:</p><ol><li><p>Envy will check the values of three environment variables: <code>API_TOKEN</code>, <code>LOG_LEVEL</code>, and <code>PARALLEL</code>.</p></li><li><p>If either <code>API_TOKEN</code> or <code>PARALLEL</code> is not set, an error will be raised:</p><pre><code>envy: The required environment variable "API_TOKEN" is not defined.
</code></pre></li><li><p>The values for <code>LOG_LEVEL</code> and <code>PARALLEL</code> will be parsed to match their type annotations.</p></li><li><p>If <code>LOG_LEVEL</code> is not set, it will use the default value, <code>'info</code>.</p></li><li><p>The values will be stored in <code>api-token</code>, <code>log-level</code>, and <code>parallel?</code>, all of which will be provided by the enclosing module.</p></li></ol><p>Now just <code>(require (prefix-in env: "environment.rkt"))</code>, and the environment variables are guaranteed to be available in your application's code.</p><h2><a name="working-with-typed-racket"></a>Working with Typed Racket</h2><p>As you may have noticed by the example above, Envy is built with Typed Racket in mind. In fact, <code>define/provide-environment</code> will <em>only</em> work within a Typed Racket module, but that doesn't mean Envy can't be used with plain Racket—the manifest module can always be required by any kind of Racket module.</p><p>However, when using Typed Racket, Envy provides additional bonuses. Environment variables are inherently untyped—they're all just strings—but Envy assigns the proper type to each environment variable automatically, so no casting is necessary.</p><pre><code>> parallel?
- : Boolean
#t
</code></pre><p>Envy really shines when using optional environment variables with the <code>#:default</code> option. The type of the value given to <code>#:default</code> doesn't need to be the same type of the environment variable itself, and if it isn't, Envy will assign the value a union type.</p><pre><code>> (define-environment
[num-threads : Positive-Integer #:default #f])
> num-threads
- : (U Positive-Integer #f)
#f
</code></pre><p>This added level of type-safety means it's easy to manage optional variables that don't have reasonable defaults: the type system will enforce that all code considers the possibility that such variables do not exist.</p><h2><a name="and-more"></a>And more...</h2><p>To see the full set of features that Envy already provides, <a href="https://lexi-lambda.github.io/envy/envy.html">take a look at the documentation</a>. That said, this is just the first release based on my initial use-cases, but I'm sure there are more features Envy could have to accommodate common application configuration patterns. If you have an idea that could make Envy better, <a href="https://github.com/lexi-lambda/envy/issues">open an issue and make a suggestion</a>! I already have plans for a <code>#lang envy</code> DSL, which will hopefully cut the boilerplate out in its entirety.</p><p>And finally, to give credit where credit is due, Envy is heavily inspired by <a href="https://github.com/eval/envied">Envied</a> (both in name and function), an environment variable manager for Ruby, which I've used to great effect.</p><p>Try it out!</p><ul><li><p><code>raco pkg install envy</code></p></li><li><p><a href="https://github.com/lexi-lambda/envy">Envy on GitHub</a></p></li><li><p><a href="https://lexi-lambda.github.io/envy/envy.html">Envy documentation</a></p></li></ul><ol class="footnotes"></ol></article>Deploying Racket applications on Heroku2015-08-22T00:00:00Z2015-08-22T00:00:00ZAlexis King<article><p><a href="https://www.heroku.com">Heroku</a> is a "platform as a service" that provides an incredibly simple way to deploy simple internet applications, and I take liberal advantage of its free tier for testing out simple applications. It has support for a variety of languages built-in, but Racket is not currently among them. Fortunately, Heroku provides an interface for adding custom build processes for arbitrary types of applications, called “buildpacks”. I've built one for Racket apps, and with just a little bit of configuration, it’s possible to get a Racket webserver running on Heroku.</p><h2><a name="building-the-server"></a>Building the server</h2><p>Racket's <a href="http://docs.racket-lang.org/web-server/index.html">web-server</a> package makes building and running a simple server incredibly easy. Here's all the code we'll need to get going:</p><pre><code class="pygments"><span class="kn">#lang </span><span class="nn">racket</span>
<span class="p">(</span><span class="k">require</span> <span class="n">web-server/servlet</span>
<span class="n">web-server/servlet-env</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">start</span> <span class="n">req</span><span class="p">)</span>
<span class="p">(</span><span class="n">response/xexpr</span>
<span class="o">'</span><span class="p">(</span><span class="ss">html</span> <span class="p">(</span><span class="ss">head</span> <span class="p">(</span><span class="ss">title</span> <span class="s2">"Racket Heroku App"</span><span class="p">))</span>
<span class="p">(</span><span class="ss">body</span> <span class="p">(</span><span class="ss">h1</span> <span class="s2">"It works!"</span><span class="p">)))))</span>
<span class="p">(</span><span class="n">serve/servlet</span> <span class="n">start</span> <span class="kd">#:servlet-path</span> <span class="s2">"/"</span><span class="p">)</span></code></pre><p>Running the above file will start up the server on the default port, 8080. When running on Heroku, however, we're required to bind to the port that Heroku provides via the <code>PORT</code> environment variable. We can access this using the Racket <code>getenv</code>[racket] function.</p><p>Additionally, the Racket web server specifically binds to localhost, but Heroku doesn't allow that restriction, so we need to pass <code>#f</code> for the <code>#:listen-ip</code> argument.</p><pre><code class="pygments"><span class="p">(</span><span class="k">define</span> <span class="n">port</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">getenv</span> <span class="s2">"PORT"</span><span class="p">)</span>
<span class="p">(</span><span class="nb">string->number</span> <span class="p">(</span><span class="nb">getenv</span> <span class="s2">"PORT"</span><span class="p">))</span>
<span class="mi">8080</span><span class="p">))</span>
<span class="p">(</span><span class="n">serve/servlet</span> <span class="n">start</span>
<span class="kd">#:servlet-path</span> <span class="s2">"/"</span>
<span class="kd">#:listen-ip</span> <span class="no">#f</span>
<span class="kd">#:port</span> <span class="n">port</span><span class="p">)</span></code></pre><p>Also, by default, <code>serve/servlet</code>[racket] will open a web browser automatically when the program is run, which is very useful for rapid prototyping within something like DrRacket, but we'll want to turn that off.</p><pre><code class="pygments"><span class="p">(</span><span class="n">serve/servlet</span> <span class="n">start</span>
<span class="kd">#:servlet-path</span> <span class="s2">"/"</span>
<span class="kd">#:listen-ip</span> <span class="no">#f</span>
<span class="kd">#:port</span> <span class="n">port</span>
<span class="kd">#:command-line?</span> <span class="no">#t</span><span class="p">)</span></code></pre><p>That's it! Now we have a Racket web server that can run on Heroku. Obviously it's not a very interesting application right now, but that's fine for our purposes.</p><h2><a name="setting-up-our-app-for-heroku"></a>Setting up our app for Heroku</h2><p>The next step is to actually create an app on Heroku. Don't worry—it's free! That said, explaining precisely how Heroku works is outside the scope of this article. Just make an account, then create an app. I called mine "racket-heroku-sample". Once you've created an app and set up Heroku's command-line tool, you can specify the proper buildpack:</p><pre><code class="pygments">$ git init
$ heroku git:remote -a racket-heroku-sample
$ heroku buildpacks:set https://github.com/lexi-lambda/heroku-buildpack-racket</code></pre><p>We'll also need to pick a particular Racket version before we deploy our app. At the time of this writing, Racket 6.2.1 is the latest version, so I just set the <code>RACKET_VERSION</code> environment variable as follows:</p><pre><code class="pygments">$ heroku config:set <span class="nv">RACKET_VERSION</span><span class="o">=</span><span class="m">6</span>.2.1</code></pre><p>Now there's just one thing left to do before we can push to Heroku: we need to tell Heroku what command to use to run our application. To do this, we use something called a "Procfile" that contains information about the process types for our app. Heroku supports multiple processes of different types, but we're just going to have a single web process.</p><p>Specifically, we just want to run our <code>serve.rkt</code> module. The Racket buildpack installs the repository as a package, so we can run <code>racket</code> with the <code>-l</code> flag to specify a module path, which will be more robust than specifying a filesystem path directly. Therefore, our Procfile will look like this:</p><pre><code>web: racket -l sample-heroku-app/server
</code></pre><p>Now all that's left to do is push our repository to Heroku's git remote. Once the build completes, we can <a href="https://racket-heroku-sample.herokuapp.com">navigate to our app's URL and actually see it running live</a>!</p><h2><a name="conclusion"></a>Conclusion</h2><p>That's all that's needed to get a Racket app up and running on Heroku, but it probably isn't the best way to manage a real application. Usually it's best to use a continuous integration service to automatically deploy certain GitHub branches to Heroku, after running the tests, of course. Also, a real application would obviously be a little more complicated.</p><p>That said, this provides the foundation and shell. If you'd like to see the sample app used in this post, you can <a href="https://github.com/lexi-lambda/racket-sample-heroku-app">find it on GitHub here</a>. For more details on the buildpack itself, <a href="https://github.com/lexi-lambda/heroku-buildpack-racket">it's also available on GitHub here</a>.</p><ol class="footnotes"></ol></article>Automatically deploying a Frog-powered blog to GitHub pages2015-07-18T00:00:00Z2015-07-18T00:00:00ZAlexis King<article><p>So, I have a blog now. It's a simple static blog, but what's unique about it is that it's powered by Racket; specifically, it uses <a href="http://www.greghendershott.com">Greg Hendershott</a>'s fantastic <a href="https://github.com/greghendershott/frog">Frog</a> tool. I've taken this and moulded it to my tastes to build my blog, including configuring automatic deployment via <a href="https://travis-ci.org">Travis CI</a>, so my blog is always up-to-date.</p><h2><a name="setting-up-frog"></a>Setting up Frog</h2><p>I should note that Frog itself was wonderfully easy to drop in and get running. Just following the readme, a simple <code>raco pkg install frog</code> followed by <code>raco frog --init</code> and <code>raco frog -bp</code> created a running blog and opened it in my web browser. There was nothing more to it. Once that's done, all it takes to write a blog post is <code>raco frog -n "Post Title"</code>, and you're good to go.</p><p>By default, Frog uses Bootstrap, which provides a lot of the necessary scaffolding for you, but I opted to roll my own layout using flexbox. I also decided to use <a href="http://sass-lang.com">Sass</a> for my stylesheets, potentially with support for <a href="http://coffeescript.org">CoffeeScript</a> later, so I wanted to have a good flow for compiling all the resources for deployment. To do that, I used <a href="http://gulpjs.com">Gulp</a> in conjunction with <a href="https://www.npmjs.com">NPM</a> for build and dependency management.</p><p>Going this route has a few advantages, primarily the fact that updating dependencies becomes much easier, and I can build and deploy my blog with just a couple of commands without needing to commit compiled or minified versions of my sources to version control.</p><h2><a name="configuring-automatic-deployment-with-travis"></a>Configuring automatic deployment with Travis</h2><p>Once Frog itself was configured and my styling was finished, I started looking into how to deploy my blog to a GitHub page without needing to check in any of the generated files to source control. I found a couple of resources, the most useful one being <a href="https://gist.github.com/domenic/ec8b0fc8ab45f39403dd">this Gist</a>, which describes how to set up deployment for any project. The basic idea is to create a deployment script which will automatically generate your project, initialize a git repository with the generated files, and push to GitHub's special <code>gh-pages</code> branch.</p><p>To make this easy, Frog can be configured to output to a separate directory via the <code>.frogrc</code> configuration file. I chose to output to the <code>out</code> directory:</p><pre><code>output-dir = out
</code></pre><p>I also configured my Gulp build to output my CSS into the same output directory. Now, all that's necessary in order to deploy the blog to GitHub is to initialize a Git repository in the output directory, and push the files to the remote branch.</p><pre><code>$ cd out
$ git init
$ git add .
$ git commit -m "Deploy to GitHub Pages"
$ git push --force "$REMOTE_URL" master:gh-pages
</code></pre><p>The next step is to configure Travis so that it can securely push to the GitHub repository with the required credentials. This can be done with Travis's <a href="http://docs.travis-ci.com/user/encryption-keys/">encryption keys</a> along with a GitHub <a href="https://github.com/settings/tokens">personal access token</a>. Just install the Travis CLI client, copy the access token, and run a command:</p><pre><code>$ gem install travis
$ travis encrypt GH_TOKEN=<access token...>
</code></pre><p>The output of that command is an encrypted value to be placed in an environment variable in the project's <code>.travis.yml</code> configuration file. The URL for the repository on GitHub will also need to be specified as well:</p><pre><code class="pygments"><span class="nt">env</span><span class="p">:</span>
<span class="nt">global</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="nt">GH_REF</span><span class="p">:</span> <span class="s">'github.com/<gh-username>/<gh-repo>.git'</span>
<span class="p p-Indicator">-</span> <span class="nt">secure</span><span class="p">:</span> <span class="l l-Scalar l-Scalar-Plain"><encrypted data...></span></code></pre><p>Now all that's left is configuring the <code>.travis.yml</code> to run Frog. Since Travis doesn't natively support Racket at the time of this writing, the choice of "language" is somewhat arbitrary, but since I want Pygments installed for syntax highlighting, I set my project type to <code>python</code>, then installed Racket and Frog as pre-installation steps.</p><pre><code class="pygments"><span class="nt">env</span><span class="p">:</span>
<span class="nt">global</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="nt">GH_REF</span><span class="p">:</span> <span class="s">'github.com/<gh-username>/<gh-repo>.git'</span>
<span class="p p-Indicator">-</span> <span class="nt">secure</span><span class="p">:</span> <span class="l l-Scalar l-Scalar-Plain"><encrypted data...></span>
<span class="p p-Indicator">-</span> <span class="nt">RACKET_DIR</span><span class="p">:</span> <span class="s">'~/racket'</span>
<span class="p p-Indicator">-</span> <span class="nt">RACKET_VERSION</span><span class="p">:</span> <span class="s">'6.2'</span>
<span class="nt">before_install</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">git clone https://github.com/greghendershott/travis-racket.git</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">cat travis-racket/install-racket.sh | bash</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">export PATH="${RACKET_DIR}/bin:${PATH}"</span>
<span class="nt">install</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">raco pkg install --deps search-auto frog</span></code></pre><p>(It might be worth noting that Greg Hendershott <em>also</em> maintains the repository that contains the above Travis build script!)</p><p>Finally, in my case, I wasn't deploying to a project-specific GitHub page. Instead, I wanted to deploy to my user page, which uses <code>master</code>, not <code>gh-pages</code>. Obviously, I didn't want Travis running on my <code>master</code> branch, since it would be deploying to that, so I added a branch whitelist:</p><pre><code class="pygments"><span class="nt">branches</span><span class="p">:</span>
<span class="nt">only</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">source</span></code></pre><p>All that was left to do was to write up the actual deployment script to be used by Travis. Based on the one provided in the above Gist, mine looked like this:</p><pre><code class="pygments"><span class="ch">#!/bin/bash</span>
<span class="nb">set</span> -ev <span class="c1"># exit with nonzero exit code if anything fails</span>
<span class="c1"># clear the output directory</span>
rm -rf out <span class="o">||</span> <span class="nb">exit</span> <span class="m">0</span><span class="p">;</span>
<span class="c1"># build the blog files + install pygments for highlighting support</span>
npm install
npm run build
pip install pygments
raco frog --build
<span class="c1"># go to the out directory and create a *new* Git repo</span>
<span class="nb">cd</span> out
git init
<span class="c1"># inside this git repo we'll pretend to be a new user</span>
git config user.name <span class="s2">"Travis CI"</span>
git config user.email <span class="s2">"<your@email.here>"</span>
<span class="c1"># The first and only commit to this new Git repo contains all the</span>
<span class="c1"># files present with the commit message "Deploy to GitHub Pages".</span>
git add .
git commit -m <span class="s2">"Deploy to GitHub Pages"</span>
<span class="c1"># Force push from the current repo's master branch to the remote</span>
<span class="c1"># repo. (All previous history on the branch will be lost, since we are</span>
<span class="c1"># overwriting it.) We redirect any output to /dev/null to hide any sensitive</span>
<span class="c1"># credential data that might otherwise be exposed.</span>
git push --force --quiet <span class="s2">"https://</span><span class="si">${</span><span class="nv">GH_TOKEN</span><span class="si">}</span><span class="s2">@</span><span class="si">${</span><span class="nv">GH_REF</span><span class="si">}</span><span class="s2">"</span> master > /dev/null <span class="m">2</span>><span class="p">&</span><span class="m">1</span></code></pre><p>For reference, my final <code>.travis.yml</code> looked like this:</p><pre><code class="pygments"><span class="nt">language</span><span class="p">:</span> <span class="l l-Scalar l-Scalar-Plain">python</span>
<span class="nt">python</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="s">'3.4'</span>
<span class="nt">branches</span><span class="p">:</span>
<span class="nt">only</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">source</span>
<span class="nt">env</span><span class="p">:</span>
<span class="nt">global</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="nt">GH_REF</span><span class="p">:</span> <span class="s">'github.com/lexi-lambda/lexi-lambda.github.io.git'</span>
<span class="p p-Indicator">-</span> <span class="nt">secure</span><span class="p">:</span> <span class="l l-Scalar l-Scalar-Plain"><long secure token...></span>
<span class="p p-Indicator">-</span> <span class="nt">RACKET_DIR</span><span class="p">:</span> <span class="s">'~/racket'</span>
<span class="p p-Indicator">-</span> <span class="nt">RACKET_VERSION</span><span class="p">:</span> <span class="s">'6.2'</span>
<span class="nt">before_install</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">git clone https://github.com/greghendershott/travis-racket.git</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">cat travis-racket/install-racket.sh | bash</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">export PATH="${RACKET_DIR}/bin:${PATH}"</span>
<span class="nt">install</span><span class="p">:</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">raco pkg install --deps search-auto frog</span>
<span class="nt">script</span><span class="p">:</span> <span class="l l-Scalar l-Scalar-Plain">bash ./deploy.sh</span></code></pre><p>That's it! Now I have a working blog that I can publish just by pushing to the <code>source</code> branch on GitHub.</p><ol class="footnotes"></ol></article>