An introduction to typeclass metaprogramming

Typeclass metaprogramming is a powerful technique available to Haskell programmers to automatically generate term-level code from static type information. It has been used to great effect in several popular Haskell libraries (such as the servant ecosystem), and it is the core mechanism used to implement generic programming via GHC generics. Despite this, remarkably little material exists that explains the technique, relegating it to folk knowledge known only to advanced Haskell programmers.

This blog post attempts to remedy that by providing an overview of the foundational concepts behind typeclass metaprogramming. It does not attempt to be a complete guide to type-level programming in Haskell—such a task could easily fill a book—but it does provide explanations and illustrations of the most essential components. This is also not a blog post for Haskell beginners—familiarity with the essentials of the Haskell type system and several common GHC extensions is assumed—but it does not assume any prior knowledge of type-level programming.

Read more

Names are not type safety

Haskell programmers spend a lot of time talking about type safety. The Haskell school of program construction advocates “capturing invariants in the type system” and “making illegal states unrepresentable,” both of which sound like compelling goals, but are rather vague on the techniques used to achieve them. Almost exactly one year ago, I published Parse, Don’t Validate as an initial stab towards bridging that gap.

The ensuing discussions were largely productive and right-minded, but one particular source of confusion quickly became clear: Haskell’s newtype construct. The idea is simple enough—the newtype keyword declares a wrapper type, nominally distinct from but representationally equivalent to the type it wraps—and on the surface this sounds like a simple and straightforward path to type safety. For example, one might consider using a newtype declaration to define a type for an email address:

newtype EmailAddress = EmailAddress Text

This technique can provide some value, and when coupled with a smart constructor and an encapsulation boundary, it can even provide some safety. But it is a meaningfully distinct kind of type safety from the one I highlighted a year ago, one that is far weaker. On its own, a newtype is just a name.

And names are not type safety.

Read more

Types as axioms, or: playing god with static types

Just what exactly is a type?

A common perspective is that types are restrictions. Static types restrict the set of values a variable may contain, capturing some subset of the space of “all possible values.” Under this worldview, a typechecker is sort of like an oracle, predicting which values will end up where when the program runs and making sure they satisfy the constraints the programmer wrote down in the type annotations. Of course, the typechecker can’t really predict the future, so when the typechecker gets it wrong—it can’t “figure out” what a value will be—static types can feel like self-inflicted shackles.

But that is not the only perspective. There is another way—a way that puts you, the programmer, back in the driver’s seat. You make the rules, you call the shots, you set the objectives. You need not be limited any longer by what the designers of your programming language decided the typechecker can and cannot prove. You do not serve the typechecker; the typechecker serves you.

…no, I’m not trying to sell you a dubious self-help book for programmers who feel like they’ve lost control of their lives. If the above sounds too good to be true, well… I won’t pretend it’s all actually as easy as I make it sound. Nevertheless, it’s well within the reach of the working programmer, and most remarkably, all it takes is a change in perspective.

Read more

No, dynamic type systems are not inherently more open

Internet debates about typing disciplines continue to be plagued by a pervasive myth that dynamic type systems are inherently better at modeling “open world” domains. The argument usually goes like this: the goal of static typing is to pin everything down as much as possible, but in the real world, that just isn’t practical. Real systems should be loosely coupled and worry about data representation as little as possible, so dynamic types lead to a more robust system in the large.

This story sounds compelling, but it isn’t true. The flaw is in the premise: static types are not about “classifying the world” or pinning down the structure of every value in a system. The reality is that static type systems allow specifying exactly how much a component needs to know about the structure of its inputs, and conversely, how much it doesn’t. Indeed, in practice static type systems excel at processing data with only a partially-known structure, as they can be used to ensure application logic doesn’t accidentally assume too much.

Read more

Parse, don’t validate

Historically, I’ve struggled to find a concise, simple way to explain what it means to practice type-driven design. Too often, when someone asks me “How did you come up with this approach?” I find I can’t give them a satisfying answer. I know it didn’t just come to me in a vision—I have an iterative design process that doesn’t require plucking the “right” approach out of thin air—yet I haven’t been very successful in communicating that process to others.

However, about a month ago, I was reflecting on Twitter about the differences I experienced parsing JSON in statically- and dynamically-typed languages, and finally, I realized what I was looking for. Now I have a single, snappy slogan that encapsulates what type-driven design means to me, and better yet, it’s only three words long:

Parse, don’t validate.

Read more

Empathy and subjective experience in programming languages

A stereotype about programmers is that they like to think in black and white. Programmers like things to be good or bad, moral or immoral, responsible or irresponsible. Perhaps there is something romantic in the idea that programmers like to be as binary as the computers they program. Reductionist? Almost certainly, but hey, laugh at yourself a bit: we probably deserve to be made fun of from time to time.

Personally, I have no idea if the trope of the nuance-challenged programmer is accurate, but whether it’s a property of programmers or just humans behind a keyboard, the intensity with which we disagree with one another never ceases to amaze. Ask any group of working programmers what their least favorite programming language is, and there’s a pretty good chance things are going to get heated real fast. Why? What is it about programming that makes us feel so strongly that we are right and others are wrong, even when our experiences contradict those of tens or hundreds of thousands of others?

I think about that question a lot.

Read more

Demystifying MonadBaseControl

⦿ haskell

MonadBaseControl from the monad-control package is a confusing typeclass, and its methods have complicated types. For many people, it’s nothing more than scary, impossible-to-understand magic that is, for some reason, needed when lifting certain kinds of operations. Few resources exist that adequately explain how, why, and when it works, which sadly seems to have resulted in some FUD about its use.

There’s no doubt that the machinery of MonadBaseControl is complex, and the role it plays in practice is often subtle. However, its essence is actually much simpler than it appears, and I promise it can be understood by mere mortals. In this blog post, I hope to provide a complete survey of MonadBaseControl—how it works, how it’s designed, and how it can go wrong—in a way that is accessible to anyone with a firm grasp of monads and monad transformers. To start, we’ll motivate MonadBaseControl by reinventing it ourselves.

Read more

Defeating Racket’s separate compilation guarantee

⦿ racket, macros

Being a self-described programming-language programming language is an ambitious goal. To preserve predictability while permitting linguistic extension, Racket comes equipped with a module system carefully designed to accommodate composable and compilable macros. One of the module system’s foundational properties is its separate compilation guarantee, which imposes strong, unbreakable limits on the extent of compile-time side-effects. It is essential for preserving static guarantees in a world where compiling a module can execute arbitrary code, and despite numerous unsafe trapdoors that have crept into Racket since its birth as PLT Scheme, none have ever given the programmer the ability to cheat it.

Yet today, in this blog post, we’re going to do exactly that.

Read more

Macroexpand anywhere with local-apply-transformer!

⦿ racket, macros

Racket programmers are accustomed to the language’s incredible capacity for extension and customization. Writing useful macros that do complicated things is easy, and it’s simple to add new syntactic forms to meet domain-specific needs. However, it doesn’t take long before many budding macrologists bump into the realization that only certain positions in Racket code are subject to macroexpansion.

To illustrate, consider a macro that provides a Clojure-style let form:

(require syntax/parse/define)

(define-simple-macro (clj-let [{~seq x:id e:expr} ...] body:expr ...+)
  (let ([x e] ...) body ...))

This can be used anywhere an expression is expected, and it does as one would expect:

> (clj-let [x 1
            y 2]
    (+ x y))
3

However, a novice macro programmer might realize that clj-let really only modifies the syntax of binding pairs for a let form. Therefore, could one define a macro that only adjusts the binding pairs of some existing let form instead of expanding to an entire let? That is, could one write the above example like this:

(define-simple-macro (clj-binding-pairs [{~seq x:id e:expr} ...])
  ([x e] ...))

> (let (clj-binding-pairs
        [x 1
         y 2])
    (+ x y))
3

The answer is no: the binding pairs of a let form are not subject to macroexpansion, so the above attempt fails with a syntax error. In this blog post, we will examine the reasons behind this limitation, then explain how to overcome it using a solution that allows macroexpansion anywhere in a Racket program.

Read more

Custom core forms in Racket, part II: generalizing to arbitrary expressions and internal definitions

⦿ racket, macros

In my previous blog post, I covered the process involved in creating a small language with a custom set of core forms. Specifically, it discussed what was necessary to create Hackett’s type language, which involved expanding to custom expressions. While somewhat involved, Hackett’s type language was actually a relatively simple example to use, since it only made use of a subset of the linguistic features Racket supports. In this blog post, I’ll demonstrate how that same technique can be generalized to support runtime bindings and internal definitions, two key concepts useful if intending to develop a more featureful language than Hackett’s intentionally-restrictive type system.

Read more