The first language I used to do purely functional programming is Scala, which I learned when I was at Facebook, and I absolutely loved it. Not long afterwards I started to learn Haskell, and sure enough, Scala quickly became the second favorite. Haskell is simply too beautiful. This motivated me to join Formation in April last year. Being paid to write Haskell on a daily basis feels like a dream, in the sense that my work is now my hobby. As a result, I’ll start to write more Haskell posts here than Scala ones. I still like Scala but I haven’t touched it in a while and I’m fairly rusty at this point.

The style of my posts, however, will remain unchanged: I’m committed to making them accessible to beginner-level functional programmers. To this end, there will continue to be plenty of examples that are as simple as I can make them, as well as explanation of the intuition.

In this post I’d like to introduce a neat hack in Haskell: defunctionalization for type families. I first learned about this technique from Sandy Maguire’s book “Thinking with Types”, although I’m going to discuss a slightly simplified approach in this post. It is a hack because it is a workaround for not having unsaturated type families or type-level lambdas in Haskell. Nonetheless, it is completely safe (unlike many other hacks we’ve heard of), and can be super useful in type-level programming, so I think it is definitely worth knowing.

# TL;DR

Defunctionalization, as explained in Wikipedia, is a technique for eliminating higher-order functions. And what we want to do is basically
to use it to eliminate higher-order functions at type-level. Why do we want to do that? Because although Haskell is a functional language, and ordinary functions (also referred to as term-level functions or value-level functions) are first class (i.e., can be assigned to variables, passed as arguments to functions and returned from functions), type-level functions (i.e., type families) are *not* first class, and higher-order type-level functions are not supported. So, whenever we want to write
or use a higher-order type-level function (such as the type-level `fmap`

), we must use this defunctionalization thing to rewrite it to something equivalent and supported.

Let’s introduce some basic concepts first.

# Basic Concepts

## Type Families

Type-level functions in Haskell are called “type families”. Type families can be defined in a few different ways, such as closed type families and open type families. There is a third way called associated type families, which are type-level functions defined within type classes. It is more commonly used than the first two, but is less relevant to our topic of defunctionalization so we omit it here.

### Closed Type Families

In a closed type family, the entire type-level function is defined the moment you
introduce the type family, under the `where`

clause. For example,

```
import Data.Kind (Constraint, Type)
type family Foo (x :: Type) (y :: Type) :: Type where
Foo Int Bool = Double
Foo Int Double = Maybe Char
Foo String (Maybe a) = Either Int a
Foo String (Maybe Int) = Maybe String -- Warning: overlapped with the line above
```

This is a type-level function that takes two arguments, both of kind `Type`

, and returns
a result, also of kind `Type`

. For example, it maps `(Int, Bool)`

to `Double`

. So now you can
write `x = 4.2 :: Foo Int Bool`

, where `Foo Int Bool`

becomes a synonym of `Double`

.

Equations in a closed type family are applied top-down, so
the last equation above, involving `Foo String (Maybe Int)`

, causes a GHC warning, since
it overlaps with the line above and is unreachable. However, if we swap the last two
equations, then the warning disappears, since `Maybe a`

is more general than `Maybe Int`

.

Note that you don’t need to make a closed type family definition “total” as you would
for term-value functions, because the compiler will automatically add an equation
`Foo a b = Foo a b`

in the end, if such an equation isn’t explicitly defined.

Of course, the kinds of the arguments and the result don’t have to be of kind `Type`

- they can be any
kind. For example, the following type family maps type constructors to constraint constructors:

```
type family Bar (x :: Type -> Type) (y :: Type -> Type -> Type) :: (Type -> Type) -> Constraint where
Bar Maybe Either = Functor
Bar [] (->) = Applicative
```

Type families `Foo`

and `Bar`

are just for illustration, and are completely useless. Here’s a more useful
type family, taking two arguments of kind `Bool`

and returns their conjunction:

```
type family And (x :: Bool) (y :: Bool) :: Bool where
And 'True y = y
And 'False y = 'False
```

Also note that type constructors are a special case of type-level functions. Type constructors also
map types to types, but the result types are determined by the type constructors and can’t be changed.
For example, the `Maybe`

type constructor always maps `Int`

to `Maybe Int`

.

### Open Type families

In an open type family, the type-level function is defined by `type instance`

declarations
which can be extended. For example,

```
type family Baz (x :: Type) (y :: Type) :: Type
type instance Baz Int Bool = Double
type instance Baz Int (Maybe a) = Either Int a
```

You can import the module that defines `Baz`

, and extend it by defining new type instances.
Because of that, there’s no such thing as “top-down” in open type families. Therefore, open type families cannot have overlapping type
instances. It is illegal to define `type instance Baz Int (Maybe Int)`

or `type instance Baz a b`

given
the type instances above.

## Higher-Order Functions

Higher-order functions are functions that take functions as arguments, and/or return functions as results.

The following is a simple higher-order function. It takes a function of type `a -> a`

, and a value
of type `a`

, and applies the function to the value twice.

```
twice :: (a -> a) -> a -> a
twice f x = f (f x)
```

Haskell, being a purely functional language, relies heavily on higher-order functions. Without higher-order functions,
we can’t do things like `fmap`

or `>>=`

, rendering the entire language almost useless. The same can be said about
type-level programming. Without higher-order type-level functions, type-level programming is still useful but the
usability is severely limited. We won’t be able to do type-level `fmap`

, or the type-level counterpart of the `twice`

function above. So, can we have higher-order type-level functions or not?

## Higher-Order Type Families?

Unfortunately, such things don’t exist in Haskell. To prove it, let’s try to write a type-level
`twice`

function.

```
type family Twice (f :: Type -> Type) (x :: Type) :: Type where
Twice f x = f (f x)
```

So far so good. We’d expect that we can apply `Twice`

to `(Foo Int)`

and `Bool`

, and
it would return `Maybe Char`

back. Unfortunately, it won’t be the case:

```
λ> x = Just 'a' :: Twice (Foo Int) Bool
```

This will not compile:

```
The type family ‘Foo’ should have 2 arguments, but has been given 1
```

The type family `Foo`

takes two `Types`

and returns a `Type`

, and it refuses to take less than two arguments. For example, we can’t
treat `Foo Int`

as a type-level function from `Type`

to `Type`

, or treat `Foo`

itself as
a type-level function from `(Type, Type)`

to `Type`

. Occurrences of `Foo`

must simply be accompanied
by two arguments. In other words, type families in Haskell cannot be partially applied (or “unsaturated”). This is the case
with both open and close type families.

The reason unsaturated type families are not allowed has to do with the fact that type families in Haskell are, in general, not injective or generative, where:

- Injectivity:
`F a ~ F b ⇒ a ~ b`

- Generativity:
`F a ~ G b ⇒ F ~ G ∧ a ~ b`

The `~`

symbol denotes (extensional) type equality, so `a ~ b`

means a value of type `a`

can be used in any context that expects a value of type `b`

.

GHC’s type inference relies on the assumptions of injectivity and generativity to work, and unsaturated type
families violate these assumptions. For example, if `m a ~ Maybe Int`

, and `m`

is possibly
an unsaturated type family, GHC won’t be able to infer `m ~ Maybe`

or `a ~ Int`

.

This means there’s *no way* to apply `Twice`

to a type family. We can, however,
apply it to a type constructor, since type constructors are both injective
and generative, and thus *can* be partially applied:

```
λ> x = Left 42 :: Twice (Either Int) Bool
λ> :t x
x :: Either Int (Either Int Bool)
```

But this is not very useful, so it is a severe limitation. And of course, attempting to write a type-level lambda:

```
type family TwiceCurried (x :: Type) :: Type -> Type where
TwiceCurried f = \x -> f (f x)
```

will fail miserably, since there’s no such thing (yet) as type-level lambdas in Haskell.
A type family returning a `Type -> Type`

can only return type constructors, e.g., `Maybe`

.

# Defunctionalization

Ok, so defunctionalization to the rescue. The key idea here is to take advantage of
type constructors, since as just mentioned, type constructors, unlike type families,
*can* be partially applied, and you *can* pass an unsaturated type constructor to a higher-order type family (such as `Twice`

).

So the plan of attack is this: we are going to encode the type-level function, which we intend to
pass to a higher-order type family, as a type constructor. Then, we create another type family to
*evaluate* the type constructor (i.e., map it to the correct types). Finally, we modify
our higher-order type family to use the evaluator.

If this sounds confusing, our running example with `Twice`

and `Foo`

should help clarify things. We start by turning
`Foo`

into a type constructor:

```
data Foo' (x :: Type) (y :: Type)
```

The type family `Foo`

is a type-level function from `(Type, Type)`

to `Type`

, so we make `Foo'`

take
two type parameters, corresponding to the arguments to `Foo`

. So the kind of both `Foo`

and `Foo'`

is `Type -> Type -> Type`

.
There’s no need for `Foo'`

to have data constructors since we won’t need to create values of `Foo'`

.

We can also define `Foo'`

using GADT syntax, i.e.,

```
data Foo' :: Type -> Type -> Type
```

This way we don’t need to name the type parameters, which may or may not be desirable.

Now we need to define the equations in the `Foo`

type family. Since `Foo`

maps
`(Int, Bool)`

to `Double`

, we need a way to map `Foo' Int Bool`

to `Double`

. This requires a
type-level function, so we can use an open type family, like this:

```
type family Eval (x :: Type) :: a
type instance Eval (Foo' Int Bool) = Double
type instance Eval (Foo' Int Double) = Maybe Char
-- the rest is omitted
```

This is the evaluator we mentioned before.

And then we modify the definition of `Twice`

to use `Eval`

:

```
type family Twice' (f :: Type -> Type) (x :: Type) :: Type where
Twice' f x = Eval (f (Eval (f x))) -- Instead of f (f x)
```

Now we can pass `Foo'`

to `Twice'`

to get what we want:

```
λ> x = Just 'a' :: Twice' (Foo' Int) Bool
λ> :t x
x :: Maybe Char
```

As you can see, now we are passing an unsaturated type constructor (`Foo' Int`

) rather than
an unsaturated type family to `Twice'`

, so we are good.

Alternatively, we can also turn `Twice'`

into a type constructor, and use `Eval`

to get
the same result. This would enable us to partially apply `Twice'`

as well.

# Final Words

As mentioned in the beginning of this post, this is a simplified approach compared to
the approach introduced in “Thinking with Types”. In the latter approach, there is an
additional type parameter for the type constructors, representing the result of the type-level
function. So `Foo'`

would be defined like this:

```
data Foo' (x :: Type) (y :: Type) (res :: Type)
```

This is also the approach adopted in the first-class-families library.

One obvious benefit of this approach is more clarity and safety, since `Foo'`

now explicitly
declares that the result of the type-level function should have kind `Type`

. The difference
is more pronounced in the type-level `map`

function, where the corresponding type constructor
would look like the following:

```
data MapList :: (a -> Exp b) -> [a] -> Exp [b]
```

where

```
type Exp a = a -> Type
```

First, it correlates nicely with the signature of the term-level `map`

function. Second, evaluating
`MapList`

with `(a -> Exp b)`

and `[a]`

, must return `Exp [b]`

; it is illegal to return
a different type.

On the other hand, in the
approach I introduced earlier, `MapList`

would look like:

```
data MapList :: (a -> Type) -> [a] -> Type
```

which works, but is much less desirable.

Also, explicitly declaring the kind of the result *may* (I don’t know) improve error messages in some cases. I’ll leave
it to the reader to determine whether there are further benefits.

Type-level programming in Haskell is generally considered a double-edged sword. It can greatly improve type safety of your code and solve certain problems beautifully, but at this time, Haskell supports nowhere near the full capability of System Fω, there are plenty of gotchas, error messages can sometimes be unreadable, and on top of all that, the learning curve is pretty steep. So, “do it in moderation” seems to be pretty sound advice.