The first language I used to do purely functional programming is Scala, which I learned when I was at Facebook, and I absolutely loved it. Not long afterwards I started to learn Haskell, and sure enough, Scala quickly became the second favorite. Haskell is simply too beautiful. This motivated me to join Formation in April last year. Being paid to write Haskell on a daily basis feels like a dream, in the sense that my work is now my hobby. As a result, I’ll start to write more Haskell posts here than Scala ones. I still like Scala but I haven’t touched it in a while and I’m fairly rusty at this point.
The style of my posts, however, will remain unchanged: I’m committed to making them accessible to beginner-level functional programmers. To this end, there will continue to be plenty of examples that are as simple as I can make them, as well as explanation of the intuition.
In this post I’d like to introduce a neat hack in Haskell: defunctionalization for type families. I first learned about this technique from Sandy Maguire’s book “Thinking with Types”, although I’m going to discuss a slightly simplified approach in this post. It is a hack because it is a workaround for not having unsaturated type families or type-level lambdas in Haskell. Nonetheless, it is completely safe (unlike many other hacks we’ve heard of), and can be super useful in type-level programming, so I think it is definitely worth knowing.
Defunctionalization, as explained in Wikipedia, is a technique for eliminating higher-order functions. And what we want to do is basically
to use it to eliminate higher-order functions at type-level. Why do we want to do that? Because although Haskell is a functional language, and ordinary functions (also referred to as term-level functions or value-level functions) are first class (i.e., can be assigned to variables, passed as arguments to functions and returned from functions), type-level functions (i.e., type families) are not first class, and higher-order type-level functions are not supported. So, whenever we want to write
or use a higher-order type-level function (such as the type-level
fmap), we must use this defunctionalization thing to rewrite it to something equivalent and supported.
Let’s introduce some basic concepts first.
Type-level functions in Haskell are called “type families”. Type families can be defined in a few different ways, such as closed type families and open type families. There is a third way called associated type families, which are type-level functions defined within type classes. It is more commonly used than the first two, but is less relevant to our topic of defunctionalization so we omit it here.
Closed Type Families
In a closed type family, the entire type-level function is defined the moment you
introduce the type family, under the
where clause. For example,
import Data.Kind (Constraint, Type) type family Foo (x :: Type) (y :: Type) :: Type where Foo Int Bool = Double Foo Int Double = Maybe Char Foo String (Maybe a) = Either Int a Foo String (Maybe Int) = Maybe String -- Warning: overlapped with the line above
This is a type-level function that takes two arguments, both of kind
Type, and returns
a result, also of kind
Type. For example, it maps
(Int, Bool) to
Double. So now you can
x = 4.2 :: Foo Int Bool, where
Foo Int Bool becomes a synonym of
Equations in a closed type family are applied top-down, so
the last equation above, involving
Foo String (Maybe Int), causes a GHC warning, since
it overlaps with the line above and is unreachable. However, if we swap the last two
equations, then the warning disappears, since
Maybe a is more general than
Note that you don’t need to make a closed type family definition “total” as you would
for term-value functions, because the compiler will automatically add an equation
Foo a b = Foo a b in the end, if such an equation isn’t explicitly defined.
Of course, the kinds of the arguments and the result don’t have to be of kind
Type - they can be any
kind. For example, the following type family maps type constructors to constraint constructors:
type family Bar (x :: Type -> Type) (y :: Type -> Type -> Type) :: (Type -> Type) -> Constraint where Bar Maybe Either = Functor Bar  (->) = Applicative
Bar are just for illustration, and are completely useless. Here’s a more useful
type family, taking two arguments of kind
Bool and returns their conjunction:
type family And (x :: Bool) (y :: Bool) :: Bool where And 'True y = y And 'False y = 'False
Also note that type constructors are a special case of type-level functions. Type constructors also
map types to types, but the result types are determined by the type constructors and can’t be changed.
For example, the
Maybe type constructor always maps
Open Type families
In an open type family, the type-level function is defined by
type instance declarations
which can be extended. For example,
type family Baz (x :: Type) (y :: Type) :: Type type instance Baz Int Bool = Double type instance Baz Int (Maybe a) = Either Int a
You can import the module that defines
Baz, and extend it by defining new type instances.
Because of that, there’s no such thing as “top-down” in open type families. Therefore, open type families cannot have overlapping type
instances. It is illegal to define
type instance Baz Int (Maybe Int) or
type instance Baz a b given
the type instances above.
Higher-order functions are functions that take functions as arguments, and/or return functions as results.
The following is a simple higher-order function. It takes a function of type
a -> a, and a value
a, and applies the function to the value twice.
twice :: (a -> a) -> a -> a twice f x = f (f x)
Haskell, being a purely functional language, relies heavily on higher-order functions. Without higher-order functions,
we can’t do things like
>>=, rendering the entire language almost useless. The same can be said about
type-level programming. Without higher-order type-level functions, type-level programming is still useful but the
usability is severely limited. We won’t be able to do type-level
fmap, or the type-level counterpart of the
function above. So, can we have higher-order type-level functions or not?
Higher-Order Type Families?
Unfortunately, such things don’t exist in Haskell. To prove it, let’s try to write a type-level
type family Twice (f :: Type -> Type) (x :: Type) :: Type where Twice f x = f (f x)
So far so good. We’d expect that we can apply
(Foo Int) and
it would return
Maybe Char back. Unfortunately, it won’t be the case:
λ> x = Just 'a' :: Twice (Foo Int) Bool
This will not compile:
The type family ‘Foo’ should have 2 arguments, but has been given 1
The type family
Foo takes two
Types and returns a
Type, and it refuses to take less than two arguments. For example, we can’t
Foo Int as a type-level function from
Type, or treat
Foo itself as
a type-level function from
(Type, Type) to
Type. Occurrences of
Foo must simply be accompanied
by two arguments. In other words, type families in Haskell cannot be partially applied (or “unsaturated”). This is the case
with both open and close type families.
The reason unsaturated type families are not allowed has to do with the fact that type families in Haskell are, in general, not injective or generative, where:
F a ~ F b ⇒ a ~ b
F a ~ G b ⇒ F ~ G ∧ a ~ b
~ symbol denotes (extensional) type equality, so
a ~ b means a value of type
can be used in any context that expects a value of type
GHC’s type inference relies on the assumptions of injectivity and generativity to work, and unsaturated type
families violate these assumptions. For example, if
m a ~ Maybe Int, and
m is possibly
an unsaturated type family, GHC won’t be able to infer
m ~ Maybe or
a ~ Int.
This means there’s no way to apply
Twice to a type family. We can, however,
apply it to a type constructor, since type constructors are both injective
and generative, and thus can be partially applied:
λ> x = Left 42 :: Twice (Either Int) Bool λ> :t x x :: Either Int (Either Int Bool)
But this is not very useful, so it is a severe limitation. And of course, attempting to write a type-level lambda:
type family TwiceCurried (x :: Type) :: Type -> Type where TwiceCurried f = \x -> f (f x)
will fail miserably, since there’s no such thing (yet) as type-level lambdas in Haskell.
A type family returning a
Type -> Type can only return type constructors, e.g.,
Ok, so defunctionalization to the rescue. The key idea here is to take advantage of
type constructors, since as just mentioned, type constructors, unlike type families,
can be partially applied, and you can pass an unsaturated type constructor to a higher-order type family (such as
So the plan of attack is this: we are going to encode the type-level function, which we intend to pass to a higher-order type family, as a type constructor. Then, we create another type family to evaluate the type constructor (i.e., map it to the correct types). Finally, we modify our higher-order type family to use the evaluator.
If this sounds confusing, our running example with
Foo should help clarify things. We start by turning
Foo into a type constructor:
data Foo' (x :: Type) (y :: Type)
The type family
Foo is a type-level function from
(Type, Type) to
Type, so we make
two type parameters, corresponding to the arguments to
Foo. So the kind of both
Type -> Type -> Type.
There’s no need for
Foo' to have data constructors since we won’t need to create values of
We can also define
Foo' using GADT syntax, i.e.,
data Foo' :: Type -> Type -> Type
This way we don’t need to name the type parameters, which may or may not be desirable.
Now we need to define the equations in the
Foo type family. Since
(Int, Bool) to
Double, we need a way to map
Foo' Int Bool to
Double. This requires a
type-level function, so we can use an open type family, like this:
type family Eval (x :: Type) :: a type instance Eval (Foo' Int Bool) = Double type instance Eval (Foo' Int Double) = Maybe Char -- the rest is omitted
This is the evaluator we mentioned before.
And then we modify the definition of
Twice to use
type family Twice' (f :: Type -> Type) (x :: Type) :: Type where Twice' f x = Eval (f (Eval (f x))) -- Instead of f (f x)
Now we can pass
Twice' to get what we want:
λ> x = Just 'a' :: Twice' (Foo' Int) Bool λ> :t x x :: Maybe Char
As you can see, now we are passing an unsaturated type constructor (
Foo' Int) rather than
an unsaturated type family to
Twice', so we are good.
Alternatively, we can also turn
Twice' into a type constructor, and use
Eval to get
the same result. This would enable us to partially apply
Twice' as well.
As mentioned in the beginning of this post, this is a simplified approach compared to
the approach introduced in “Thinking with Types”. In the latter approach, there is an
additional type parameter for the type constructors, representing the result of the type-level
Foo' would be defined like this:
data Foo' (x :: Type) (y :: Type) (res :: Type)
This is also the approach adopted in the first-class-families library.
One obvious benefit of this approach is more clarity and safety, since
Foo' now explicitly
declares that the result of the type-level function should have kind
Type. The difference
is more pronounced in the type-level
map function, where the corresponding type constructor
would look like the following:
data MapList :: (a -> Exp b) -> [a] -> Exp [b]
type Exp a = a -> Type
First, it correlates nicely with the signature of the term-level
map function. Second, evaluating
(a -> Exp b) and
[a], must return
Exp [b]; it is illegal to return
a different type.
On the other hand, in the
approach I introduced earlier,
MapList would look like:
data MapList :: (a -> Type) -> [a] -> Type
which works, but is much less desirable.
Also, explicitly declaring the kind of the result may (I don’t know) improve error messages in some cases. I’ll leave it to the reader to determine whether there are further benefits.
Type-level programming in Haskell is generally considered a double-edged sword. It can greatly improve type safety of your code and solve certain problems beautifully, but at this time, Haskell supports nowhere near the full capability of System Fω, there are plenty of gotchas, error messages can sometimes be unreadable, and on top of all that, the learning curve is pretty steep. So, “do it in moderation” seems to be pretty sound advice.