I was recently updating a work project’s user guide, and added a page that involves Template Haskell (TH). A reviewer noted that TH can be challenging to grasp and suggested adding pointers to learning resources. I’ve always found the trickiest part of TH to be the stage errors, which often confuse even experienced Haskell developers. I couldn’t find an existing TH tutorial that covers stage errors with the level of detail I was hoping for, so I decided to fill that gap myself.
Understanding stage errors in TH not only helps make sense of GHC compilation errors, but also helps one write correct TH code that produces the intended output. Moreover, this knowledge translates well to other languages with similar metaprogramming constructs.
This post starts with an explanation of stages and stage errors, and then applies that knowledge to demonstrate the implementation of the peel
function, which unrolls a given number of layers from an arbitrary recursive function at compile time.
Stage Errors Explained
TH revolves around two key concepts: quotations ([| ... |]
and look-alikes) convert code into abstract syntax trees (ASTs) for manipulation, while splices ($(...)
or $$(...)
) do the reverse - turning ASTs back into code that can be combined with the surrounding code.
This post focuses on typed expression quotations, [|| ... ||]
, which turns code into CodeQ a
, and typed expression splices, $$(...)
, which convert CodeQ a
back into code.
All of the following also applies to untyped expression quotations ([| ... |]
) and untyped splices ($(...)
).
The rules are similar but slightly different for other kinds of quotations - type, pattern and declaration quotations.
We won’t cover these to keep things short.
To understand stage errors, keep the following four rules about quotations and splices in mind:
1. Without TH, all expressions in the code are evaluated at runtime.
For example, writing x = 3 + 5
means 3 + 5
is computed at runtime1, when x
is evaluated.
2. With TH, expressions can be evaluated at different stages. Quotations delay the evaluation of quoted code relative to the surrounding code, while splices do the opposite.
Consider a top-level binding x = g $$(f 2)
. Here f 2
is evaluated at compile time - earlier than evaluating the application of g
.
The compiler evaluates f 2
, producing an AST, which the splice operator converts into code.
It then proceeds to compile x = g <code generated by f 2>
.
This is the whole point of TH: generating and manipulating code at compile time.
As a contrasting example, consider x = g [|| f 2 ||]
.
Now f 2
is inside a quotation, meaning it is evaluated later than the surrounding code.
Indeed, when the application of g
is evaluated, f 2
is not - the argument g
receives is, in effect, the AST representing f 2
.
f 2
is only evaluated if and when x
is eventually used in a splice.
3. Quotations and splices cancel each other out.
For example, foo f = [|| $$(f 2) ||]
is equivalent to foo f = f 2
, as is foo f = $$([|| f 2 ||])
.
4. Generally, a variable can only be used at the same level where it is bound. The level is determined by the number of surrounding quotations or splices, after adjacent ones cancel out.
Without quotations or splices, the code is at level 0.
Each quotation increases the level, while each splice decreases it2.
For instance, this is valid: foo f = [|| 3 + $(f 2) ||]
, because f
is bound at level 0 and used at level 0 (assuming no additional surrounding quotations or splices).
But the following aren’t valid:
foo f = $$(f 2)
, becausef
is bound at level 0 but used at level -1, meaning it is used before it is bound. The compiler needs to evaluatef 2
first, splice the resulting code intofoo
’s definition, and then continue compilingfoo f = <code generated by f 2>
. Butf
is a parameter that isn’t determined untilfoo
is called, making this impossible.foo = [|| \f -> $$(f 2) ||]
, becausef
is bound at level 1 but used at level 0.foo x = unsafeCodeCoerce (varE 'x)
. This is usually equivalent tofoo x = [|| x ||]
, but I’m avoiding[|| x ||]
here because there are cases where it is valid to writefoo x = [|| x ||]
, as I’ll explain shortly. This has the opposite problem:x
is bound at level 0 but used at level 1. It is problematic because when the result offoo x
is spliced into the surrounding code,x
no longer exists as a variable, sovarE 'x
no longer makes sense. Instead,x
has been replaced by a concrete argument, effectively becoming a literal3.
GHC has several variations of the stage error message. It may say “the non-top-level quoted name ‘x must be used at the same stage at which it is bound”, or “‘x’ is used in a top-level splice, quasi-quote, or annotation, and must be imported, not defined locally”, or “‘x’ is bound at stage 2 but used at stage 1”. These all result from breaking rule #4.
Exceptions to Rule #4
Rule #4 has two exceptions. First, it applies only within a single module. If an identifier is defined in another module and imported into the current module, then the restriction does not apply, and it can be used at any level.
In theory, an identifier should be usable at any level if its value is statically known. This holds for top-level identifiers (both defined in the same module and imported) and certain local identifiers. However, in practice, while top-level identifiers defined in the same module can be used at a later level (e.g., level 1), only imported identifiers can be used at an earlier level (e.g., level -1). This limitation is known as the stage restriction.
Second, and this explains why foo x = [|| x ||]
is sometimes allowed: it doesn’t compile this code into foo x = unsafeCodeCoerce (varE 'x)
, but instead into foo x = liftTyped x
.
Essentially, when foo
is applied to a concrete argument, the argument itself, rather than the variable x
, is converted into AST nodes.
For example, if foo
is applied to 2
, the generated AST contains LitE 2
instead of VarE 'x
.
This only works if the type of x
has a Lift
instance, which is usually true for types that don’t contain functions.
However, functions generally cannot be lifted.
An Example: The peel Function
Equipped with the understanding of stages and stage errors, we are ready to implement the peel
function.
It takes a number k
, and the step function of a recursive function, and generates a new recursive function with the first k
layers unrolled.
This is similar to loop unrolling, except we only unroll once, covering the first k
steps.
We’ll use the length
function as our running example, which computes the length of a list.
The standard way to define length
using a step function is:
length :: forall a. [a] -> Int
length = fix step
where
step = \self xs -> case xs of [] -> 0; y:ys -> 1 + self ys
If we peel 3 layers off, the definition should look like this:
-- Desired definition of `lengthPeeled`
lengthPeeled :: forall a. [a] -> Int
lengthPeeled xs =
case xs of
[] -> 0
y:ys -> 1 +
case ys of
[] -> 0
z:zs -> 1 +
case zs of
[] -> 0
w:ws -> 1 + fix step ws
where
step = -- same as above
lengthPeeled
is what we aim to generate using TH, because writing it by hand becomes tedious quickly for larger values of k
.
Attempt 1: no TH
Before diving into TH, let’s first question why we really need it. Could we just write it like this instead?
-- Attempt 1 - doomed to fail
peel_1 :: Int -> ((a -> b) -> a -> b) -> a -> b
peel_1 0 f = fix f
peel_1 k f = f (peel_1 (k - 1) f)
lengthPeeled_1 :: forall a. [a] -> Int
lengthPeeled_1 = peel_1 3 step
where
step = -- same as above
No, this of course won’t work.
Remember rule #1 above: without TH, all evaluation happens at runtime.
This means peel_1
is compiled as is into the target code, and the “peeling” takes place at runtime.
This is certainly not what we want, and it has no advantage over the regular length
, only additional overhead.
We need lengthPeeled
to be generated at compile time.
Attempt 2: put peel’s result in a splice
How do we ensure the peeling happens at compile time?
Recall rule #2: code inside a splice evaluates earlier than the surrounding code, while code inside a quotation evaluates later.
Therefore, to make the peeling happen at compile time (which of course is earlier than runtime), the application of peel
must be placed inside a splice, as in
lengthPeeled_2 :: forall a. [a] -> Int
lengthPeeled_2 = $$(peel_2 3 step)
And this means peel_2
must return CodeQ (a -> b)
, rather than a -> b
.
But what should the argument types of peel_2
be?
Can it still take Int
and (a -> b) -> a -> b
?
Let’s give it a try.
The only way to write peel_2
with such a type would be
-- Attempt 2 - doesn't even compile
peel_2 :: Int -> ((a -> b) -> a -> b) -> CodeQ (a -> b)
peel_2 0 f = [|| fix f ||]
Well, we don’t even need to continue to write the peel_2 k f
case.
This already violates rule #4: f
is bound at level 0 but used at level 1, and since it has a function type, it can’t be lifted.
Note that there’s no problem with fix
being used at level 1, since it is imported from another module.
Attempt 3: also put peel’s second argument in a splice
It should be clear by now that since f
is neither a top-level variable nor can it be lifted, to avoid violating rule #4, f
must be used at level 0.
The more obvious way to achieve this is to make peel
take CodeQ ((a -> b) -> a -> b)
as the second argument:
-- Attempt 3 - still not quite right
peel_3 :: Int -> CodeQ ((a -> b) -> a -> b) -> CodeQ (a -> b)
peel_3 0 f = [|| fix $$f ||]
peel_3 n f = [|| $$f $$(peel_3 (n - 1) f) ||]
This compiles!
In this definition, f
, n
and peel_3
are all used at level 0 (as each usage is surrounded by one splice and one quotation) - no stage violation!
Now we can proceed to define lengthPeeled_3
as follows.
Note that due to rule #4, lengthPeeled_3
must be placed in a different module than the one that defines peel_3
.
lengthPeeled_3 :: forall a. [a] -> Int
lengthPeeled_3 =
$$( let step :: CodeQ (([a] -> Int) -> [a] -> Int)
step = [|| \self xs -> case xs of [] -> 0; y:ys -> 1 + self ys ||]
in peel_3 3 step
)
GHC is happy with this, but does it do what we want?
Sadly, no.
If we use -ddump-splices
to inspect the generated code, we’ll see that it produces something equivalent to the following:
-- Generated by TH
lengthPeeled_3 :: forall a. [a] -> Int
lengthPeeled_3 = step (step (step (fix step)))
where
step = \self xs -> case xs of [] -> 0; y:ys -> 1 + self ys
This isn’t quite what we want.
What we’ve done is expand fix step
into step (step (step (fix step)))
.
While this may not be totally useless, it definitely falls short of the desired lengthPeeled
.
The desired lengthPeeled
is equivalent to taking step (step (step (fix step)))
, and applying beta reduction 3 times.
Specifically, applying one beta reduction gives
step (step (\zs -> case zs of [] -> 0; w:ws -> 1 + fix step ws))
and applying two more beta reductions results in what we want.
This is why the desired lengthPeeled
is better: not only do fewer beta reductions need to be performed at runtime, but performing them at compile time could expose additional optimization opportunities for the compiler.
To generate it correctly, we need to rethink our approach.
Attempt 4: evaluate f (peel (n-1) f) earlier
The problem with attempt 3 is that the code isn’t sufficiently evaluated at compile time.
In particular, in the definition of peel_3
, expression peel_3 (n - 1) f
is at level 0 (because it is surrounded by one splice and one quotation), which means peel_3 (n - 1) f
is evaluated when peel_3 n f
is evaluated.
This is desirable, because when compiling lengthPeeled_3
, expression peel_3 3 step
is evaluated at level -1 (compile time), and as a result, all applications of peel_3
are evaluated at compile time.
This is what expands fix step
into step (step (step (fix step)))
.
But on the other hand, the application of f
- $$f $$(peel_3 (n - 1) f)
- is evaluated too late, because it is at a later level - level 1 (since it is only surrounded by a quotation).
This means it is not evaluated at compile time when compiling lengthPeeled_3
.
This application is exactly where the desired beta reductions take place, so we need to find a way to put it at a lower level - level 0.
Recall that level 0 means something is either not surrounded by quotations or splices, or surrounded by an equal number of quotations or splices.
This suggests that in the definition of peel_4
, the peel_4 k f
case could simply be
peel_4 k f = f (peel_4 (k-1) f) -- no quotation or splice - everything is at level 0
This appears identical to attempt 1, but only at the term level.
The types are very different, since peel_4
needs to return CodeQ (a -> b)
.
In order for the types to align correctly, peel_4
’s type needs to be
peel_4 :: Int -> (CodeQ (a -> b) -> CodeQ (a -> b)) -> CodeQ (a -> b)
What about the peel_4 0 f
case?
This turns out to be a bit more involved, as it requires nested quotations, but guided by the types and the knowledge of stage errors, it isn’t too difficult to get to the correct definition:
peel_4 0 f = [|| fix (\self -> $$(f [|| self ||])) ||]
This is well-staged: f
is bound and used at level 0, self
is bound and used at level 1, and fix
is imported.
All is well.
And once again, guided by the types and the knowledge of stage errors, it is relatively straightforward to arrive at the correct definition of lengthPeeled_4
:
lengthPeeled_4 :: forall a. [a] -> Int
lengthPeeled_4 =
$$( let step :: CodeQ ([a] -> Int) -> CodeQ ([a] -> Int)
step = \self -> [|| \xs -> case xs of [] -> 0; y:ys -> 1 + $$self ys ||]
in peel_4 3 step
)
This produces exactly the code we want.
Update (03/11/2025) : a reader (AndrasKovacs on Reddit) pointed out another alternative, using a custom fix
rather than the standard one from base
:
peel_5 :: Int -> ((CodeQ a -> CodeQ b) -> CodeQ a -> CodeQ b) -> CodeQ (a -> b)
peel_5 k f = [||\a -> $$(go k f [||a||])||]
where
go :: Int -> ((CodeQ a -> CodeQ b) -> CodeQ a -> CodeQ b) -> CodeQ a -> CodeQ b
go 0 f a = [||let body a = $$(f (\a -> [||body $$a||]) [||a||]) in body $$a||]
go k f a = f (go (k - 1) f) a
lengthPeeled_5 :: forall a. [a] -> Int
lengthPeeled_5 =
$$( let step :: (Code Q [a] -> Code Q Int) -> Code Q [a] -> Code Q Int
step = \self xs -> [|| case $$xs of [] -> 0; y:ys -> $$(self [|| ys ||]) ||]
in peel 3 step
)
This generates more optimized code that performs additional beta reductions when compiling with -O0
.
It would be a good exercise to try to figure out why.
-
Assuming the compiler does not perform constant folding. ↩
-
Single-quoting an identifier, e.g.,
'x
, is equivalent to placing it in a quotation ([|| x ||]
) for the purpose of calculating levels. For instance, inf x = 'x
,x
is bound at level 0 and used at level 1. ↩ -
On the other hand,
foo x = [p| x |]
is always valid, because thex
in the quotation does not refer to the function parameterx
. Instead,[p| ... |]
quotes patterns, meaning thex
within it is a fresh variable introduced in a variable pattern that always succeeds. ↩