There are two versions of StateT
in the transformers
package: lazy and strict. And each
version comes with two functions for updating
the state: the lazy modify
and the strict modify'
. As long as, one might contemplate, I
stick to the strict StateT
and the strict modify'
, it should be safe and I’d
need to worry no more about space leaks, right? Wrong.
Worse, sometimes the space leak can be “hidden” by -O
or -O2
, or other GHC flags. It can
resurface when a seemingly unrelated change is made. The situation can also vary depending
on the underlying monad. Space leak is tricky business for Haskell developers
and this is just one example.
In this blog post I’ll explain using a small program how space leaks can still occur
when using the strict StateT
and the strict modify'
. Also discussed include other
factors (e.g., GHC flags) that may impact memory consumption, things to look out for,
as well as stack space leaks vs. heap space leaks. I used GHC 9.0.1 to compile and run
the example program.
An Example
module Main where
import Control.Monad.IO.Class
import Control.Monad.Trans.State.Strict
import Data.Foldable
main :: IO ()
main = print $
flip execState (Just 0 :: Maybe Int) $
for_ [1 .. 1000000] $ \i ->
modify' $ fmap (+ i)
This program uses the strict StateT
and the strict modify'
, but it leaks space. Compiling it with -rtsopts
,
then running it with +RTS -K1M -RTS
(i.e., limiting the stack size to 1M), leads to stack overflow:
Main: Stack space overflow: current size 33624 bytes.
Main: Use `+RTS -Ksize -RTS' to increase it.
Why the Strict StateT Does Not Prevent the Space Leak
First of all, not only does the strict StateT
not always prevent space leaks, but it can
cause space leaks. The relationship between the two versions of StateT
is similar to that between foldl'
and foldr
. Recall that as a general rule,
foldl'
is the right choice when you are folding a structure into a single strict value,
while foldr
is more appropriate if the result is a lazy and potentially infinite data
structure. In the latter case, the recursion in foldr
is usually hidden behind a data
constructor. This is known as “guarded recursion”, which ensures that it only computes
the part of the result that is needed. For example, sum
should use foldl'
and map
should use foldr
. Using foldr
for sum
leaks memory, as does using foldl'
for map
.
Similarly, the strict StateT
should, generally speaking, be used if the result is a strict value, and the lazy
StateT
if the result is a lazy structure. The difference between the two
versions of StateT
is that the lazy version uses lazy pattern matching on intermediate
results (the (value, state) pairs), essentially delaying the recursive call, and potentially
turning a non-guarded recursion into a guarded one. Just like foldl'
and foldr
, using the
wrong version of StateT
could potentially lead to memory leaks.
That being said, unlike foldl'
and foldr
, the situation with StateT
also depends
on the underlying monad. In general, for underlying monads
whose (>>=)
operator is lazy, such as Identity
1, Reader
, and the lazy Writer
, choosing
the right version of StateT
is important. On the other hand, underlying monads with strict (>>=)
operators, like IO
, Maybe
, the strict Writer
and []
, render the choice of lazy vs. strict StateT
moot, since the (>>=)
of StateT
would be strict anyway even for the lazy StateT
.2
Here are some examples where the underlying monad is Identity
, and choosing the proper version of
StateT
does matter:
-- Must use lazy StateT: the result is a lazy structure, [Int].
main :: IO ()
main = print $ take 5 $ evalState (go 0) ()
where
go :: Int -> State () [Int]
go x
| x == 1000000 = pure []
| otherwise = do
rest <- go (x + 1)
pure (x : rest)
-- Must use strict StateT
main :: IO ()
main = print $ runState (go 0) 0
where
go :: Int -> State Int ()
go x
| x == 1000000 = pure ()
| otherwise = do
modify' (+ 1)
go (x + 1)
Now, back to the question of why the strict StateT
doesn’t prevent the space leak
in our original example. Our underlying monad is Identity
, and we
don’t need the result, so the strict StateT
is the right choice. However, it is
only strict to the extent that the intermediate (value, state) pairs are evaluated to
weak head normal form (WHNF). Since (,)
is lazy, this does not force either the value
or the state. At issue here in our example is that the state keeps getting bigger
and bigger without being evaluated, and the strict StateT
can’t prevent it from
happening.
Why the Strict modify’ Does Not Prevent the Space Leak
modify'
does force the new state, but only to WHNF. If the state type in our example
was Int
, then we’d have no problem. But it’s Maybe Int
, which means modify'
will
only evaluate it to Just <thunk>
, but does not evaluate the thunk to WHNF. Then
what winds up happening is that the thunk under the Just
keeps growing,
and is only collapsed at the end.
Why O2 Does Prevent (This Particular) Space Leak
When compiling our example with -O2
, the space leak goes away, and running it with -K1M
successfully prints “Just 500000500000”. How come? It turns out to be the effect of two
flags implied by -O2
:
-fno-ignore-interface-pragmas
(i.e.,-O2
turns off-fignore-interface-pragmas
). Ignoring interface pragmas means nothing will be inlined. Inlining often enables further optimization, and in this case, inlining the+
operator allows-fspec-constr
to generate more efficient code.-fspec-constr
. This flag turns on call-pattern specialization, which is described in paper Call-pattern specialisation for Haskell programs. Section 2.1 of the paper usesdrop
as an example to show how call-pattern specialization creates a specialized functiondrop' :: Int# -> [a] -> [a]
(vs.drop: Int -> [a] -> [a]
).drop'
’s first argument is an unlifted type, hence strict. It is exactly for this reason that compiling with-O2
produces, in this case, more strict code.3
Compiling with -O
itself does not prevent the space leak, because -O
does not imply -fspec-constr
.
-O -fspec-constr
does the job.
Compiling with -O0 -fno-ignore-interface-pragmas -fspec-constr
, however, cannot prevent the
space leak, because -O0
ignores a number of optimization flags, including -fspec-constr
.
Although -O2
prevents the space leak in this particular case, it is not something you can rely on to fix space
leaks in general. It isn’t difficult at all to produce a space leak with -O2
enabled. For example, if you
do either one of these two things, then -O2
is no longer capable of preventing the space leak:
- change
fmap (+ i)
tofmap (+ if even i then 1 else 2)
; - or, create a wrapper for
+
markedNOINLINE
, then use the wrapper in place of+
.
This shows when compiling with optimizations, space leaks can occur after making a seemingly innocent and unrelated change, which can further add to the confusion, and that’s not a good situation to be in.
Pay Attention to Whether You Are Forcing the Right Things
A few common ways of making things more strict include seq
, deepseq
and bang patterns. But
before rushing to reach for them,
you may want to make sure you are going to use them correctly.
You may be tempted, for example, to do something like this:
- modify' $ fmap (+ i)
+ modify' $ fmap $ \(!x) -> Control.DeepSeq.force (x + i)
or
- modify' $ fmap (+ i)
+ modify' $ \(!x) -> fmap (+ i) x
Does either of these fix the space leak? Nope. In the first attempt, we are forcing
things within fmap
, and that doesn’t help. Recall that
fmap f (Just a) = Just (f a)
. And since modify'
doesn’t evaluate things beyond Just
,
the f a
won’t be evaluated till the end.
In the second attempt, the (!x)
only evaluates x :: Just Int
to WHNF, which is
useless: modify'
already evaluates it to WHNF, and we need to evaluate it beyond WHNF.
The proper fix is to ensure the new state is fully evaluated whenever the argument to
modify'
is evaluated to WHNF. Two possible fixes are:
- modify' $ fmap (+ i)
+ modify' $ \case Just (!j) -> Just (i + j); Nothing -> Nothing
and
- modify' $ fmap (+ i)
+ modify' $ Control.DeepSeq.force . fmap (+i)
Stack vs. Heap
What if we only need to evaluate the final state to WHNF (and not NF)? Suppose we make the following change to the original example:
- main = print $
+ main = print $ isJust $
Now we only need to know whether the final state is Just
or Nothing
, but don’t need the exact
value. Indeed, if we compile it with -O0
and run it with -K1M
, it would successfully print
out Just
. So we are all good, right? Wrong. Now it doesn’t leak space on the stack,
but it still leaks space on the heap. If we run it
with -M2M
(which limits the maximum heap space to 2M), then we’d get
Main: Heap exhausted;
Main: Current maximum heap size is 2097152 bytes (2 MB).
Main: Use `+RTS -M<size>' to increase it.
Thunks are stored on the heap. The stack is consumed when a thunk is evaluated. In this
case we don’t need a lot of stack space since we don’t need to evaluate the thunk beyond
Just
, but the thunk still occupies the heap.
By the way, in case you come from another programming language and are relatively new to Haskell, and don’t know this: the stack in Haskell is not the call stack employed in the runtime of most other programming languages. Haskell as a lazy language has an entirely different and fairly unique execution model which doesn’t use call stacks to run subroutines. Its “stack” is part of the spineless tagless G-machine, an abstract machine for evaluating lazy functional code. In Haskell there’s no such thing as “one stack per thread”, and the stack size can be much larger than the typical size of a call stack. Haskell’s default stack size is 80% of the heap size.
StrictData
Finally, though not applicable to our particular example, it bears noting that
StrictData
is a good extension to use by default. It tends to make the code much more resistant
to space leaks arising from unevaluated expressions.
-
Identity
’s(>>=)
operator is lazy becauseIdentity
is a newtype wrapper; it doesn’t exist at runtime, and so pattern matching onIdentity
doesn’t force evaluation (in other words,case undefined of Identity _ -> 3
evaluates to 3). IfIdentity
was defined using data, then its(>>=)
would be strict. ↩ -
The list monad (
[]
) as an underlying monad is especially prone to space leaks, because its(>>=)
operator desugars toconcatMap
, which essentially delays pattern matching on[]
’s data constructors till the end, even for the strictStateT
. If your use case demands using list as either the underlying monad, or the monad transformer itself (“ListT
”), consider using a streaming library likeconduit
orpipes
. ↩ -
You may go and check the outputs from
-ddump-simpl
, both with and without-fspec-constr
, and see if you can spot the effect of-fspec-constr
. ↩