How do I write a constant-space length function in Haskell?

…衆ロ難τιáo~ 提交于 2019-12-22 04:44:08

问题


The canonical implementation of length :: [a] -> Int is:

length [] = 0
length (x:xs) = 1 + length xs

which is very beautiful but suffers from stack overflow as it uses linear space.

The tail-recursive version:

length xs = length' xs 0
  where length' [] n = n
        length' (x:xs) n = length xs (n + 1)

doesn't suffer from this problem, but I don't understand how this can run in constant space in a lazy language.

Isn't the runtime accumulating numerous (n + 1) thunks as it moves through the list? Shouldn't this function Haskell to consume O(n) space and lead to stack overflow?

(if it matters, I'm using GHC)


回答1:


Yes, you've run into a common pitfall with accumulating parameters. The usual cure is to force strict evaluation on the accumulating parameter; for this purpose I like the strict application operator $!. If you don't force strictness, GHC's optimizer might decide it's OK for this function to be strict, but it might not. Definitely it's not a thing to rely on—sometimes you want an accumulating parameter to be evaluated lazily and O(N) space is just fine, thank you.

How do I write a constant-space length function in Haskell?

As noted above, use the strict application operator to force evaluation of the accumulating parameter:

clength xs = length' xs 0
  where length' []     n = n
        length' (x:xs) n = length' xs $! (n + 1)

The type of $! is (a -> b) -> a -> b, and it forces the evaluation of the a before applying the function.




回答2:


Running your second version in GHCi:

> length [1..1000000]
*** Exception: stack overflow

So to answer your question: Yes, it does suffer from that problem, just as you expect.

However, GHC is smarter than the average compiler; if you compile with optimizations turned out, it'll fix the code for you and make it work in constant space.

More generally, there are ways to force strictness at specific points in Haskell code, preventing the building of deeply nested thunks. A usual example is foldl vs. foldl':

len1 = foldl (\x _ -> x + 1) 0
len2 = foldl' (\x _ -> x + 1) 0

Both functions are left folds that do the "same" thing, except that foldl is lazy while foldl' is strict. The result is that len1 dies with a stack overflow in GHCi, while len2 works correctly.




回答3:


A tail-recursive function doesn't need to maintain a stack, since the value returned by the function is simply going to be the value returned by the tail call. So instead of creating a new stack frame, the current one gets re-used, with the locals overwritten by the new values passed into the tail call. So every n+1 gets written into the same place where the old n was, and you have constant space usage.

Edit - Actually, as you've written it, you're right, it'll thunk the (n+1)s and cause an overflow. Easy to test, just try length [1..1000000].. You can fix that by forcing it to evaluate it first: length xs $! (n+1), which will then work as I said above.



来源:https://stackoverflow.com/questions/2777686/how-do-i-write-a-constant-space-length-function-in-haskell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!