I wanted to test foldl vs foldr. From what I\'ve seen you should use foldl over foldr when ever you can due to tail reccursion optimization.
This makes sense. Howeve
EDIT: Upon looking at this problem again, I think all current explanations are somewhat insufficient so I've written a longer explanation.
The difference is in how foldl
and foldr
apply their reduction function. Looking at the foldr
case, we can expand it as
foldr (\x -> [x] ++ ) [] [0..10000]
[0] ++ foldr a [] [1..10000]
[0] ++ ([1] ++ foldr a [] [2..10000])
...
This list is processed by sum
, which consumes it as follows:
sum = foldl' (+) 0
foldl' (+) 0 ([0] ++ ([1] ++ ... ++ [10000]))
foldl' (+) 0 (0 : [1] ++ ... ++ [10000]) -- get head of list from '++' definition
foldl' (+) 0 ([1] ++ [2] ++ ... ++ [10000]) -- add accumulator and head of list
foldl' (+) 0 (1 : [2] ++ ... ++ [10000])
foldl' (+) 1 ([2] ++ ... ++ [10000])
...
I've left out the details of the list concatenation, but this is how the reduction proceeds. The important part is that everything gets processed in order to minimize list traversals. The foldr
only traverses the list once, the concatenations don't require continuous list traversals, and sum
finally consumes the list in one pass. Critically, the head of the list is available from foldr
immediately to sum
, so sum
can begin working immediately and values can be gc'd as they are generated. With fusion frameworks such as vector
, even the intermediate lists will likely be fused away.
Contrast this to the foldl
function:
b xs = ( ++xs) . (\y->[y])
foldl b [] [0..10000]
foldl b ( [0] ++ [] ) [1..10000]
foldl b ( [1] ++ ([0] ++ []) ) [2..10000]
foldl b ( [2] ++ ([1] ++ ([0] ++ [])) ) [3..10000]
...
Note that now the head of the list isn't available until foldl
has finished. This means that the entire list must be constructed in memory before sum
can begin to work. This is much less efficient overall. Running the two versions with +RTS -s
shows miserable garbage collection performance from the foldl version.
This is also a case where foldl'
will not help. The added strictness of foldl'
doesn't change the way the intermediate list is created. The head of the list remains unavailable until foldl' has finished, so the result will still be slower than with foldr
.
I use the following rule to determine the best choice of fold
foldl'
(e.g. this will be the only/final traversal)foldr
.foldl
.In most cases foldr
is the best fold function because the traversal direction is optimal for lazy evaluation of lists. It's also the only one capable of processing infinite lists. The extra strictness of foldl'
can make it faster in some cases, but this is dependent on how you'll use that structure and how lazy it is.