I have this fairly simple function to compute the mean of elements of a big list, using two accumulators to hold the sum so far and the count so far:
mean =
I've written extensively on this:
Firstly, yes, if you want to require strict evaluation of the accumulators use seq and stay in Haskell 98:
mean = go 0 0
where
go s l [] = s / fromIntegral l
go s l (x:xs) = s `seq` l `seq`
go (s+x) (l+1) xs
main = print $ mean [0..10000000]
*Main> main
5000000.0
Secondly: strictness analysis will kick in if you give some type annotations, and compile with -O2:
mean :: [Double] -> Double
mean = go 0 0
where
go :: Double -> Int -> [Double] -> Double
go s l [] = s / fromIntegral l
go s l (x:xs) = go (s+x) (l+1) xs
main = print $ mean [0..10000000]
$ ghc -O2 --make A.hs
[1 of 1] Compiling Main ( A.hs, A.o )
Linking A ...
$ time ./A
5000000.0
./A 0.46s user 0.01s system 99% cpu 0.470 total
Because 'Double' is a wrapper over the strict atomic type Double#, with optimizations on, and a precise type, GHC runs strictness analysis and infers that the strict version will be ok.
import Data.Array.Vector
main = print (mean (enumFromToFracU 1 10000000))
data Pair = Pair !Int !Double
mean :: UArr Double -> Double
mean xs = s / fromIntegral n
where
Pair n s = foldlU k (Pair 0 0) xs
k (Pair n s) x = Pair (n+1) (s+x)
$ ghc -O2 --make A.hs -funbox-strict-fields
[1 of 1] Compiling Main ( A.hs, A.o )
Linking A ...
$ time ./A
5000000.5
./A 0.03s user 0.00s system 96% cpu 0.038 total
As described in the RWH chapter above.