Computing the mean of a list efficiently in Haskell

后端 未结 6 1995
傲寒
傲寒 2021-02-04 11:21

I\'ve designed a function to compute the mean of a list. Although it works fine, but I think it may not be the best solution due to it takes two functions rather than one. Is it

6条回答
  •  广开言路
    2021-02-04 11:49

    To follow up on Don's 2010 reply, on GHC 8.0.2 we can do much better. First let's try his version.

    module Main (main) where
    
    import System.CPUTime.Rdtsc (rdtsc)
    import Text.Printf (printf)
    import qualified Data.Vector.Unboxed as U
    
    data Pair = Pair {-# UNPACK #-}!Int {-# UNPACK #-}!Double
    
    mean' :: U.Vector Double -> Double
    mean' xs = s / fromIntegral n
      where
        Pair n s       = U.foldl' k (Pair 0 0) xs
        k (Pair n s) x = Pair (n+1) (s+x)
    
    main :: IO ()
    main = do
      s <- rdtsc
      let r = mean' (U.enumFromN 1 30000000)
      e <- seq r rdtsc
      print (e - s, r)
    

    This gives us

    [nix-shell:/tmp]$ ghc -fforce-recomp -O2 MeanD.hs -o MeanD && ./MeanD +RTS -s
    [1 of 1] Compiling Main             ( MeanD.hs, MeanD.o )
    Linking MeanD ...
    (372877482,1.50000005e7)
         240,104,176 bytes allocated in the heap
               6,832 bytes copied during GC
              44,384 bytes maximum residency (1 sample(s))
              25,248 bytes maximum slop
                 230 MB total memory in use (0 MB lost due to fragmentation)
    
                                         Tot time (elapsed)  Avg pause  Max pause
      Gen  0         1 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
      Gen  1         1 colls,     0 par    0.006s   0.006s     0.0062s    0.0062s
    
      INIT    time    0.000s  (  0.000s elapsed)
      MUT     time    0.087s  (  0.087s elapsed)
      GC      time    0.006s  (  0.006s elapsed)
      EXIT    time    0.006s  (  0.006s elapsed)
      Total   time    0.100s  (  0.099s elapsed)
    
      %GC     time       6.2%  (6.2% elapsed)
    
      Alloc rate    2,761,447,559 bytes per MUT second
    
      Productivity  93.8% of total user, 93.8% of total elapsed
    

    However the code is simple: ideally there should be no need for vector: optimal code should be possible from just inlining the list generation. Luckily GHC can do this for us[0].

    module Main (main) where
    
    import System.CPUTime.Rdtsc (rdtsc)
    import Text.Printf (printf)
    import Data.List (foldl')
    
    data Pair = Pair {-# UNPACK #-}!Int {-# UNPACK #-}!Double
    
    mean' :: [Double] -> Double
    mean' xs = v / fromIntegral l
      where
        Pair l v = foldl' f (Pair 0 0) xs
        f (Pair l' v') x = Pair (l' + 1) (v' + x)
    
    main :: IO ()
    main = do
      s <- rdtsc
      let r = mean' $ fromIntegral <$> [1 :: Int .. 30000000]
          -- This is slow!
          -- r = mean' [1 .. 30000000]
      e <- seq r rdtsc
      print (e - s, r)
    

    This gives us:

    [nix-shell:/tmp]$ ghc -fforce-recomp -O2 MeanD.hs -o MeanD && ./MeanD +RTS -s
    [1 of 1] Compiling Main             ( MeanD.hs, MeanD.o )
    Linking MeanD ...
    (128434754,1.50000005e7)
             104,064 bytes allocated in the heap
               3,480 bytes copied during GC
              44,384 bytes maximum residency (1 sample(s))
              17,056 bytes maximum slop
                   1 MB total memory in use (0 MB lost due to fragmentation)
    
                                         Tot time (elapsed)  Avg pause  Max pause
      Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
      Gen  1         1 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
    
      INIT    time    0.000s  (  0.000s elapsed)
      MUT     time    0.032s  (  0.032s elapsed)
      GC      time    0.000s  (  0.000s elapsed)
      EXIT    time    0.000s  (  0.000s elapsed)
      Total   time    0.033s  (  0.032s elapsed)
    
      %GC     time       0.1%  (0.1% elapsed)
    
      Alloc rate    3,244,739 bytes per MUT second
    
      Productivity  99.8% of total user, 99.8% of total elapsed
    

    [0]: Notice how I had to map fromIntegral: without this, GHC fails to eliminate [Double] and the solution is much slower. That is somewhat sad: I don't understand why GHC fails to inline/decides it does not need to without this. If you do have genuine collection of fractionals, then this hack won't work for you and vector may still be necessary.

提交回复
热议问题