Haskell/GHC performance of `any`/`all`

╄→гoц情女王★ 提交于 2019-12-01 19:32:55

I find it instructive to re-implement any in the various ways:

import Prelude hiding (any)
import Criterion.Main
import Data.Foldable (foldMap)
import Data.Monoid

Your exists:

exists :: (a -> Bool) -> [a] -> Bool
exists _ [] = False
exists pred (x : xs)
    = if pred x
      then True
      else exists pred xs

A version using (||):

existsOr :: (a -> Bool) -> [a] -> Bool
existsOr _ [] = False
existsOr pred (x : xs) = pred x || existsOr pred xs

Using foldr:

any :: (a -> Bool) -> [a] -> Bool
any pred = foldr ((||) . pred) False

Using foldr and Any:

anyF :: (a -> Bool) -> [a] -> Bool
anyF pred = getAny . foldr (mappend . (Any . pred)) mempty

Using foldMap and Any:

anyFM :: (a -> Bool) -> [a] -> Bool
anyFM pred = getAny . foldMap (Any . pred)

Benchmarks with ghc -O0:

benchmarking exists
time                 1.552 μs   (1.504 μs .. 1.593 μs)
                     0.989 R²   (0.983 R² .. 0.993 R²)
mean                 1.482 μs   (1.427 μs .. 1.545 μs)
std dev              196.1 ns   (168.8 ns .. 229.2 ns)
variance introduced by outliers: 93% (severely inflated)

benchmarking existsOr
time                 2.699 μs   (2.616 μs .. 2.768 μs)
                     0.992 R²   (0.988 R² .. 0.995 R²)
mean                 2.629 μs   (2.554 μs .. 2.704 μs)
std dev              277.8 ns   (235.8 ns .. 351.1 ns)
variance introduced by outliers: 89% (severely inflated)

benchmarking any
time                 5.551 μs   (5.354 μs .. 5.777 μs)
                     0.990 R²   (0.986 R² .. 0.995 R²)
mean                 5.553 μs   (5.395 μs .. 5.750 μs)
std dev              584.2 ns   (447.5 ns .. 835.5 ns)
variance introduced by outliers: 88% (severely inflated)

benchmarking anyF
time                 7.330 μs   (7.081 μs .. 7.612 μs)
                     0.988 R²   (0.982 R² .. 0.994 R²)
mean                 7.502 μs   (7.272 μs .. 7.762 μs)
std dev              848.2 ns   (712.6 ns .. 1.022 μs)
variance introduced by outliers: 89% (severely inflated)

benchmarking anyFM
time                 5.668 μs   (5.451 μs .. 6.008 μs)
                     0.987 R²   (0.975 R² .. 0.996 R²)
mean                 5.807 μs   (5.659 μs .. 5.975 μs)
std dev              542.5 ns   (446.4 ns .. 721.8 ns)
variance introduced by outliers: 86% (severely inflated)

Your version (exists) is indeed the fastest, and the foldr versions are rather slow.

With ghc -O2, your version (exists) is the slowest, and all other functions are nearly equally fast to each other:

benchmarking exists
time                 753.5 ns   (725.4 ns .. 779.9 ns)
                     0.990 R²   (0.986 R² .. 0.995 R²)
mean                 762.4 ns   (737.0 ns .. 787.0 ns)
std dev              82.47 ns   (66.79 ns .. 105.1 ns)
variance introduced by outliers: 91% (severely inflated)

benchmarking existsOr
time                 491.5 ns   (478.2 ns .. 503.2 ns)
                     0.994 R²   (0.992 R² .. 0.996 R²)
mean                 494.5 ns   (481.1 ns .. 512.9 ns)
std dev              54.97 ns   (42.54 ns .. 80.34 ns)
variance introduced by outliers: 92% (severely inflated)

benchmarking any
time                 461.2 ns   (442.0 ns .. 479.7 ns)
                     0.989 R²   (0.985 R² .. 0.993 R²)
mean                 456.0 ns   (439.3 ns .. 476.3 ns)
std dev              60.04 ns   (47.27 ns .. 89.47 ns)
variance introduced by outliers: 94% (severely inflated)

benchmarking anyF
time                 436.9 ns   (415.8 ns .. 461.0 ns)
                     0.978 R²   (0.967 R² .. 0.988 R²)
mean                 450.8 ns   (430.1 ns .. 472.6 ns)
std dev              70.64 ns   (57.04 ns .. 85.92 ns)
variance introduced by outliers: 96% (severely inflated)

benchmarking anyFM
time                 438.9 ns   (426.9 ns .. 449.5 ns)
                     0.993 R²   (0.989 R² .. 0.996 R²)
mean                 435.8 ns   (421.4 ns .. 447.6 ns)
std dev              45.32 ns   (36.73 ns .. 58.74 ns)
variance introduced by outliers: 90% (severely inflated)

If one looks into the simplified Core code (ghc -O2 -ddump-simpl), one sees that there are no foldrs anymore (with -O0, everything is still in there, folds included).

I would thus venture to say that your code is faster (in the un-optimized version, -O0) than the library code, because it is simpler (for the potential price of being less general). The optimized library code is faster than your version, since it is written in a way that its optimization potential is recognized by the compiler. (admittedly, this is a bit of guess work)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!