How to use Criterion to measure performance of Haskell programs?

后端 未结 1 1599
遥遥无期
遥遥无期 2020-12-29 08:01

I\'m trying to measure the performance of a simple Haar DWT program using the Criterion framework. (It is erroneously slow, but I\'ll leave that for another question). I can

相关标签:
1条回答
  • 2020-12-29 08:37

    The posted benchmark is erroniously slow... or is it

    Are you sure it's erroneous? You're touching (well, the "nf" call is touching) 2 million boxed elements - thats 4 million pointers. You can call this erroneous if you want, but the issue is just what you think you're measure compared to what you really are measuring.

    Sharing Data Between Benchmarks

    Data sharing can be accomplished through partial application. In my benchmarks I commonly have

    let var = somethingCommon in
    defaultMain [ bench "one" (nf (func1 somethingCommon) input1)
                , bench "two" (nf (func2 somethingCommon) input2)]
    

    Avoiding Reuse in the presences of lazy evaluation

    Criterion avoids sharing by separating out your function and your input. You have signatures such as:

    funcToBenchmark :: (NFData b) => a -> b
    inputForFunc :: a
    

    In Haskell every time you apply funcToBenchmark inputForFunc it will create a thunk that needs evaluated. There is no sharing unless you use the same variable name as a previous computation. There is no automatic memoization - this seems to be a common misunderstanding.

    Notice the nuance in what isn't shared. We aren't sharing the final result, but the input is shared. If the generation of the input is what you want to benchmark (i.e. getRandList, in this case) then benchmark that and not just the identity + nf function:

    main = do
        gen <- getStdGen
        let inData = getRandList gen size
            inVec = V.fromList inData
            size = 2097152
        defaultMain
          [ bench "get input for real" $ nf (getRandList gen) size
          , bench "get input for real and run harrDWT and listify a vector" $ nf (V.toList . haarDWT  . V.fromList . getRandList gen) size
          , bench "screw generation, how fast is haarDWT" $ whnf haarDWT inVec] -- for unboxed vectors whnf is sufficient
    

    Interpreting Data

    The third benchmark is rather instructive. Lets look at what criterion prints out:

    benchmarking screw generation, how fast is haarDWT
    collecting 100 samples, 1 iterations each, in estimated 137.3525 s
    bootstrapping with 100000 resamples
    mean: 134.7204 ms, lb 134.5117 ms, ub 135.0135 ms, ci 0.950
    

    Based on a single run, Criterion thinks it will take 137 seconds to perform it's 100 samples. About ten seconds later it was done - what happened? Well, the first run forced all the inputs (inVec), which was expensive. The subsequent runs found a value instead of a thunk, and thus we truely benchmarked haarDWT and not the StdGen RNG (which is known to be painfully slow).

    0 讨论(0)
提交回复
热议问题