Controlling memory allocation/GC in a simulation?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-04 08:16:20

I've updated the hpaste with a working example. It looks like the culprits are:

  • Missing strictness annotations in three SimConfig fields: simArray, logP and logL
    data SimConfig = SimConfig {
            numDimensions :: !Int            -- strict
        ,   numWalkers    :: !Int            -- strict
        ,   simArray      :: !(IntMap [Double]) -- strict spine
        ,   logP          :: !(Seq Double)      -- strict spine
        ,   logL          :: !(Seq Double)      -- strict spine
        ,   pairStream    :: [(Int, Int)]    -- lazy
        ,   doubleStream  :: [Double]        -- lazy 
        } deriving Show
  • newConfig was never evaluated in the simKernel loop due to State being lazy. Another alternative would be to use the strict State monad instead.

    put $! newConfig
    
  • execState ... replicateM also builds thunks. I originally replaced this with a foldl' and moved the execState into the fold, but I would think swapping in replicateM_ is equivalent and easier to read:

    let sim = logL $ execState (replicateM_ epochs simKernel) initConfig
    --  sim = logL $ foldl' (const . execState simKernel) initConfig [1..epochs]
    

And a few calls to mapM .. replicate have been replaced with replicateM. Particularly noteworthy in consPairList where it reduces memory usage quite a bit. There is still room for improvement but the lowest hanging fruit involves unsafeInterleaveST... so I stopped.

I have no idea if the output results are what you want:

fromList [-4.287033457733427,-1.8000404912760795,-5.581988678626085,-0.9362372340483293,-5.267791907985331]

But here are the stats:

     268,004,448 bytes allocated in the heap
      70,753,952 bytes copied during GC
      16,014,224 bytes maximum residency (7 sample(s))
       1,372,456 bytes maximum slop
              40 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       490 colls,     0 par    0.05s    0.05s     0.0001s    0.0012s
  Gen  1         7 colls,     0 par    0.04s    0.05s     0.0076s    0.0209s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.12s  (  0.12s elapsed)
  GC      time    0.09s  (  0.10s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.21s  (  0.22s elapsed)

  %GC     time      42.2%  (45.1% elapsed)

  Alloc rate    2,241,514,569 bytes per MUT second

  Productivity  57.8% of total user, 53.7% of total elapsed
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!