Why are Haskell/GHC executables so large in filesize? [duplicate]

与世无争的帅哥 提交于 2019-11-30 22:07:11

问题


Possible Duplicate:
Small Haskell program compiled with GHC into huge binary

Recently I noticed how large Haskell executables are. Everything below was compiled on GHC 7.4.1 with -O2 on Linux.

  1. Hello World (main = putStrLn "Hello World!") is over 800 KiB. Running strip over it reduces the filesize to 500 KiB; even adding -dynamic to the compilation doesn't help much, leaving me with a stripped executable around 400 KiB.

  2. Compiling a very primitive example involving Parsec yields a 1.7 MiB file.

    -- File: test.hs
    import qualified Text.ParserCombinators.Parsec as P
    import Data.Either (either)
    
    -- Parses a string of type "x y" to the tuple (x,y).
    testParser :: P.Parser (Char, Char)
    testParser = do
        a <- P.anyChar
        P.char ' '
        b <- P.anyChar
        return (a, b)
    
    -- Parse, print result.
    str = "1 2"
    main = print $ either (error . show) id . P.parse    testParser "" $ str
    -- Output: ('1','2')
    

    Parsec may be a larger library, but I'm only using a tiny subset of it, and indeed the optimized core code generated by the above is dramatically smaller than the executable:

    $ ghc -O2 -ddump-simpl -fforce-recomp test.hs | wc -c
    49190 (bytes)
    

    Therefore, it's not the case that a huge amount of Parsec is actually found in the program, which was my initial assumption.

Why are the executables of such an enormous size? Is there something I can do about it (except dynamic linking)?


回答1:


To effectively reduce size of the executable produced by Glasgow Haskell Compiler you have to focus on

  • use of dynamic linking with -dynamic option passed to ghc so modules code won't get bundled into the final executable by utilizing of shared(dynamic) libraries. The existence of shared versions of these GHC's libraries in the system is required !
  • removing debugging informations of the final executable (f.E. by strip tool of GNU's binutils)
  • removing imports of unused modules (don't expect gains at dynamic linking)

The simple hello world example has the final size 9 KiB and Parsec test about 28 KiB (both 64 bit Linux executables) which I find quite small and acceptable for such a high level language implementation.




回答2:


My understanding is that if you use a single function from package X, the entire package gets statically linked in. I don't think GHC actually links function-by-function. (Unless you use the "split objects" hack, which "tends to freak the linker out".)

But if you're linking dynamically, that ought to fix this. So I'm not sure what to suggest here...

(I'm pretty sure I saw a blog post when dynamic linking first came out, demonstrating Hello World compiled to a 2KB binary. Obviously I cannot find this blog post now... grr.)

Consider also cross-module optimisation. If you're writing a Parsec parser, it's likely that GHC will inline all the parser definitions and simplify them down to the most efficient code. And, sure enough, your few lines of Haskell have produced 50KB of Core. Should that get 37x bigger when compiling to machine-code? I don't know. You could perhaps try looking at the STG and Cmm code produced in the next steps. (Sorry, I don't recall the compiler flags off the top of my head...)



来源:https://stackoverflow.com/questions/12719207/why-are-haskell-ghc-executables-so-large-in-filesize

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!