Python faster than compiled Haskell?

前端 未结 7 1287
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-29 23:16

I have a simple script written in both Python and Haskell. It reads a file with 1,000,000 newline separated integers, parses that file into a list of integers, quick sorts i

7条回答
  •  栀梦
    栀梦 (楼主)
    2020-11-29 23:38

    This is after the fact, but I think most of the trouble is in the Haskell writing. The following module is pretty primitive -- one should use builders probably and certainly avoid the ridiculous roundtrip via String for showing -- but it is simple and did distinctly better than pypy with kindall's improved python and better than the 2 and 4 sec Haskell modules elsewhere on this page (it surprised me how much they were using lists, so I made a couple more turns of the crank.)

    $ time aa.hs        real    0m0.709s
    $ time pypy aa.py   real    0m1.818s
    $ time python aa.py real    0m3.103s
    

    I'm using the sort recommended for unboxed vectors from vector-algorithms. The use of Data.Vector.Unboxed in some form is clearly now the standard, naive way of doing this sort of thing -- it's the new Data.List (for Int, Double, etc.) Everything but the sort is irritating IO management, which could I think still be massively improved, on the write end in particular. The reading and sorting together take about 0.2 sec as you can see from asking it to print what's at a bunch of indexes instead of writing to file, so twice as much time is spent writing as in anything else. If the pypy is spending most of its time using timsort or whatever, then it looks like the sorting itself is surely massively better in Haskell, and just as simple -- if you can just get your hands on the darned vector...

    I'm not sure why there aren't convenient functions around for reading and writing vectors of unboxed things from natural formats -- if there were, this would be three lines long and would avoid String and be much faster, but maybe I just haven't seen them.

    import qualified Data.ByteString.Lazy.Char8 as BL
    import qualified Data.ByteString.Char8 as B
    import qualified Data.Vector.Unboxed.Mutable as M
    import qualified Data.Vector.Unboxed as V
    import Data.Vector.Algorithms.Radix 
    import System.IO
    
    main  = do  unsorted <- fmap toInts (BL.readFile "data")
                vec <- V.thaw unsorted
                sorted <- sort vec >> V.freeze vec
                withFile "sorted" WriteMode $ \handle ->
                   V.mapM_ (writeLine handle) sorted
    
    writeLine :: Handle -> Int -> IO ()
    writeLine h int = B.hPut h $ B.pack (show int ++ "\n")
    
    toInts :: BL.ByteString -> V.Vector Int
    toInts bs = V.unfoldr oneInt (BL.cons ' ' bs) 
    
    oneInt :: BL.ByteString -> Maybe (Int, BL.ByteString)
    oneInt bs = if BL.null bs then Nothing else 
                   let bstail = BL.tail bs
                   in if BL.null bstail then Nothing else BL.readInt bstail
    

提交回复
热议问题