I have a simple script written in both Python and Haskell. It reads a file with 1,000,000 newline separated integers, parses that file into a list of integers, quick sorts i
This is after the fact, but I think most of the trouble is in the Haskell writing. The following module is pretty primitive -- one should use builders probably and certainly avoid the ridiculous roundtrip via String for showing -- but it is simple and did distinctly better than pypy with kindall's improved python and better than the 2 and 4 sec Haskell modules elsewhere on this page (it surprised me how much they were using lists, so I made a couple more turns of the crank.)
$ time aa.hs real 0m0.709s
$ time pypy aa.py real 0m1.818s
$ time python aa.py real 0m3.103s
I'm using the sort recommended for unboxed vectors from vector-algorithms. The use of Data.Vector.Unboxed in some form is clearly now the standard, naive way of doing this sort of thing -- it's the new Data.List (for Int, Double, etc.) Everything but the sort is irritating IO management, which could I think still be massively improved, on the write end in particular. The reading and sorting together take about 0.2 sec as you can see from asking it to print what's at a bunch of indexes instead of writing to file, so twice as much time is spent writing as in anything else. If the pypy is spending most of its time using timsort or whatever, then it looks like the sorting itself is surely massively better in Haskell, and just as simple -- if you can just get your hands on the darned vector...
I'm not sure why there aren't convenient functions around for reading and writing vectors of unboxed things from natural formats -- if there were, this would be three lines long and would avoid String and be much faster, but maybe I just haven't seen them.
import qualified Data.ByteString.Lazy.Char8 as BL
import qualified Data.ByteString.Char8 as B
import qualified Data.Vector.Unboxed.Mutable as M
import qualified Data.Vector.Unboxed as V
import Data.Vector.Algorithms.Radix
import System.IO
main = do unsorted <- fmap toInts (BL.readFile "data")
vec <- V.thaw unsorted
sorted <- sort vec >> V.freeze vec
withFile "sorted" WriteMode $ \handle ->
V.mapM_ (writeLine handle) sorted
writeLine :: Handle -> Int -> IO ()
writeLine h int = B.hPut h $ B.pack (show int ++ "\n")
toInts :: BL.ByteString -> V.Vector Int
toInts bs = V.unfoldr oneInt (BL.cons ' ' bs)
oneInt :: BL.ByteString -> Maybe (Int, BL.ByteString)
oneInt bs = if BL.null bs then Nothing else
let bstail = BL.tail bs
in if BL.null bstail then Nothing else BL.readInt bstail