Merge sort in Haskell

只谈情不闲聊 提交于 2019-12-02 21:35:43

Try this version:

mergesort :: [String] -> [String]
mergesort = mergesort' . map wrap

mergesort' :: [[String]] -> [String]
mergesort' [] = []
mergesort' [xs] = xs
mergesort' xss = mergesort' (merge_pairs xss)

merge_pairs :: [[String]] -> [[String]]
merge_pairs [] = []
merge_pairs [xs] = [xs]
merge_pairs (xs:ys:xss) = merge xs ys : merge_pairs xss

merge :: [String] -> [String] -> [String]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys)
 = if x > y
        then y : merge (x:xs)  ys
        else x : merge  xs    (y:ys)

wrap :: String -> [String]
wrap x = [x]
  1. Bad idea is splitting list first. Instead of it just make list of one member lists. Haskell is lazy, it will be done in right time.
  2. Then merge pairs of lists until you have only one list.

Edit: Someone who down-vote this answer: above merge sort implementation is same algorithm as used in ghc Data.List.sort except with cmp function removed. Well ghc authors are may be wrong :-/

In Haskell, a string is a lazy list of characters and has the same overhead as any other list. If I remember right from a talk I heard Simon Peyton Jones give in 2004, the space cost in GHC is 40 bytes per character. For an apples-to-apples comparation you probably should be sorting Data.ByteString, which is designed to give performance comparable to other languages.

Better way to split the list to avoid the issue CesarB points out:

split []             = ([], [])
split [x]            = ([x], [])
split (x : y : rest) = (x : xs, y : ys)
                       where (xs, ys) = split rest

mergeSort []  = []
mergeSort [x] = [x]
mergeSort xs  = merge (mergesort ys) (mergesort zs)
                where (ys, zs) = split xs

EDIT: Fixed.

I am not sure if this is the cause of your problem, but remember that lists are a sequential data structure. In particular, both length xs and splitAt n xs will take an amount of time proportional to the length of the list (O(n)).

In C and Java, you are most probably using arrays, which take constant time for both operations (O(1)).

Edit: answering your question on how to make it more efficient, you can use arrays in Haskell too.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!