Optimizing Haskell code

前端 未结 6 1241
轮回少年
轮回少年 2020-12-31 04:20

I\'m trying to learn Haskell and after an article in reddit about Markov text chains, I decided to implement Markov text generation first in Python and now in Haskell. Howev

6条回答
  •  不思量自难忘°
    2020-12-31 04:54

    I tried to avoid doing anything fancy or subtle. These are just two approaches to doing the grouping; the first emphasizes pattern matching, the second doesn't.

    import Data.List (foldl')
    import qualified Data.Map as M
    import qualified Data.ByteString.Char8 as B
    
    type Database2 = M.Map (B.ByteString, B.ByteString) (M.Map B.ByteString Int)
    
    train2 :: [B.ByteString] -> Database2
    train2 words = go words M.empty
        where go (x:y:[]) m = m
              go (x:y:z:xs) m = let addWord Nothing   = Just $ M.singleton z 1
                                    addWord (Just m') = Just $ M.alter inc z m'
                                    inc Nothing    = Just 1
                                    inc (Just cnt) = Just $ cnt + 1
                                in go (y:z:xs) $ M.alter addWord (x,y) m
    
    train3 :: [B.ByteString] -> Database2
    train3 words = foldl' update M.empty (zip3 words (drop 1 words) (drop 2 words))
        where update m (x,y,z) = M.alter (addWord z) (x,y) m
              addWord word = Just . maybe (M.singleton word 1) (M.alter inc word)
              inc = Just . maybe 1 (+1)
    
    main = do contents <- B.readFile "76.txt"
              let db = train3 $ B.words contents
              print $ "Built a DB of " ++ show (M.size db) ++ " words"
    

    I think they are both faster than the original version, but admittedly I only tried them against the first reasonable corpus I found.

    EDIT As per Travis Brown's very valid point,

    train4 :: [B.ByteString] -> Database2
    train4 words = foldl' update M.empty (zip3 words (drop 1 words) (drop 2 words))
        where update m (x,y,z) = M.insertWith (inc z) (x,y) (M.singleton z 1) m
              inc k _ = M.insertWith (+) k 1
    

提交回复
热议问题