Binary Serialization for Lists of Undefined Length in Haskell

前端 未结 2 1874
迷失自我
迷失自我 2020-12-31 02:08

I\'ve been using Data.Binary to serialize data to files. In my application I incrementally add items to these files. The two most popular serialization packages, binary an

2条回答
  •  無奈伤痛
    2020-12-31 02:21

    It's four years since this question has been answered, but I ran into the same problems as gatoatigrado in the comment to Don Stewart's answer. The put method works as advertised, but get reads the whole input. I believe the problem lies in the pattern match in the case statement, Stream xs <- get, which must determine whether or not the remaining get is a Stream a or not before returning.

    My solution used the example in Data.Binary.Get as a starting point:

    import Data.ByteString.Lazy(toChunks,ByteString)
    import Data.Binary(Binary(..),getWord8)
    import Data.Binary.Get(pushChunk,Decoder(..),runGetIncremental)
    import Data.List(unfoldr)
    
    decodes :: Binary a => ByteString -> [a]
    decodes = runGets (getWord8 >> get)
    
    runGets :: Get a -> ByteString -> [a]
    runGets g = unfoldr (decode1 d) . toChunks
      where d = runGetIncremental g
    
    decode1 _ [] = Nothing
    decode1 d (x:xs) = case d `pushChunk` x of
                         Fail _ _ str  -> error str
                         Done x' _ a   -> Just (a,x':xs)
                         k@(Partial _) -> decode1 k xs
    

    Note the use of getWord8 This is to read the encoded [] and : resulting from the definition of put for the stream instance. Also note, since getWord8 ignores the encoded [] and : symbols, this implementation will not detect the end of the list. My encoded file was just a single list so it works for that, but otherwise you'll need to modify.

    In any case, this decodes ran in constant memory in both cases of accessing the head and last elements.

提交回复
热议问题