Haskell lazy I/O and closing files

前端 未结 7 877
攒了一身酷
攒了一身酷 2020-12-07 23:35

I\'ve written a small Haskell program to print the MD5 checksums of all files in the current directory (searched recursively). Basically a Haskell version of md5deep

7条回答
  •  被撕碎了的回忆
    2020-12-08 00:04

    Lazy IO is very bug-prone.

    As dons suggested, you should use strict IO.

    You can use a tool such as Iteratee to help you structure strict IO code. My favorite tool for this job is monadic lists.

    import Control.Monad.ListT (ListT) -- List
    import Control.Monad.IO.Class (liftIO) -- transformers
    import Data.Binary (encode) -- binary
    import Data.Digest.Pure.MD5 -- pureMD5
    import Data.List.Class (repeat, takeWhile, foldlL) -- List
    import System.IO (IOMode(ReadMode), openFile, hClose)
    import qualified Data.ByteString.Lazy as BS
    import Prelude hiding (repeat, takeWhile)
    
    hashFile :: FilePath -> IO BS.ByteString
    hashFile =
        fmap (encode . md5Finalize) . foldlL md5Update md5InitialContext . strictReadFileChunks 1024
    
    strictReadFileChunks :: Int -> FilePath -> ListT IO BS.ByteString
    strictReadFileChunks chunkSize filename =
        takeWhile (not . BS.null) $ do
            handle <- liftIO $ openFile filename ReadMode
            repeat () -- this makes the lines below loop
            chunk <- liftIO $ BS.hGet handle chunkSize
            when (BS.null chunk) . liftIO $ hClose handle
            return chunk
    

    I used the "pureMD5" package here because "Crypto" doesn't seem to offer a "streaming" md5 implementation.

    Monadic lists/ListT come from the "List" package on hackage (transformers' and mtl's ListT are broken and also don't come with useful functions like takeWhile)

提交回复
热议问题