I\'ve written a small Haskell program to print the MD5 checksums of all files in the current directory (searched recursively). Basically a Haskell version of md5deep>
Lazy IO is very bug-prone.
As dons suggested, you should use strict IO.
You can use a tool such as Iteratee to help you structure strict IO code. My favorite tool for this job is monadic lists.
import Control.Monad.ListT (ListT) -- List
import Control.Monad.IO.Class (liftIO) -- transformers
import Data.Binary (encode) -- binary
import Data.Digest.Pure.MD5 -- pureMD5
import Data.List.Class (repeat, takeWhile, foldlL) -- List
import System.IO (IOMode(ReadMode), openFile, hClose)
import qualified Data.ByteString.Lazy as BS
import Prelude hiding (repeat, takeWhile)
hashFile :: FilePath -> IO BS.ByteString
hashFile =
fmap (encode . md5Finalize) . foldlL md5Update md5InitialContext . strictReadFileChunks 1024
strictReadFileChunks :: Int -> FilePath -> ListT IO BS.ByteString
strictReadFileChunks chunkSize filename =
takeWhile (not . BS.null) $ do
handle <- liftIO $ openFile filename ReadMode
repeat () -- this makes the lines below loop
chunk <- liftIO $ BS.hGet handle chunkSize
when (BS.null chunk) . liftIO $ hClose handle
return chunk
I used the "pureMD5" package here because "Crypto" doesn't seem to offer a "streaming" md5 implementation.
Monadic lists/ListT come from the "List" package on hackage (transformers' and mtl's ListT are broken and also don't come with useful functions like takeWhile)