Is there a way to split an InputStream?

问题

I wonder if there is a way to "split"/"duplicate" an System.IO.Streams.InputStream from the io-streams package to be forwarded to two processing stages?

duplicate :: InputStream a -> IO (InputStream a, InputStream a)

I can see that this probably doesn't work with the demand driven nature of streams, but what would be the canonical solution if you need several things to be processed? Would you build a pipeline that "writes to the side"? Like:

input >>= countEvents countIORef >>= logEvents loggerRef >>= output

I could probably go the Arrow route and store everything in tuples, but that is going to get dirty quickly and to my knowledge there is no Arrow interface to io-streams:

input >>= (countEvents *** logEvents) >>= output

Any recommendations?

回答1:

You can do this a number of ways, but since countEvents and logEvents are both folds over the stream, repeated applications of outputFoldM is probably the simplest. You aren't looking for a way to split the stream so much as a way to link the folds. outputFoldM turns a stream into a new stream associated with the result of applying a repeated fold operation to it, writing the result 'to the side' as you say.

>>> let logger = IOS.inputFoldM (\() x -> print x >> putStrLn "----") ()
>>> let counter = IOS.inputFoldM (\a _ -> return $! a + 1) (0::Int)
>>> ls0 <- IOS.fromList [1..5::Int]
>>> (ls1,io_count) <- counter ls0
>>> (ls2,_) <- logger ls1
>>> IOS.fold (+) 0 ls2
1          -- here we see the "logging" happening from `logger`
----
2
----
3
----
4
----
5
----
15        -- this is the sum from the `fold (+) 0` that actually exhausted the stream
>>> io_count 
5         -- this is the result of `counter`

For what it's worth, I wrote a patch to make it possible to apply the Folds and FoldMs from the foldl library to InputStreams https://github.com/snapframework/io-streams/issues/53 . This would permit you to apply indefinitely many simultaneous folds, discriminating elements as you please with lens, thus fitting the Arrow analogy you mention. Here's the same folds/sinks applied that way. I used ApplicativeDo to write the one big fold that does "logging" and collects statistics and formats them. The same thing can be written with the applicative operators.

{-#LANGUAGE ApplicativeDo #-}

import qualified System.IO.Streams as IOS
import qualified Control.Foldl as L
import Control.Lens (filtered)

main = do
  ls <- IOS.fromList [1..5::Int]
  res <- L.impurely IOS.foldM_ myfolds ls
  putStrLn res

myfolds = do 
  sum_        <- L.generalize L.sum     -- generalize makes an 'impure' fold
  length_     <- L.generalize L.length  -- out of a pure one like sum or length
  odd_length_ <- L.generalize (L.handles (filtered odd) L.length)
  _           <- L.sink (\n -> print n >> putStrLn "-------")
  pure  (format sum_ length_  odd_length_)

 where  
  format sum_ length_ odd_length_ = unlines
     [ ""
     , "Results:"
     , "sum:        " ++ show sum_
     , "length:     " ++ show length_
     , "number odd: " ++ show odd_length_]

So this looks like this

>>> main
1
-------
2
-------
3
-------
4
-------
5
-------

Results:
sum:        15
length:     5
number odd: 3

The "beautiful folding" folds like the ones in foldl are nice since they are not special to any given framework. You can apply the myfolds without alteration to a list, a Sequence, an unboxed vector, a pipes Producer, conduit Source etc etc.. It's a separate discipline of hyper-composable folds and sinks.

来源：https://stackoverflow.com/questions/38307729/is-there-a-way-to-split-an-inputstream

标签

haskell

stream