How to implement lazy iterate when IO is involved in Haskell

问题

I'm using IO to encapsulate randomness. I am trying to write a method which iterates a next function n times, but the next function produces a result wrapped in IO because of the randomness.

Basically, my next function has this signature:

next :: IO Frame -> IO Frame

and I want to start with an initial Frame, then use the same pattern as iterate to get a list [Frame] with length n. Essentially, I'd like to be able to write the following:

runSimulation :: {- parameters -} -> IO [Frame]
runSimulation {- parameters -} = do
  {- some setup -}
  sequence . take n . iterate next $ firstFrame

Where firstFrame :: IO Frame formed by doing something like let firstFrame = return Frame x y z.

The problem I am encountering is that when I run this function, it never exits, so it seems to be running on an infinite loop (since iterate produces an infinite list).

I'm quite new to haskell so not sure where I'm going wrong here, or if my supposition above is correct that it seems that the entire infinite list is being executed.

(Update) In case it's helpful, here are the full definitions of Frame, next, and runSimulation:

-- A simulation Frame encapsulates the state of the simulation at some
-- point in "time". That means it contains a list of Agents in that
-- Frame, and a list of the Interactions that occurred in it as well. It
-- also contains the state of the World, as well as an AgentID counter
-- (so we can easily increment for generating new Agents).
data Frame = Frame AgentID [Agent] [Interaction]
  deriving Show

-- Generate the next Frame from the current one, including scoring the
-- Agents based on the outcomes *in this Frame*.
-- TODO: add in reproduction.
nextFrame :: Reactor -> World -> IO Frame -> IO Frame
nextFrame react w inp = do
  (Frame i agents history) <- inp
  interactions <- interactAll react history agents
  let scoredAgents = scoreAgents (rewards w) interactions agents
  return (Frame i scoredAgents interactions)

-- Run a simulation for a number of iterations
runSimulation :: World -> Reactor -> (Dist, Dist) -> IO [Frame]
runSimulation world react (gen_dist, sel_dist) = do
  startingAgents <- spawnAgents (initial_size world) (agentCreatorFactory gen_dist sel_dist)
  let firstFrame = return (Frame (length startingAgents) startingAgents [])
      next       = nextFrame react world
  sequence . take (iterations world) . iterate next $ firstFrame

回答1:

I don't know how much time computing each Frame takes, but I suspect you are doing more work than necessary. The cause is a bit subtle. iterate produces a list of repeated applications of a function. For each element in the list, the previous value is reused. Your list is composed of IO actions. The IO action at position n is computed from the already obtained IO action at position n-1 by applying next.

Alas, when executing those actions, we are not so lucky. Executing the action at position n in the list will repeat all the work of the previous actions! We shared work when building the actions themselves (which are values, like almost everything in Haskell) but not when executing them, which is a different thing.

The simplest solution could be to define this auxiliary function with a baked-in limit:

iterateM :: Monad m => (a -> m a) -> a -> Int -> m [a]
iterateM step = go
    where
    go _ 0 = return []
    go current limit =
        do next <- step current
           (current:) <$> go next (pred limit)

While simple, it's a bit inelegant, for two reasons:

It conflates the iteration process with the limiting of such process. In the pure list world we didn't have to do that, we could create infinite lists and take from then. But now in the effectful world that nice compositionality seems to be lost.
What if we want to do something with each value as it is being produced, without having to wait for all of them? Out function returns everything at the end, in one go.

As mentioned in the comments, streaming libraries like "conduit", "streamly" or "streaming" try to solve this problem in a better way, regaining some of the compositionality of pure lists. These libraries have types that represent effectful processes whose results are yielded piecewise.

For example, consider the function Streaming.Prelude.iterateM from "streaming", specialized to IO:

iterateM :: (a -> IO a) -> IO a -> Stream (Of a) IO r

It returns a Stream that we can "limit" using Streaming.Prelude.take:

take :: Int -> Stream (Of a) IO r -> Stream (Of a) IO ()

after limiting it we can get back to IO [a] with Streaming.Prelude.toList_ which accumulates all results:

toList_ :: Stream (Of a) IO r -> IO [a]

But instead of that we could process each element as it is being produced, with functions like Streaming.Prelude.mapM_:

mapM_ :: (a -> IO x) -> Stream (Of a) IO r -> IO r

回答2:

An elementary solution:

As an alternative to @danidiaz's answer, it is possible to solve the problem without resorting to extra libraries such as Streaming, assuming the role of IO can be minimized.

Most of the required code can be written in terms of the MonadRandom class, of which IO is just one instance. It is not necessary to use IO in order to generate pseudo-random numbers.

The required iteration function can be written like this, in do notation:

import  System.Random
import  Control.Monad.Random.Lazy

iterateM1 :: MonadRandom mr => (a -> mr a) -> a -> mr [a]
iterateM1 fn x0 = 
    do
        y  <- fn x0
        ys <- iterateM1 fn y
        return (x0:ys)

Unfortunately, the text of the question does not define exactly what a Frame object is, or what the next stepping function does; so I have to somehow fill in the blanks. Also the next name gets defined in the libraries involved, so I will have to use nextFrame instead of just next.

Let's assume that a Frame object is just a point in 3-dimensional space, and that at each random step, one and only one of the 3 dimensions is chosen at random, and the corresponding coordinate is bumped by an amount of either +1 or -1, with equal probabilities. This gives this code:

data Frame = Frame Int Int Int  deriving  Show

nextFrame :: MonadRandom mr => Frame -> mr Frame
nextFrame (Frame x y z) =
    do
        -- 3 dimensions times 2 possible steps: 1 & -1, hence 6 possibilities
        n <- getRandomR (0::Int, 5::Int)
        let fr = case n of
                  0 -> Frame (x-1) y z
                  1 -> Frame (x+1) y z
                  2 -> Frame x (y-1) z
                  3 -> Frame x (y+1) z
                  4 -> Frame x y (z-1)
                  5 -> Frame x y (z+1)
                  _ -> Frame x y z
        return fr

At that point, it is not difficult to write code that builds an unlimited list of Frame objects representing the simulation history. Creating that list does not cause the code to loop forever, and the usual take function can be used to select the first few elements of such a list.

Putting all the code together:

import  System.Random
import  Control.Monad.Random.Lazy

iterateM1 :: MonadRandom mr => (a -> mr a) -> a -> mr [a]
iterateM1 fn x0 = 
    do
        y  <- fn x0
        ys <- iterateM1 fn y
        return (x0:ys)

data Frame = Frame Int Int Int  deriving  Show

nextFrame :: MonadRandom mr => Frame -> mr Frame
nextFrame (Frame x y z) =
    do
        -- 3 dimensions times 2 possible steps: 1 & -1, hence 6 possibilities
        n <- getRandomR (0::Int, 5::Int)
        let fr = case n of
                  0 -> Frame (x-1) y z
                  1 -> Frame (x+1) y z
                  2 -> Frame x (y-1) z
                  3 -> Frame x (y+1) z
                  4 -> Frame x y (z-1)
                  5 -> Frame x y (z+1)
                  _ -> Frame x y z
        return fr

runSimulation :: MonadRandom mr => Int -> Int -> Int -> mr [Frame]
runSimulation x y z  =  let  fr0 = Frame x y z  in  iterateM1 nextFrame fr0

main = do
    rng0 <- getStdGen  -- PRNG hosted in IO monad
                       -- Could use mkStdGen or MkTFGen instead
    let
         sim  = runSimulation 0 0 0
         allFrames = evalRand sim rng0   -- unlimited list of frames !
         frameCount = 10
         frames = take frameCount allFrames
    mapM_  (putStrLn  .  show)  frames

Program execution:

$ ./frame
Frame 0 0 0
Frame 0 1 0
Frame 0 0 0
Frame 0 (-1) 0
Frame 1 (-1) 0
Frame 1 (-2) 0
Frame 1 (-1) 0
Frame 1 (-1) 1
Frame 1 0 1
Frame 2 0 1
$

For large values of frameCount, execution time is a quasi-linear function of frameCount, as expected.

More on monadic actions for random number generation here.

来源：https://stackoverflow.com/questions/60137468/how-to-implement-lazy-iterate-when-io-is-involved-in-haskell

标签

haskell

lazy-evaluation