How to make fromList lazy in this dynamic programming example?

问题

module Main where
  import System.Random
  import Data.Foldable
  import Control.Monad
  import qualified Data.Map as M
  import qualified Data.Vector as V
  import Debug.Trace
  import Data.Maybe
  import Data.Ord

  -- Represents the maximal integer. maxBound is no good because it overflows.
  -- Ideally should be something like a billion.
  maxi = 1000

  candies :: V.Vector Int -> Int --M.Map (Int, Int) Int
  candies ar = ff [l (V.length ar - 1) x | x <- [0..maxi]]
    where
      go :: Int -> Int -> Int
      go _ 0 = maxi
      go 0 j = j
      go i j =
        case compare (ar V.! (i-1)) (ar V.! i) of
          LT -> ff [l (i-1) x + j | x <- [0..j-1]]
          GT -> ff [l (i-1) x + j | x <- [j+1..maxi]]
          EQ -> ff [l (i-1) x + j | x <- [0..maxi]]
      l :: Int -> Int -> Int
      l i j = fromMaybe maxi (M.lookup (i,j) cs)
      ff l = --minimum l
        case l of
          l:ls -> if l < maxi then l else ff ls
          [] -> maxi

      -- I need to make this lazy somehow.
      cs :: M.Map (Int, Int) Int
      cs = M.fromList [((i,j), go i j) | i <- [0..V.length ar - 1], j <- [0..maxi]]


  main :: IO ()
  main = do
    --ar <- fmap (V.fromList . map read . tail . words) getContents
    g <- fmap (V.fromList . take 5 . randomRs (1,50)) getStdGen
    print $ candies g

The above code is for the HackerRank Candies challenge. I think the code is correct in essence even though it gives me runtime errors on submission. HackerRank does not say what those errors are, but most likely it is because I ran out allotted memory.

To make the above work, I need to rewrite the above so the fromList gets lazily evaluated or something to that effect. I like the above form and rewriting the functions so they pass along the map as a parameter is something I would very much like to avoid.

I know Haskell has various memoization libraries on Hackage, but the online judge does not allow their use.

I might have coded myself into a hole due to Haskell's purity.

Edit:

I did some experimenting in order to figure out how those folds and lambda's work. I think this is definitely linked to continuation passing after all, as the continuations are being built up along the fold. To show what I mean, I'll demonstrate it with a simple program.

module Main where
  trans :: [Int] -> [Int]
  trans m =
    foldr go (\_ -> []) m 0 where
      go x f y = (x + y) : f x

  main = do
    s <- return $ trans [1,2,3]
    print s

One thing that surprised me was that when I inserted a print, it got executed in a reverse manner, from left to right, which made me think at first that I misunderstood how foldr works. That turned out to not be the case.

What the above does is print out [1,3,5].

Here is the explanation how it executes. Trying to print out f x in the above will not be informative and will cause it to just all around the place.

It starts with something like this. The fold obviously executes 3 go functions.

go x f y = (x + y) : f x
go x f y = (x + y) : f x
go x f y = (x + y) : f x

The above is not quite true. One has to keep in mind that all fs are separate.

go x f'' y = (x + y) : f'' x
go x f' y = (x + y) : f' x
go x f y = (x + y) : f x

Also for clarity one it should also be instructive to separate out the lambdas.

go x f'' = \y -> (x + y) : f'' x
go x f' = \y -> (x + y) : f' x
go x f = \y -> (x + y) : f x

Now the fold starts from the top. The topmost statement gets evaluated as...

go 3 (\_ -> []) = \y -> (3 + y) : (\_ -> []) 3

This reduces to:

go 3 (\_ -> []) = (\y -> (3 + y) : [])

The result is the unfinished lambda above. Now the fold evaluates the second statement.

go 2 (\y -> (3 + y) : []) = \y -> (2 + y) : (\y -> (3 + y) : []) 2

This reduces to:

go 2 (\y -> (3 + y) : []) = (\y -> (2 + y) : 5 : [])

The the fold goes to the last statement.

go 1 (\y -> (2 + y) : 5 : []) = \y -> (1 + y) : (\y -> (2 + y) : 5 : []) 1

This reduces to:

go 1 (\y -> (2 + y) : 5 : []) = \y -> (1 + y) : 3 : 5 : []

The the 0 outside the fold gets applied and the final lambda gets reduced to

1 : 3 : 5 : []

This is just the start of it. The case gets more interesting when f x is replaced with f y.

Here is a similar program to the previous.

module Main where
  trans :: [Int] -> [Int]
  trans m =
    foldr go (\_ -> []) m 1 where
      go x f y = (x + y) : f (2*y+1)

  main = do
    s <- return $ trans [1,2,3]
    print s

Let me once again go from top to bottom.

go x f'' = \y -> (x + y) : f'' (2*y+1)
go x f' = \y -> (x + y) : f' (2*y+1)
go x f = \y -> (x + y) : f (2*y+1)

The top statement.

go 3 (\_ -> []) = \y -> (3 + y) : (\_ -> []) (2*y+1)

The middle statement:

go 2 (\y -> (3 + y) : (\_ -> []) (2*y+1)) = \y -> (2 + y) : (\y -> (3 + y) : (\_ -> []) (2*y+1)) (2*y+1)

The last statement:

go 1 (\y -> (2 + y) : (\y -> (3 + y) : (\_ -> []) (2*y+1)) (2*y+1)) = \y -> (1 + y) : (\y -> (2 + y) : (\y -> (3 + y) : (\_ -> []) (2*y+1)) (2*y+1)) 2*y+1

Notice how the expressions build up because ys cannot be applied. Only after the 0 gets inserted can the whole expression be evaluated.

(\y -> (1 + y) : (\y -> (2 + y) : (\y -> (3 + y) : (\_ -> []) (2*y+1)) (2*y+1)) 2*y+1) 1

2 : (\y -> (2 + y) : (\y -> (3 + y) : (\_ -> []) (2*y+1)) (2*y+1)) 3

2 : 5 : (\y -> (3 + y) : (\_ -> []) (2*y+1)) 7

2 : 5 : 10 : (\_ -> []) 15

2 : 5 : 10 : []

There is a buildup due to the order of evaluation.

Edit: So...

go (candy, score) f c s = (candy', score): f candy' score
    where candy' = max candy $ if s < score then c + 1 else 1

The above in fact does 3 passes across the list in each iteration.

First foldr has to travel to back of the list before it can begin. Then as candi' depends on s and c variables which cannot be applied immediately this necessitates building up the continuations as in that last example.

Then when the two 0 0 are fed into at the end of the fold, the whole thing only then gets evaluated.

It is a bit hard to reason about.

回答1:

The problem you have linked to has a clean Haskell solution using right folds. In other words, you can skip worrying about lazy fromList, memoization and all that by just using a more functional style.

The idea is that you maintain a list of (candy, score) pairs where candy is zero initially for all (repeat 0 in bellow code). Then you go once from left to right and bump up candy values if this item score exceeds the one before:

-- s is the score and c is the candy of the guy before
-- if s < score then this guy should get at least c + 1 candies
candy' = max candy $ if s < score then c + 1 else 1

and do the same thing again going in the other direction:

import Control.Monad (replicateM)
import Control.Applicative ((<$>))

solve :: [Int] -> Int
solve = sum . map fst . loop . reverse . loop . zip  (repeat 0)
    where
    loop cs = foldr go (\_ _ -> []) cs 0 0
    go (candy, score) f c s = (candy', score): f candy' score
        where candy' = max candy $ if s < score then c + 1 else 1

main = do
    n <- read <$> getLine
    solve . fmap read <$> replicateM n getLine >>= print

This performs linearly, and passes all tests on HackerRank.

回答2:

Well, regarding my own question at the top, probably the way to make the thing lazy would be to just use a list (a list of lists or a vector of lists.) The reason why the above is impossible to make lazy is because the Map type is lazy in the values and strict in the keys.

More importantly, my analysis that the fold is doing essentially two passes was completely right. The way those built up continuations are being executed in reverse completely tripped me up at first, but I've adapted @behzad.nouri code to work with only a single loop.

module Main where
  import Control.Monad (replicateM)
  import Control.Applicative ((<$>))
  import Debug.Trace


  solve :: [Int] -> Int
  solve = sum . loop
      where
      loop :: [Int] -> [Int]
      loop = (\(_,_,x) -> x 0 0) . foldr go (0, 0, \_ _ -> [])
      go :: Int -> (Int, Int, Int -> Int -> [Int]) -> (Int, Int, Int -> Int -> [Int])
      go score (candyP,scoreP,f) =
        let
          candyP' = if scoreP < score then candyP + 1 else 1
          in
            (candyP', score,
              \candyN scoreN ->
                let
                  candy' = max candyP' $ if scoreN < score then candyN + 1 else 1
                  in candy' : f candy' score) -- This part could be replaced with a sum

  main = do
      n <- read <$> getLine
      solve . fmap read <$> replicateM n getLine >>= print

The above passes all tests, no problem, and that is convincing proof that the above analysis is correct.

来源：https://stackoverflow.com/questions/37501967/how-to-make-fromlist-lazy-in-this-dynamic-programming-example

标签

haskell

dynamic-programming

lazy-evaluation

memoization

continuations