Explicit Purely-Functional Data-Structure For Difference Lists

后端 未结 3 1592
遥遥无期
遥遥无期 2020-12-10 18:10

In Haskell, difference lists, in the sense of

[a] representation of a list with an efficient concatenation operation

seem to be

3条回答
  •  失恋的感觉
    2020-12-10 19:00

    (++)'s notoriously bad asymptotics come about when you use it in a left-associative manner - that is, when (++)'s left argument is the result of another call to (++). Right-associative expressions run efficiently, though.

    To put it more concretely: evaluating a left-nested append of m lists like

    ((ws ++ xs) ++ ys) ++ zs  -- m = 3 in this example
    

    to WHNF requires you to force m thunks, because (++) is strict in its left argument.

    case (
        case (
            case ws of { [] -> xs ; (w:ws) -> w:(ws ++ xs) }
        ) of { [] -> ys ; (x:xs) -> x:(xs ++ ys) }
    ) of { [] -> zs ; (y:ys) -> y:(ys ++ zs) }
    

    In general, to fully evaluate n elements of such a list, this'll require forcing that stack of m thunks n times, for O(m*n) complexity. When the entire list is built from appends of singleton lists (ie (([w] ++ [x]) ++ [y]) ++ [z]), m = n so the cost is O(n2).

    Evaluating a right-nested append like

    ws ++ (xs ++ (ys ++ zs))
    

    to WHNF is much easier (O(1)):

    case ws of
        [] -> xs ++ (ys ++ zs)
        (w:ws) -> w:(ws ++ (xs ++ (ys ++ zs)))
    

    Evaluating n elements just requires evaluating n thunks, which is about as good as you can expect to do.


    Difference lists work by paying a small (O(m)) up-front cost to automatically re-associate calls to (++).

    newtype DList a = DList ([a] -> [a])
    fromList xs = DList (xs ++)
    toList (DList f) = f []
    
    instance Monoid (DList a) where
        mempty = fromList []
        DList f `mappend` DList g = DList (f . g)
    

    Now a left-nested expression,

    toList (((fromList ws <> fromList xs) <> fromList ys) <> fromList zs)
    

    gets evaluated in a right-associative manner:

    ((((ws ++) . (xs ++)) . (ys ++)) . (zs ++)) []
    
    -- expand innermost (.)
    (((\l0 -> ws ++ (xs ++ l0)) . (ys ++)) . (zs ++)) []
    
    -- expand innermost (.)
    ((\l1 -> (\l0 -> ws ++ (xs ++ l0)) (ys ++ l1)) . (zs ++)) []
    -- beta reduce
    ((\l1 -> ws ++ (xs ++ (ys ++ l1))) . (zs ++)) []
    
    -- expand innermost (.)
    (\l2 -> (\l1 -> ws ++ (xs ++ (ys ++ l1))) (zs ++ l2)) []
    -- beta reduce
    (\l2 -> ws ++ (xs ++ (ys ++ (zs ++ l2)))) []
    -- beta reduce
    ws ++ (xs ++ (ys ++ (zs ++ [])))
    

    You perform O(m) steps to evaluate the composed function, then O(n) steps to evaluate the resulting expression, for a total complexity of O(m+n), which is asymptotically better than O(m*n). When the list is made up entirely of appends of singleton lists, m = n and you get O(2n) ~ O(n), which is asymptotically better than O(n2).

    This trick works for any Monoid.

    newtype RMonoid m = RMonoid (m -> m)  -- "right-associative monoid"
    toRM m = RMonoid (m <>)
    fromRM (RMonoid f) = f mempty
    instance Monoid m => Monoid (RMonoid m):
        mempty = toRM mempty
        RMonoid f `mappend` RMonoid g = RMonoid (f . g)
    

    See also, for example, the Codensity monad, which applies this idea to monadic expressions built using (>>=) (rather than monoidal expressions built using (<>)).


    I hope I have convinced you that (++) only causes a problem when used in a left-associative fashion. Now, you can easily write a list-like data structure for which append is strict in its right argument, and left-associative appends are therefore not a problem.

    data Snoc a = Nil | Snoc (Snoc a) a
    
    xs +++ Nil = xs
    xs +++ (Snoc ys y) = Snoc (xs +++ ys) y
    

    We recover O(1) WHNF of left-nested appends,

    ((ws +++ xs) +++ ys) +++ zs
    
    case zs of
        Nil -> (ws +++ xs) +++ ys
        Snoc zs z -> Snoc ((ws +++ xs) +++ ys) +++ zs) z
    

    but at the expense of slow right-nested appends.

    ws +++ (xs +++ (ys +++ zs))
    
    case (
        case (
            case zs of { Nil -> ys ; (Snoc zs z) -> Snoc (ys +++ zs) z }
        ) of { Nil -> xs ; (Snoc ys y) -> Snoc (xs +++ ys) y }
    ) of { Nil -> ws ; (Snoc xs x) -> Snoc (ws +++ xs) y }
    

    Then, of course, you end up writing a new type of difference list which reassociates appends to the left!

    newtype LMonoid m = LMonoid (m -> m)  -- "left-associative monoid"
    toLM m = LMonoid (<> m)
    fromLM (LMonoid f) = f mempty
    instance Monoid m => Monoid (LMonoid m):
        mempty = toLM mempty
        LMonoid f `mappend` LMonoid g = LMonoid (g . f)
    

提交回复
热议问题