Explicit Purely-Functional Data-Structure For Difference Lists

后端未结

关注

 3  1611

遥遥无期 2020-12-10 18:10

In Haskell, difference lists, in the sense of

[a] representation of a list with an efficient concatenation operation

seem to be

3条回答

失恋的感觉 (楼主)

2020-12-10 19:00
(++)'s notoriously bad asymptotics come about when you use it in a left-associative manner - that is, when (++)'s left argument is the result of another call to (++). Right-associative expressions run efficiently, though.

To put it more concretely: evaluating a left-nested append of m lists like
```
((ws ++ xs) ++ ys) ++ zs  -- m = 3 in this example
```
to WHNF requires you to force m thunks, because (++) is strict in its left argument.
```
case (
    case (
        case ws of { [] -> xs ; (w:ws) -> w:(ws ++ xs) }
    ) of { [] -> ys ; (x:xs) -> x:(xs ++ ys) }
) of { [] -> zs ; (y:ys) -> y:(ys ++ zs) }
```
In general, to fully evaluate n elements of such a list, this'll require forcing that stack of m thunks n times, for O(m*n) complexity. When the entire list is built from appends of singleton lists (ie (([w] ++ [x]) ++ [y]) ++ [z]), m = n so the cost is O(n²).

Evaluating a right-nested append like
```
ws ++ (xs ++ (ys ++ zs))
```
to WHNF is much easier (O(1)):
```
case ws of
    [] -> xs ++ (ys ++ zs)
    (w:ws) -> w:(ws ++ (xs ++ (ys ++ zs)))
```
Evaluating n elements just requires evaluating n thunks, which is about as good as you can expect to do.

Difference lists work by paying a small (O(m)) up-front cost to automatically re-associate calls to (++).
```
newtype DList a = DList ([a] -> [a])
fromList xs = DList (xs ++)
toList (DList f) = f []

instance Monoid (DList a) where
    mempty = fromList []
    DList f `mappend` DList g = DList (f . g)
```
Now a left-nested expression,
```
toList (((fromList ws <> fromList xs) <> fromList ys) <> fromList zs)
```
gets evaluated in a right-associative manner:
```
((((ws ++) . (xs ++)) . (ys ++)) . (zs ++)) []

-- expand innermost (.)
(((\l0 -> ws ++ (xs ++ l0)) . (ys ++)) . (zs ++)) []

-- expand innermost (.)
((\l1 -> (\l0 -> ws ++ (xs ++ l0)) (ys ++ l1)) . (zs ++)) []
-- beta reduce
((\l1 -> ws ++ (xs ++ (ys ++ l1))) . (zs ++)) []

-- expand innermost (.)
(\l2 -> (\l1 -> ws ++ (xs ++ (ys ++ l1))) (zs ++ l2)) []
-- beta reduce
(\l2 -> ws ++ (xs ++ (ys ++ (zs ++ l2)))) []
-- beta reduce
ws ++ (xs ++ (ys ++ (zs ++ [])))
```
You perform O(m) steps to evaluate the composed function, then O(n) steps to evaluate the resulting expression, for a total complexity of O(m+n), which is asymptotically better than O(m*n). When the list is made up entirely of appends of singleton lists, m = n and you get O(2n) ~ O(n), which is asymptotically better than O(n²).

This trick works for any Monoid.
```
newtype RMonoid m = RMonoid (m -> m)  -- "right-associative monoid"
toRM m = RMonoid (m <>)
fromRM (RMonoid f) = f mempty
instance Monoid m => Monoid (RMonoid m):
    mempty = toRM mempty
    RMonoid f `mappend` RMonoid g = RMonoid (f . g)
```
See also, for example, the Codensity monad, which applies this idea to monadic expressions built using (>>=) (rather than monoidal expressions built using (<>)).

I hope I have convinced you that (++) only causes a problem when used in a left-associative fashion. Now, you can easily write a list-like data structure for which append is strict in its right argument, and left-associative appends are therefore not a problem.
```
data Snoc a = Nil | Snoc (Snoc a) a

xs +++ Nil = xs
xs +++ (Snoc ys y) = Snoc (xs +++ ys) y
```
We recover O(1) WHNF of left-nested appends,
```
((ws +++ xs) +++ ys) +++ zs

case zs of
    Nil -> (ws +++ xs) +++ ys
    Snoc zs z -> Snoc ((ws +++ xs) +++ ys) +++ zs) z
```
but at the expense of slow right-nested appends.
```
ws +++ (xs +++ (ys +++ zs))

case (
    case (
        case zs of { Nil -> ys ; (Snoc zs z) -> Snoc (ys +++ zs) z }
    ) of { Nil -> xs ; (Snoc ys y) -> Snoc (xs +++ ys) y }
) of { Nil -> ws ; (Snoc xs x) -> Snoc (ws +++ xs) y }
```
Then, of course, you end up writing a new type of difference list which reassociates appends to the left!
```
newtype LMonoid m = LMonoid (m -> m)  -- "left-associative monoid"
toLM m = LMonoid (<> m)
fromLM (LMonoid f) = f mempty
instance Monoid m => Monoid (LMonoid m):
    mempty = toLM mempty
    LMonoid f `mappend` LMonoid g = LMonoid (g . f)
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...