Explicit Purely-Functional Data-Structure For Difference Lists

后端未结

关注

 3  1619

遥遥无期 2020-12-10 18:10

In Haskell, difference lists, in the sense of

[a] representation of a list with an efficient concatenation operation

seem to be

3条回答

渐次进展 (楼主)

2020-12-10 18:37
Carl hit it in his comment. We can write
```
data TList a = Nil | Single a | Node !(TList a) (TList a)

singleton :: a -> TList a
singleton = Single

instance Monoid (TList a) where
  mempty = Nil
  mappend = Node
```
We could get toList by just deriving Foldable, but let's write it out instead to see just what's going on.
```
instance Foldable TList where
  foldMap _ Nil = mempty
  foldMap f (Single a) = f a
  foldMap f (Node t u) = foldMap f t <> foldMap f u

  toList as0 = go as0 [] where
    go Nil k = k
    go (Single a) k = a : k
    go (Node l r) k = go l (go r k)
```
toList is O(n), where n is the total number of internal nodes (i.e., the total number of mappend operations used to form the TList). This should be pretty clear: each Node is inspected exactly once. mempty, mappend, and singleton are each obviously O(1).

This is exactly the same as for a DList:
```
newtype DList a = DList ([a] -> [a])
singletonD :: a -> DList a
singletonD a = DList (a:)
instance Monoid (DList a) where
  mempty = DList id
  mappend (DList f) (DList g) = DList (f . g)
instance Foldable DList where
  foldr c n xs = foldr c n (toList xs)
  toList (DList f) = f []
```
Why, operationally, is this the same? Because, as you indicate in your question, the functions are represented in memory as trees. And they're represented as trees that look a lot like TLists! singletonD x produces a closure containing a (boring) (:) and an exciting x. When applied, it does O(1) work. mempty just produces the id function, which when applied does O(1) work. mappend as bs produces a closure that, when applied, does O(1) work of its own, plus O(length as + length bs) work in its children.

The shapes of the trees produced for TList and DList are actually the same. You should be able to convince yourself that they also have identical asymptotic performance when used incrementally: in each case, the program has to walk down the left spine of the tree to get to the first list element.

Both DList and TList are equally okay when built up and then used only once. They're equally lousy when built once and converted to lists multiple times.

As Will Ness showed with a similar type, the explicit tree representation is better if you want to add support for deconstructing the representation, as you can actually get your hands on the structure. TList can support a reasonably efficient uncons operation (that improves the structure as it works). To get efficient unsnoc as well, you'll need to use a fancier representation (a catenable deque). This implementation also has potentially wretched cache performance. You can switch to a cache-oblivious data structure, but it is practically guaranteed to be complicated.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...