explain the Haskell breadth first numbering code to traverse trees

问题

I am reading this paper by Chris Okasaki; titled "Breadth-First Numbering: Lessons from a Small Exercise in Algorithm Design".

A question is - how is the magic happening in the algorithm? There are some figures (e.g. figure 7 titled "threading the output of one level into the input of next level") Unfortunately, maybe it's only me, but that figure has completely baffled me. I don't understand how the threading happens at all?

回答1:

Breadth first traversal means traversing levels of a tree one by one. So let's assume we already know what are the numbers at the beginning of each level - the number of traversed elements so far before each level. For the simple example in the paper

import Data.Monoid

data Tree a = Tree (Tree a) a (Tree a)
            | Empty
  deriving (Show)

example :: Tree Char
example = Tree (Tree Empty 'b' (Tree Empty 'c' Empty)) 'a' (Tree Empty 'd' Empty)

the sizes would be 0, 1, 3, 4. Knowing this, we can thread such a list of sizes through a give tree (sub-tree) left-to-right: We advance the first element of the list by one for the node, and thread the tail of the list first through the left and then through the right subtree (see thread below).

After doing so, we'll get again the same list of sizes, only shifted by one - now we have the total number of elements after each level. So the trick is: Assume we have such a list, use it for the computation, and then feed the output as the input - tie the knot.

A sample implementation:

tagBfs :: (Monoid m) => (a -> m) -> Tree a -> Tree m
tagBfs f t = let (ms, r) = thread (mempty : ms) t
              in r
  where
    thread ms Empty = (ms, Empty)
    thread (m : ms) (Tree l x r) =
        let (ms1, l') = thread ms l
            (ms2, r') = thread ms1 r
         in ((m <> f x) : ms2, Tree l' m r')

generalized to Monoid (for numbering you'd give const $ Sum 1 as the function).

回答2:

One way to view tree numbering is in terms of a traversal. Specifically, we want to traverse the tree in breadth-first order using State to count up. The necessary Traversable instance looks something like this. Note that you'd probably actually want to define this instance for a newtype like BFTree, but I'm just using the raw Tree type for simplicity. This code is strongly inspired by ideas in Cirdec's monadic rose tree unfolding code, but the situation here seems to be substantially simpler. Hopefully I haven't missed something horrible.

{-# LANGUAGE DeriveFunctor,
             GeneralizedNewtypeDeriving,
             LambdaCase #-}
{-# OPTIONS_GHC -Wall #-}

module BFT where

import Control.Applicative
import Data.Foldable
import Data.Traversable
import Prelude hiding (foldr)

data Tree a = Tree (Tree a) a (Tree a)
            | Empty
  deriving (Show, Functor)

newtype Forest a = Forest {getForest :: [Tree a]}
   deriving (Functor)

instance Foldable Forest where
  foldMap = foldMapDefault

-- Given a forest, produce the forest consisting
-- of the children of the root nodes of non-empty
-- trees.
children :: Forest a -> Forest a
children (Forest xs) = Forest $ foldr go [] xs
  where
    go Empty c = c
    go (Tree l _a r) c = l : r : c

-- Given a forest, produce a list of the root nodes
-- of the elements, with `Nothing` values in place of
-- empty trees.
parents :: Forest a -> [Maybe a]
parents (Forest xs) = foldr go [] xs
  where
    go Empty c = Nothing : c
    go (Tree _l a _r) c = Just a : c

-- Given a list of values (mixed with blanks) and
-- a list of trees, attach the values to pairs of
-- trees to build trees; turn the blanks into `Empty`
-- trees.
zipForest :: [Maybe a] -> Forest a -> [Tree a]
zipForest [] _ts = []
zipForest (Nothing : ps) ts = Empty : zipForest ps ts
zipForest (Just p : ps) (Forest ~(t1 : ~(t2 : ts'))) =
   Tree t1 p t2 : zipForest ps (Forest ts')

instance Traversable Forest where
  -- Traversing an empty container always gets you
  -- an empty one.
  traverse _f (Forest []) = pure (Forest [])

  -- First, traverse the parents. The `traverse.traverse`
  -- gets us into the `Maybe`s. Then traverse the
  -- children. Finally, zip them together, and turn the
  -- result into a `Forest`. If the `Applicative` in play
  -- is lazy enough, like lazy `State`, I believe 
  -- we avoid the double traversal Okasaki mentions as
  -- a problem for strict implementations.
  traverse f xs = (Forest .) . zipForest <$>
          (traverse.traverse) f (parents xs) <*>
          traverse f (children xs)

instance Foldable Tree where
  foldMap = foldMapDefault

instance Traversable Tree where
  traverse f t =
       (\case {(Forest [r]) -> r;
               _ -> error "Whoops!"}) <$>
       traverse f (Forest [t])

Now we can write code to pair up each element of the tree with its breadth-first number like this:

import Control.Monad.Trans.State.Lazy

numberTree :: Tree a -> Tree (Int, a)
numberTree tr = flip evalState 1 $ for tr $ \x ->
      do
        v <- get
        put $! (v+1)
        return (v,x)

来源：https://stackoverflow.com/questions/29726454/explain-the-haskell-breadth-first-numbering-code-to-traverse-trees

标签

algorithm

haskell

tree

breadth-first-search

tree-traversal