How to enumerate a recursive datatype in Haskell?

后端 未结 4 1406
北荒
北荒 2020-12-01 12:57

This blog post has an interesting explanation of how to use the Omega monad to enumerate an arbitrary grammar diagonally. He offers an example of how to do so, resulting in

相关标签:
4条回答
  • 2020-12-01 13:38

    My first ugly approach was:

    allTerms :: Omega T
    allTerms = do
      which <- each [ 1,2,3 ]
      if which == 1 then
        return A
      else if which == 2 then do
        x <- allTerms
        return $ B x
      else do
        x <- allTerms
        y <- allTerms
        return $ C x y
    

    But then, after some cleaning up I reached this one liner

    import Control.Applicative
    import Control.Monad.Omega
    import Control.Monad
    
    allTerms :: Omega T
    allTerms = join $ each [return A, B <$> allTerms, C <$> allTerms <*> allTerms]
    

    Note that order matters: return A has to be the first choice in the list above, or allTerms will not terminate. Basically, the Omega monad ensures a "fair scheduling" among choices, saving you from e.g. infiniteList ++ something, but does not prevent infinite recursion.


    An even more elegant solution was suggested by Crazy FIZRUK, exploiting the Alternative instance of Omega.

    import Control.Applicative
    import Data.Foldable (asum)
    import Control.Monad.Omega
    
    allTerms :: Omega T
    allTerms = asum [ pure A
                    , B <$> allTerms
                    , C <$> allTerms <*> allTerms
                    ]
    
    0 讨论(0)
  • 2020-12-01 13:41

    You really should show us what you have tried so far. But granted, this is not an easy problem for a bgeinner.

    Let's try to write a naive version down:

    enum = A : (map B enum ++ [ C x y | x <- enum, y <- enum ])
    

    Ok, this actually gives us:

    [A, B A, B (B A), B (B (B A)), .... ]
    

    and never reaches the C values.

    We obviously need to construct the list in steps. Say we already have a complete list of items up to a certain nesting level, we can compute the items with one nesting level more in one step:

    step xs = map B xs ++ [ C x y | x <- xs, y <- xs ]
    

    For example, we get:

    > step [A]
    [B A,C A A]
    > step (step [A])
    [B (B A),B (C A A),C (B A) (B A),C (B A) (C A A),C (C A A) (B A),C (C A A) (C A ...
    

    What we want is thus:

    [A] ++ step [A] ++ step (step [A]) ++ .....
    

    which is the concatenation of the result of

    iterate step [A]
    

    which is, of course

    someT = concat (iterate step [A])
    

    Warning: You will notice that this still does not give all values. For example:

    C A (B (B A))
    

    will be missing.

    Can you find out why? Can you improve it?

    0 讨论(0)
  • 2020-12-01 13:44

    I finally found the time to write a generic version. It uses the Universe typeclass, which represents recursively enumerabley types. Here it is:

    {-# LANGUAGE DeriveGeneric, TypeOperators, ScopedTypeVariables #-}
    {-# LANGUAGE FlexibleInstances, FlexibleContexts #-}
    {-# LANGUAGE UndecidableInstances, OverlappingInstances #-}
    
    import Data.Universe
    import Control.Monad.Omega
    import GHC.Generics
    import Control.Monad (mplus, liftM2)
    
    class GUniverse f where
        guniverse :: [f a]
    
    instance GUniverse U1 where
        guniverse = [U1]
    
    instance (Universe c) => GUniverse (K1 i c) where
        guniverse = fmap K1 (universe :: [c])
    
    instance (GUniverse f) => GUniverse (M1 i c f) where
        guniverse = fmap M1 (guniverse :: [f p])
    
    instance (GUniverse f, GUniverse g) => GUniverse (f :*: g) where
        guniverse = runOmega $ liftM2 (:*:) ls rs
            where ls = each (guniverse :: [f p])
                  rs = each (guniverse :: [g p])
    
    instance (GUniverse f, GUniverse g) => GUniverse (f :+: g) where
        guniverse = runOmega $ (fmap L1 $ ls) `mplus` (fmap R1 $ rs)
            where ls = each (guniverse :: [f p])
                  rs = each (guniverse :: [g p])
    
    instance (Generic a, GUniverse (Rep a)) => Universe a where
        universe = fmap to $ (guniverse :: [Rep a x])
    
    
    data T = A | B T | C T T deriving (Show, Generic)
    data Tree a = Leaf a | Branch (Tree a) (Tree a) deriving (Show, Generic)
    

    I couldn't find a way to remove UndecidableInstances, but that should be of no greater concern. OverlappingInstances is only required to override predefined Universe instances, like Either's. Now some nice outputs:

    *Main> take 10 $ (universe :: [T])
    [A,B A,B (B A),C A A,B (B (B A)),C A (B A),B (C A A),C (B A) A,B (B (B (B A))),C A (B (B A))]
    *Main> take 20 $ (universe :: [Either Int Char])
    [Left (-9223372036854775808),Right '\NUL',Left (-9223372036854775807),Right '\SOH',Left (-9223372036854775806),Right '\STX',Left (-9223372036854775805),Right '\ETX',Left (-9223372036854775804),Right '\EOT',Left (-9223372036854775803),Right '\ENQ',Left (-9223372036854775802),Right '\ACK',Left (-9223372036854775801),Right '\a',Left (-9223372036854775800),Right '\b',Left (-9223372036854775799),Right '\t']
    *Main> take 10 $ (universe :: [Tree Bool])
    [Leaf False,Leaf True,Branch (Leaf False) (Leaf False),Branch (Leaf False) (Leaf True),Branch (Leaf True) (Leaf False),Branch (Leaf False) (Branch (Leaf False) (Leaf False)),Branch (Leaf True) (Leaf True),Branch (Branch (Leaf False) (Leaf False)) (Leaf False),Branch (Leaf False) (Branch (Leaf False) (Leaf True)),Branch (Leaf True) (Branch (Leaf False) (Leaf False))]
    

    I'm not exactly sure what happens in the branching order of mplus, but I think it should all work out if Omega is correctly implemented, which I strongly believe.


    But wait! The above implementation is not yet bug-free; it diverges on "left recursive" types, like this:

    data T3 = T3 T3 | T3' deriving (Show, Generic)
    

    while this works:

    data T6 = T6' | T6 T6 deriving (Show, Generic)
    

    I'll see if I can fix that. EDIT: At some time, the solution of this problem might be found in this question.

    0 讨论(0)
  • 2020-12-01 13:57

    Below is a terrible solution, but perhaps an interesting one.


    We might consider the idea of adding "one more layer"

    grow :: T -> Omega T
    grow t = each [A, B t, C t t]
    

    which is close to correct but has a flaw—in particular, in the C branch, we end up having both of the arguments take the exact same values instead of being able to vary independently. We can fix this by computing the "base functor" of T which looks like this

    data T    = A  | B  T | C  T T
    data Tf x = Af | Bf x | Cf x x deriving Functor
    

    In particular, Tf is just a copy of T where the recursive calls are functor "holes" instead of direct recursive calls. Now we can write:

    grow :: Omega T -> Omega (Tf (Omega T))
    grow ot = each [ Af, Bf ot, Cf ot ot ]
    

    which has a whole computation of a new set of T in each hole. If we could somehow "flatten" the Omega (Tf (Omega T)) into Omega T then we'd have a computation which adds "one new layer" to our Omega computation correctly.

    flatten :: Omega (Tf (Omega T)) -> Omega T
    flatten = ...
    

    and we could take the fixed point of this layering with fix

    fix :: (a -> a) -> a
    
    every :: Omega T
    every = fix (flatten . grow)
    

    So the only trick is to figure out flatten. To do this we need to notice two features of Tf. First, it's Traversable so we can use sequenceA to "flip" the order of Tf and Omega

    flatten = ?f . fmap (?g . sequenceA)
    

    where ?f :: Omega (Omega T) -> Omega T is just join. The final tricky bit is figuring out ?g :: Omega (Tf T) -> Omega T. Obviously, we don't care about the Omega layer so we should just fmap a function of type Tf T -> T.

    And this function is very close to the defining notion for the relationship between Tf and T: we can always compress a layer of Tf on the top of T.

    compress :: Tf T -> T
    compress Af         = A
    compress (Bf t)     = B t
    compress (Cf t1 t2) = C t1 t2
    

    All together we have

    flatten :: Omega (Tf (Omega T)) -> Omega T
    flatten = join . fmap (fmap compress . sequenceA)
    

    Ugly, but all together functional.

    0 讨论(0)
提交回复
热议问题