Create random data from custom type

六月ゝ 毕业季﹏ 提交于 2019-12-12 13:53:35

问题


I have the following custom type defined

data Tree = Empty | Node Tree Tree

I want to create random Trees with a given number of nodes n that I can then pass to another function which calculates the depth of the tree

depth :: Tree -> Int
depth Empty = 0
depth Node t1 t2 = (maximum [depth t1, depth t2]) + 1

Which is the easiest way to achieve this?

EDIT: I have tried with an approach similar to that of Alec in an answer below, which returns a random IO Tree. However, there are several other functions I need to pass this random Trees to, over which I have no control. These require an argument of type Tree, not IO Tree so this solution doesn`t quite work for my purposes.


回答1:


Think of it as a simple recursive problem. The only complication is that getting a random number requires either threading through explicitly a generator, or working within IO. For simplicity, I'll stick with the latter.

import System.Random

data Tree = Empty | Node Tree Tree

-- | Generate a tree of the given size
arbitraryTree :: Int -> IO Tree
arbitraryTree treeSize
  | treeSize <= 1 = pure Empty  -- base case, tree of size 1
  | otherwise = do
      leftSize <- randomRIO (0,treeSize - 1)
      let rightSize = treeSize - 1 - leftSize

      leftSubtree <- arbitraryTree leftSize
      rightSubtree <- arbitraryTree rightSize

      pure (Node leftSubtree rightSubtree)



回答2:


Interesting question! This problem is best approached in two parts: generating a list of Trees with the required amount of nodes, and then selecting one value at random from that list.

We'll start with the first problem: given a number n, generate a list of all Trees with n Nodes. But how would we do this? Let's try out a few simple cases for small ns. For n = 0, we have only one choice: Empty. (From now on, I'll abbreviate Node and Empty as N and E respectfully to reduce the amount of typing.) For n = 1, we also have only one choice: N E E. For n = 2, we have two cases, which we can generate from the n = 1 case by replacing one E with N E E:

N (N E E) E
N E (N E E)

For n = 3, we can repeat the same procedure, substituting each E in turn in each case with N E E to find all the possible places we can add an extra node:

N (N (N E E) E) E
N (N E (N E E)) E
N (N E E) (N E E)
N E (N (N E E) E)
N E (N E (N E E))

We can do this via a recursive function:

allWithNum :: Int -> [Tree]
allWithNum 0 = [Empty]
allWithNum n = nub $ concatMap subWithNode (allWithNum $ n-1)
  where
    subWithNode = fmap snd . filter fst . go False
      where
        go False Empty = [(True,Node Empty Empty),(False,Empty)]
        go True  Empty = [(True,Empty)]
        go hasSubdNode (Node x y) = do
            (subdX, newX) <- go hasSubdNode x
            (subdY, newY) <- go subdX y
            return (subdY, Node newX newY)

(Note that this uses nub from Data.List, and also requires Tree to have an Eq instance.)

Most of the work here is done in go, which moves along a Tree substituting Node Empty Empty into each Empty in turn, keeping track in its first argument of whether a substitution has been made yet.

Now for the second problem: how do we choose a random element from this list? This may be done using the functions in the System.Random module from the random package, by using choose (0, length_of_list) to choose an index and then getting the value at that index using (!!).




回答3:


QuickCheck is your friend. Now is a great time to begin learning it.

You asked about generating random trees. In QuickCheck terms, we generate arbitrary trees using the typeclass Arbitrary.

class Arbitrary a where
  arbitrary   :: Gen a
  coarbitrary :: a -> Gen b -> Gen b

Gen is a typeclass for test data generators.

Test Data Generators: The Type Gen

Test data is produced by test data generators. QuickCheck defines default generators for most types, but you can use your own with forAll, and will need to define your own generators for any new types you introduce.

Generators have types of the form Gen a; this is a generator for values of type a. The type Gen is a monad, so Haskell's do syntax and standard monadic functions can be used to define generators.

Generators are built up on top of the function

choose :: Random a => (a, a) -> Gen a

which makes a random choice of a value from an interval, with a uniform distribution. For example, to make a random choice between the elements of a list, use

do i <- choose (0, length xs-1)
   return (xs !! i)

In your question, you say you want to generate trees with a certain number of nodes. An Arbitrary instance for that is

instance Arbitrary Tree where
  arbitrary = sized tree'
    where tree' 0 = return Empty
          tree' n | n > 0 = do
            lsize <- choose (0, n - 1)
            l <- tree' lsize
            r <- tree' (n - lsize - 1)
            return $ Node l r

This instance respects an external size parameter thanks to sized.

The Size of Test Data

Test data generators have an implicit size parameter; quickCheck begins by generating small test cases, and gradually increases the size as testing progresses. Different test data generators interpret the size parameter in different ways: some ignore it, while the list generator, for example, interprets it as an upper bound on the length of generated lists. You are free to use it as you wish to control your own test data generators. You can obtain the value of the size parameter using

sized :: (Int -> Gen a) -> Gen a

sized g calls g, passing it the current size as a parameter. For example, to generate natural numbers in the range 0 to n, use

sized $ \n -> choose (0, n)

The internal tree' generator takes the number of nodes remaining and creates a tree by creating Empty when the parameter is zero or else creating a Node whose left subtree has size between 0 and n - 1 (less 1 to account for the node itself) and whose right subtree gets the remaining elements.

Say you have a function

nodes Empty = 0
nodes (Node l r) = 1 + nodes l + nodes r

and we want to see that it correctly counts nodes in 10-node trees. After defining the property

prop_nodes_count = forAll (resize 10 arbitrary) (\t -> nodes t == 10)

we successfully test it in GHCi with

λ> quickCheck prop_nodes_count
+++ OK, passed 100 tests.

The code above uses resize to force the size of the generated trees.

To demonstrate that you can generate a pure list of trees, i.e., [Tree] rather than IO [Tree], we’ll use a simple standin.

print_trees :: [Tree] -> IO ()
print_trees = print

Ultimately, QuickCheck generators are random and therefore stateful, so generating them from your main action is an easy approach.

main :: IO ()
main = do
  trees <- sample' (resize 0 arbitrary)
  print_trees $ take 1 trees
  trees' <- sample' (resize 1 arbitrary)
  print_trees $ take 1 trees'
  trees'' <- sample' (resize 2 arbitrary)
  print_trees $ take 2 trees''

Doing the work here is sample' that has type Gen a -> IO [a]. That is, when run inside an IO action, sample' produces a list of whatever it is you want to generate, arbitrary trees in this case. The related function sample has type Show a => Gen a -> IO (), which means it goes ahead an prints the generated test data.

The output of the above program after adding deriving Show to your definition of Tree is

[Empty]
[Node Empty Empty]
[Node (Node Empty Empty) Empty,Node Empty (Node Empty Empty)]


来源:https://stackoverflow.com/questions/54383876/create-random-data-from-custom-type

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!