问题
I have the following custom type defined
data Tree = Empty | Node Tree Tree
I want to create random Trees with a given number of nodes n that I can then pass to another function which calculates the depth of the tree
depth :: Tree -> Int
depth Empty = 0
depth Node t1 t2 = (maximum [depth t1, depth t2]) + 1
Which is the easiest way to achieve this?
EDIT: I have tried with an approach similar to that of Alec in an answer below, which returns a random IO Tree. However, there are several other functions I need to pass this random Trees to, over which I have no control. These require an argument of type Tree, not IO Tree so this solution doesn`t quite work for my purposes.
回答1:
Think of it as a simple recursive problem. The only complication is that getting a random number requires either threading through explicitly a generator, or working within IO. For simplicity, I'll stick with the latter.
import System.Random
data Tree = Empty | Node Tree Tree
-- | Generate a tree of the given size
arbitraryTree :: Int -> IO Tree
arbitraryTree treeSize
| treeSize <= 1 = pure Empty -- base case, tree of size 1
| otherwise = do
leftSize <- randomRIO (0,treeSize - 1)
let rightSize = treeSize - 1 - leftSize
leftSubtree <- arbitraryTree leftSize
rightSubtree <- arbitraryTree rightSize
pure (Node leftSubtree rightSubtree)
回答2:
Interesting question! This problem is best approached in two parts: generating a list of Trees with the required amount of nodes, and then selecting one value at random from that list.
We'll start with the first problem: given a number n, generate a list of all Trees with n Nodes. But how would we do this? Let's try out a few simple cases for small ns. For n = 0, we have only one choice: Empty. (From now on, I'll abbreviate Node and Empty as N and E respectfully to reduce the amount of typing.) For n = 1, we also have only one choice: N E E. For n = 2, we have two cases, which we can generate from the n = 1 case by replacing one E with N E E:
N (N E E) E
N E (N E E)
For n = 3, we can repeat the same procedure, substituting each E in turn in each case with N E E to find all the possible places we can add an extra node:
N (N (N E E) E) E
N (N E (N E E)) E
N (N E E) (N E E)
N E (N (N E E) E)
N E (N E (N E E))
We can do this via a recursive function:
allWithNum :: Int -> [Tree]
allWithNum 0 = [Empty]
allWithNum n = nub $ concatMap subWithNode (allWithNum $ n-1)
where
subWithNode = fmap snd . filter fst . go False
where
go False Empty = [(True,Node Empty Empty),(False,Empty)]
go True Empty = [(True,Empty)]
go hasSubdNode (Node x y) = do
(subdX, newX) <- go hasSubdNode x
(subdY, newY) <- go subdX y
return (subdY, Node newX newY)
(Note that this uses nub from Data.List, and also requires Tree to have an Eq instance.)
Most of the work here is done in go, which moves along a Tree substituting Node Empty Empty into each Empty in turn, keeping track in its first argument of whether a substitution has been made yet.
Now for the second problem: how do we choose a random element from this list? This may be done using the functions in the System.Random module from the random package, by using choose (0, length_of_list) to choose an index and then getting the value at that index using (!!).
回答3:
QuickCheck is your friend. Now is a great time to begin learning it.
You asked about generating random trees. In QuickCheck terms, we generate arbitrary trees using the typeclass Arbitrary.
class Arbitrary a where
arbitrary :: Gen a
coarbitrary :: a -> Gen b -> Gen b
Gen is a typeclass for test data generators.
Test Data Generators: The Type Gen
Test data is produced by test data generators. QuickCheck defines default generators for most types, but you can use your own with
forAll, and will need to define your own generators for any new types you introduce.Generators have types of the form
Gen a; this is a generator for values of typea. The typeGenis a monad, so Haskell'sdosyntax and standard monadic functions can be used to define generators.Generators are built up on top of the function
choose :: Random a => (a, a) -> Gen awhich makes a random choice of a value from an interval, with a uniform distribution. For example, to make a random choice between the elements of a list, use
do i <- choose (0, length xs-1) return (xs !! i)
In your question, you say you want to generate trees with a certain number of nodes. An Arbitrary instance for that is
instance Arbitrary Tree where
arbitrary = sized tree'
where tree' 0 = return Empty
tree' n | n > 0 = do
lsize <- choose (0, n - 1)
l <- tree' lsize
r <- tree' (n - lsize - 1)
return $ Node l r
This instance respects an external size parameter thanks to sized.
The Size of Test Data
Test data generators have an implicit
sizeparameter; quickCheck begins by generating small test cases, and gradually increases the size as testing progresses. Different test data generators interpret the size parameter in different ways: some ignore it, while the list generator, for example, interprets it as an upper bound on the length of generated lists. You are free to use it as you wish to control your own test data generators. You can obtain the value of the size parameter usingsized :: (Int -> Gen a) -> Gen a
sized gcallsg, passing it the current size as a parameter. For example, to generate natural numbers in the range 0 to n, usesized $ \n -> choose (0, n)
The internal tree' generator takes the number of nodes remaining and creates a tree by creating Empty when the parameter is zero or else creating a Node whose left subtree has size between 0 and n - 1 (less 1 to account for the node itself) and whose right subtree gets the remaining elements.
Say you have a function
nodes Empty = 0
nodes (Node l r) = 1 + nodes l + nodes r
and we want to see that it correctly counts nodes in 10-node trees. After defining the property
prop_nodes_count = forAll (resize 10 arbitrary) (\t -> nodes t == 10)
we successfully test it in GHCi with
λ> quickCheck prop_nodes_count
+++ OK, passed 100 tests.
The code above uses resize to force the size of the generated trees.
To demonstrate that you can generate a pure list of trees, i.e., [Tree] rather than IO [Tree], we’ll use a simple standin.
print_trees :: [Tree] -> IO ()
print_trees = print
Ultimately, QuickCheck generators are random and therefore stateful, so generating them from your main action is an easy approach.
main :: IO ()
main = do
trees <- sample' (resize 0 arbitrary)
print_trees $ take 1 trees
trees' <- sample' (resize 1 arbitrary)
print_trees $ take 1 trees'
trees'' <- sample' (resize 2 arbitrary)
print_trees $ take 2 trees''
Doing the work here is sample' that has type Gen a -> IO [a]. That is, when run inside an IO action, sample' produces a list of whatever it is you want to generate, arbitrary trees in this case. The related function sample has type Show a => Gen a -> IO (), which means it goes ahead an prints the generated test data.
The output of the above program after adding deriving Show to your definition of Tree is
[Empty]
[Node Empty Empty]
[Node (Node Empty Empty) Empty,Node Empty (Node Empty Empty)]
来源:https://stackoverflow.com/questions/54383876/create-random-data-from-custom-type