My Haskell project includes an expression evaluator, which for the purposes of this question can be simplified to:
data Expression a where
I :: Int ->
This answer is inspired by rampion's follow-up question, which suggests the following function:
step :: Expression a -> Expression a
step x = case x of
Add (I x) (I y) -> I $ x + y
Mul (I x) (I y) -> I $ x * y
Eq (I x) (I y) -> B $ x == y
And (B x) (B y) -> B $ x && y
Or (B x) (B y) -> B $ x || y
If (B b) x y -> if b then x else y
z -> z
step
looks at a single term, and reduces it if everything needed to reduce it is present. Equiped with step
, we only need a way to replace a term everywhere in the expression tree. We can start by defining a way to apply a function inside every term.
{-# LANGUAGE RankNTypes #-}
emap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
emap f x = case x of
I a -> I a
B a -> B a
Add x y -> Add (f x) (f y)
Mul x y -> Mul (f x) (f y)
Eq x y -> Eq (f x) (f y)
And x y -> And (f x) (f y)
Or x y -> Or (f x) (f y)
If x y z -> If (f x) (f y) (f z)
Now, we need to apply a function everywhere, both to the term and everywhere inside the term. There are two basic possibilities, we could apply the function to the term before applying it inside or we could apply the function afterwards.
premap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
premap f = emap (premap f) . f
postmap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
postmap f = f . emap (postmap f)
This gives us two possibilities for how to use step
, which I will call shorten
and reduce
.
shorten = premap step
reduce = postmap step
These behave a little differently. shorten
removes the innermost level of terms, replacing them with literals, shortening the height of the expression tree by one. reduce
completely evaluates the expression tree to a literal. Here's the result of iterating each of these on the same input
"shorten"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
If (And (B True) (B True)) (Add (I 1) (I 6)) (I 0)
If (B True) (I 7) (I 0)
I 7
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
I 7
Your question implies that you sometimes expect that expressions can't be reduced completely. I'll extend your example to include something to demonstrate this case, by adding a variable, Var
.
data Expression a where
Var :: Expression Int
...
We will need to add support for Var
to emap
:
emap f x = case x of
Var -> Var
...
bind
will replace the variable, and evaluateFor
performs a complete evaluation, traversing the expression only once.
bind :: Int -> Expression a -> Expression a
bind a x = case x of
Var -> I a
z -> z
evaluateFor :: Int -> Expression a -> Expression a
evaluateFor a = postmap (step . bind a)
Now reduce
iterated on an example containing a variable produces the following output
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul Var (I 3))) (I 0)
Add (I 1) (Mul Var (I 3))
If the output expression from the reduction is evaluated for a specific value of Var
, we can reduce the expression all the way to a literal.
"evaluateFor 5"
Add (I 1) (Mul Var (I 3))
I 16
emap
can instead be written in terms of an Applicative
Functor
, and postmap
can be made into a generic piece of code suitable for other data types than expressions. How to do so is described in this answer to rampion's follow-up question.
One partial solution, which I've used in a similar situation, is to extract the logic into a "lifting" function that takes a normal Haskell operation and applies it to your language's values. This abstracts over the wrappping/unwrapping and resulting error handling.
The idea is to create two typeclasses for going to and from your custom type, with appropriate error handling. Then you can use these to create a liftOp
function that could look like this:
liftOp :: (Extract a, Extract b, Pack c) => (a -> b -> c) ->
(Expression a -> Expression b -> Expression c)
liftOp err op a b = case res of
Nothing -> err a' b'
Just res -> pack res
where res = do a' <- extract $ reduce' a
b' <- extract $ reduce' b
return $ a' `op` b'
Then each specific case looks like this:
Mul x y -> liftOp Mul (*) x y
Which isn't too bad: it isn't overly redundant. It encompasses the information that matters: Mul
gets mapped to *
, and in the error case we just apply Mul
again.
You would also need instances for packing and unpacking, but these are useful anyhow. One neat trick is that these can also let you embed functions in your DSL automatically, with an instance of the form (Extract a, Pack b) => Pack (a -> b)
.
I'm not sure this will work exactly for your example, but I hope it gives you a good starting point. You might want to wire additional error handling through the whole thing, but the good news is that most of that gets folded into the definition of pack
, unpack
and liftOp
, so it's still pretty centralized.
I wrote up a similar solution for a related (but somewhat different) problem. It's also a way to handle going back and forth between native Haskell values and an interpreter, but the interpreter is structured differently. Some of the same ideas should still apply though!