How can I handle operations over many different types in my DSL?

Suppose Haskell is to be used to implement an interpreter for a domain specific language. The DSL has a large number of type, represented as data constructors, and a large number of binary expressions. A naive first attempt would be a type class BinaryOps encapsulating all binary operations over MyType in the DSL:

data MyType = A String
            | B Integer
            | C Bool
            | D Double
         {- | E .. Z -}

class BinaryOps a where
    f :: a -> a -> a
    g :: a -> a -> a
    h :: a -> a -> a
    j :: a -> a -> a
    {- many more binary ops -}

instance BinaryOps MyType where
    f (A s1) (A s2) = {- Haskell expression on s1 and s2 -}
    f (A s1) (B s2) = {- ... -}
    f (B s1) (D s2) = {- ... -}
    f _ _ = error "f does not support argument types"

    g (D s1) (A s2) = {- Haskell expression on s1 and s2 -}
    g (D s1) (C s2) = {- ... -}
    g _ _ = error "g does not support argument types"

    h (B s1) (B s2) = {- Haskell expression on s1 and s2 -}
    h (B s1) (C s2) = {- ... -}
    h (B s1) (D s2) = {- ... -}
    h (C s1) (B s2) = {- ... -}
    h (D s1) (C s2) = {- ... -}
    h (D s1) (D s2) = {- ... -}
    h _ _ = error "h does not support argument types"

The DSL will have many binary expressions, and many built-in types. The solution above won't scale particularly well: The class definition will be large, and the number of "unsupported" ill-typed combinations of DSL types will grow (the error calls).

Is there a more elegant way to use type classes for interpreting the binary expressions in the DSL? Or indeed, is there something like GADTs that provides a more scalable solution?

I don't see why you're using a typeclass in the first place. What does a typeclass gain you over just having normal functions?

Just define binary operators as, well, Haskell binary operators which are just normal functions:

f :: MyType -> MyType -> MyType
f = ...

Since all your DSL types are in MyType, there's no reason to use a typeclass.

Packing and Unpacking

Of course, this still doesn't solve your error problem. One approach I've taken in the past is to use typeclasses to define ways to "pack" and "extract" primitive types into your DSL:

class Pack a where
  pack :: a -> MyType

class Extract a where
  extract :: MyType -> a

Here's what the instance for String would look like:

instance Pack String where pack = A
instance Extract String where
  extract (A str) = str
  extract _       = error "Type error: expected string!"

The Extract class can deal with error handling for incompatible types.

This lets you uniformly "lift" functions into your DSL:

-- Lifts binary Haskell functions into your DSL
lift :: (Extract a, Extract b, Pack c) => (a -> b -> c)
          -> MyType -> MyType -> MyType
lift f a b = pack $ f (extract a) (extract b)

If you make MyType an instance of Pack and Extract, this will work for both purely Haskell functions and functions aware of your DSL. That said, the aware functions will just get some sort of MyType and will have to deal with it manually, calling error if their MyType argument isn't what they expected.

So this solves your error problem for functions you can write in straight Haskell but not really for ones that depend on MyType.

Error Handling

Using pack is also nice because it's pretty straightforward to switch to a better error-handling mechanism than error. You would just switch the type of extract (or even pack, if appropriate). Maybe you could use:

class Extract a where
  extract :: MyType -> Either MyError a

and then fail with Left (TypeError expected got) which would let you write nice error messages.

This would also let you easily combine multiple primitive functions into "cases" at the MyType level. The basic idea is that we combine multiple liftable functions into a single MyType -> MyType -> MyType and internally we just use the first one that doesn't give us an error. This can also give us some pretty looking syntax :).

Here's the relevant code:

type MyFun = MyType -> MyType -> Either MyError MyType

(|:) :: (Extract a, Extract b, Pack c) => MyFun -> (a -> b -> c) -> MyFun
(f |: option) a b = case f a b of
  Right res -> return res
  Left err  -> (lift option) a b 

match :: MyFun
match _ _ = Left EmptyFunction

test = match |: (\ a b -> a ++ b :: String)
             |: (\ a b -> a || b)

Unfortunately, I had to add a :: String type signature because it was ambiguous otherwise. The same would happen if I use +, since it doesn't know what kind of number to rely on.

Now test is a function which works correctly on two As or two Bs and gives an error otherwise:

*Main> test (A "foo") (A "foo")
Right (A "foofoo")
*Main> test (C True) (C False)
Right (C True)
*Main> test (A "foo") (C False)
Left TypeError

Also note that this would work perfectly happily on different types of arguments, like a case which could combine A and B values.

This means that you can now conveniently recast your f, g, h and so on functions as top-level names in Haskell. Here is how you could define f:

f :: MyFun
f = match |: \ s1 s2 -> {- something with strings -}
          |: \ s i   -> {- something with a string and an int -}
          |: \ i d   -> {- something with an int and a double -}
          |: {- ...and so on... -}

You will sometimes have to annotate some of the values with type signatures because there isn't always enough information to make type inference work properly. This should only come up if you use operations from typeclasses (ie +) or use operations with more general types like ++ for strings (++ can work on any lists).

You'd also have to update lift to handle the errors properly. This involves changing it to return an Either and adding the necessary plumbing. My version looks like this:

lift :: (Extract a, Extract b, Pack c) => (a -> b -> c) -> MyFun
lift f a b = fmap pack $ f <$> extract a <*> extract b

Newtypes

This mostly solves your error problem by having the |: construct check errors for you. The main weakness with this approach is that it won't work very well if you want your DSL to have multiple types that have the same underlying Haskell type like:

data MyType = A Double
            | B Double
            {- ... -}

You could fix this by using newtype to create a wrapper for Double. Something like this:

newtype BDouble = B Double

instance Pack Double where pack = A

instance Pack BDouble where pack = B

-- same for Extract

You can use a GADT to better encode the semantics of your dsl.

{-# LANGUAGE GADTs, TypeSynonymInstances, FlexibleInstances #-} 

data MyType a where 
  A :: String -> MyType String
  B :: Integer -> MyType Integer 
  C :: Bool -> MyType Bool
  D :: Double -> MyType Double

The problem of assigning a type to your functions arises. Take f for example. I can't imagine a function which is polymorphic enough to take two strings, a string and an integer, or an integer and a double, but not a string and a double. You didn't include the semantics so I don't know what it does. So while you would like to do something like this:

class BinaryOps r where 
  add :: r Integer -> r Integer -> r Integer

or even

class BinaryOps r where 
  add :: Num a => r a -> r a-> r a

you can't, because f is too polymorphic. The best I could think of:

class BinaryOps r where 
  f :: FArg a b c => r a -> r b -> r c  

class FArg a b c 
instance FArg String String a  -- a should be the actual output type
instance FArg String Integer a
instance FArg Integer Double a

instance BinaryOps MyType where 
    f (A s1) (A s2) = undefined 
    f (A s1) (B s2) = undefined 
    f (B s1) (D s2) = undefined

This isn't a very good solution because FArg says nothing about the arguments, the user would then have to look up the definition of the class, if an instance is added to FArg, for example FArg Double Double Double you will be able to call f (D 0) (D 0) and get a runtime pattern match error. My suggestion would be to change the functions to more sensible types; write the functions as monomorphic and implement implicit or explicit casting in your dsl; include the definition of some actual functions so that it is easier to address this issue.

来源：https://stackoverflow.com/questions/22491228/how-can-i-handle-operations-over-many-different-types-in-my-dsl

标签

haskell

dsl

typeclass