Is there a compiler-extension for untagged union types in Haskell?

孤人 提交于 2020-01-29 09:30:08

问题


In some languages (#racket/typed, for example), the programmer can specify a union type without discriminating against it, for instance, the type (U Integer String) captures integers and strings, without tagging them (I Integer) (S String) in a data IntOrStringUnion = ... form or anything like that. Is there a way to do the same in Haskell?


回答1:


Either is what you're looking for... ish.

In Haskell terms, I'd describe what you're looking for as an anonymous sum type. By anonymous, I mean that it doesn't have a defined name (like something with a data declaration). By sum type, I mean a data type that can have one of several (distinguishable) types; a tagged union or such. (If you're not familiar with this terminology, try Wikipedia for starters.)

We have a well-known idiomatic anonymous product type, which is just a tuple. If you want to have both an Int and a String, you just smush them together with a comma: (Int, String). And tuples (seemingly) can go on forever--(Int, String, Double, Word), and you can pattern-match the same way. (There's a limit, but never mind.)

The well-known idiomatic anonymous sum type is Either, from Data.Either (and the Prelude):

data  Either a b  =  Left a | Right b
  deriving (Eq, Ord, Read, Show, Typeable)

It has some shortcomings, most prominently a Functor instance that favors Right in a way that's odd in this context. The problem is that chaining it introduces a lot of awkwardness: the type ends up like Either (Int (Either String (Either Double Word))). Pattern matching is even more awkward, as others have noted.

I just want to note that we can get closer to (what I understand to be) the Racket use case. From my extremely brief Googling, it looks like in Racket you can use functions like isNumber? to determine what type is actually in a given value of a union type. In Haskell, we usually do that with case analysis (pattern matching), but that's awkward with Either, and function using simple pattern-matching will likely end up hard-wired to a particular union type. We can do better.

IsNumber?

I'm going to write a function I think is an idiomatic Haskell stand-in for isNumber?. First, we don't like doing Boolean tests and then running functions that assume their result; instead, we tend to just convert to Maybe and go from there. So the function's type will end with -> Maybe Int. (Using Int as a stand-in for now.)

But what's on the left hand of the arrow? "Something that might be an Int -- or a String, or whatever other types we put in the union." Uh, okay. So it's going to be one of a number of types. That sounds like typeclass, so we'll put a constraint and a type variable on the left hand of the arrow: MightBeInt a => a -> Maybe Int. Okay, let's write out the class:

class MightBeInt a where
  isInt   :: a -> Maybe Int
  fromInt :: Int -> a

Okay, now how do we write the instances? Well, we know if the first parameter to Either is Int, we're golden, so let's write that out. (Incidentally, if you want a nice exercise, only look at the instance ... where parts of these next three code blocks, and try to implement that class members yourself.)

instance MightBeInt (Either Int b) where
  isInt (Left i) = Just i
  isInt _ = Nothing
  fromInt = Left

Fine. And ditto if Int is the second parameter:

instance MightBeInt (Either a Int) where
  isInt (Right i) = Just i
  isInt _ = Nothing
  fromInt = Right

But what about Either String (Either Bool Int)? The trick is to recurse on the right hand type: if it's not Int, is it an instance of MightBeInt itself?

instance MightBeInt b => MightBeInt (Either a b) where
  isInt (Right xs) = isInt xs
  isInt _ = Nothing
  fromInt = Right . fromInt

(Note that these all require FlexibleInstances and OverlappingInstances.) It took me a long time to get a feel for writing and reading these class instances; don't worry if this instance is surprising. The punch line is that we can now do this:

anInt1 :: Either Int String
anInt1 = fromInt 1

anInt2 :: Either String (Either Int Double)
anInt2 = fromInt 2

anInt3 :: Either String Int
anInt3 = fromInt 3

notAnInt :: Either String Int
notAnInt = Left "notint"

ghci> isInt anInt3
Just 3
ghci> isInt notAnInt
Nothing

Great!

Generalizing

Okay, but now do we need to write another type class for each type we want to look up? Nope! We can parameterize the class by the type we want to look up! It's a pretty mechanical translation; the only question is how to tell the compiler what type we're looking for, and that's where Proxy comes to the rescue. (If you don't want to install tagged or run base 4.7, just define data Proxy a = Proxy. It's nothing special, but you'll need PolyKinds.)

class MightBeA t a where
  isA   :: proxy t -> a -> Maybe t
  fromA :: t -> a

instance MightBeA t t where
  isA _ = Just
  fromA = id

instance MightBeA t (Either t b) where
  isA _ (Left i) = Just i
  isA _ _ = Nothing
  fromA = Left

instance MightBeA t b => MightBeA t (Either a b) where
  isA p (Right xs) = isA p xs
  isA _ _ = Nothing
  fromA = Right . fromA

ghci> isA (Proxy :: Proxy Int) anInt3
Just 3
ghci> isA (Proxy :: Proxy String) notAnInt
Just "notint"

Now the usability situation is... better. The main thing we've lost, by the way, is the exhaustiveness checker.

Notational Parity With (U String Int Double)

For fun, in GHC 7.8 we can use DataKinds and TypeFamilies to eliminate the infix type constructors in favor of type-level lists. (In Haskell, you can't have one type constructor--like IO or Either--take a variable number of parameters, but a type-level list is just one parameter.) It's just a few lines, which I'm not really going to explain:

type family OneOf (as :: [*]) :: * where
  OneOf '[] = Void
  OneOf '[a] = a
  OneOf (a ': as) = Either a (OneOf as)

Note that you'll need to import Data.Void. Now we can do this:

anInt4 :: OneOf '[Int, Double, Float, String]
anInt4 = fromInt 4

ghci> :kind! OneOf '[Int, Double, Float, String]
OneOf '[Int, Double, Float, String] :: *
= Either Int (Either Double (Either Float [Char]))

In other words, OneOf '[Int, Double, Float, String] is the same as Either Int (Either Double (Either Float [Char])).




回答2:


You need some kind of tagging because you need to be able to check if a value is actually an Integer or a String to use it for anything. One way to alleviate having to create a custom ADT for every combination is to use a type such as

{-# LANGUAGE TypeOperators #-}

data a :+: b = L a | R b

infixr 6 :+:

returnsIntOrString :: Integer -> Integer :+: String
returnsIntOrString i
    | i `rem` 2 == 0 = R "Even"
    | otherwise      = L (i * 2)

returnsOneOfThree :: Integer -> Integer :+: String :+: Bool
returnsOneOfThree i
    | i `rem` 2 == 0 = (R . L) "Even"
    | i `rem` 3 == 0 = (R . R) False
    | otherwise      = L (i * 2)


来源:https://stackoverflow.com/questions/26205585/is-there-a-compiler-extension-for-untagged-union-types-in-haskell

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!