data family instances: newtype, data

问题

With Haskell 98 decls, the whole datatype must be either newtype or data. But data families can have a mix of newtype instance, data instance. Does that mean that being newtype is a property of the data constructor rather than the datatype? Could there be a Haskell with something like:

data Foo a = MkFoo1 Int a
           | MkFoo2 Bool a
           | newtype MkFoo3 a

I know I can't write the following, but why/what goes wrong?:

data family Bar a

newtype instance Bar (Maybe Int)  = MkBar1 (Maybe Int)
            -- MkBar1 :: (Maybe Int) -> Bar (Maybe Int), see below

newtype instance Bar [Char]  = MkBar2 [Char]

data instance Bar [Bool]  where
  MkBar3 :: Int -> Bool -> Bar [Bool]
  -- can't be a newtype because of the existential Int

-- we're OK up to here mixing newtypes and data/GADT

data instance Bar [a]  where
  MkBar4 :: Num a => a -> Bar [a]
  -- can't be a newtype because of the Num a =>

I can't write that because the instance head Bar [a] overlaps the two heads for MkBar2, MkBar3. Then I could fix that by moving those two constructor decls inside the where ... for Bar [a]. But then MkBar2 becomes a GADT (because it's result type is not Bar [a]), so can't be a newtype.

Then is being a newtype a property of the result type, rather than of the constructor? But consider the type inferred for newtype instance MkBar1 above. I can't write a top-level newtype with the same type

newtype Baz a  where
  MkBaz :: (Maybe Int) -> Baz (Maybe Int)

-- Error: A newtype constructor must have a return type of form T a1 ... an

Huh? MkBar1 is a newtype constructor whose type is not of that form.

If possible, please explain without diving into talk of roles: I've tried to understand them; it just makes my head hurt. Talk in terms of constructing and pattern-matching on those constructors.

回答1:

You can think of data families as stronger versions of (injective and open) type families. As you can see from this question I asked a while back, data families can almost be faked with injective type families.

Does that mean that being newtype is a property of the data constructor rather than the datatype? Could there be a Haskell with something like:
data Foo a = MkFoo1 Int a
           | MkFoo2 Bool a
           | newtype MkFoo3 a

No. Being a newtype is definitively a property of the type constructor, not of the data constructor. A data family, quite like a type family, has two levels of type constructors:

the type constructor data/type family FamTyCon a
the type constructors for the instances data/type instance FamInstTyCon a = ...

Different instances of the same data/type family constructor are still fundamentally different types - they happen to be unified under one type constructor (and that type constructor is going to be injective and generative - see the linked question for more on that).

By the type family analogy, you wouldn't expect to be able to pattern match on something of type TyFam a, right? Because you could have type instance TyFam Int = Bool and type instance TyFam () = Int, and you wouldn't be able to statically know if you were looking at a Bool or an Int!

回答2:

These are equivalent:

newtype          NT a b  = MkNT (Int, b)

data family DF a b
newtype instance DF a b  = MkDF (Int, b)

-- inferred MkNT :: (Int, b) -> NT a b
--          MkDF :: (Int, b) -> DF a b

Furthermore you can't declare any other instance/constructor within family DF, because their instance heads would overlap DF a b.

The error message in the q re standalone newtype Baz is misleading

-- Error: A newtype constructor must have a return type of form T a1 ... an

(and presumably dates from before there were data families). A newtype constructor must have a return type exactly as the instance head (modulo alpha renaming if it's using GADT syntax). For a standalone newtype, "instance head" means the newtype head.

For non-newtype data instances, the various constructors might have return types strictly more specific than the instance head (that makes them GADTs).

The type declared in a data/newtype instance head (in the literature variously called a 'type scheme', 'monotype' -- because no constraints, 'Function-free type' -- in the 2008 paper 'Type Checking with Open Type Functions') is the principal type that all the constructors' return types must unify with. (There might not be any constructors with exactly that return type.)

If your instance be a newtype, the rules are much tighter: there can be only one constructor, and its return type must be exactly the principal type. So to answer the original q

Does that mean that being newtype is a property of the data constructor rather than the datatype? ... but ... Then is being a newtype a property of the result type, rather than of the constructor?

No, not of the constructor; yes more like of the result: being a newtype is the property of a data family instance/its principal type. It's just that for standalone newtypes, there can in effect be only one instance, and its principal type is the most general type for that type constructor. Derivatively, we can say the newtype's principal type/return type uniquely identifies a data constructor. That's crucial for type safety, because values of that type share their representation with the type inside the newtype's data constructor, i.e. with no wrapper -- as @AlexisKing's comment points out. Then pattern matching doesn't need to go looking for the constructor: the match is irrefutable/the constructor is virtual.

It strikes me that in the interests of more precise typing/better documentation, you might want to declare a newtype as a data family with only one instance, whose head is more specific than what you must put for a standalone newtype.

来源：https://stackoverflow.com/questions/52417415/data-family-instances-newtype-data

标签

haskell

newtype