Haskell: how to infer the type of an expression manually

问题

Given ist the Haskell function:

head . filter fst

The question is now how to find the type "manually" by hand. If I let Haskell tell me the type I get:

head . filter fst :: [(Bool, b)] -> (Bool, b)

But I want to understand how this works using only the signatures of the used functions which are defined as follows:

head :: [a] -> a
(.) :: (b -> c) -> (a -> b) -> a -> c
filter :: (a -> Bool) -> [a] -> [a]
fst :: (a, b) -> a

Edit: so many very good explanations ... it's not easy to select the best one!

回答1:

Types are infered using a process generally called unification. Haskell belongs to the Hindley-Milner family, which is the unification algorithm it uses to determine the type of an expression.

If unification fails, then the expression is a type error.

The expression

head . filter fst

passes. Let's do the unification manually to see what why we get what we get.

Let's start with filter fst:

filter :: (a -> Bool) -> [a] -> [a]
fst :: (a' , b') -> a'                -- using a', b' to prevent confusion

filter takes a (a -> Bool), then a [a] to give another [a]. In the expression filter fst, we pass to filter the argument fst, whose type is (a', b') -> a'. For this to work, the type fst must unify with the type of filter's first argument:

(a -> Bool)  UNIFY?  ((a', b') -> a')

The algorithm unifies the two type expressions and tries to bind as many type variables (such as a or a') to actual types (such as Bool).

Only then does filter fst lead to a valid typed expression:

filter fst :: [a] -> [a]

a' is clearly Bool. So the type variable a' resolves to a Bool. And (a', b') can unify to a. So if a is (a', b') and a' is Bool, Then a is just (Bool, b').

If we had passed an incompatible argument to filter, such as 42 (a Num), unification of Num a => a with a -> Bool would have failed as the two expressions can never unify to a correct type expression.

Coming back to

filter fst :: [a] -> [a]

This is the same a we are talking about, so we substitute in it's place the result of the previous unification:

filter fst :: [(Bool, b')] -> [(Bool, b')]

The next bit,

head . (filter fst)

Can be written as

(.) head (filter fst)

So take (.)

(.) :: (b -> c) -> (a -> b) -> a -> c

So for unification to succeed,

head :: [a] -> a must unify (b -> c)
filter fst :: [(Bool, b')] -> [(Bool, b')] must unify (a -> b)

From (2) we get that a IS b in the expression (.) :: (b -> c) -> (a -> b) -> a -> c)`

So the values of the type variables a and c in the expression (.) head (filter fst) :: a -> c are easy to tell since (1) gives us the relation between b and c, that: b is a list of c. And as we know a to be [(Bool, b')], c can only unify to (Bool, b')

So head . filter fst successfully type-checks as that:

head . filter fst ::  [(Bool, b')] -> (Bool, b')

UPDATE

It's interesting to see how you can unify starting the process from various points. I chose filter fst first, then went on to (.) and head but as the other examples show, unification can be carried out in several ways, not unlike the way a mathematic proof or a theorem derivation can be done in more than one way!

回答2:

filter :: (a -> Bool) -> [a] -> [a] takes a function (a -> Bool), a list of the same type a, and also returns a list of that type a.

In your defintion you use filter fst with fst :: (a,b) -> a so the type

filter (fst :: (Bool,b) -> Bool) :: [(Bool,b)] -> [(Bool,b)]

is inferred. Next, you compose your result [(Bool,b)] with head :: [a] -> a.

(.) :: (b -> c) -> (a -> b) -> a -> c is the composition of two functions, func2 :: (b -> c) and func1 :: (a -> b). In your case, you have

func2 = head       ::               [ a      ]  -> a

and

func1 = filter fst :: [(Bool,b)] -> [(Bool,b)]

so head here takes [(Bool,b)] as argument and returns (Bool,b) per definition. In the end you have:

head . filter fst :: [(Bool,b)] -> (Bool,b)

回答3:

Let's start with (.). It's type signature is

(.) :: (b -> c) -> (a -> b) -> a -> c

which says "given a function from b to c, and a function from a to b, and an a, I can give you a b". We want to use that with head and filter fst, so`:

(.) :: (b -> c) -> (a -> b) -> a -> c
       ^^^^^^^^    ^^^^^^^^
         head     filter fst

Now head, which is a function from an array of something to a single something. So now we know that b is going to be an array, and c is going to be an element of that array. So for the purpose of our expression, we can think of (.) as having the signature:

(.) :: ([d] -> d) -> (a -> [d]) -> a -> d -- Equation (1)
                     ^^^^^^^^^^
                     filter fst

The signature for filter is:

filter :: (e -> Bool) -> [e] -> [e] -- Equation (2)
          ^^^^^^^^^^^
              fst

(Note that I've changed the name of the type variable to avoid confusion with the as that we already have!) This says "Given a function from e to a Bool, and a list of es, I can give you a list of es". The function fst has the signature:

fst :: (f, g) -> f

says, "given a pair containing an f and a g, I can give you an f". Comparing this with Equation 2, we know that e is going to be a pair of values, the first element of which must be a Bool. So in our expression, we can think of filter as having the signature:

filter :: ((Bool, g) -> Bool) -> [(Bool, g)] -> [(Bool, g)]

(All I've done here is to replace e with (Bool, g) in Equation 2.) And the expression filter fst has the type:

filter fst :: [(Bool, g)] -> [(Bool, g)]

Going back to Equation 1, we can see that (a -> [d]) must now be [(Bool, g)] -> [(Bool, g)], so a must be [(Bool, g)] and d must be (Bool, g). So in our expression, we can think of (.) as having the signature:

(.) :: ([(Bool, g)] -> (Bool, g)) -> ([(Bool, g)] -> [(Bool, g)]) -> [(Bool, g)] -> (Bool, g)

To summarise:

(.) :: ([(Bool, g)] -> (Bool, g)) -> ([(Bool, g)] -> [(Bool, g)]) -> [(Bool, g)] -> (Bool, g)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                head                         filter fst
head :: [(Bool, g)] -> (Bool, g)
filter fst :: [(Bool, g)] -> [(Bool, g)]

Putting it all together:

head . filter fst :: [(Bool, g)] -> (Bool, g)

Which is equivalent to what you had, except that I've used g as the type variable rather than b.

This probably all sounds very complicated, because I described it in gory detail. However, this sort of reasoning quickly becomes second nature and you can do it in your head.

回答4:

_{^{(page down for manual derivation)}}Find the type of head . filter fst == ((.) head) (filter fst), given

head   :: [a] -> a
(.)    :: (b -> c) -> ((a -> b) -> (a -> c))
filter :: (a -> Bool) -> ([a] -> [a])
fst    :: (a, b) -> a

This is achieved in a purely mechanical manner by a small Prolog program:

type(head,    arrow(list(A)       , A)).                 %% -- known facts
type(compose, arrow(arrow(B, C)   , arrow(arrow(A, B), arrow(A, C)))).
type(filter,  arrow(arrow(A, bool), arrow(list(A)    , list(A)))).
type(fst,     arrow(pair(A, B)    , A)).

type([F, X], T):- type(F, arrow(A, T)), type(X, A).      %% -- application rule

which automagically produces, when run in a Prolog interpreter,

3 ?- type([[compose, head], [filter, fst]], T).
T = arrow(list(pair(bool, A)), pair(bool, A))       %% -- [(Bool,a)] -> (Bool,a)

where types are represented as compound data terms, in a purely syntactical manner. E.g. the type [a] -> a is represented by arrow(list(A), A), with possible Haskell equivalent Arrow (List (Logvar "a")) (Logvar "a"), given the appropriate data definitions.

Only one inference rule, that of an application, was used, as well as Prolog's structural unification where compound terms match if they have the same shape and their constituents match: f(a₁, a₂, ... a_n) and g(b₁, b₂, ... b_m) match iff f is the same as g, n == m and a_i matches b_i, with logical variables being able to take on any value as needed, but only once (can't be changed).

4 ?- type([compose, head], T1).     %% -- (.) head   :: (a -> [b]) -> (a -> b)
T1 = arrow(arrow(A, list(B)), arrow(A, B))

5 ?- type([filter, fst], T2).       %% -- filter fst :: [(Bool,a)] -> [(Bool,a)]
T2 = arrow(list(pair(bool, A)), list(pair(bool, A)))

To perform type inference manually in a mechanical fashion, involves writing things one under another, noting equivalences on the side and performing the substitutions thus mimicking the operations of Prolog. We can treat any ->, (_,_), [] etc. purely as syntactical markers, without understanding their meaning at all, and perform the process mechanically using structural unification and, here, only one rule of type inference, viz. rule of application: (a -> b) c ⊢ b {a ~ c} (replace a juxtaposition of (a -> b) and c, with b, under the equivalence of a and c). It is important to rename logical variables, consistently, to avoid name clashes:

(.)  :: (b    -> c ) -> ((a -> b  ) -> (a -> c ))           b ~ [a1], 
head ::  [a1] -> a1                                         c ~ a1
(.) head ::              (a ->[a1]) -> (a -> c ) 
                         (a ->[c] ) -> (a -> c ) 
---------------------------------------------------------
filter :: (   a    -> Bool) -> ([a]        -> [a])          a ~ (a1,b), 
fst    ::  (a1, b) -> a1                                    Bool ~ a1
filter fst ::                   [(a1,b)]   -> [(a1,b)]  
                                [(Bool,b)] -> [(Bool,b)] 
---------------------------------------------------------
(.) head   :: (      a    -> [     c  ]) -> (a -> c)        a ~ [(Bool,b)]
filter fst ::  [(Bool,b)] -> [(Bool,b)]                     c ~ (Bool,b)
((.) head) (filter fst) ::                   a -> c      
                                    [(Bool,b)] -> (Bool,b)

回答5:

You can do this the "technical" way, with lots of complicated unification steps. Or you can do it the "intuitive" way, just looking at the thing and thinking "OK, what have I got here? What is this expecting?" and so on.

Well, filter expects a function and a list, and returns a list. filter fst specifies a function, but there's no list given - so we're still waiting for the list input. So filter fst is taking a list and returning another list. (This is quite a common Haskell phrase, by the way.)

Next, the . operator "pipes" the output to head, which expects a list and returns one of the elements from that list. (The first one, as it happens.) So whatever filter comes up with, head gives you the first element of it. At this point, we can conclude

head . filter foobar :: [x] -> x

But what is x? Well, filter fst applies fst to every element of the list (to decide whether to keep it or throw it). So fst must be applicable to the list elements. And fst expects a 2-element tuple, and returns the first element of that tuple. Now filter is expecting fst to return a Bool, so that means the first element of the tuple must be a Bool.

Putting all that together, we conclude

head . filter fst :: [(Bool, y)] -> (Bool, y)

What is y? We don't know. We don't actually care! The above functions will work whatever it is. So that's our type signature.

In more complicated examples it can be harder to figure out what's going on. (Especially when weird class instances get involved!) But for smallish ones like this, involving common functions, you can usually just think "OK, what goes in here? What comes out there? What does this function expect?" and walk right up to the answer without too much manual algorithm-chasing.

来源：https://stackoverflow.com/questions/14335704/haskell-how-to-infer-the-type-of-an-expression-manually

标签

haskell

types

type-inference

unification