What cases do the GHC occurs check identify?

时光总嘲笑我的痴心妄想 提交于 2021-01-02 07:08:16

问题


The GHC occurs check prevents you from constructing infinite types. Is its purpose to prevent common errors in code or to prevent the typechecker from looping indefinitely, or both?

What cases does it identify and is it possible for a malicious user to trick it (as in a Safe Haskell context) into looping? If the type system is Turing-complete (is it?) I don't understand how GHC can guarantee that the computation will halt.


回答1:


Think of type inference as solving a system of equations. Let's look at an example:

f x = (x,2)

How can we deduce the type of f? We see that it is a function:

f :: a -> b

Additionally, from the structure of f we can see that the following equations hold simulatenously:

b = (c,d)
d = Int
c = a

By solving this system of equations we can see that the type of f is a -> (a, Int). Now let's look at the following (erroneous) function:

f x = x : x

The type of (:) is a -> [a] -> [a], so this generates the following system of equations (simplified):

a = a
a = [a]

So we get an equation a = [a], from which we can conclude that this system of equations doesn't have a solution, and therefore the code is not well-typed. If we didn't reject the equation a = [a], we would indeed go in an infinite loop adding equations a = [a], a = [[a]], a = [[[a]]], etc to our system (alternatively, as Daniel notes in his answer, we could allow infinite types in our type system, but that would make erroneous programs such as f x = x : x to typecheck).

You can also test this in ghci:

> let f x = x : x

<interactive>:2:15:
    Occurs check: cannot construct the infinite type: a0 = [a0]
    In the second argument of `(:)', namely `x'
    In the expression: x : x
    In an equation for `f': f x = x : x

As to your other questions: GHC Haskell's type system is not Turing-complete and the typechecker is guaranteed to halt - unless you enable UndecidableInstances, in which case it theoretically can go in an infinite loop. GHC, however, ensures termination by having a fixed-depth recursion stack, so it's not possible to construct a program on which it never halts (edit: according to C.A.McCann's comment, it is possible after all - there's an analogue of tail recursion on the type level that allows to loop without increasing the stack depth).

It is, however, possible to make compilation take arbitrarily long time, since the complexity of Hindley-Milner type inference is exponential in the worst (but not the average!) case:

dup x = (x,x)

bad = dup . dup . dup . dup . dup . dup . dup . dup
      . dup . dup . dup . dup . dup . dup . dup . dup
      . dup . dup . dup . dup . dup . dup . dup . dup
      . dup . dup . dup . dup . dup . dup . dup . dup

Safe Haskell won't protect you from this - take a look at mueval if you want to allow potentially malicious users compile Haskell programs on your system.




回答2:


The GHC occurs check prevents you from constructing infinite types.

This is true only in the very literal sense of preventing types that are syntactically infinite. What's really going on here is just a recursive type where the unification algorithm would need to, in a sense, inline the recursion.

It's always possible to define exactly the same type by making the recursion explicit. This can even be done generically, using a type-level version of fix:

newtype Fix f = Fix (f (Fix f))

As an example, the type Fix ((->) a) is equivalent to unifying b with (a -> b).

In practice, however, "infinite type" errors almost always indicate an error in the code (so if it's broke, you probably shouldn't Fix it). The usual scenario is mixing up argument order or otherwise having an expression in the wrong place in code that doesn't use explicit type signatures.

A type inference system that was extremely naive in the right way could potentially expand the recursion until it ran out of memory, but the halting problem doesn't enter into it--if a type needs to unify with part of itself, that's never going to work (at least in Haskell, there may be languages that instead treat it as equivalent to the explicitly recursive newtype above).

Neither type checking nor type inference in GHC are Turing-complete unless you enable UndecidableInstances, in which case you can do arbitrary computation via functional dependencies or type families.

Safe Haskell doesn't really enter the picture at all. It's easy to generate very large inferred types that will exhaust system memory despite being finite, and if memory serves me Safe Haskell doesn't restrict use of UndecidableInstances anyway.




回答3:


I have the following wonderful mail in my bookmarks: There's nothing wrong with infinite types! Even with infinite types, there's no real danger of making the typechecker loop. The type system is not Turing complete (unless you explicitly ask for it to be with something like the UndecidableInstances extension).




回答4:


The purpose (in Haskell compilers) is to prevent common errors in code. It is possible to construct a type checker and inference engine which would support infinite recursion of types. There is some further information in this question.

OCaml implements recursive types with -rectypes, so it's definitely possible. The OCaml community will be much more versed in some of the problems that arise (the behavior is off by default).

The occurs check identifies infinite type expansions. For example, this code:

Prelude> let a = [a]
<interactive>:2:10:
    Occurs check: cannot construct the infinite type: t0 = [t0]
    In the expression: a
    In the expression: [a]
    In an equation for `a': a = [a]

if you try to work out the type by hand, the type is [ [ [ [ ... ] ] ] ]. It's impossible to write such types by hand, because they're infinite.

The occurs check occurs during type inferencing, which is a separate stage from type checking. Infinite types would have to be inferred, because they can't be annotated manually. Haskell-98 type inference is decidable, so it's impossible to trick the compiler into looping (barring bugs of course, which I suspect this example exploits). GHC by default uses a restricted subset of System F for which type inference is also decidable. With some extensions, such as RankNTypes, GHC does allow code for which type inference is undecidable, but then requires a type annotation, so again there is no danger of the type inference stage looping.

Since Turing complete languages are undecidable, the default type system cannot be Turing-complete. I don't know if GHC's type system becomes Turing-complete with some combination of extensions enabled, but certain extensions (e.g. UndecidableInstances) allow writing code that will crash the compiler with a stack overflow.

Incidentally, the main problem with disabling the occurs check is that many common coding mistakes result in infinite type errors, so disabling it usually leads to more problems than it solves. If you do intend to use an infinite type, a newtype wrapper will allow it without much excess notation.



来源:https://stackoverflow.com/questions/12493773/what-cases-do-the-ghc-occurs-check-identify

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!