How does the presence of the “error” function bear on the purity of Haskell?

问题

I've always wondered how the Haskell exception system fits in with the whole "Pure functional language" thing. For example see the below GHCi session.

GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Prelude> head []
*** Exception: Prelude.head: empty list
Prelude> :t head
head :: [a] -> a
Prelude> :t error
error :: [Char] -> a
Prelude> error "ranch"
*** Exception: ranch
CallStack (from HasCallStack):
  error, called at <interactive>:4:1 in interactive:Ghci1
Prelude>

The type of head is [a] -> a. But when you call it on the special case of an empty list, you get an exception instead. But this exception is not accounted for in the type signature.

If I remember correctly it's a similar story when there is a failure during pattern matching. It doesn't matter what the type signature says, if you haven't accounted for every possible pattern, you run the risk of throwing an exception.

I don't have a single, concise question to ask, but my head is swimming. What was the motivation for adding this strange exception system to an otherwise pure and elegant language? Is it still pure but I'm just missing something? If I want to take advantage of this exception feature, how would I go about doing it (ie how do I catch and handle exceptions? is there anything else I can do with them?) For example, if ever I write code that uses the "head" function, surely I should take precautions for the case where an empty list somehow smuggles itself in.

回答1:

You are confusing two concepts: purity and totality.

Purity says that functions have no side effects.
Totality says that every function terminates and produces a value.

Haskell is pure, but is not total.

Outside of IO, nontermination (e.g., let loop = loop in loop) and exceptions (e.g., error "urk!") are the same – nonterminating and exceptional terms, when forced, do not evaluate to a value. The designers of Haskell wanted a Turing-complete language, which – as per the halting problem – means that they forwent totality. And once you have nontermination, I suppose you might as well have exceptions, too – defining error msg = error msg and having calls to error do nothing forever is much less satisfying in practice than actually seeing the error message you want in finite time!

In general, though, you're right – partial functions (those which are not defined for every input value, like head) are ugly. Modern Haskell generally prefers writing total functions instead by returning Maybe or Either values, e.g.

safeHead :: [a] -> Maybe a
safeHead []    = Nothing
safeHead (x:_) = Just x

errHead :: [a] -> Either String a
errHead []    = Left "Prelude.head: empty list"
errHead (x:_) = Right x

In this case, the Functor, Applicative, Monad, MonadError, Foldable, Traversable, etc., machinery makes combining these total functions and working with their results easy.

Should you actually come across an exception in your code – for instance, you might use error to check a complicated invariant in your code that you think you've enforced, but you have a bug – you can catch it in IO. Which returns to the question of why it's OK to interact with exceptions in IO – doesn't that make the language impure? The answer is the same as that to the question of why we can do I/O in IO, or work with mutable variables – evaluating a value of type IO A doesn't produce the side effects that it describes, it's just an action that describes what a program could do. (There are better descriptions of this elsewhere on the internet; exceptions aren't any different than other effects.)

(Also, note that there is a separate-but-related exception system in IO, which is used when e.g. trying to read a file that isn't there. People are often OK with this exception system, in moderation, because since you're in IO you're already working with impure code.)

回答2:

For example, if ever I write code that uses the "head" function, surely I should take precautions for the case where an empty list somehow smuggles itself in.

A simpler solution: don't use head. There are plenty of replacements: listToMaybe from Data.Maybe, the various alternative implementations in the safe package, etc. The partial functions [1] in the base libraries -- specially ones as easy to replace as head -- are little more than historical cruft, and should be either ignored or replaced by safe variants, such as those in the aforementioned safe package. For further arguments, here is an entirely reasonable rant about partial functions.

If I want to take advantage of this exception feature, how would I go about doing it (ie how do I catch and handle exceptions? is there anything else I can do with them?)

Exceptions of the sort thrown by error can only be caught in the IO monad. If you are writing pure functions you won't want to force your users to run them in the IO monad merely for catching exceptions. Therefore, if you ever use error in a pure function, assume the error will not be caught [2]. Ideally you shouldn't use error in pure code at all, but if you are somehow compelled to do so, at least make sure to write an informative error message (that is, not "Prelude.head: empty list") so that your users know what is going on when the program crashes.

If I remember correctly it's a similar story when there is a failure during pattern matching. It doesn't matter what the type signature says, if you haven't accounted for every possible pattern, you run the risk of throwing an exception.

Indeed. The only difference from using head to writing the incomplete pattern match (\(x:_) -> x) by yourself explicitly is that in the latter case the compiler will at least warn you if you use -Wall, while with head even that is swept under the rug.

I've always wondered how the Haskell exception system fits in with the whole "Pure functional language" thing.

Technically speaking, partial functions don't affect purity (which doesn't make them any less nasty, of course). From a theoretical point of view, head [] is just as undefined as things like foo = let x = x in x. (The keyword for further reading into such subtleties is "bottom".)

[1]: Partial functions are functions that, just like head, are not defined for some values of the argument types they are supposed to take.

[2]: It is worth mentioning that exceptions in IO are a whole different issue, as you can't trivially avoid e.g. a file read failure just by using better functions. There are quite a few approaches towards handling such scenarios in a sensible way. If you are curious about the issue, here is one "highly opinionated" article about it that is illustrative of the relevant tools and trade-offs.

回答3:

Haskell does not require that your functions be total, and doesn't track when they're not. (Total functions are those that have a well defined output for every possible value of their input type)

Even without exceptions or pattern match failures, you can have a function that doesn't define output for some inputs by just going on forever. An example is length (repeat 1). This continues to compute forever, but never actually throws an error.

The way Haskell semantics "copes" with this is to declare that there is an "extra" value in every single type; the so called "bottom value", and declare that any computation that doesn't properly complete and produce a normal value of its type actually produces the bottom value. It's represented by the mathematical symbol ⊥ (only when talking about Haskell; there isn't really any way in Haskell to directly refer to this value, but undefined is often also used since that is a Haskell name that is bound to an error-raising computation, and so semantically produces the bottom value).

This is a theoretical wart in the system, since it gives you the ability to create a 'value' of any type (albeit not a very useful one), and a lot of the reasoning about bits of code being correct based on types actually relies on the assumption that you can't do exactly that (if you're into the Curry-Howard isomorphism between pure functional programs and formal logic, the existence of ⊥ gives you the ability to "prove" logical contradictions, and thus to prove absolutely anything at all).

But in practice it seems to work out that all the reasoning done by pretending that ⊥ doesn't exist in Haskell still generally works well enough to be useful when you're writing "well-behaved" code that doesn't use ⊥ very much.

The main reason for tolerating this situation in Haskell is ease-of-use as a programming language rather than a system of formal logic or mathematics. It's impossible to make a compiler that could actually tell of arbitrary Haskell-like code whether or not each function is total or partial (see the Halting Problem). So a language that wanted to enforce totality would have to either remove a lot of the things you can do, or require you to jump through lots of hoops to demonstrate that your code always terminates, or both. The Haskell designers didn't want to do that.

So given that Haskell as a language is resigned to partiality and ⊥, it may as well give you things like error as a convenience. After all, you could always write a error :: String -> a function by just not terminating; getting an immediate printout of the error message rather than having the program just spin forever is a lot more useful to practicing programmers, even if those are both equivalent in the theory of Haskell semantics!

Similarly, the original designers of Haskell decided that implicitly adding a catch-all case to every pattern match that just errors out would be more convenient than forcing programmers to add the error case explicitly every time they expect a part of their code to only ever see certain cases. (Although a lot of Haskell programmers, including me, work with the incomplete-pattern-match warning and almost always treat it as an error and fix their code, and so would probably prefer the original Haskell designers went the other way on this one).

TLDR; exceptions from error and pattern match failure are there for convenience, because they don't make the system any more broken than it already has to be, without being quite a different system than Haskell.

You can program by throwing and catch exceptions if you really want, including catching the exceptions from error or pattern match failure, by using the facilities from Control.Exception.

In order to not break the purity of the system you can raise exceptions from anywhere (because the system always has to deal with the possibility of a function not properly terminating and producing a value; "raising an exception" is just another way in which that can happen), but exceptions can only be caught by constructs in IO. Because the formal semantics of IO permit basically anything to happen (because it has to interface with the real world and there aren't really any hard restrictions we can impose on that from the definition of Haskell), we can also relax most of the rules we usually need for pure functions in Haskell and still have something that technically fits in the model of Haskell code being pure.

I haven't used this very much at all (usually I prefer to keep my error handling using things that are more well-defined in terms of Haskell's semantic model than the operational model of what IO does, which can be as simple as Maybe or Either), but you can read about it if you want to.

来源：https://stackoverflow.com/questions/40076851/how-does-the-presence-of-the-error-function-bear-on-the-purity-of-haskell

标签

haskell

exception

error-handling

functional-programming

purely-functional