问题
My question is about Graham Hutton's book Programming in Haskell 1st Ed.
There is a parser created in section 8.4, and I am assuming anyone answering has the book or can see the link to slide 8 in the link above.
A basic parser called item
is described as:
type Parser a = String -> [(a, String)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
which is used with do
to define another parser p
(the do
parser)
p :: Parser (Char, Char)
p = do x <- item
item
y <- item
return (x,y)
the relevant bind definition is:
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case parse p inp of
[] -> []
[(v,out)] -> parse (f v) out
return
is defined as:
return :: a -> Parser a
return v = \inp -> [(v,inp)]
parse
is defined as:
parse :: Parser a -> String -> [(a,String)]
parse p inp = p inp
The program (the do
parser) takes a string and selects the 1st and 3rd characters and returns them in a tuple with the remainder of the string in a list, e.g., "abcdef"
produces [('a','c'), "def"]
.
I want to know how the
(f v) out
in
[(v,out)] -> parse (f v) out
returns a parser which is then applied to out
.
f
in thedo
parser isitem
anditem
taking a character'c'
returns[('c',[])]
?How can that be a parser and how can it take
out
as an argument?
Perhaps I am just not understanding what (f v)
does.
Also how does the
do
parser 'drop' the returned values each time to operate on the rest of the input string whenitem
is called again?What is the object that works its way through the
do
parser, and how is it altered at each step, and by what means is it altered?
回答1:
f v
produces a Parser b
because f
is a function of type a -> Parser b
and v
is a value of type a
. So then you're calling parse
with this Parser b
and the string out
as arguments.
F in the 'do' parser is item
No, it's not. Let's consider a simplified (albeit now somewhat pointless) version of your parser:
p = do x <- item
return x
This will desugar to:
p = item >>= \x -> return x
So the right operand of >>=
, i.e. f
, is \x -> return x
, not item
.
Also how does the 'do' parser 'drop' the returned values each time to operate on the rest of the input string when item is called again? What is the object that works its way through the 'do' parser and how is it altered and each step and by what means is it altered?
When you apply a parser it returns a tuple containing the parsed value and a string representing the rest of the input. If you look at item
for example, the second element of the tuple will be xs
which is the tail of the input string (i.e. a string containing all characters of the input string except the first). This second part of the tuple will be what's fed as the new input to subsequent parsers (as per [(v,out)] -> parse (f v) out
), so that way each successive parser will take as input the string that the previous parser produced as the second part of its output tuple (which will be a suffix of its input).
In response to your comments:
When you write "p = item >>= \x -> return x", is that the equivalent of just the first line "p = do x <- item"?
No, it's equivalent to the entire do
-block (i.e. do {x <- item; return x}
). You can't translate do
-blocks line-by-line like that. do { x <- foo; rest }
is equivalent to foo >>= \x -> do {rest}
, so you'll always have the rest of the do
-block as part of the right operand of >>=
.
but not how that reduces to simply making 'out' available as the input for the next line. What is parse doing if the next line of the 'do' parser is a the item parser?
Let's walk through an example where we invoke item
twice (this is like your p
, but without the middle item). In the below I'll use ===
to denote that the expressions above and below the ===
are equivalent.
do x <- item
y <- item
return (x, y)
=== -- Desugaring do
item >>= \x -> item >>= \y -> return (x, y)
=== -- Inserting the definition of >>= for outer >>=
\inp -> case parse item inp of
[] -> []
[(v,out)] -> parse (item >>= \y -> return (v, y)) out
Now let's apply this to the input "ab":
case parse item "ab" of
[] -> []
[(v,out)] -> parse (item >>= \y -> return (v, y)) out
=== Insert defintiion of `parse`
case item "ab" of
[] -> []
[(v,out)] -> parse (item >>= \y -> return (v, y)) out
=== Insert definition of item
case ('a', "b") of
[] -> []
[(v,out)] -> parse (item >>= \y -> return (v, y)) out
===
parse (item >>= \y -> return ('a', y)) out
Now we can expand the second >>=
the same we did the fist and eventually end up with ('a', 'b')
.
回答2:
The relevant advice is, Don't panic (meaning, don't rush it; or, take it slow), and, Follow the types.
First of all, Parser
s
type Parser a = String -> [(a,String)]
are functions from String
to lists of pairings of result values of type a
and the leftover Strings. That leftovers string will be used as input for next parsing step. That's the main thing about it here.
You are asking, in
p >>= f = \inp -> case (parse p inp) of
[] -> []
[(v,out)] -> parse (f v) out
how the
(f v)
in[(v,out)] -> parse (f v) out
returns a parser which is then applied toout
?
The answer is, f
's type says that it does so:
(>>=) :: Parser a -> (a -> Parser b) -> Parser b -- or, the equivalent
(>>=) :: Parser a -> (a -> Parser b) -> (String -> [(b,String)])
-- p f inp
We have f :: a -> Parser b
, so that's just what it does: applied to a value of type a
it returns a value of type Parser b
. Or equivalently,
f :: a -> (String -> [(b,String)]) -- so that
f (v :: a) :: String -> [(b,String)] -- and,
f (v :: a) (out :: String) :: [(b,String)]
So whatever is the value that parse p inp
produces, it must be what f
is waiting for to proceed. The types must "fit":
p :: Parser a -- m a
f :: a -> Parser b -- a -> m b
f <$> p :: Parser ( Parser b ) -- m ( m b )
p >>= f :: Parser b -- m b
or, equivalently,
p :: String -> [(a, String)]
-- inp v out
f :: a -> String -> [(b, String)]
-- v out
p >>= f :: String -> [(b, String)] -- a combined Parser
-- inp v2 out2
So this also answers your second question,
How can that be a parser and how can it take
out
as an argument?
The real question is, what kind of f
is it, that does such a thing? Where does it come from? And that's your fourth question.
And the answer is, your example in do
-notation,
p :: Parser (Char, Char)
p = do x <- item
_ <- item
y <- item
return (x,y)
by Monad laws is equivalent to the nested chain
p = do { x <- item
; do { _ <- item
; do { y <- item
; return (x,y) }}}
which is a syntactic sugar for the nested chain of Parser
bind applications,
p :: Parser (Char, Char) -- ~ String -> [((Char,Char), String)]
p = item >>= (\ x -> -- item :: Parser Char ~ String -> [(Char,String)]
item >>= (\ _ -> -- x :: Char
item >>= (\ y -> -- y :: Char
return (x,y) )))
and it is because the functions are nested that the final return
has access to both y
and x
there; and it is precisely the Parser
bind that arranges for the output leftovers string to be used as input to the next parsing step:
p = item >>= f -- :: String -> [((Char,Char), String)]
where
{ f x = item >>= f2
where { f2 _ = item >>= f3
where { f3 y = return (x,y) }}}
i.e. (under the assumption that inp
is a string of length two or longer),
parse p inp -- assume that `inp`'s
= (item >>= f) inp -- length is at least 2 NB.
=
let [(v, left)] = item inp -- by the def of >>=
in
(f v) left
=
let [(v, left)] = item inp
in
let x = v -- inline the definition of `f`
in (item >>= f2) left
=
let [(v, left)] = item inp
in let x = v
in let [(v2, left2)] = item left -- by the def of >>=, again
in (f2 v2) left2
=
..........
=
let [(x,left1)] = item inp -- x <- item
[(_,left2)] = item left1 -- _ <- item
[(y,left3)] = item left2 -- y <- item
in
[((x,y), left3)]
=
let (x:left1) = inp -- inline the definition
(_:left2) = left1 -- of `item`
(y:left3) = left2
in
[((x,y), left3)]
=
let (x:_:y:left3) = inp
in
[((x,y), left3)]
after few simplifications.
And this answers your third question.
回答3:
I am having similar problems reading the syntax, because it's not what we are used to.
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case parse p inp of
[] -> []
[(v,out)] -> parse (f v) out
so for the question: I want to know how the (f v) out in [(v,out)] -> parse (f v) out returns a parser which is then applied to out.
It does because thats the siganture of the 2nd arg (the f): (>>=) :: Parser a -> (a -> Parser b) -> Parser b .... f takes an a and produces a Parser b. a Parser b takes a String which is the out ... (f v) out
But the output of this should not be mixed up with the output of the function we are writing: >>=
- We are outputing a parser ... (>>=) :: Parser a -> (a -> Parser b) -> Parser b .
- The Parser we are outputing has the job of wrapping and chaining the first 2 args
A parser is a function that takes 1 arg. This is constructed right after the first = ... ie by returning an (anonymous) function: p >>= f = \inp -> ... so inp refers to the input string of the Parser we are building
so what is left is to define what that constructed function should do ... NOTE: we are not implementing any of the input parsers just chaining them together ... so the ouput Parser function should:
- apply the input parser (p) to the its input (inp): p >>= f = \inp -> case parse p inp of
- take the output of that parse [(v, out)] v is the result out is what remains of the input
- apply the input function (f is (a -> Parser b)) to the parsed result (v)
- f produces a Parser b (a function that takes 1 arg)
- so apply that output parser to the remainder of the first parser (out)
For me the understanding lies in the use of destructuring and the realisation that we are constructing a function that glues together the execution of other functions together simply considering their interface
Hope that helps ... it helped me to write it :-)
来源:https://stackoverflow.com/questions/52497671/haskell-graham-hutton-book-parsing-ch-8-what-does-parse-f-v-out-do-and