问题
I've got the following, which type-checks:
p_int = liftA read (many (char ' ') *> many1 digit <* many (char ' '))
Now, as the function name implies, I want it to give me an Int. But if I do this:
p_int = liftA read (many (char ' ') *> many1 digit <* many (char ' ')) :: Int
I get this type error:
Couldn't match expected type `Int' with actual type `f0 b0'
In the return type of a call of `liftA'
In the expression:
liftA read (many (char ' ') *> many1 digit <* many (char ' ')) ::
Int
In an equation for `p_int':
p_int
= liftA read (many (char ' ') *> many1 digit <* many (char ' ')) ::
Int
Is there a simpler, cleaner way to parse integers that may have whitespace? Or a way to fix this?
Ultimately, I want this to be part of the following:
betaLine = string "BETA " *> p_int <*> p_int <*> p_int <*>
p_int <*> p_parallel <*> p_exposure <* eol
which is to parse lines that look like this:
BETA 6 11 5 24 -1 oiiio
So I can eventually call a BetaPair constructor which will need those values (some as Int, some as other types like [Exposure] and Parallel)
(if you're curious, this is a parser for a file format that represents, among other things, hydrogen-bonded beta-strand pairs in proteins. I have no control over the file format!)
回答1:
p_int
is a parser that produces an Int
, so the type would be Parser Int
or similar¹.
p_int = liftA read (many (char ' ') *> many1 digit <* many (char ' ')) :: Parser Int
Alternatively, you can type the read
function, (read :: String -> Int)
to tell the compiler which type the expression has.
p_int = liftA (read :: String -> Int) (many (char ' ') *> many1 digit <* many (char ' ')) :: Int
As for the cleaner ways, consider replacing many (char ' ')
with spaces
.
¹ ParsecT x y z Int
, for example.
回答2:
How do I get Parsec to let me call
read :: Int
?
A second answer is "Don't use read".
Using read
is equivalent to re-parsing data you have already parsed - so using it within a Parsec parser is a code smell. Parsing natural numbers is harmless enough, but read
has different failure semantics to Parsec and it is tailored to Haskell's lexical syntax so using it for more complicated number formats is problematic.
If you don't want to go to the trouble of defining a LanguageDef
and using Parsec's Token
module here is a natural number parser that doesn't use read:
-- | Needs @foldl'@ from Data.List and
-- @digitToInt@ from Data.Char.
--
positiveNatural :: Stream s m Char => ParsecT s u m Int
positiveNatural =
foldl' (\a i -> a * 10 + digitToInt i) 0 <$> many1 digit
回答3:
You may find
Text-Megaparsec-Lexer.integer :: MonadParsec s m Char => m Integer
does what you want.
The vanilla parsec library seems to be missing a number of obvious parsers, which has led to the rise of "batteries included" parsec derivative packages. I suppose the parsec maintainers will get around to betteries eventually.
https://hackage.haskell.org/package/megaparsec-4.2.0/docs/Text-Megaparsec-Lexer.html
UPDATE
or with vanilla parsec:
Prelude Text.Parsec Text.Parsec.Language Text.Parsec.Token> parse ( integer . makeTokenParser $ haskellStyle ) "integer" "-1234"
Right (-1234)
来源:https://stackoverflow.com/questions/10726085/how-do-i-get-parsec-to-let-me-call-read-int