Is this idiomatic use of Text.Parsec?

╄→尐↘猪︶ㄣ 提交于 2019-12-12 12:25:16

问题


My use of Text.Parsec is a little rusty. If I just want to return the matched string is this idiomatic?

category :: Stream s m Char => ParsecT s u m [Char]                        
category = concat <$> (many1 $ (:) <$> char '/' <*> (many1 $ noneOf "/\n"))

I feel like there might be an existing operator for liftM concat . many1 or (:) <$> p1 <*> p2 that I'm ignoring, but I'm not sure.


回答1:


That's fine, I think. A little judicious naming would make it prettier:

category = concat <$> many1 segment
  where
    segment = (:) <$> char '/' <*> many1 (noneOf "/\n")



回答2:


I think it would be slightly more idiomatic use of Parsec to return something more structured, for example, the list of strings:

catList :: Parser [String]    
catList = char '/' *> many1 alphaNum `sepBy1` char '/'

I don't think there's a combinator like the one you were wondering there was, but this is Haskell, and roll-your-own-control-structure-or-combinator is always available:

concatMany1 :: Parser [a] -> Parser [a]
concatMany1 p = concat <$> many1 p

catConcat = concatMany1 $ (:) <$> char '/' <*> many1 alphaNum

But this next combinator is even nicer, and definitely idiomatic Haskell at least:

infixr 5 <:>
(<:>) :: Applicative f => f a -> f [a] -> f [a]
hd <:> tl = (:) <$> hd <*> tl

So now we can write

catCons :: Parser String
catCons = concatMany1 (char '/' <:> many1 alphaNum)

but incidentally also

contrivedExample :: IO String
contrivedExample = getChar <:> getLine

moreContrived :: String -> Maybe String
moreContrived name = find isLetter name <:> lookup name symbolTable

noneOf

You'll notice I've used alphaNum where you used noneOf "/\n". I think noneOf is not good practice; parsers should be really careful to accept onlt the right thing. Are you absolutely sure you want your parser to accept /qwerty/12345/!"£$%^&*()@:?><.,#{}[] \/ "/" /-=_+~? Should it really be happy with /usr\local\bin?

As it stands, your parser accepts any string as long as it starts with / and ends before \n with something that's not /. I think you should rewrite it with alphaNum <|> oneOf "_-.',~+" or similar instead of using noneOf. Using noneOf allows you to avoid thinking about what you should allow and focus on getting positive examples to parse instead of only positive examples to parse.

Parser

I've also always gone for Parser a instead of Stream s m t => ParsecT s u m a. That's just lazy typing, but let's pretend I did it to make it clearer what my code was doing, shall we? :) Use what type signature suits you, of course.



来源:https://stackoverflow.com/questions/13670340/is-this-idiomatic-use-of-text-parsec

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!