Parsing regex with alternatives and optionals

╄→尐↘猪︶ㄣ 提交于 2019-12-24 03:37:08

问题


I'm building a chatbot subset of RiveScript and trying to build the pattern matching parser with regular expression. Which three regexes match the following three examples?

ex1: I am * years old
valid match:
- "I am 24 years old"
invalid match:
- "I am years old"

ex2: what color is [my|your|his|her] (bright red|blue|green|lemon chiffon) *
valid matches:
- "what color is lemon chiffon car"
- "what color is my some random text till the end of string"

ex3: [*] told me to say *
valid matches:
- "Bob and Alice told me to say hallelujah"
- "told me to say by nobody"

The wildcards mean any text that is not empty is acceptable.

In example 2, anything between [ ] is optional, anything between ( ) is alternative, each option or alternative is separated by a |.

In example 3, the [*] is an optional wildcard, meaning blank text can be accepted.


回答1:


  1. https://regex101.com/r/CuZuMi/4

    I am (?:\d+) years old
    
  2. https://regex101.com/r/CuZuMi/2

    what color is.*(?:my|your|his|her).*(?:bright red|blue|green|lemon chiffon)?.*
    
  3. https://regex101.com/r/CuZuMi/3

    .*told me to say.*
    

I am using mostly 2 things:

  1. (?:) non-capture groups, to group things together like the parenthesis use on math.
  2. .* match any character 0 or more times. Could be replaced by {1,3} to match between 1 and 3 times.

You can exchange * by + to match at least 1 character, instead of 0. And the ? after the non-capture group, makes that group optional.


These are golden place for you to start:

  1. http://www.rexegg.com/regex-quickstart.html
  2. https://regexone.com/
  3. http://www.regular-expressions.info/quickstart.html
  4. Reference - What does this regex mean?


来源:https://stackoverflow.com/questions/42464120/parsing-regex-with-alternatives-and-optionals

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!