Why are there so many different regular expression dialects?

前端 未结 4 1758
陌清茗
陌清茗 2020-12-06 05:00

I\'m wondering why there have to be so many regular expression dialects. Why does it seem like so many languages, rather then reusing a tried and true dialect, seem bent on

4条回答
  •  一整个雨季
    2020-12-06 05:51

    Because regular expressions only have three operations:

    • Concatenation
    • Union |
    • Kleene closure *

    Everything else is an extension or syntactic sugar, and so has no source for standardization. Things like capturing groups, backreferences, character classes, cardinality operations, etc are all additions to the original definition of regular expressions.

    Some of these extensions make "regular expressions" no longer regular at all. They are able to decide non-regular languages because of these extras, but we still call them regular expressions regardless.

    As people add more extensions, they will often try to use other, common variations of regular expressions. That's why nearly every dialect uses X+ to mean "one or many Xs", which itself is just a shortcut for writing XX*.

    But when new features get added, there's no basis for standardization, so someone has to make something up. If more than one group of designers come up with similar ideas at around the same time, they'll have different dialects.

提交回复
热议问题