What literal characters should be escaped in a regex?

前端 未结 5 1438
予麋鹿
予麋鹿 2020-11-30 01:47

I just wrote a regex for use with the php function preg_match that contains the following part:

[\\w-.]

To match any word char

5条回答
  •  执笔经年
    2020-11-30 02:43

    In many regex implementations, the following rules apply:

    Meta characters inside a character class are:

    • ^ (negation)
    • - (range)
    • ] (end of the class)
    • \ (escape char)

    So these should all be escaped. There are some corner cases though:

    • - needs no escaping if placed at the very start, or end of the class ([abc-] or [-abc]). In quite a few regex implementations, it also needs no escaping when placed directly after a range ([a-c-abc]) or short-hand character class ([\w-abc]). This is what you observed
    • ^ needs no escaping when it's not at the start of the class: [^a] means any char except a, and [a^] matches either a or ^, which equals: [\^a]
    • ] needs no escaping if it's the only character in the class: []] matches the char ]

提交回复
热议问题