Can I mix character classes in Python RegEx?

后端 未结 4 428
长情又很酷
长情又很酷 2021-01-12 19:04

Special sequences (character classes) in Python RegEx are escapes like \\w or \\d that matches a set of characters.

In my case, I need to b

4条回答
  •  佛祖请我去吃肉
    2021-01-12 19:25

    I don't think you can directly combine (boolean and) character sets in a single regex, whether one is negated or not. Otherwise you could simply have combined [^\d] and \w.

    Note: the ^ has to be at the start of the set, and applies to the whole set. From the docs: "If the first character of the set is '^', all the characters that are not in the set will be matched.". Your set [\w^\d] tries to match an alpha-numerical character, followed by a caret, followed by a digit. I can imagine that doesn't match anything either.

    I would do it in two steps, effectly combining the regular expressions. First match by non-digits (inner regex), then match by alpha-numerical characters:

    re.search('\w+', re.search('([^\d]+)', s).group(0)).group(0)
    

    or variations to this theme.

    Note that would need to surround this with a try: except: block, as it will throw an AttributeError: 'NoneType' object has no attribute 'group' in case one of the two regexes fails. But you can, of course, split this single line up in a few more lines.

提交回复
热议问题