repeating multiple characters regex

前端 未结 5 446
春和景丽
春和景丽 2020-12-05 07:19

Is there a way using a regex to match a repeating set of characters? For example:

ABCABCABCABCABC

ABC{5}

I know that\'s wro

相关标签:
5条回答
  • 2020-12-05 07:53

    Parentheses "()" are used to group characters and expressions within larger, more complex regular expressions. Quantifiers that immediately follow the group apply to the whole group.

    (ABC){5}
    
    0 讨论(0)
  • 2020-12-05 08:01

    Enclose the regex you want to repeat in parentheses. For instance, if you want 5 repetitions of ABC:

    (ABC){5}
    

    Or if you want any number of repetitions (0 or more):

    (ABC)*
    

    Or one or more repetitions:

    (ABC)+
    

    edit to respond to update

    Parentheses in regular expressions do two things; they group together a sequence of items in a regular expression, so that you can apply an operator to an entire sequence instead of just the last item, and they capture the contents of that group so you can extract the substring that was matched by that subexpression in the regex.

    You can nest parentheses; they are counted from the first opening paren. For instance:

    >>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(0)
    '123 ABCDEF'
    >>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(1)
    'ABCDEF'
    >>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(2)
    'DEF'
    

    If you would like to avoid capturing when you are grouping, you can use (?:. This can be helpful if you don't want parentheses that you're just using to group together a sequence for the purpose of applying an operator to change the numbering of your matches. It is also faster.

    >>> re.search('[0-9]* (?:ABC(...))', '123 ABCDEF 456').group(1)
    'DEF'
    

    So to answer your update, yes, you can use nested capture groups, or even avoid capturing with the inner group at all:

    >>> re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(1)
    'ABCABCABCABCABC'
    >>> re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(2)
    'DEF'
    
    0 讨论(0)
  • 2020-12-05 08:03

    As to the update to the question-

    You can nest capture groups. The capture group index is incremented per open paren.

    (((ABC)*)(DEF)*)
    

    Feeding that regex ABCABCABCDEFDEFDEF, capture group 0 matches the whole thing, 1 is also the whole thing, 2 is ABCABCABC, 3 is ABC, and 4 is DEF (because the star is outside of the capture group).

    If you have variation inside a capture group and a repeat just outside, then things can get a little wonky if you're not expecting it...

    (a[bc]*c)*
    

    when fed abbbcccabbc will return the last match as capture group 1, in this example just the abbc, since the capture group gets reset with the repeat operator.

    0 讨论(0)
  • 2020-12-05 08:08

    ABC{5} matches ABCCCCC. To match 5 ABC's, you should use (ABC){5}. Parentheses are used to group a set of characters. You can also set an interval for occurrences like (ABC){3,5} which matches ABCABCABC, ABCABCABCABC, and ABCABCABCABCABC.

    (ABC){1,} means 1 or more repetition which is exactly the same as (ABC)+.

    (ABC){0,} means 0 or more repetition which is exactly the same as (ABC)*.

    0 讨论(0)
  • 2020-12-05 08:16

    (ABC){5} Should work for you

    0 讨论(0)
提交回复
热议问题