RegEx - reusing subexpressions

后端 未结 6 1539
傲寒
傲寒 2020-11-28 12:56

Say I have a regex matching a hexadecimal 32 bit number:

([0-9a-fA-F]{1,8})

When I construct a regex where I need to match this multiple ti

6条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-28 13:29

    .NET regex does not support pattern recursion, and if you can use (?(?[0-9a-fA-F]{1,8}))\s*:\s*(?(\g)) in Ruby and PHP/PCRE (where hex is a "technical" named capturing group whose name should not occur in the main pattern), in .NET, you may just define the block(s) as separate variables, and then use them to build a dynamic pattern.

    Starting with C#6, you may use an interpolated string literal that looks very much like a PCRE/Onigmo subpattern recursion, but is actually cleaner and has no potential bottleneck when the group is named identically to the "technical" capturing group:

    C# demo:

    using System;
    using System.Text.RegularExpressions;
    
    public class Test
    {
        public static void Main()
        {
            var block = "[0-9a-fA-F]{1,8}";
            var pattern = $@"(?{block})\s*:\s*(?{block})";
            Console.WriteLine(Regex.IsMatch("12345678  :87654321", pattern));
        }
    }
    

    The $@"..." is a verbatim interpolated string literal, where escape sequences are treated as combinations of a literal backslash and a char after it. Make sure to define literal { with {{ and } with }} (e.g. $@"(?:{block}){{5}}" to repeat a block 5 times).

    For older C# versions, use string.Format:

    var pattern = string.Format(@"(?{0})\s*:\s*(?{0})", block);
    

    as is suggested in Mattias's answer.

提交回复
热议问题