Regex including what is supposed to be non-capturing group in result

前端 未结 3 1344
耶瑟儿~
耶瑟儿~ 2020-12-12 04:35

I have the following simple test where i\'m trying to get the Regex pattern such that it yanks the executable name without the \".exe\" suffix.
 
It appears my n

3条回答
  •  鱼传尺愫
    2020-12-12 05:11

    You're using a non-capturing group. The emphasis is on the word group here; the group does not capture the .exe, but the regex in general still does.

    You're probably wanting to use a positive lookahead, which just asserts that the string must meet a criteria for the match to be valid, though that criteria is not captured.

    In other words, you want (?=, not (?:, at the start of your group.

    The former is only if you are enumerating the Groups property of the Match object; in your case, you're just using the Value property, so there's no distinction between a normal group (\.exe) and a non-capturing group (?:\.exe).

    To see the distinction, consider this test program:

    static void Main(string[] args)
    {
        var positiveInput = "\"D:\\src\\repos\\myprj\\bin\\Debug\\MyApp.exe\" /?";
        Test(positiveInput, @"[^\\]+(\.exe)");
        Test(positiveInput, @"[^\\]+(?:\.exe)");
        Test(positiveInput, @"[^\\]+(?=\.exe)");
    
        var negativeInput = "\"D:\\src\\repos\\myprj\\bin\\Debug\\MyApp.dll\" /?";
        Test(negativeInput, @"[^\\]+(?=\.exe)");
    }
    
    static void Test(String input, String pattern)
    {
        Console.WriteLine($"Input: {input}");
        Console.WriteLine($"Regex pattern: {pattern}");
    
        var match = Regex.Match(input, pattern, RegexOptions.IgnoreCase);
    
        if (match.Success)
        {
            Console.WriteLine("Matched: " + match.Value);
            for (int i = 0; i < match.Groups.Count; i++)
            {
                Console.WriteLine($"Groups[{i}]: {match.Groups[i]}");
            }
        }
        else
        {
            Console.WriteLine("No match.");
        }
        Console.WriteLine("---");
    }
    

    The output of this is:

    Input: "D:\src\repos\myprj\bin\Debug\MyApp.exe" /?
    Regex pattern: [^\\]+(\.exe)
    Matched: MyApp.exe
    Groups[0]: MyApp.exe
    Groups[1]: .exe
    ---
    Input: "D:\src\repos\myprj\bin\Debug\MyApp.exe" /?
    Regex pattern: [^\\]+(?:\.exe)
    Matched: MyApp.exe
    Groups[0]: MyApp.exe
    ---
    Input: "D:\src\repos\myprj\bin\Debug\MyApp.exe" /?
    Regex pattern: [^\\]+(?=\.exe)
    Matched: MyApp
    Groups[0]: MyApp
    ---
    Input: "D:\src\repos\myprj\bin\Debug\MyApp.dll" /?
    Regex pattern: [^\\]+(?=\.exe)
    No match.
    ---
    

    The first regex (@"[^\\]+(\.exe)") has \.exe as just a normal group. When we enumerate the Groups property, we see that .exe is indeed a group captured in our input. (Note that the entire regex is itself a group, hence Groups[0] is equal to Value).

    The second regex (@"[^\\]+(?:\.exe)") is the one provided in your question. The only difference compared to the previous scenario is that the Groups property doesn't contain .exe as one of its entries.

    The third regex (@"[^\\]+(?=\.exe)") is the one I'm suggesting you use. Now, the .exe part of the input isn't captured by the regex at all, but a regex won't match a string unless it ends in .exe, as the fourth scenario illustrates.

提交回复
热议问题