Regex with non-capturing group in C#

我的未来我决定 提交于 2019-12-05 00:35:26

mc[0].Captures is equivalent to mc[0].Groups[0].Captures. Groups[0] always refers to the whole match, so there will only ever be the one Capture associated with it. The part you're looking for is captured in group #1, so you should be using mc[0].Groups[1].Captures.

But your regex is designed to match the whole input in one attempt, so the Matches() method will always return a MatchCollection with only one Match in it (assuming the match is successful). You might as well use Match() instead:

  Match m = Regex.Match(source, jointPattern);
  if (m.Success)
  {
    foreach (Capture c in m.Groups[1].Captures)
    {
      Console.WriteLine(c.Value);
    }
  }

output:

1            0.000000E+00           0.975415E+01           0.616921E+01
2            0.000000E+00           0.000000E+00           0.000000E+00

I would just not use Regex for heavy lifting and parse the text.

var data = @"     JOINTS               DISPL.-X               DISPL.-Y               ROTATION


         1            0.000000E+00           0.975415E+01           0.616921E+01
         2            0.000000E+00           0.000000E+00           0.000000E+00";

var lines = data.Split('\r', '\n').Where(s => !string.IsNullOrWhiteSpace(s));
var regex = new Regex(@"(\S+)");

var dataItems = lines.Select(s => regex.Matches(s)).Select(m => m.Cast<Match>().Select(c => c.Value));

Why not just capture the values and ignore the rest. Here is a regex which gets the values.

string data = @"JOINTS DISPL.-X DISPL.-Y ROTATION
 1 0.000000E+00 0.975415E+01 0.616921E+01
 2 0.000000E+00 0.000000E+00 0.000000E+00";

string pattern = @"^
\s+
 (?<Joint>\d+)
\s+
 (?<ValX>[^\s]+)
\s+
 (?<ValY>[^\s]+)
\s+
 (?<Rotation>[^\s]+)";

var result = Regex.Matches(data, pattern, RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture)
                  .OfType<Match>()
                  .Select (mt => new
                  {
                    Joint = mt.Groups["Joint"].Value,
                    ValX  = mt.Groups["ValX"].Value,
                    ValY  = mt.Groups["ValY"].Value,
                    Rotation = mt.Groups["Rotation"].Value,
                  });
/* result is
IEnumerable<> (2 items)
Joint ValX ValY Rotation
1 0.000000E+00 0.975415E+01 0.616921E+01
2 0.000000E+00 0.000000E+00 0.000000E+00
*/

There's two problems: The repeating part (?:...) is not matching properly; and the .* is greedy and consumes all the input, so the repeating part never matches even if it could.

Use this instead:

JOINTS.*?[\r\n]+(?:\s*(\d+\s*\S*\s*\S*\s*\S*)[\r\n\s]*)*

This has a non-greedy leading part, ensures that the line-matching part starts on a new line (not in the middle of a title), and uses [\r\n\s]* in case the newlines are not exactly as you expect.

Personally, I would use regexes for this, but I like regexes :-) If you happen to know that the structure of the string will always be [title][newline][newline][lines] then perhaps it's more straightforward (if less flexible) to just split on newlines and process accordingly.

Finally, you can use regex101.com or one of the many other regex testing sites to help debug your regular expressions.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!