Regex match word followed by decimal from text

纵饮孤独 提交于 2021-01-27 15:12:39

问题


I want to be able to match the following examples and return array of matches

given text:

some word
another 50.00 
some-more 10.10 text
another word

Matches should be (word, followed by space then decimal number (Optionally followed by another word):

another 50.00 
some-more 10.10 text

I have the following so far:

     string pat = @"\r\n[A-Za-z ]+\d+\.\d{1,2}([A-Za-z])?";
        Regex r = new Regex(pat, RegexOptions.IgnoreCase);
        Match m = r.Match(input);

but it only matches first item: another 50.00


回答1:


You do not account for - with [A-Za-z ] and only match some text after a newline.

You can use the following regex:

[\p{L}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}-]+)?

See the regex demo

The [\p{L}-]+ matches 1 or more letters and hyphens, \p{Zs}* matches 0 or more horizontal whitespace symbols, \d*\.?\d{1,2} matches a float number with 1 to 2 digits in the decimal part, and (?:\p{Zs}*[\p{L}-]+)? matches an optional word after the number.

Here is a C# snippet matching all occurrences based on Regex.Matches method:

var res = Regex.Matches(str, @"[\p{L}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}-]+)?")
              .Cast<Match>()
              .Select(p => p.Value)
              .ToList();

Just FYI: if you need to match whole words, you can also use word boundaries \b:

\b[\p{L}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}-]+)?\b

And just another note: if you need to match diacritics, too, you may add \p{M} to the character class containing \p{L}:

[\p{L}\p{M}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}\p{M}-]+)?\b


来源:https://stackoverflow.com/questions/33786590/regex-match-word-followed-by-decimal-from-text

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!