问题
I want to be able to match the following examples and return array of matches
given text:
some word
another 50.00
some-more 10.10 text
another word
Matches should be (word, followed by space then decimal number (Optionally followed by another word):
another 50.00
some-more 10.10 text
I have the following so far:
string pat = @"\r\n[A-Za-z ]+\d+\.\d{1,2}([A-Za-z])?";
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
Match m = r.Match(input);
but it only matches first item: another 50.00
回答1:
You do not account for - with [A-Za-z ] and only match some text after a newline.
You can use the following regex:
[\p{L}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}-]+)?
See the regex demo
The [\p{L}-]+ matches 1 or more letters and hyphens, \p{Zs}* matches 0 or more horizontal whitespace symbols, \d*\.?\d{1,2} matches a float number with 1 to 2 digits in the decimal part, and (?:\p{Zs}*[\p{L}-]+)? matches an optional word after the number.
Here is a C# snippet matching all occurrences based on Regex.Matches method:
var res = Regex.Matches(str, @"[\p{L}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}-]+)?")
.Cast<Match>()
.Select(p => p.Value)
.ToList();
Just FYI: if you need to match whole words, you can also use word boundaries \b:
\b[\p{L}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}-]+)?\b
And just another note: if you need to match diacritics, too, you may add \p{M} to the character class containing \p{L}:
[\p{L}\p{M}-]+\p{Zs}*\d*\.?\d{1,2}(?:\p{Zs}*[\p{L}\p{M}-]+)?\b
来源:https://stackoverflow.com/questions/33786590/regex-match-word-followed-by-decimal-from-text