问题
Is there a regex pattern for .NET that will match any character that will result in multiple lines, i.e. any vertical whitespace character, like perl regex does with \v
? In other words, is there a way to match \r
(carriage return), \n
(line feed), \v
(vertical tab), and \f
(form feed) as well as the Unicode characters U+0085
(next line), U+2028
(line separator), and U+2029
(paragraph separator) and any other characters I'm not aware of that might result in more than one line?
I'm writing some validation code in .NET that will fail if a user has provided input text that contains more than one line. In most cases, that means I just have to check for \r
and \n
. However, I know there is a multitude of other vertical whitespace characters.
I know .NET regex differs from perl regex, most importantly in that \v
in .NET matches "vertical tab" whereas it matches "vertical whitespace" in perl regex.
回答1:
As you say, the Perl character class \v
matches [\x0A-\x0D]
(linefeed, vertical tab, form feed and carriage-return (although I would dispute that CR is vertical white space)) in addition to the non-ASCII code points [\x{2028}\x{2029}]
(line separator and paragraph separator).
You can hand-build this character class in .NET like this
[\u0A-\u0D\u2028\u2029]
回答2:
If one wants to match any unknowns simply us the not set [^ ]
(at least in .Net, my perl is a little hazy) to match up to a specific character. For example if I wanted to match whitespace between from a current position across a line to the next line which starts with the letter D
I would use this
([^D]+)
So the match capture would include every type of whitespace there is up to the letter D.
来源:https://stackoverflow.com/questions/28743851/regular-expression-to-match-any-vertical-whitespace