Regex options matching multi-line as well as ignoring the case

时光总嘲笑我的痴心妄想 提交于 2019-12-04 04:05:19

问题


I have some piece of ill-formed html, sometimes the " is missing. Also, it sometimes shows capital cases while other times lower cases:

<DIV class="main">
    <DIV class="subsection1">
   <H2>
   <DIV class=subwithoutquote>StackOverflow</DIV></H2></DIV></DIV>

I would like to match both multi-line and ignoring the case. But the following patern does not seem to be working. (For the concatenation, I also tried | instead of &)

const string pattern = @"<div class=""?main""?><div class=""?subsection1""?><h2><div class=""?subwithoutquote""?>(.+?)</div>";
Match m = Regex.Match(html, pattern, RegexOptions.IgnoreCase & RegexOptions.Singleline);

Or should I add \n* in the pattern to solve multi-line issue?


回答1:


The first problem is that you are not allowing for white-space in your regex between tabs. The correct regex (tested in Rubular) is:

<div class=""?main""?>\s*<div class=""?subsection1""?>\s*<h2>\s*<div class=\"?subwithoutquote\"?>(.+?)<\/div>\s*

Notice the addition of several \s* entries.

The second problem is that you're not concatenating the options properly.

Your code:

Match m = Regex.Match(html, pattern, RegexOptions.IgnoreCase & RegexOptions.Singleline);

Since these are bit flags, Bitwise-And (& operator) is a wrong flag. What you want is Bitwise-Or (| operator).

Bitwise-And means "if the bit is set in both of these, leave it set; otherwise, unset it. You need Bitwise-Or, which means "if the bit is set in either of these, set it; otherwise, unset it."




回答2:


You need to OR them together in this case.

const string pattern = @"<div class=""?main""?><div class=""?subsection1""?><h2><div class=""?subwithoutquote""?>(.+?)</div>";
Match m = Regex.Match(html, pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline)

Edit: Change your RegEx to the following ...

const string pattern = @"<div class="?main"?>\s*<div class="?subsection1"?>\*+<h2>\s*<div class="?subwithoutquote"?>(.+?)</div>


来源:https://stackoverflow.com/questions/14611495/regex-options-matching-multi-line-as-well-as-ignoring-the-case

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!