C# RegEx on a StreamReader will not return matches

假装没事ソ 提交于 2019-12-13 06:40:41

问题


I'm writing myself a simple screen scraping application to play around with the HTMLAgilityPack library, and after getting it to work on several different types of HtmlNodes, I figured I'd get fancy and throw in a Regex for Email addresses as well. The only problem is that the application never finds any matches, or maybe it is but not returning properly. This takes place even on sites known to contain email addresses. Can anyone spot what I'm doing wrong here?

      string url = String.Format("http://{0}", mainForm.Target);
      string reg = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}\b";
      try
            {
                WebClient wClient = new WebClient();
                Stream data = wClient.OpenRead(url);
                StreamReader read = new StreamReader(data);
                MatchCollection matches = Regex.Matches(read.ReadToEnd(), reg, RegexOptions.IgnoreCase|RegexOptions.Multiline);
                foreach (Match match in matches)
                {
                    textBox1.AppendText(match.ToString() + Environment.NewLine);
                }

回答1:


Use raw strings:

string reg = @"\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b";

Without that, \b becomes backspace. Also, your last period should be \., so it only matches a literal period.




回答2:


Check the string that is returned by read.ReadToEnd() and see if you can find email addresses in this string with your regex. I guess that your problem doesn't have anything to do with StreamReader.



来源:https://stackoverflow.com/questions/3433774/c-sharp-regex-on-a-streamreader-will-not-return-matches

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!