Regular Expression to get the SRC of images in C#

后端 未结 8 1522
情深已故
情深已故 2020-11-29 09:34

I\'m looking for a regular expression to isolate the src value of an img. (I know that this is not the best way to do this but this is what I have to do in this case)

8条回答
  •  清歌不尽
    2020-11-29 09:50

    I know you say you have to use regex, but if possible i would really give this open source project a chance: HtmlAgilityPack

    It is really easy to use, I just discovered it and it helped me out a lot, since I was doing some heavier html parsing. It basically lets you use XPATHS to get your elements.

    Their example page is a little outdated, but the API is really easy to understand, and if you are a little bit familiar with xpaths you will get head around it in now time

    The code for your query would look something like this: (uncompiled code)

     List imgScrs = new List();
     HtmlDocument doc = new HtmlDocument();
     doc.LoadHtml(htmlText);//or doc.Load(htmlFileStream)
     var nodes = doc.DocumentNode.SelectNodes(@"//img[@src]"); s
     foreach (var img in nodes)
     {
        HtmlAttribute att = img["src"];
        imgScrs.Add(att.Value)
     }
    

提交回复
热议问题