Iterate through an html string to find all img tags and replace the src attribute values

前端 未结 2 1100
你的背包
你的背包 2021-01-01 04:59

I have an html code as a string. I need to find all img tags in that string, read the value of each src attribute and pass it to a function, that function returns an entire

2条回答
  •  抹茶落季
    2021-01-01 05:37

    If I understand your need correctly you can use HtmlAgilityPack for this purpose. Using regex may cause unwanted behavior. Can you try the code below ?

    public static string DoIt()
    {
            string htmlString = "";
            using (WebClient client = new WebClient())
                htmlString = client.DownloadString("http://dean.edwards.name/my/base64-ie.html"); //This is an example source for base64 img src, you can change this directly to your source.
    
            HtmlDocument document = new HtmlDocument();
            document.LoadHtml(htmlString);
            document.DocumentNode.Descendants("img")
                                .Where(e =>
                                {
                                    string src = e.GetAttributeValue("src", null) ?? "";
                                    return !string.IsNullOrEmpty(src) && src.StartsWith("data:image");
                                })
                                .ToList()
                                .ForEach(x =>
                                {
                                    string currentSrcValue = x.GetAttributeValue("src", null);
                                    currentSrcValue = currentSrcValue.Split(',')[1];//Base64 part of string
                                    byte[] imageData = Convert.FromBase64String(currentSrcValue);
                                    string contentId = Guid.NewGuid().ToString();
                                    LinkedResource inline = new LinkedResource(new MemoryStream(imageData), "image/jpeg");
                                    inline.ContentId = contentId;
                                    inline.TransferEncoding = TransferEncoding.Base64;
    
                                    x.SetAttributeValue("src", "cid:" + inline.ContentId);
                                });
    
    
            string result = document.DocumentNode.OuterHtml;
    }
    

    You can retrieve HtmlAgilityPack from https://www.nuget.org/packages/HtmlAgilityPack

    Hope this helps

提交回复
热议问题