C# Convert Relative to Absolute Links in HTML String

前端 未结 10 1648
余生分开走
余生分开走 2020-12-16 04:03

I\'m mirroring some internal websites for backup purposes. As of right now I basically use this c# code:

System.Net.WebClient client = new System.Net.WebCli         


        
10条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-16 04:56

    Uri WebsiteImAt = new Uri(
           "http://www.w3schools.com/media/media_mimeref.asp?q=1&s=2,2#a");
    string href = new Uri(WebsiteImAt, "/something/somethingelse/filename.asp")
           .AbsoluteUri;
    string href2 = new Uri(WebsiteImAt, "something.asp").AbsoluteUri;
    string href3 = new Uri(WebsiteImAt, "something").AbsoluteUri;
    

    which with your Regex-based approach is probably (untested) mappable to:

            String value = Regex.Replace(text, "<(.*?)(src|href)=\"(?!http)(.*?)\"(.*?)>", match => 
                "<" + match.Groups[1].Value + match.Groups[2].Value + "=\""
                    + new Uri(WebsiteImAt, match.Groups[3].Value).AbsoluteUri + "\""
                    + match.Groups[4].Value + ">",RegexOptions.IgnoreCase | RegexOptions.Multiline);
    

    I should also advise not to use Regex here, but to apply the Uri trick to some code using a DOM, perhaps XmlDocument (if xhtml) or the HTML Agility Pack (otherwise), looking at all //@src or //@href attributes.

提交回复
热议问题