C# Convert Relative to Absolute Links in HTML String

前端 未结 10 1676
余生分开走
余生分开走 2020-12-16 04:03

I\'m mirroring some internal websites for backup purposes. As of right now I basically use this c# code:

System.Net.WebClient client = new System.Net.WebCli         


        
10条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-16 04:46

    I know this is an older question, but I figured out how to do it with a fairly simple regex. It works well for me. It handles http/https and also root-relative and current directory-relative.

    var host = "http://www.google.com/";
    var baseUrl = host + "images/";
    var html = "
    "; var regex = "(?<=(?:href|src)=\")(?!https?://)(?[^\"]+)"; html = Regex.Replace( html, regex, match => match.Groups["url"].Value.StartsWith("/") ? host + match.Groups["url"].Value.Substring(1) : baseUrl + match.Groups["url"].Value);

提交回复
热议问题