Whitelisting, preventing XSS with WMD control in C#

北慕城南 提交于 2020-01-15 11:56:07

问题


Are there any problems with what I am doing here? This is my first time to deal with something like this, and I just want to make sure I understand all the risks, etc. to different methods.

I am using WMD to get user input, and I am displaying it with a literal control. Since it is uneditable once entered I will be storing the HTML and not the Markdown,

input = Server.HTMLEncode(stringThatComesFromWMDTextArea)

And then run something like the following for tags I want users to be able to use.

// Unescape whitelisted tags.
string output = input.Replace("&lt;b&gt;", "<b>").Replace("&lt;/b&gt;", "</b>")
                     .Replace("&lt;i&gt;", "<i>").Replace("&lt;/i&gt;", "</i>");

Edit Here is what I am doing currently:

 public static string EncodeAndWhitelist(string html)
 {
     string[] whiteList = { "b", "i", "strong", "img", "ul", "li" };
     string encodedHTML = HttpUtility.HtmlEncode(html);
     foreach (string wl in whiteList)
         encodedHTML = encodedHTML.Replace("&lt;" + wl + "&gt;", "<" + wl + ">").Replace("&lt;/" + wl + "&gt;", "</" + wl + ">");
     return encodedHTML;
 }
  1. Will what I am doing here keep me protected from XSS?
  2. Are there any other considerations that should be made?
  3. Is there a good list of normal tags to whitelist?

回答1:


If your requirements really are that basic that you can do such simple string replacements then yes, this is ‘safe’ against XSS. (However, it's still possible to submit non-well-formed content where <i> and <b> are mis-nested or unclosed, which could potentially mess up the page the content ends up inserted into.)

But this is rarely enough. For example currently <a href="..."> or <img src="..." /> are not allowed. If you wanted to allow these or other markup with attribute values in, you'd have a whole lot more work to do. You might then approach it with regex, but that gives you endless problems with accidental nesting and replacement of already-replaced content, seeing as how regex can't parse HTML, and that.

To solve both problems, the usual approach is to use an [X][HT]ML parser on the input, then walk the DOM removing all but known-good elements and attributes, then finally re-serialise to [X]HTML. The result is then guaranteed well-formed and contains only safe content.



来源:https://stackoverflow.com/questions/2104520/whitelisting-preventing-xss-with-wmd-control-in-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!