问题
My site allows site-users to write blog-posts
class BlogPost
{
[AllowHtml]
public string Content;
}
The site is created using a MVC5 Internet application template and uses bootstrap 3 for it's CSS. So I decided to use http://jhollingworth.github.io/bootstrap-wysihtml5 to take care of all the JavaScript Part of a Rich Text Editor.
It works like a charm. But in order to make the POST happen, I had to add the [AllowHtml]
attribute as in the code above. So now I'm scared of dangerous stuff that can get into the database and be in-turn displayed to all users.
I tried giving values like <script>alert("What's up?")</script>
etc in the form and it seemed to be fine... the text was displayed exactly the same way (<script>
became <script>
. But this conversion seemed to be done by the javascript plugin I used.
So I used fiddler to compose a POST request with the same script tag and this time, the page actually executed the JavaScript code.
Is there any way I can figure out vulnerable input like <script>
and even <a href="javascript:some_code">Link</a>
...?
回答1:
Unfortunately, you have to sanitize the HTML yourself. See these on how people did it:
- How to sanitize input from MCE in ASP.NET? - whitelist using Html Agility Pack
- .NET HTML Sanitation for rich HTML Input - blacklist using Html Agility Pack
An alternative to accepting HTML is to accept markdown or BBCode instead. Both of them are widely used (markdown is used by stackoverflow!) and eliminate the need to sanitize the input. There are rich editors available too.
Edit
I found that Microsoft Web Protection Library can sanitize HTML input through AntiXss.GetSafeHtml and AntiXss.GetSafeHtmlFragment. Documentation is really poor though and seems like you can't configure which tags are valid.
回答2:
I faced the same problem sanitizing wysihtml5 content on the server side. I was rather charmed by how wysihtml5 performed client side sanitation and implemented this using Html Agility Pack: HtmlRuleSanitizer on Github Also available as NuGet package.
The reason for not using Microsoft's AntiXss is that it's not possible to enforce more detailed rules like what to do with tags. This results in tags being completely deleted when it for example would make sense to preserve the textual content. In addition I wanted to have a white listing approach on everything (CSS, tags and attributes).
来源:https://stackoverflow.com/questions/19735214/allowing-only-certain-html-tags-as-user-input