Best way to Sanitize / Filter Comments from users?

两盒软妹~` 提交于 2019-12-03 21:48:20

Don't write your own HTML sanitizer. You'll create XSS holes.

If you're going to write your own, at least run the ha.ckers.org xss smoketests against it

Between those tests, and the htmlpurifier comparison of filters, you should be able to get a good idea of just how complicated html sanitization is -- and why you should leave it to the pros.

The most important thing when thinking about storing data to a database is to escape it ; using mysql_real_escape_string, or mysqli_real_escape_string, or PDO::quote, depending on the DB you're using (or other functions for oracle/pg/...)

Another solution would be to use prepared statements (see mysqli::prepare and/or PDO::prepare -- those are not supported by the old mysql_* extension), which will deal with escaping data at your place ;-)


When thinking about HTML output, you have two solutions :

  • accept HTML and use some library like HTMLPurifier to filter/clean it ; it will allow to specify exactly which tags and attributes are allowed, and will give you clean and valid HTML as output.
  • try to remove HTML, like you are doinig -- not always working well (what if you forget some special case ? )
  • escape HTML, with htmlentities or htmlspecialchars : not necessarily looking nice, but the output will look like the input of the user.

I would go with either the first or the last solution ; yours feels more "dangerous" -- but that's only a feeling ^^ (the general idea being "do not reinvent the wheel")

Your magic quotes handling is fine, although if you create get parameters with quotes you need to stripslashes the keys too. :)

As for strip tags, you are better off with a real HTML filter library. There are so many twists and turns involved with html that you just should not trust anything you just make once and forget about. People spend time making those HTML filters so use their work to your advantage.

As for "straight into the DB", well in a bound parameters, sure, that's great. You can safely put anything into a bound parameter. In a string with quotes, I hope you are escaping the result.

Escape all characters when puting it in database. When retrieving and displaying make sure to escape html formating such as <sometag> so it displays instead of being treated as code.

PHP has little known but powerful built in sanitation functions. I would recommend using them:

Input filtering in PHP

filter_input and filter_var

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!