Sanitising user input using Python

后端 未结 7 893
小鲜肉
小鲜肉 2020-12-04 07:36

What is the best way to sanitize user input for a Python-based web application? Is there a single function to remove HTML characters and any other necessary characters combi

7条回答
  •  一向
    一向 (楼主)
    2020-12-04 08:25

    Edit: bleach is a wrapper around html5lib which makes it even easier to use as a whitelist-based sanitiser.

    html5lib comes with a whitelist-based HTML sanitiser - it's easy to subclass it to restrict the tags and attributes users are allowed to use on your site, and it even attempts to sanitise CSS if you're allowing use of the style attribute.

    Here's now I'm using it in my Stack Overflow clone's sanitize_html utility function:

    http://code.google.com/p/soclone/source/browse/trunk/soclone/utils/html.py

    I've thrown all the attacks listed in ha.ckers.org's XSS Cheatsheet (which are handily available in XML format at it after performing Markdown to HTML conversion using python-markdown2 and it seems to have held up ok.

    The WMD editor component which Stackoverflow currently uses is a problem, though - I actually had to disable JavaScript in order to test the XSS Cheatsheet attacks, as pasting them all into WMD ended up giving me alert boxes and blanking out the page.

提交回复
热议问题