How do I HTML Encode all the output in a web application?

前端 未结 11 1478
暖寄归人
暖寄归人 2020-12-10 17:20

I want to prevent XSS attacks in my web application. I found that HTML Encoding the output can really prevent XSS attacks. Now the problem is that how do I HTML encode every

相关标签:
11条回答
  • 2020-12-10 17:42

    You don't want to encode all HTML, you only want to HTML-encode any user input that you're outputting.

    For PHP: htmlentities and htmlspecialchars

    0 讨论(0)
  • 2020-12-10 17:42

    You could wrap echo / print etc. in your own methods which you can then use to escape output. i.e. instead of

    echo "blah";
    

    use

    myecho('blah');
    

    you could even have a second param that turns off escaping if you need it.

    In one project we had a debug mode in our output functions which made all the output text going through our method invisible. Then we knew that anything left on the screen HADN'T been escaped! Was very useful tracking down those naughty unescaped bits :)

    0 讨论(0)
  • 2020-12-10 17:43

    One thing that you shouldn't do is filter the input data as it comes in. People often suggest this, since it's the easiest solution, but it leads to problems.

    Input data can be sent to multiple places, besides being output as HTML. It might be stored in a database, for example. The rules for filtering data sent to a database are very different from the rules for filtering HTML output. If you HTML-encode everything on input, you'll end up with HTML in your database. (This is also why PHP's "magic quotes" feature is a bad idea.)

    You can't anticipate all the places your input data will travel. The safe approach is to prepare the data just before it's sent somewhere. If you're sending it to a database, escape the single quotes. If you're outputting HTML, escape the HTML entities. And once it's sent somewhere, if you still need to work with the data, use the original un-escaped version.

    This is more work, but you can reduce it by using template engines or libraries.

    0 讨论(0)
  • 2020-12-10 17:46

    there was a good essay from Joel on software (making wrong code look wrong I think, I'm on my phone otherwise I'd have a URL for you) that covered the correct use of Hungarian notation. The short version would be something like:

    Var dsFirstName, uhsFirstName : String;
    
    Begin
    
    uhsFirstName := request.queryfields.value['firstname'];
    
    dsFirstName := dsHtmlToDB(uhsFirstName);
    

    Basically prefix your variables with something like "us" for unsafe string, "ds" for database safe, "hs" for HTML safe. You only want to encode and decode where you actually need it, not everything. But by using they prefixes that infer a useful meaning looking at your code you'll see real quick if something isn't right. And you're going to need different encode/decode functions anyways.

    0 讨论(0)
  • 2020-12-10 17:51

    OWASP has a nice API to encode HTML output, either to use as HTML text (e.g. paragraph or <textarea> content) or as an attribute's value (e.g. for <input> tags after rejecting a form):

    encodeForHTML($input) // Encode data for use in HTML using HTML entity encoding
    encodeForHTMLAttribute($input) // Encode data for use in HTML attributes.
    

    The project (the PHP version) is hosted under http://code.google.com/p/owasp-esapi-php/ and is also available for some other languages, e.g. .NET.

    Remember that you should encode everything (not only user input), and as late as possible (not when storing in DB but when outputting the HTTP response).

    0 讨论(0)
提交回复
热议问题