Markdown and XSS

前端未结

关注

 5  1686

萌比男神i 2020-12-31 10:20

Ok, so I have been reading about markdown here on SO and elsewhere and the steps between user-input and the db are usually given as

convert markdown to html

5条回答

刺人心 (楼主)

2020-12-31 10:50
1. convert markdown to html
2. sanitize html (w/whitelist)
3. insert into database
Here, the assumptions are
1. Given dangerous HTML, the sanitizer can produce safe HTML.
2. The definition of safe HTML will not change, so if it is safe when I insert it into the DB, it is safe when I extract it.
1. sanitize markdown (remove all tags - no exceptions)
2. convert to html
3. insert into database
Here the assumptions are
1. Given dangerous markdown, the sanitizer can produce markdown that when converted to HTML by a different program will be safe.
2. The definition of safe HTML will not change, so if it is safe when I insert it into the DB, it is safe when I extract it.
The markdown sanitizer has to know not just about dangerous HTML and dangerous markdown, but how the markdown->HTML converter does its job. That makes it more complex, and more likely to be wrong than the simpler unsafeHTML->safeHTML function above.

As a concrete example, "remove all tags" assumes you can identify tags, and would not work against UTF-7 attacks. There might be other encoding attacks out there that render this assumption moot, or there might be a bug that causes the markdown->HTML program to convert (full-width '<', exotic white-space characters stripped by markdown, SCRIPT) into a