问题
Edit: tchrist has informed me that my original accusations about Perl's insecurity are unfounded. However, the question still stands.
I know that in Perl, you can embed arbitrary code in a regular expression, so obviously accepting a user-supplied regex and matching it allows arbitrary code execution and is a clear security hole. But is this true for all languages that use regular expressions? Is it true for all languages that use "Perl-compatible" regular expressions? In which languages are user-supplied regexes safe to use, and in which languages do they allow arbitrary code execution or other security holes?
回答1:
In most languages allowing users to supply regular expression means that you allow for a denial of service attack.
Some types of regular expressions are extremely cpu intensive to execute. So in general it's a bad idea to allow users to enter regular expressions that will be executed on a remote system.
For more info, read this page: http://www.regular-expressions.info/catastrophic.html
回答2:
This is not true: you cannot execute code callbacks in Perl by sneaking them in an evaluated regex. This is forbidden. You have to specifically override that with a lexically scoped
use re "eval";
if you expect to have both interpolation and code escapes happening in the same pattern.
Watch:
% perl -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
Eval-group not allowed at runtime, use re 'eval' in regex m/(?{ die naughty })/ at -e line 1.
Exit 255
% perl -Mre=eval -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
naughty at (re_eval 1) line 1.
Exit 255
回答3:
It's generally dynamic languages with an eval
facility that tend to have the ability to execute code from regular expressions. In static languages (i.e. those requiring a separate compilation step) there is generally no way to execute code that wasn't compiled, so evaluating code from within a regex is impossible.
Without a way to embed code in a regex, the worst a user can do is write a regex that takes a long time to evaluate.
回答4:
1)Vulnerabilities are found in regex libraries, such as this buffer overflow that affects Webkit and allows any attacker to gain remote code execution by accessing the regex library from javascript.
2)It is a DoS condition in C#.
3)User supplied regex's can be for php because of modifiers. Adding the /e modifier evals the match. In this case system will be eval()'ed.
preg_replace("/.*/e","system('echo /etc/passwd')");
Or in the form of a vulnerability:
preg_replace($_GET['regex'],$_GET['check']);
回答5:
Regular expressions are a programming language. I don't think they're quite Turing-complete, but they're close enough that allowing your users to enter them into your web site IS allowing other people to run code on your server. QED, yes, it's a security hole.
You might be able to get away with allowing a subset of whatever regexp language you want to use, whitelist a particular set of constructs to make it a not-big-enough-to-sweat-over hole... other people have already mentioned the possible dooms of nesting and * . How much you're willing to let people load down your server is up to you. Personally, I'd be comfortable with letting 'em have one SQL "CONTAINS" statement and maybe a "BETWEEN()". :)
回答6:
I suspect ruby would allow /#{system("rm -rf really_important_directory")}/
- is that the kind of thing you're worried about?
回答7:
User-supplied regex, or in general, user input, should never be treated as safe - regardless of the programming language. If your program fails to do so, it is vulnerable to attacks by deliberately crafted inputs.
In the case of Regex, it can be ReDos
: Regex Denial of Service. Basically, a regex which consumes an excessive amount of CPU and memory to process.
For e.g: if you try to evaluate this regex
^(([a-z])+.)+[A-Z]([a-z])+$
on this input:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!
you'll notice it may hang - it's called catastrophic backtrack. See it for yourself here: https://regex101.com/r/Qhn3Vb/1
Read more about Regex DoS: https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS
Bottomline: never assume user input is safe!
回答8:
AFAIK, you can do it safely in C#: you can supply the regex string to the Regex constructor, and if it fails to parse it'll throw. I'm not sure about others.
来源:https://stackoverflow.com/questions/4289923/in-which-languages-is-it-a-security-hole-to-use-user-supplied-regular-expression