In which languages is it a security hole to use user-supplied regular expression?

后端 未结 8 1479
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-17 17:00

Edit: tchrist has informed me that my original accusations about Perl\'s insecurity are unfounded. However, the question still stands.

I know that i

相关标签:
8条回答
  • 2020-12-17 17:06

    User-supplied regex, or in general, user input, should never be treated as safe - regardless of the programming language. If your program fails to do so, it is vulnerable to attacks by deliberately crafted inputs.

    In the case of Regex, it can be ReDos: Regex Denial of Service. Basically, a regex which consumes an excessive amount of CPU and memory to process.

    For e.g: if you try to evaluate this regex

    ^(([a-z])+.)+[A-Z]([a-z])+$
    

    on this input:

    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!
    

    you'll notice it may hang - it's called catastrophic backtrack. See it for yourself here: https://regex101.com/r/Qhn3Vb/1

    Read more about Regex DoS: https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS


    Bottomline: never assume user input is safe!

    0 讨论(0)
  • 2020-12-17 17:10

    It's generally dynamic languages with an eval facility that tend to have the ability to execute code from regular expressions. In static languages (i.e. those requiring a separate compilation step) there is generally no way to execute code that wasn't compiled, so evaluating code from within a regex is impossible.

    Without a way to embed code in a regex, the worst a user can do is write a regex that takes a long time to evaluate.

    0 讨论(0)
  • 2020-12-17 17:13

    This is not true: you cannot execute code callbacks in Perl by sneaking them in an evaluated regex. This is forbidden. You have to specifically override that with a lexically scoped

    use re "eval";
    

    if you expect to have both interpolation and code escapes happening in the same pattern.

    Watch:

    % perl -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
    Eval-group not allowed at runtime, use re 'eval' in regex m/(?{ die naughty })/ at -e line 1.
    Exit 255
    
    % perl -Mre=eval -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
    naughty at (re_eval 1) line 1.
    Exit 255
    
    0 讨论(0)
  • 2020-12-17 17:20

    In most languages allowing users to supply regular expression means that you allow for a denial of service attack.

    Some types of regular expressions are extremely cpu intensive to execute. So in general it's a bad idea to allow users to enter regular expressions that will be executed on a remote system.

    For more info, read this page: http://www.regular-expressions.info/catastrophic.html

    0 讨论(0)
  • 2020-12-17 17:23

    Regular expressions are a programming language. I don't think they're quite Turing-complete, but they're close enough that allowing your users to enter them into your web site IS allowing other people to run code on your server. QED, yes, it's a security hole.

    You might be able to get away with allowing a subset of whatever regexp language you want to use, whitelist a particular set of constructs to make it a not-big-enough-to-sweat-over hole... other people have already mentioned the possible dooms of nesting and * . How much you're willing to let people load down your server is up to you. Personally, I'd be comfortable with letting 'em have one SQL "CONTAINS" statement and maybe a "BETWEEN()". :)

    0 讨论(0)
  • 2020-12-17 17:27

    I suspect ruby would allow /#{system("rm -rf really_important_directory")}/ - is that the kind of thing you're worried about?

    0 讨论(0)
提交回复
热议问题