Python eval: is it still dangerous if I disable builtins and attribute access?

后端未结

关注

 6  969

We all know that eval is dangerous, even if you hide dangerous functions, because you can use Python\'s introspection features to dig down into things and re-extract them. F

相关标签:

6条回答

误落风尘

2020-12-02 13:03

Here is a safe_eval example which will ensure that the evaluated expression do not contain unsafe tokens. It does not try to take the literal_eval approach of interpreting the AST but rather whitelist the token types and use the real eval if expression passed test.

# license: MIT (C) tardyp
import ast


def safe_eval(expr, variables):
    """
    Safely evaluate a a string containing a Python
    expression.  The string or node provided may only consist of the following
    Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
    and None. safe operators are allowed (and, or, ==, !=, not, +, -, ^, %, in, is)
    """
    _safe_names = {'None': None, 'True': True, 'False': False}
    _safe_nodes = [
        'Add', 'And', 'BinOp', 'BitAnd', 'BitOr', 'BitXor', 'BoolOp',
        'Compare', 'Dict', 'Eq', 'Expr', 'Expression', 'For',
        'Gt', 'GtE', 'Is', 'In', 'IsNot', 'LShift', 'List',
        'Load', 'Lt', 'LtE', 'Mod', 'Name', 'Not', 'NotEq', 'NotIn',
        'Num', 'Or', 'RShift', 'Set', 'Slice', 'Str', 'Sub',
        'Tuple', 'UAdd', 'USub', 'UnaryOp', 'boolop', 'cmpop',
        'expr', 'expr_context', 'operator', 'slice', 'unaryop']
    node = ast.parse(expr, mode='eval')
    for subnode in ast.walk(node):
        subnode_name = type(subnode).__name__
        if isinstance(subnode, ast.Name):
            if subnode.id not in _safe_names and subnode.id not in variables:
                raise ValueError("Unsafe expression {}. contains {}".format(expr, subnode.id))
        if subnode_name not in _safe_nodes:
            raise ValueError("Unsafe expression {}. contains {}".format(expr, subnode_name))

    return eval(expr, variables)



class SafeEvalTests(unittest.TestCase):

    def test_basic(self):
        self.assertEqual(safe_eval("1", {}), 1)

    def test_local(self):
        self.assertEqual(safe_eval("a", {'a': 2}), 2)

    def test_local_bool(self):
        self.assertEqual(safe_eval("a==2", {'a': 2}), True)

    def test_lambda(self):
        self.assertRaises(ValueError, safe_eval, "lambda : None", {'a': 2})

    def test_bad_name(self):
        self.assertRaises(ValueError, safe_eval, "a == None2", {'a': 2})

    def test_attr(self):
        self.assertRaises(ValueError, safe_eval, "a.__dict__", {'a': 2})

    def test_eval(self):
        self.assertRaises(ValueError, safe_eval, "eval('os.exit()')", {})

    def test_exec(self):
        self.assertRaises(SyntaxError, safe_eval, "exec 'import os'", {})

    def test_multiply(self):
        self.assertRaises(ValueError, safe_eval, "'s' * 3", {})

    def test_power(self):
        self.assertRaises(ValueError, safe_eval, "3 ** 3", {})

    def test_comprehensions(self):
        self.assertRaises(ValueError, safe_eval, "[i for i in [1,2]]", {'i': 1})

0 讨论(0)

渐次进展

2020-12-02 13:04

I'm going to mention one of the new features of Python 3.6 - f-strings.

They can evaluate expressions,

>>> eval('f"{().__class__.__base__}"', {'__builtins__': None}, {})
"<class 'object'>"

but the attribute access won't be detected by Python's tokenizer:

0,0-0,0:            ENCODING       'utf-8'        
1,0-1,1:            ERRORTOKEN     "'"            
1,1-1,27:           STRING         'f"{().__class__.__base__}"'
2,0-2,0:            ENDMARKER      ''

0 讨论(0)

别跟我提以往

2020-12-02 13:06
Users can still DoS you by inputting an expression that evaluates to a huge number, which would fill your memory and crash the Python process, for example
```
'10**10**100'
```
I am definitely still curious if more traditional attacks, like recovering builtins or creating a segfault, are possible here.

EDIT:

It turns out, even Python's parser has this issue.
```
lambda: 10**10**100
```
will hang, because it tries to precompute the constant.
0 讨论(0)
发布评论:

提交评论
- 加载中...
既然无缘

2020-12-02 13:07
It is possible to construct a return value from eval that would throw an exception outside eval if you tried to print, log, repr, anything:
```
eval('''((lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args))))
        (lambda f: lambda n: (1,(1,(1,(1,f(n-1))))) if n else 1)(300))''')
```
This creates a nested tuple of form (1,(1,(1,(1...; that value cannot be printed (on Python 3), stred or repred; all attempts to debug it would lead to
```
RuntimeError: maximum recursion depth exceeded while getting the repr of a tuple
```
pprint and saferepr fails too:
```
...
  File "/usr/lib/python3.4/pprint.py", line 390, in _safe_repr
    orepr, oreadable, orecur = _safe_repr(o, context, maxlevels, level)
  File "/usr/lib/python3.4/pprint.py", line 340, in _safe_repr
    if issubclass(typ, dict) and r is dict.__repr__:
RuntimeError: maximum recursion depth exceeded while calling a Python object
```
Thus there is no safe built-in function to stringify this: the following helper could be of use:
```
def excsafe_repr(obj):
    try:
        return repr(obj)
    except:
        return object.__repr__(obj).replace('>', ' [exception raised]>')
```
And then there is the problem that print in Python 2 does not actually use str/repr, so you do not have any safety due to lack of recursion checks. That is, take the return value of the lambda monster above, and you cannot str, repr it, but ordinary print (not print_function!) prints it nicely. However, you can exploit this to generate a SIGSEGV on Python 2 if you know it will be printed using the print statement:
```
print eval('(lambda i: [i for i in ((i, 1) for j in range(1000000))][-1])(1)')
```
crashes Python 2 with SIGSEGV. This is WONTFIX in the bug tracker. Thus never use print-the-statement if you want to be safe. from __future__ import print_function!

This is not a crash, but
```
eval('(1,' * 100 + ')' * 100)
```
when run, outputs
```
s_push: parser stack overflow
Traceback (most recent call last):
  File "yyy.py", line 1, in <module>
    eval('(1,' * 100 + ')' * 100)
MemoryError
```
The MemoryError can be caught, is a subclass of Exception. The parser has some really conservative limits to avoid crashes from stackoverflows (pun intended). However, s_push: parser stack overflow is output to stderr by C code, and cannot be suppressed.

And just yesterday I asked why doesn't Python 3.4 be fixed for a crash from,
```
% python3  
Python 3.4.3 (default, Mar 26 2015, 22:03:40) 
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class A:
...     def f(self):
...         nonlocal __x
... 
[4]    19173 segmentation fault (core dumped)  python3
```
and Serhiy Storchaka's answer confirmed that Python core devs do not consider SIGSEGV on seemingly well-formed code a security issue:

Only security fixes are accepted for 3.4.

Thus it can be concluded that it can never be considered safe to execute any code from 3rd party in Python, sanitized or not.

And Nick Coghlan then added:

And as some additional background as to why segmentation faults provoked by Python code aren't currently considered a security bug: since CPython doesn't include a security sandbox, we're already relying entirely on the OS to provide process isolation. That OS level security boundary isn't affected by whether the code is running "normally", or in a modified state following a deliberately triggered segmentation fault.
0 讨论(0)
发布评论:

提交评论
- 加载中...
一生所求

2020-12-02 13:07
Controlling the locals and globals dictionaries is extremely important. Otherwise, someone could just pass in eval or exec, and call it recursively
```
safe_eval('''e("""[c for c in ().__class__.__base__.__subclasses__() 
    if c.__name__ == \'catch_warnings\'][0]()._module.__builtins__""")''', 
    globals={'e': eval})
```
The expression in the recursive eval is just a string.

You also need to set the eval and exec names in the global namespace to something that isn't the real eval or exec. The global namespace is important. If you use a local namespace, anything that creates a separate namespace, such as comprehensions and lambdas, will work around it
```
safe_eval('''[eval("""[c for c in ().__class__.__base__.__subclasses__()
    if c.__name__ == \'catch_warnings\'][0]()._module.__builtins__""") for i in [1]][0]''', locals={'eval': None})

safe_eval('''(lambda: eval("""[c for c in ().__class__.__base__.__subclasses__()
    if c.__name__ == \'catch_warnings\'][0]()._module.__builtins__"""))()''',
    locals={'eval': None})
```
Again, here, safe_eval only sees a string and a function call, not attribute accesses.

You also need to clear out the safe_eval function itself, if it has a flag to disable safe parsing. Otherwise you could simply do
```
safe_eval('safe_eval("<dangerous code>", safe=False)')
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
独厮守ぢ

2020-12-02 13:12
I don't believe Python is designed to have any security against untrusted code. Here's an easy way to induce a segfault via stack overflow (on the C stack) in the official Python 2 interpreter:
```
eval('()' * 98765)
```
From my answer to the "Shortest code that returns SIGSEGV" Code Golf question.
0 讨论(0)
发布评论:

提交评论
- 加载中...