sandboxing/running python code line by line

谁说我不能喝 提交于 2019-11-27 20:34:51

问题


I'd love to be able to do something like these two are doing:

Inventing on principle @18:20 , Live ClojureScript Game Editor

If you don't wanna check the videos, my problem is this:

Say I had this code:

....
xs = []
for x in xrange(10):
    xs.append(x)
...

I'd like to make an environment where I can execute the code, statement for statement and watch/trace the locals/globals as they change. Maybe give it a list of vars to keep track of in the locals/globals dictionaries. Like stepping through the code and saving the state info.

Optimally I'd like to save every state and it's associated context-data (locals/globals) so I can verify predicates for instance.

I'd like to do something like Bret Victor's binarySearch example Inventing on principle @18:20

Am I making sense? I find it complicated to explain in text, but the videos showcase what I want to try :)

Thanks for your time


What I've tried/read/googled:

  • code.InteractiveConsole / code.InteractiveInterpreter
  • the livecoding module: seems to work for pure functional/stateless code
  • exec / eval magic: seems that I can't get as fine grained control as I'd like.
  • the trace module doesn't seem to be the way either.
  • Python eval(compile(...), sandbox), globals go in sandbox unless in def, why? <-- This is close to what I want, but it compiles the whole string/code block and runs it in one step. If I could run a file like this, but check the locals between every line/statement..
  • run python source code line by line <-- This is not what I want
  • How do Ruby and Python implement their interactive consoles? <-- This topic suggests that I look into the code module some more

My next step would be looking into ast and compiling the code and running it bit-by-bit, but I really need some guidance.. Should I look more into reflection and the inspect-module??

I've used the Spin model checker before, but it uses its own DSL and I'd just love to do the modelling in the implementation language, in this case python.

Oh and BTW I know about the security implications of sandboxing code, but I'm not trying to make a secure execution environment, I'm trying to make a very interactive environment, aiming for crude model checking or predicate assertion for instance.


回答1:


After my initial success with sys.settrace(), I ended up switching to the ast module (abstract syntax trees). I parse the code I want to analyse and then insert new calls after each assignment to report on the variable name and its new value. I also insert calls to report on loop iterations and function calls. Then I execute the modified tree.

        tree = parse(source)

        visitor = TraceAssignments()
        new_tree = visitor.visit(tree)
        fix_missing_locations(new_tree)

        code = compile(new_tree, PSEUDO_FILENAME, 'exec')

        self.environment[CONTEXT_NAME] = builder
        exec code in self.environment

I'm working on a live coding tool like Bret Victor's, and you can see my working code on GitHub, and some examples of how it behaves in the test. You can also find links to a demo video, tutorial, and downloads from the project page.




回答2:


Update: After my initial success with this technique, I switched to using the ast module as described in my other answer.

sys.settrace() seems to work really well. I took the hacks question you mentioned and Andrew Dalke's article and got this simple example working.

import sys

def dump_frame(frame, event, arg):
    print '%d: %s' % (frame.f_lineno, event)
    for k, v in frame.f_locals.iteritems():
        print '    %s = %r' % (k, v)
    return dump_frame

def main():
    c = 0
    for i in range(3):
        c += i

    print 'final c = %r' % c

sys.settrace(dump_frame)

main()

I had to solve two problems to get this working.

  1. The trace function has to return itself or another trace function if you want to continue tracing.
  2. Tracing only seems to begin after the first function call. I originally didn't have the main method, and just went directly into a loop.

Here's the output:

9: call
10: line
11: line
    c = 0
12: line
    i = 0
    c = 0
11: line
    i = 0
    c = 0
12: line
    i = 1
    c = 0
11: line
    i = 1
    c = 1
12: line
    i = 2
    c = 1
11: line
    i = 2
    c = 3
14: line
    i = 2
    c = 3
final c = 3
14: return
    i = 2
    c = 3
38: call
    item = <weakref at 0x7febb692e1b0; dead>
    selfref = <weakref at 0x17cc730; to 'WeakSet' at 0x17ce650>
38: call
    item = <weakref at 0x7febb692e100; dead>
    selfref = <weakref at 0x7febb692e0a8; to 'WeakSet' at 0x7febb6932910>



回答3:


It sounds like you need bdb, the python debugger library. It's built-in, and the docs are here: http://docs.python.org/library/bdb.html

It doesn't have all of the functionality you seem to want, but it's a sensible place to start implementing it.




回答4:


Okay guys, I've made a bit progress.

Say we have a source file like this, we want to run statement by statement:

print("single line")
for i in xrange(3):
    print(i)
    print("BUG, executed outside for-scope, so only run once")
if i < 0:
    print("Should not get in here")
if i > 0:
    print("Should get in here though")

I want to execute it one statement at a time, while having access to the locals/globals. This is a quick dirty proof of concept (disregard the bugs and crudeness):

# returns matched text if found
def re_match(regex, text):
    m = regex.match(text)
    if m: return m.groups()[0]

# regex patterns
newline = "\n"
indent = "[ ]{4}"
line = "[\w \"\'().,=<>-]*[^:]"
block = "%s:%s%s%s" % (line, newline, indent, line)

indent_re = re.compile(r"^%s(%s)$" % (indent, line))
block_re = re.compile(r"^(%s)$" % block)
line_re =  re.compile(r"^(%s)$" % (line))

buf = ""
indent = False

# parse the source using the regex-patterns
for l in source.split(newline):
    buf += l + newline              # add the newline we removed by splitting

    m = re_match(indent_re, buf)    # is the line indented?
    if m: 
        indent = True               # yes it is
    else:
        if indent:                  # else, were we indented previously?
            indent = False          # okay, now we aren't

    m = re_match(block_re, buf)     # are we starting a block ?
    if m:
        indent = True
        exec(m)
        buf = ""
    else:
        if indent: buf = buf[4:]   # hack to remove indentation before exec'ing
        m = re_match(line_re, buf) # single line statement then?
        if m:
            exec(m) # execute the buffer, reset it and start parsing
            buf = ""
        # else no match! add a line more to the buffer and try again

Output:

morten@laptop /tmp $ python p.py
single line
0
1
2
BUG, executed outside for-scope, son only run once
Should get in here though

So this is somewhat what I want. This code breaks the source into executable statements and I'm able to "pause" in between statements and manipulate the environment. As the code above shows, I can't figure out how to properly break up the code and execute it again. This made me think that I should be able to use some tool to parse the code and run it like I want. Right now I'm thinking ast or pdb like you guys suggest.

A quick look suggests ast can do this, but it seems a bit complex so I'll have to dig into the docs. If pdb can control the flow programmatically, that may very well be the answer too.

Update:

Sooo, I did some more reading and I found this topic: What cool hacks can be done using sys.settrace?

I looked into using sys.settrace(), but it doesn't seem to be the way to go. I am getting more and more convinced I need to use the ast module to get as fine-gained control as I would like to. FWIW here's the code to use settrace() to peak inside function scope vars:

import sys

def trace_func(frame,event,arg):
    print "trace locals:"
    for l in frame.f_locals:
        print "\t%s = %s" % (l, frame.f_locals[l])

def dummy(ls):
    for l in ls: pass

sys.settrace(trace_func)
x = 5
dummy([1, 2, 3])
print "whatisthisidonteven-"

output:

morten@laptop /tmp $ python t.py 
trace locals:
    ls = [1, 2, 3]
whatisthisidonteven-
trace locals:
    item = <weakref at 0xb78289b4; dead>
    selfref = <weakref at 0xb783d02c; to 'WeakSet' at 0xb783a80c>
trace locals:
    item = <weakref at 0xb782889c; dead>
    selfref = <weakref at 0xb7828504; to 'WeakSet' at 0xb78268ac>

UPDATE:

Okay I seem to have solved it.. :) Ive written a simple parser that injects a statement between each line of code and then executes the code.. This statement is a function call that captures and saves the local environment in its current state.

I'm working on a Tkinter text editor with two windows that'll do what Bret Victor does in his binarySearch-demo. I'm almost done :)




回答5:


For simple tracing I suggest you use pdb. I've found it's quite reasonable for most debugging/single stepping purposes. For your example:

import pdb
...
xs = []
pdb.set_trace()
for x in xrange(10):
    xs.append(x)

Now your program will stop at the set_trace() call and you can use n or s to step through your code while it's executing. AFAIK pdb is using bdb as its backend.




回答6:


I see you've come up with something that works for you, but thought it would be worth mentioning 'pyscripter'. http://code.google.com/p/pyscripter/

I'm pretty new to python, but I'm finding it very useful to
simply click past the line that has a variable I want to check,
then press f4 to run it in a debugger mode.
After that I can just hover the mouse over the variable and it pops up
a tooltip that has the variable's values.

You can also single step through the script with f7 as described here:
http://openbookproject.net/thinkcs/python/english3e/functions.html#flow-of-execution
(see 'Watch the flow of execution in action')

Although when I followed the example it still stepped into the turtle module for some reason.




回答7:


download eclipse+pydev and run it in debug mode...



来源:https://stackoverflow.com/questions/9670931/sandboxing-running-python-code-line-by-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!