Why does a background task block the response in SimpleHTTPServer?

问题

I'm writing a simple browser-based front end that should be able to launch a background task and then get progress from it. I want the browser to receive a response saying whether the task launched successfully, and then poll to determine when it is done. However, the presence of a background task seems to be stopping the XMLHttpRequest response from being sent immediately, so I can't report the success of launching the process. Consider the following (simplified) code:

import SocketServer
import SimpleHTTPServer
import multiprocessing
import time

class MyProc(multiprocessing.Process):
    def run(self):
        print 'Starting long process..'
        for i in range(100): time.sleep(1)
        print 'Done long process'

class Page(SimpleHTTPServer.SimpleHTTPRequestHandler):
    def do_GET(self):
        if self.path == '/':
            print >>self.wfile, "<html><body><a href='/run'>Run</a></body></html>"
        if self.path == '/run':
            self.proc = MyProc()
            print 'Starting..'
            self.proc.start()
            print 'After start.'
            print >>self.wfile, "Process started."

httpd = SocketServer.TCPServer(('', 8000), Page)
httpd.serve_forever()

When I run this, and browse to http://localhost:8000, I get a button named "Run". When I click on it, the terminal displays:

Starting..
After start.

However the browser view does not change.. in fact the cursor is spinning. Only when I press Ctrl-C in the terminal to interrupt the program, then the browser is update with the message Process started.

The message After start is clearly being printed. Therefore I can assume that do_GET is returning after starting the process. Yet, the browser doesn't get a response until after I interrupt the long-running process. I have to conclude there is something blocking between do_GET and the response being sent, which is inside SimpleHTTPServer.

I've also tried this with threads and subprocess.Popen but ran into similar problems. Any ideas?

回答1:

In addition to Steve's and my comments above, here is a solution that works for me.

The method to determine a content-length is a bit ugly. If you don't specify one, the browser may still show a spinning cursor although the content is shown. Closing the self.wfile instead could also work.

from cStringIO import StringIO

class Page(SimpleHTTPServer.SimpleHTTPRequestHandler):
    def do_GET(self):
        out = StringIO()
        self.send_response(200)
        self.send_header("Content-type", "text/html")
        if self.path == '/':
            out.write("<html><body><a href='/run'>Run</a></body></html>\n")
        elif self.path == '/run':
            self.proc = MyProc()
            print 'Starting..'
            self.proc.start()
            print 'After start.'
            out.write("<html><body><h1>Process started</h1></body></html>\n")
        text = out.getvalue()
        self.send_header("Content-Length", str(len(text)))
        self.end_headers()
        self.wfile.write(text)

回答2:

I use this snippet to run Threaded Version of SimpleHTTPServer.

I save this file as ThreadedHTTPServer.py for example and then I run like that:

$ python -m /path/to/ThreadedHTTPServer PORT

So it'll be threated in separated threads and now you can download in paralell and navigate properly.

from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
from SocketServer import ThreadingMixIn
import threading
import SimpleHTTPServer
import sys

PORT = int(sys.argv[1])

Handler = SimpleHTTPServer.SimpleHTTPRequestHandler

class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
    """Handle requests in a separate thread."""

if __name__ == '__main__':
    server = ThreadedHTTPServer(('0.0.0.0', PORT), Handler)
    print 'Starting server, use <Ctrl-C> to stop'
    server.serve_forever()

回答3:

The answer is that the multiprocessing module forks a completely different process with its own stdout... So your application is running just as you wrote it:

You start up the application in your terminal window.
You click on the Run button in your browser which does a GET on /run
You see the output of the current process in your terminal window, "Starting.."
A new process is started, MyProc with its own stdout and stderr.
MyProc prints to its stdout (which goes nowhere), 'Starting long process..'.
The very moment MyProc starts up, your app prints to stdout, "After start." since it was not told to wait for any kind of response from MyProc before doing so.

What you need is to implement a Queue that communicates back and forth between your main application's process and the forked process. There's some multiprocessing-specific examples on how to do that here:

http://www.ibm.com/developerworks/aix/library/au-multiprocessing/

However, that article (like most articles from IBM) is kind of deep and overly complicated... You might want to take a look at a simpler example of how to use the "regular" Queue module (it is pretty much identical to the one included in multiprocessing):

http://www.artfulcode.net/articles/multi-threading-python/

The most important concepts to understand are how to shuffle data between processes using the Queue and how to use join() to wait for a response before proceeding.

来源：https://stackoverflow.com/questions/3973789/why-does-a-background-task-block-the-response-in-simplehttpserver

标签

python

multiprocessing

simplehttpserver