Cherrypy : which solutions for pages with large processing time

前端 未结 1 1511
甜味超标
甜味超标 2020-12-10 20:52

I have a website powered by cherrypy. For some pages, I need quite a long processing time (a multi-join SQL request on a several-million-row DB). The processing needs someti

1条回答
  •  时光取名叫无心
    2020-12-10 21:29

    Everything here depends on a volume of the website. CherryPy is a threaded server and once every thread is waiting for database, new requests won't be processed. There's also aspect of request queue, but in general it is so.

    Poor man's solution

    If you know that you have small traffic you can try to workaround. Increase response.timeout if needed (default is 300 seconds). Increase server.thread_pool (defaults to 10). If you use reserve proxy, like nginx, in front of CherryPy application, increase proxy timeout there as well.

    The following solutions will require you to redesign your website. Specifically to make it asynchronous, where client code sends a task, and then uses pull or push to get its result. It will require changes on both sides of the wire.

    CherryPy BackgroundTask

    You can make use of cherrypy.process.plugins.BackgroundTask and some intermediary storage (e.g. new table in your database) at server side. XmlHttpRequest for pull or WebSockets for push to client side. CherryPy can handle both.

    Note that because CherryPy is run in single Python process, the background task's thread will run within it too. If you do some SQL result set post-processing, you will be affected by GIL. So you may want rewrite it to use processes instead, which is a little more complicated.

    Industrial solution

    If your website operates or is deemed to operate at scale, you are better to consider a distributed task queue like Rq or Celery. It makes server-side difference. Client side is the same pull or push.

    Example

    Here follows a toy implementation for BackgroundTags with XHR polling.

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    
    import time
    import uuid
    
    import cherrypy
    from cherrypy.process.plugins import BackgroundTask
    
    
    config = {
      'global' : {
        'server.socket_host' : '127.0.0.1',
        'server.socket_port' : 8080,
        'server.thread_pool' : 8,
      }
    }
    
    
    class App:
    
      _taskResultMap = None
    
    
      def __init__(self):
        self._taskResultMap = {}
    
      def _target(self, task, id, arg):
        time.sleep(10) # long one, right?
        try:
          self._taskResultMap[id] = 42 + arg
        finally:
          task.cancel()
    
      @cherrypy.expose
      @cherrypy.tools.json_out()
      def schedule(self, arg):
        id = str(uuid.uuid1())
        self._taskResultMap[id] = None
        task = BackgroundTask(
          interval = 0, function = self._target, args = [id, int(arg)], 
          bus = cherrypy.engine)
        task.args.insert(0, task)
        task.start()
        return str(id)
    
      @cherrypy.expose
      @cherrypy.tools.json_out()
      def poll(self, id):
        if self._taskResultMap[id] is None:
          return {'id': id, 'status': 'wait', 'result': None}
        else:
          return {
            'id'     : id, 
            'status' : 'ready', 
            'result' : self._taskResultMap.pop(id)
          }
    
      @cherrypy.expose
      def index(self):
        return '''
          
          
            CherryPy BackgroundTask demo
            
            
          
          
            

    Run a long task, look in browser console.

      ''' if __name__ == '__main__': cherrypy.quickstart(App(), '/', config)

      0 讨论(0)
    提交回复
    热议问题