python multiprocessing: some functions do not return when they are complete (queue material too big)

后端 未结 1 1605
悲哀的现实
悲哀的现实 2020-12-03 02:43

I am using multiprocessing\'s Process and Queue. I start several functions in parallel and most behave nicely: they finish, their output goes to their Queue, and they show u

相关标签:
1条回答
  • 2020-12-03 03:34

    Alright, it seems that the pipe used to fill the Queue gets plugged when the output of a function is too big (my crude understanding? This is an unresolved/closed bug? http://bugs.python.org/issue8237). I have modified the code in my question so that there is some buffering (queues are regularly emptied while processes are running), which solves all my problems. So now this takes a collection of tasks (functions and their arguments), launches them, and collects the outputs. I wish it were simpler /cleaner looking.

    Edit (2014 Sep; update 2017 Nov: rewritten for readability): I'm updating the code with the enhancements I've made since. The new code (same function, but better features) is here: https://gitlab.com/cpbl/cpblUtilities/blob/master/parallel.py

    The calling Description is also below.

    def runFunctionsInParallel(*args, **kwargs):
        """ This is the main/only interface to class cRunFunctionsInParallel. See its documentation for arguments.
        """
        return cRunFunctionsInParallel(*args, **kwargs).launch_jobs()
    
    ###########################################################################################
    ###
    class cRunFunctionsInParallel():
        ###
        #######################################################################################
        """Run any list of functions, each with any arguments and keyword-arguments, in parallel.
    The functions/jobs should return (if anything) pickleable results. In order to avoid processes getting stuck due to the output queues overflowing, the queues are regularly collected and emptied.
    You can now pass os.system or etc to this as the function, in order to parallelize at the OS level, with no need for a wrapper: I made use of hasattr(builtinfunction,'func_name') to check for a name.
    Parameters
    ----------
    listOf_FuncAndArgLists : a list of lists 
        List of up-to-three-element-lists, like [function, args, kwargs],
        specifying the set of functions to be launched in parallel.  If an
        element is just a function, rather than a list, then it is assumed
        to have no arguments or keyword arguments. Thus, possible formats
        for elements of the outer list are:
          function
          [function, list]
          [function, list, dict]
    kwargs: dict
        One can also supply the kwargs once, for all jobs (or for those
        without their own non-empty kwargs specified in the list)
    names: an optional list of names to identify the processes.
        If omitted, the function name is used, so if all the functions are
        the same (ie merely with different arguments), then they would be
        named indistinguishably
    offsetsSeconds: int or list of ints
        delay some functions' start times
    expectNonzeroExit: True/False
        Normal behaviour is to not proceed if any function exits with a
        failed exit code. This can be used to override this behaviour.
    parallel: True/False
        Whenever the list of functions is longer than one, functions will
        be run in parallel unless this parameter is passed as False
    maxAtOnce: int
        If nonzero, this limits how many jobs will be allowed to run at
        once.  By default, this is set according to how many processors
        the hardware has available.
    showFinished : int
        Specifies the maximum number of successfully finished jobs to show
        in the text interface (before the last report, which should always
        show them all).
    Returns
    -------
    Returns a tuple of (return codes, return values), each a list in order of the jobs provided.
    Issues
    -------
    Only tested on POSIX OSes.
    Examples
    --------
    See the testParallel() method in this module
        """
    
    0 讨论(0)
提交回复
热议问题