Multiprocessing Running Slower than a Single Process

问题

I'm attempting to use multiprocessing to run many simulations across multiple processes; however, the code I have written only uses 1 of the processes as far as I can tell.

Updated

I've gotten all the processes to work (I think) thanks to @PaulBecotte ; however, the multiprocessing seems to run significantly slower than its non-multiprocessing counterpart.

For instance, not including the function and class declarations/implementations and imports, I have:

def monty_hall_sim(num_trial, player_type='AlwaysSwitchPlayer'):
    if player_type == 'NeverSwitchPlayer':
        player = NeverSwitchPlayer('Never Switch Player')
    else:
        player = AlwaysSwitchPlayer('Always Switch Player')

    return (MontyHallGame().play_game(player) for trial in xrange(num_trial))

def do_work(in_queue, out_queue):
    while True:
        try:
            f, args = in_queue.get()
            ret = f(*args)
            for result in ret:
                out_queue.put(result)
        except:
            break

def main():
    logging.getLogger().setLevel(logging.ERROR)

    always_switch_input_queue = multiprocessing.Queue()
    always_switch_output_queue = multiprocessing.Queue()

    total_sims = 20
    num_processes = 5
    process_sims = total_sims/num_processes

    with Timer(timer_name='Always Switch Timer'):
        for i in xrange(num_processes):
            always_switch_input_queue.put((monty_hall_sim, (process_sims, 'AlwaysSwitchPlayer')))

        procs = [multiprocessing.Process(target=do_work, args=(always_switch_input_queue, always_switch_output_queue)) for i in range(num_processes)]

        for proc in procs:
            proc.start()

        always_switch_res = []
        while len(always_switch_res) != total_sims:
            always_switch_res.append(always_switch_output_queue.get())

        always_switch_success = float(always_switch_res.count(True))/float(len(always_switch_res))

    print '\tLength of Always Switch Result List: {alw_sw_len}'.format(alw_sw_len=len(always_switch_res))
    print '\tThe success average of switching doors was: {alw_sw_prob}'.format(alw_sw_prob=always_switch_success)

which yields:

    Time Elapsed: 1.32399988174 seconds
    Length: 20
    The success average: 0.6

However, I am attempting to use this for total_sims = 10,000,000 over num_processes = 5, and doing so has taken significantly longer than using 1 process (1 process returned in ~3 minutes). The non-multiprocessing counterpart I'm comparing it to is:

def main():
    logging.getLogger().setLevel(logging.ERROR)

    with Timer(timer_name='Always Switch Monty Hall Timer'):
        always_switch_res = [MontyHallGame().play_game(AlwaysSwitchPlayer('Monty Hall')) for x in xrange(10000000)]

        always_switch_success = float(always_switch_res.count(True))/float(len(always_switch_res))

    print '\n\tThe success average of not switching doors was: {not_switching}' \
          '\n\tThe success average of switching doors was: {switching}'.format(not_switching=never_switch_success,
                                                                               switching=always_switch_success)

回答1:

You could try import “process “ under some if statements

回答2:

EDIT- you changed some stuff, let me try and explain a bit better.

Each message you put into the input queue will cause the monty_hall_sim function to get called and send num_trial messages to the output queue.

So your original implementation was right- to get 20 output messages, send in 5 input messages.

However, your function is slightly wrong.

for trial in xrange(num_trial):
    res = MontyHallGame().play_game(player)
    yield res

This will turn the function into a generator that will provide a new value on each next() call- great! The problem is here

while True:
    try:
        f, args = in_queue.get(timeout=1)
        ret = f(*args)
        out_queue.put(ret.next())
    except:
        break

Here, on each pass through the loop you create a NEW generator with a NEW message. The old one is thrown away. So here, each input message only adds a single output message to the queue before you throw it away and get another one. The correct way to write this is-

while True:
    try:
        f, args = in_queue.get(timeout=1)
        ret = f(*args)
        for result in ret:
            out_queue.put(ret.next())
    except:
        break

Doing it this way will continue to yield output messages from the generator until it finishes (after yielding 4 messages in this case)

回答3:

I was able to get my code to run significantly faster by changing monty_hall_sim's return to a list comprehension, having do_work add the lists to the output queue, and then extend the results list of main with the lists returned by the output queue. Made it run in ~13 seconds.

来源：https://stackoverflow.com/questions/51922119/multiprocessing-running-slower-than-a-single-process

标签

python

python-2.7

queue

multiprocessing

python-multiprocessing