Python multiprocessing Pool.apply_async with shared variables (Value)

前端未结

关注

 2  2178

For my college project I am trying to develop a python based traffic generator.I have created 2 CentOS machines on vmware and I am using 1 as my client and 1 as my server ma

相关标签:

2条回答

花落未央

2021-01-01 07:30

As the error messages states, you can't pass a multiprocessing.Value via pickle. However, you can use a multiprocessing.Manager().Value:

import multiprocessing
import urllib2
import random
import myurllist    #list of all destination urls for all 10 servers
import time
import socbindtry   #script that binds various virtual/aliased client ips to the script

def send_request3(response_time, error_count):    #function to send requests from alias client ip 1
    opener=urllib2.build_opener(socbindtry.BindableHTTPHandler3)    #bind to alias client ip1
    try:
        tstart=time.time()
        for i in range(myurllist.url):
            x=random.choice(myurllist.url[i])
            opener.open(x).read()
            print "file downloaded:",x
            response_time.append(time.time()-tstart)
    except urllib2.URLError, e:
        with error_count.get_lock():
            error_count.value += 1

def send_request4(response_time, error_count):    #function to send requests from alias client ip 2
    opener=urllib2.build_opener(socbindtry.BindableHTTPHandler4)    #bind to alias client ip2
    try:
        tstart=time.time()
        for i in range(myurllist.url):
            x=random.choice(myurllist.url[i])
            opener.open(x).read()
            print "file downloaded:",x
            response_time.append(time.time()-tstart)
    except urllib2.URLError, e:
        with error_count.get_lock():
            error_count.value += 1

#50 such functions are defined here for 50 clients

def func(response_time, error_count):
    pool=multiprocessing.Pool(processes=2*multiprocessing.cpu_count())
    args = (response_time, error_count)
    for i in range(5):
        pool.apply_async(send_request3, args=args)
        pool.apply_async(send_request4, args=args)
#append 50 functions here
    pool.close()
    pool.join()
    print"All work Done..!!"
    return

if __name__ == "__main__":
    m=multiprocessing.Manager()
    response_time=m.list()    #some shared variables
    error_count=m.Value('i',0)

    start=float(time.time())
    func(response_time, error_count)
    end=float(time.time())-start
    print end

A few other notes here:

Using a Pool with 750 processes is not a good idea. Unless you're using a server with hundreds of CPU cores, that's going to overwhelm your machine. It'd be faster and put less strain on your machine to use significantly fewer processes. Something more like 2 * multiprocessing.cpu_count().
As a best practice, you should explicitly pass all the shared arguments you need to use to the child processes, rather than using global variables. This increases the chances that the code will be work on Windows.
It looks like all your send_request* functions do almost the exact same thing. Why not just make one function and use a variable to decide which socbindtry.BindableHTTPHandler to use? You would avoid a ton of code duplication by doing this.
The way you're incrementing error_count is not process/thread-safe, and is susceptible to race conditions. You need to protect the increment with a lock (as I did in the example code above).

0 讨论(0)

夕颜

2021-01-01 07:37

Possibly, because Python Multiprocess diff between Windows and Linux (I seriously, don't know how multiprocessing works in VMs, as is the case here.)

This might work;

import multiprocessing
import random
import myurllist    #list of all destination urls for all 10 servers
import time

def send_request3(response_time, error_count):    #function to send requests from alias client ip 1
    opener=urllib2.build_opener(socbindtry.BindableHTTPHandler3)    #bind to alias client ip1
    try:
        tstart=time.time()
        for i in range(myurllist.url):
            x=random.choice(myurllist.url[i])
            opener.open(x).read()
            print "file downloaded:",x
            response_time.append(time.time()-tstart)
    except urllib2.URLError, e:
        error_count.value=error_count.value+1
def send_request4(response_time, error_count):    #function to send requests from alias client ip 2
    opener=urllib2.build_opener(socbindtry.BindableHTTPHandler4)    #bind to alias client ip2
    try:
        tstart=time.time()
        for i in range(myurllist.url):
            x=random.choice(myurllist.url[i])
            opener.open(x).read()
            print "file downloaded:",x
            response_time.append(time.time()-tstart)
    except urllib2.URLError, e:
        error_count.value=error_count.value+1
#50 such functions are defined here for 50 clients
def func():
    m=multiprocessing.Manager()
    response_time=m.list()    #some shared variables
    error_count=multiprocessing.Value('i',0)

    pool=multiprocessing.Pool(processes=750)
    for i in range(5):
        pool.apply_async(send_request3, [response_time, error_count])
        pool.apply_async(send_request4, [response_time, error_count])
        # pool.apply_async(send_request5)
#append 50 functions here
    pool.close()
    pool.join()
    print"All work Done..!!"
    return


start=float(time.time())
func()
end=float(time.time())-start
print end

0 讨论(0)