问题
I am defining a function in python. Program file name itself is abc_d.py . I don't understand if i can import the same file inside again.
import numpy as np
import matplotlib.pyplot as plt
import sys
import multiprocessing
num_processor=4
pool = multiprocessing.Pool(num_processor)
def abc(data):
w=np.dot(data.reshape(25,1),data.reshape(1,25))
return w
data_final=np.array(range(100))
n=100
error=[]
k_list=[50,100,500,1000,2000]
for k in k_list:
dict_data={}
for d_set in range(num_processor):
dict_data[d_set]=data_final[int(d_set*n/4):int((d_set+1)*n/4)]
if(d_set==num_processor-1):
dict_data[d_set]=data_final[int(d_set*n/4):]
tasks = dict_data
results_w=[pool.apply_async(abc,dict_data[t]) for t in range(num_processor)]
w_f=[]
for result in results_w:
w_s=result.get()
w_f.append(w_s.tolist())
w_f=np.array(w_f)
print (w_f)
where tasks is a dictionary with array.
Error:
Can anybody explain the error. I am still not much familiar with the python.
Process ForkPoolWorker-1:
Process ForkPoolWorker-2:
Process ForkPoolWorker-3:
Process ForkPoolWorker-4:
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/anaconda3/lib/python3.5/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/home/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/anaconda3/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()
File "/home/anaconda3/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
File "/home/anaconda3/lib/python3.5/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/home/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
AttributeError: Can't get attribute 'abc' on <module '__main__' from 'abc_d.py'>
回答1:
If you declare the pool prior to declaring the function you are trying to use in parallel it will throw this error. Reverse the order and it will no longer throw this error. Also, there is a bug in your code, you are feeding all of your data_dict to abc, when you want to feed it as a list. So I changed that line too and it returns some results.
import numpy as np
import matplotlib.pyplot as plt
import sys
import multiprocessing
num_processor=4
def abc(data):
w=np.dot(data.reshape(25,1),data.reshape(1,25))
return w
pool = multiprocessing.Pool(num_processor)
data_final=np.array(range(100))
n=100
error=[]
k_list=[50,100,500,1000,2000]
for k in k_list:
dict_data={}
for d_set in range(num_processor):
dict_data[d_set]=data_final[int(d_set*n/4):int((d_set+1)*n/4)]
if(d_set==num_processor-1):
dict_data[d_set]=data_final[int(d_set*n/4):]
tasks = dict_data
results_w=[pool.apply_async(abc, [dict_data[t]]) for t in range(num_processor)]
w_f=[]
for result in results_w:
w_s=result.get()
w_f.append(w_s.tolist())
w_f=np.array(w_f)
print (w_f)
回答2:
Hi i got the same problem but i could fix it.
you have to put the definitions out of the script, because windows can't find the function.
Maybe you put your code in an if __name__ == '__main__':
query and add the function out of them.
import numpy as np
import matplotlib.pyplot as plt
import sys
import multiprocessing
def abc(data):
w=np.dot(data.reshape(25,1),data.reshape(1,25))
return w
if __name__ == '__main__':
num_processor=4
pool = multiprocessing.Pool(num_processor)
data_final=np.array(range(100))
n=100
error=[]
k_list=[50,100,500,1000,2000]
for k in k_list:
dict_data={}
for d_set in range(num_processor):
dict_data[d_set]=data_final[int(d_set*n/4):int((d_set+1)*n/4)]
if(d_set==num_processor-1):
dict_data[d_set]=data_final[int(d_set*n/4):]
tasks = dict_data
results_w=[pool.apply_async(abc,dict_data[t]) for t in range(num_processor)]
w_f=[]
for result in results_w:
w_s=result.get()
w_f.append(w_s.tolist())
w_f=np.array(w_f)
print (w_f)
回答3:
I also faced the same issue. Declaring pool after the function solved the issue. pool = multiprocessing.Pool(num_processor)
回答4:
You can try to pass the Pool as a parameter ! Alex
回答5:
a likely answer that I am pursuing myself is that the function will not pickle.. as discovered by this guy:
https://github.com/joblib/joblib/issues/166#issuecomment-55529781
who is the writer of a multi threading handler.
for those who use global variables in there multi threaded function, refer to this question:
Globals variables and Python multiprocessing
来源:https://stackoverflow.com/questions/36533134/cant-get-attribute-abc-on-module-main-from-abc-h-py