问题
I am using ThreadPool to achieve multiprocessing. When using multiprocessing, pool size limit should be equivalent to number of CPU cores. My question- When using ThreadPool, should the pool size limit be number of CPU cores?
This is my code
from multiprocessing.pool import ThreadPool as Pool
class Subject():
def __init__(self, url):
#rest of the code
def func1(self):
#returns something
if __name__=="__main__":
pool_size= 11
pool= Pool(pool_size)
objects= [Subject() for url in all_my_urls]
for obj in objects:
pool.apply_async(obj.func1, ())
pool.close()
pool.join()
What should be the maximum pool size be? Thanks in advance.
回答1:
You cannot use threads for multiprocessing, you can only achieve multithreading. Multiple threads cannot run concurrently in a single Python process because of the GIL and so multithreading is only useful if they are running IO heavy work (e.g. talking to the Internet) where they spend a lot of time waiting, rather than CPU heavy work (e.g. maths) which constantly occupies a core.
So if you have many IO heavy tasks running at once then having that many threads will be useful, even if it's more than the the number of CPU cores. A very large number threads will eventually have a negative impact on performance, but until you actually measure a problem don't worry. Something like 100 threads should be fine.
来源:https://stackoverflow.com/questions/42541893/maximum-pool-size-when-using-threadpool-python