问题
Is there an easy way in Dask to push a pure-python module to the workers?
I have many workers in a cluster and I want to distribute a local module that I have on my client. I understand that for large packages like NumPy or Python I should distribute things in a more robust fashion, but I have a small module that changes frequently that shouldn't be too much work to move around.
回答1:
Alternative if you wish to deploy a package to the workers after they have started you can do something similar to this using Client.run and Client.restart
def deploy_env(packages):
conda_prefix = pathlib.Path(sys.executable).parent.parent
res = subprocess.check_output(['conda', 'install', '-p', conda_prefix] + packages)
return res
# Run the deploy command on all the workers
result = client.run(deploy_env, packages)
# Restart all the worker processes
client.restart()
After this the packages specified will be installed on all currently running workers.
This approach will not work when adding additional workers to the scheduler.
回答2:
Yes, use the Client.upload_file method.
client.upload_file('myfile.py')
This method will distribute the file and, if the file ends in .py
or .egg
will also import and reload the module on each of the workers.
来源:https://stackoverflow.com/questions/49327802/push-a-pure-python-module-to-dask-workers