Dask: Jobs on multiple nodes with one worker, run on one node only

笑着哭i 提交于 2019-12-11 07:26:48

问题


I am trying to process some files using a python function and would like to parallelize the task on a PBS cluster using dask. On the cluster I can only launch one job but have access to 10 nodes with 24 cores each.

So my dask PBSCluster looks like:

import dask
from dask_jobqueue import PBSCluster
cluster = PBSCluster(cores=240,
                     memory="1GB",
                     project='X',
                     queue='normal',
                     local_directory='$TMPDIR',
                     walltime='12:00:00',
                    resource_spec='select=10:ncpus=24:mem=1GB',
                    )
cluster.scale(1) # one worker 
from dask.distributed import Client
client = Client(cluster)     
client

After the Cluster in Dask shows 1 worker with 240 cores (not sure if that make sense). When I run

result = compute(*foo, scheduler='distributed') 

and access the allocated nodes only one of them is actually running the computation. I am not sure if I using the right PBS configuration.


回答1:


cluster = PBSCluster(cores=240,
                     memory="1GB",

The values you give to the Dask Jobqueue constructors are the values for a single job for a single node. So here you are asking for a node with 240 cores, which probably doesn't make sense today.

If you can only launch one job then dask-jobqueue's model probably won't work for you. I recommnd looking at dask-mpi as an alternative.



来源:https://stackoverflow.com/questions/56313707/dask-jobs-on-multiple-nodes-with-one-worker-run-on-one-node-only

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!