How to ensure each worker use exactly one CPU?

走远了吗. 提交于 2021-01-29 12:17:47

问题


I'm implementing SEED using ray, and therefore, I define a Worker class as follows

import numpy as np
import gym

class Worker:
    def __init__(self, worker_id, env_name, n):
        import os
        os.environ['OPENBLAS_NUM_THREADS'] = '1'
        self._id = worker_id
        self._n_envs = n
        self._envs = [gym.make(env_name) 
            for _ in range(self._n_envs)]

    def reset_env(self, env_id):
        return self._envs[env_id].reset()

    def env_step(self, env_id, action):
        return self._envs[env_id].step(action)

Besides that, there is a loop in the Leaner that invoke methods of Worker when necessary to interact with the environment.

As this document suggests, I want to make sure each worker use exactly one CPU resource. Here's some of my attempts:

  1. When creating a worker, I set num_cpus=1: worker=ray.remote(num_cpus=1)(Worker).remote(...)
  2. I checked my numpy configuration using np.__config__.show() which gave me the following information

blas_mkl_info: NOT AVAILABLE

blis_info: NOT AVAILABLE

openblas_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]

blas_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]

lapack_mkl_info: NOT AVAILABLE

openblas_lapack_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]

lapack_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]

I noticed that numpy is using OpenBLAS, so I set os.environ['OPENBLAS_NUM_THREADS'] = '1' in the Worker class as the above code does following this instruction.

After both are done, I opened top but still noticed that each Worker use 130%-180% CPUs, exactly the same as before. I've also tried to set os.environ['OPENBLAS_NUM_THREADS'] = '1' at the beginning of main python script or using export OPENBLAS_NUM_THREADS=1, but nothing helps. What can I do now?


回答1:


You can pin your core at each worker. For example, you can use something like psutil.Process().cpu_affinity([i]) to pin an index i core at each worker.

Also, before you pin your cpu, make sure to know what cpu has been assigned to the worker by this api. https://github.com/ray-project/ray/blob/203c077895ac422b80e31f062d33eadb89e66768/python/ray/worker.py#L457

Example:

ray.init(num_cpus=4)
@ray.remote(num_cpus=1) 
def f(): 
   import numpy 
   resources = ray.ray.get_resource_ids() 
   cpus = [v[0] for v in resources['CPU']]
   psutil.Process().cpu_affinity(cpus)                                                                                                                                                                                                                      



来源:https://stackoverflow.com/questions/61051911/how-to-ensure-each-worker-use-exactly-one-cpu

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!