Dask worker seem die but cannot find the worker log to figure out why

霸气de小男生 提交于 2020-05-29 10:19:58

问题


I have a piece of DASK code run on local machine which work 90% of time but will stuck sometimes. Stuck mean. No crash, no error print out not cpu usage. never end.

I google and think it maybe due to some worker dead. I will be very useful if I can see the worker log and figure out why. But I cannot find my worker log. I go to edit config.yaml to add loging but still see nothing from stderr. Then I go to dashboard --> info --> logs and see blank page.

The code it stuck is X_test = df_test.to_dask_array(lengths=True) or proba = y_pred_proba_train[:, 1].compute()

and my ~/.config/dask/config.yaml or ~.dask/config.yaml look like logging: distributed: info distributed.client: warning distributed.worker: debug bokeh: error

I am using python 3.6 dask 1.1.4 All I need is a way to see the log so that I can try to figure out what goes wrong.

Thanks

Joseph


回答1:


Worker logs are usually managed by whatever system you use to set up Dask.

Perhaps you used something like Kubernetes or Yarn or SLURM?

These systems all have ways to get logs back.

Unfortunately, once a Dask worker is no longer running, Dask itself has no ability to collect logs for you. You need to use the system that you use to launch Dask.



来源:https://stackoverflow.com/questions/57618323/dask-worker-seem-die-but-cannot-find-the-worker-log-to-figure-out-why

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!