问题
I have a piece of DASK code run on local machine which work 90% of time but will stuck sometimes. Stuck mean. No crash, no error print out not cpu usage. never end.
I google and think it maybe due to some worker dead. I will be very useful if I can see the worker log and figure out why. But I cannot find my worker log. I go to edit config.yaml to add loging but still see nothing from stderr. Then I go to dashboard --> info --> logs and see blank page.
The code it stuck is X_test = df_test.to_dask_array(lengths=True) or proba = y_pred_proba_train[:, 1].compute()
and my ~/.config/dask/config.yaml or ~.dask/config.yaml look like logging: distributed: info distributed.client: warning distributed.worker: debug bokeh: error
I am using python 3.6 dask 1.1.4 All I need is a way to see the log so that I can try to figure out what goes wrong.
Thanks
Joseph
回答1:
Worker logs are usually managed by whatever system you use to set up Dask.
Perhaps you used something like Kubernetes or Yarn or SLURM?
These systems all have ways to get logs back.
Unfortunately, once a Dask worker is no longer running, Dask itself has no ability to collect logs for you. You need to use the system that you use to launch Dask.
来源:https://stackoverflow.com/questions/57618323/dask-worker-seem-die-but-cannot-find-the-worker-log-to-figure-out-why