How to see progress of Dask Compute task?

蹲街弑〆低调 提交于 2019-11-29 11:22:37

问题


I would like to see a progressbar on Jupyternotebook while i'm running a compute task using Dask, I'm counting all values of "id" column from a large csv file +4GB, so any ideas?

import dask.dataframe as dd

df = dd.read_csv('data/train.csv')
df.id.count().compute()

回答1:


If you're using the single machine scheduler then do this:

from dask.diagnostics import ProgressBar
ProgressBar().register()

http://dask.pydata.org/en/latest/diagnostics-local.html

If you're using the distributed scheduler then do this:

from dask.distributed import progress

result = df.id.count.persist()
progress(result)

Or just use the dashboard

http://dask.pydata.org/en/latest/diagnostics-distributed.html



来源:https://stackoverflow.com/questions/49039750/how-to-see-progress-of-dask-compute-task

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!