How do you transpose a dask dataframe (convert columns to rows) to approach tidy data principles

六月ゝ 毕业季﹏ 提交于 2019-12-01 00:32:34

I think you can get the result you want by bypassing bag altogether, with code like

import glob

import pandas as pd
import dask.dataframe as dd
from dask.delayed import delayed

filenames = glob.glob('sampleTwitter*.json')
dfs = [delayed(pd.read_json)(fn, 'records') for fn in filenames]
ddf = dd.from_delayed(dfs)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!