Why is Slicing, splitting and storing data on a Pyspark Dataframe time consuming?

后端 未结 0 736
情深已故
情深已故 2021-01-31 13:16

I am pretty new to Pyspark and I have been processing a large number of files. I have around 2.5 TB of data and I am extracting some metadata from each of these files and storin

相关标签:
回答
  • 消灭零回复
提交回复
热议问题