发表新帖

发表新帖

How to read gz compressed file by pyspark

前端未结

关注

 3  968

名媛妹妹 2020-12-19 02:04

I have line data in .gz compressed format. I have to read it in pyspark Following is the code snippet

rdd = sc.textFile(\"data/label.gz\").map(func)
<

3条回答

太阳男子 (楼主)

2020-12-19 02:45

You didn't write the error message you got, but it's probably not going well for you because gzipped files are not splittable. You need to use a splittable compression codec, like bzip2.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题