Read CSV file in pyspark with ANSI encoding

荒凉一梦 提交于 2021-01-29 13:25:54

问题


I am trying to read in a csv/text file that requires it to be read in using ANSI encoding. However this is not working. Any ideas?

mainDF= spark.read.format("csv")\
                  .option("encoding","ANSI")\
                  .option("header","true")\
                  .option("maxRowsInMemory",1000)\
                  .option("inferSchema","false")\
                  .option("delimiter", "¬")\
                  .load(path)

java.nio.charset.UnsupportedCharsetException: ANSI

The file is over 5GB hence the spark requirement.

I have also tried ANSI in lower case


回答1:


ISO-8859-1 is the same as ANSI so replace that as above



来源:https://stackoverflow.com/questions/59645851/read-csv-file-in-pyspark-with-ansi-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!