PySpark: org.apache.spark.sql.AnalysisException: Attribute name … contains invalid character(s) among “ ,;{}()\\n\\t=”. Please use alias to rename it [duplicate]

雨燕双飞 提交于 2019-12-01 21:28:19

Have you tried,

df = df.withColumnRenamed("Foo Bar", "foobar")

When you select the column with an alias you're still passing the wrong column name through a select clause.

The suggestion from @MaFF seems to pass for me

df = spark.read.parquet("my_parquet_dump")
df2 = df.withColumnRenamed("Foo Bar", "foobar")
df2.registerTempTable("temp")
hc.sql("CREATE TABLE persistent STORED AS PARQUET AS SELECT * FROM temp")

What error messages are you getting?

You could replace bad symbols with regular expression. Check my answer.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!