CSV file creation Error in spark_expect_jobj_class

问题

I want to create CSV file. While running following Spark R code it gives an error.

sc <- spark_connect(master = "local", config = conf, version = '2.2.0')
sample_tbl <- spark_read_json(sc,name="example",path="example.json", header = TRUE, memory = FALSE,
                              overwrite = TRUE) 
sdf_schema_viewer(sample_tbl) # to create db schema
df <- spark_dataframe(sample_tbl)
spark_write_table(df, path = "data.csv", header = TRUE, delimiter = ",",
                charset = "UTF-8", null_value = NULL,
                options = list(), mode = NULL, partition_by = NULL)

Last line gives following Error,

Error in spark_expect_jobj_class(x, "org.apache.spark.sql.DataFrame") : 
  This operation is only supported on org.apache.spark.sql.DataFrame jobjs but found org.apache.spark.sql.Dataset instead.

Question

How to resolve this error in R?

回答1:

spark_dataframe is

used to access a Spark DataFrame object (as a Java object reference) from an R object.

In other words it is used to expose internal JVM representation to be able to interact with Scala / Java API. It has no use here.

When working with sdf_* or spark_methods you should pass tbl_spark objects. As long as sample_tbl contains only atomic types all you need is:

sample_tbl %>% spark_write_csv(path = "data.csv")

Otherwise you have to restructure it (by expanding or exploding complex fields) or convert nested structs to serialized objects (for example with to_json).

来源：https://stackoverflow.com/questions/52250484/csv-file-creation-error-in-spark-expect-jobj-class

标签

apache-spark

sparklyr

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!