I am loading a file of JSON objects as a PySpark SchemaRDD. I want to change the \"shape\" of the objects (basically, I\'m flattening them) and then insert into
It looks like select is not available in python, so you will have to registerTempTable and write it as a SQL statement, like
`SELECT flatten(*) FROM TABLE`
after setting up the function for use in SQL
sqlCtx.registerFunction("flatten", lambda x: flatten_function(x))
As @zero323 brought up, a function against * is probably not supported...so you can just create a function that takes in your data types and pass all of that in.