How to define schema for Pyspark createDataFrame(rdd, schema)?
问题 I looked at spark-rdd to dataframe. I read my gziped json into rdd rdd1 =sc.textFile('s3://cw-milenko-tests/Json_gzips/ticr_calculated_2_2020-05-27T11-59-06.json.gz') I want to convert it to spark dataframe. The first method from the linked SO question does not work. This is the first row form the file {"code_event": "1092406", "code_event_system": "LOTTO", "company_id": "2", "date_event": "2020-05-27 12:00:00.000", "date_event_real": "0001-01-01 00:00:00.000", "ecode_class": "", "ecode_event