Pyspark - converting json string to DataFrame

前端 未结 2 1920
别那么骄傲
别那么骄傲 2020-12-17 16:51

I have a test2.json file that contains simple json:

{  \"Name\": \"something\",  \"Url\": \"https://stackoverflow.com\",  \"Author\": \"jangcy\",  \"BlogEntr         


        
相关标签:
2条回答
  • 2020-12-17 17:15

    You can do the following

    newJson = '{"Name":"something","Url":"https://stackoverflow.com","Author":"jangcy","BlogEntries":100,"Caller":"jangcy"}'
    df = spark.read.json(sc.parallelize([newJson]))
    df.show(truncate=False)
    

    which should give

    +------+-----------+------+---------+-------------------------+
    |Author|BlogEntries|Caller|Name     |Url                      |
    +------+-----------+------+---------+-------------------------+
    |jangcy|100        |jangcy|something|https://stackoverflow.com|
    +------+-----------+------+---------+-------------------------+
    
    0 讨论(0)
  • 2020-12-17 17:15

    It is apprently part of the "ingestion pipeline" help section,

    Therefore, renaming the field @ indexing time, not querying time

    0 讨论(0)
提交回复
热议问题