Converting JSON into newline delimited JSON in Python

后端 未结 3 1320
囚心锁ツ
囚心锁ツ 2020-12-17 16:15

My goal is to convert JSON file into a format that can uploaded from Cloud Storage into BigQuery (as described here) with Python.

I have tried using newlineJSON pack

相关标签:
3条回答
  • 2020-12-17 16:45

    This takes a JSON file and converts into ND-JSON file.

    import json
    
    with open("results-20190312-113458.json", "r") as read_file:
        data = json.load(read_file)
    result = [json.dumps(record) for record in data]
    with open('nd-proceesed.json', 'w') as obj:
        for i in result:
            obj.write(i+'\n')
    

    Hope this helps someone.

    0 讨论(0)
  • 2020-12-17 16:57

    The answer with jq is really useful, but if you still want to do it with Python (as it seems from the question), you can do it with built-in json module.

    import json
    from io import StringIO
    in_json = StringIO("""[{
        "key01": "value01",
        "key02": "value02",
    
        "keyN": "valueN"
    },
    {
        "key01": "value01",
        "key02": "value02",
    
        "keyN": "valueN"
    },
    {
        "key01": "value01",
        "key02": "value02",
    
        "keyN": "valueN"
    }
    ]""")
    
    result = [json.dumps(record) for record in json.load(in_json)]  # the only significant line to convert the JSON to the desired format
    
    print('\n'.join(result))
    
    {"key01": "value01", "key02": "value02", "keyN": "valueN"}
    {"key01": "value01", "key02": "value02", "keyN": "valueN"}
    {"key01": "value01", "key02": "value02", "keyN": "valueN"}
    

    * I'm using StringIO and print here just to make a sample easier to test locally.

    As an alternative, you can use Python jq binding to combine it with the other answer.

    0 讨论(0)
  • 2020-12-17 17:03

    If you are willing to get out of Python, use jq:

    $ cat a.json 
    [{
        "key01": "value01",
        "key02": "value02",
        "keyN": "valueN"
    },
    {
        "key01": "value01",
        "key02": "value02",
        "keyN": "valueN"
    },
    {
        "key01": "value01",
        "key02": "value02",
        "keyN": "valueN"
    }
    ]
    
    
    $ cat a.json | jq -c '.[]'
    {"key01":"value01","key02":"value02","keyN":"valueN"}
    {"key01":"value01","key02":"value02","keyN":"valueN"}
    {"key01":"value01","key02":"value02","keyN":"valueN"}
    

    The iterator I used is '.[]' to go through the array, and -c puts each JSON object on a single line.

    Resources:

    • https://stedolan.github.io/jq/manual/
    • https://github.com/stedolan/jq
    0 讨论(0)
提交回复
热议问题