I\'d like to know if there is a memory efficient way of reading multi record JSON file ( each line is a JSON dict) into a pandas dataframe. Below is a 2 line example with wo
++++++++Update++++++++++++++
As of v0.19, Pandas supports this natively (see https://github.com/pandas-dev/pandas/pull/13351). Just run:
df=pd.read_json('test.json', lines=True)
++++++++Old Answer++++++++++
The existing answers are good, but for a little variety, here is another way to accomplish your goal that requires a simple pre-processing step outside of python so that pd.read_json() can consume the data.
cat test.json | jq -c --slurp . > valid_test.jsondf=pd.read_json('valid_test.json')In ipython notebook, you can run the shell command directly from the cell interface with
!cat test.json | jq -c --slurp . > valid_test.json
df=pd.read_json('valid_test.json')