Optimization of json.load() to reduce in-memory usage and time in Python
问题 I have 10K folders each with 200 records in 200 JSON format files. Trying to compile all records into one data frame then finally into a CSV (other format suggestions welcome) Here is my working solution which takes around 8.3hrs just for the dataframe building process. (Not converting into CSV) %%time finalDf = pd.DataFrame() rootdir ='/path/foldername' all_files = Path(rootdir).rglob('*.json') for filename in all_files: with open(filename, 'r+') as f: data = json.load(f) df = pd.json