Compressing A Series of JSON Objects While Maintaining Serial Reading?

前端 未结 2 1014
梦谈多话
梦谈多话 2020-12-30 15:53

I have a bunch of json objects that I need to compress as it\'s eating too much disk space, approximately 20 gigs worth for a few million of them.

2条回答
  •  被撕碎了的回忆
    2020-12-30 16:30

    You might want to try an incremental json parser, such as jsaone.

    That is, create a single json with all your objects, and parse it like

    with gzip.GzipFile(file_path, 'r') as f_in:
        for key, val in jsaone.load(f_in):
            ...
    

    This is quite similar to Martin's answer, wasting slightly more space but maybe slightly more comfortable.

    EDIT: oh, by the way, it's probably fair to clarify that I wrote jsaone.

提交回复
热议问题