发表新帖

发表新帖

MemoryError using json.dumps()

前端未结

关注

 2  983

I would like to know which one of json.dump() or json.dumps() are the most efficient when it comes to encoding a large array to json format.

<

相关标签:

2条回答

再見小時候

2020-12-18 07:35
You can simply replace
```
f.write(json.dumps(mytab,default=dthandler,indent=4))
```
by
```
json.dump(mytab, f, default=dthandler, indent=4)
```
This should "stream" the data into the file.
0 讨论(0)
发布评论:

提交评论
- 加载中...
一生所求

2020-12-18 07:39
The JSON module will allocate the entire JSON string in memory before writing, which is why MemoryError occurs.

To get around this problem, use JSON.Encoder().iterencode():
```
with open(filepath, 'w') as f:
    for chunk in json.JSONEncoder().iterencode(object_to_encode):
        f.write(chunk)
```
However note that this will generally take quite a while, since it is writing in many small chunks and not everything at once.

Special case:

I had a Python object which is a list of dicts. Like such:
```
[
    { "prop": 1, "attr": 2 },
    { "prop": 3, "attr": 4 }
    # ...
]
```
I could JSON.dumps() individual objects, but the dumping whole list generates a MemoryError To speed up writing, I opened the file and wrote the JSON delimiter manually:
```
with open(filepath, 'w') as f:
    f.write('[')

    for obj in list_of_dicts[:-1]:
        json.dump(obj, f)
        f.write(',')

    json.dump(list_of_dicts[-1], f)
    f.write(']')
```
You can probably get away with something like that if you know your JSON object structure beforehand. For a general use, just use JSON.Encoder().iterencode().
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题