Python NaN JSON encoder

问题

The default behavior for the JSON encoder is to convert NaNs to 'NaN', e.g. json.dumps(np.NaN) results in 'NaN'. How can I change this 'NaN' value to 'null'?

I have tried to subclass the JSONEncoder and implement the default() method as follows:

from json import JSONEncoder, dumps
import numpy as np

class NanConverter(JSONEncoder):
    def default(self, obj):
        try:
            _ = iter(obj)
        except TypeError:
            if isinstance(obj, float) and np.isnan(obj):
                return "null"
        return JSONEncoder.default(self, obj)

>>> d = {'a': 1, 'b': 2, 'c': 3, 'e': np.nan, 'f': [1, np.nan, 3]}
>>> dumps(d, cls=NanConverter)
'{"a": 1, "c": 3, "b": 2, "e": NaN, "f": [1, NaN, 3]}'

EXPECTED RESULT: '{"a": 1, "c": 3, "b": 2, "e": null, "f": [1, null, 3]}'

回答1:

This seems to achieve my objective:

import simplejson


>>> simplejson.dumps(d, ignore_nan=True)
Out[3]: '{"a": 1, "c": 3, "b": 2, "e": null, "f": [1, null, 3]}'

回答2:

Unfortunately, you probably need to use @Bramar's suggestion. You're not going to be able to use this directly. The documentation for Python's JSON encoder states:

If specified, default is a function that gets called for objects that can’t otherwise be serialized

Your NanConverter.default method isn't even being called, since Python's JSON encoder already knows how to serialize np.nan. Add some print statements - you'll see your method isn't even being called.

回答3:

As @Gerrat points out, your hook dumps(d, cls=NanConverter) unfortunately won't work.
@Alexander's simplejson.dumps(d, ignore_nan=True) works but introduces an additional dependency (simplejson).

If we introduce another dependency (pandas):

Another obvious solution would be dumps(pd.DataFrame(d).fillna(None)), but Pandas issue 1972 notes that d.fillna(None) will have unpredictable behaviour:

Note that fillna(None) is equivalent to fillna(), which means the value parameter is unused. Instead, it uses the method parameter which is by default forward fill.

So instead, use DataFrame.where:

df = pd.DataFrame(d)
dumps(df.where(pd.notnull(df), None)))

回答4:

simplejson will do the right work here, but there's one extra flag worth including:

Try using simplejson:

pip install simplejson

Then in the code:

import simplejson

response = df.to_dict('records')
simplejson.dumps(response, ignore_nan=True,default=datetime.datetime.isoformat)

The ignore_nan flag will handle correctly all NaN --> null conversions

The default flag will allow simplejson to parse your datetimes correctly.

来源：https://stackoverflow.com/questions/28639953/python-nan-json-encoder

标签

python

json

numpy

nan