I have a dictionary that is filled with data from two files I imported, but some of the data comes out as nan. How do I remove the pieces of data with nan?
My code is:
import matplotlib.pyplot as plt
from pandas.lib import Timestamp
import numpy as np
from datetime import datetime
import pandas as pd
import collections
orangebook = pd.read_csv('C:\Users\WEGWEIS_JAKE\Desktop\Work Programs\Code Files\products2.txt',sep='~', parse_dates=['Approval_Date'])
specificdrugs=pd.read_csv('C:\Users\WEGWEIS_JAKE\Desktop\Work Programs\Code Files\Drugs.txt',sep=',')
"""This is a dictionary that collects data from the .txt file
This dictionary has a key,value pair for every generic name with its corresponding approval date """
drugdict={}
for d in specificdrugs['Generic Name']:
drugdict.dropna()
drugdict[d]=orangebook[orangebook.Ingredient==d.upper()]['Approval_Date'].min()
What should I add or take away from this code to make sure that there are no key,value pairs in the dictionary with a value of nan?
from math import isnan
if nans are being stored as keys:
# functional
clean_dict = filter(lambda k: not isnan(k), my_dict)
# dict comprehension
clean_dict = {k: my_dict[k] for k in my_dict if not isnan(k)}
if nans are being stored as values:
# functional
clean_dict = filter(lambda k: not isnan(my_dict[k]), my_dict)
# dict comprehension
clean_dict = {k: my_dict[k] for k in my_dict if not isnan(my_dict[k])}
With simplejson
import simplejson
clean_dict = simplejson.loads(simplejson.dumps(my_dict, ignore_nan=True))
## or depending on your needs
clean_dict = simplejson.loads(simplejson.dumps(my_dict, allow_nan=False))
Instead of trying to remove the NaNs from your dictionary, you should further investigate why NaNs are getting there in the first place.
It gets difficult to use NaNs in a dictionary, as a NaN does not equal itself.
Check this out for more information: NaNs as key in dictionaries
来源:https://stackoverflow.com/questions/24068306/is-there-a-way-to-remove-nan-from-a-dictionary-filled-with-data