I have a huge dictionary something like this:
d[id1][id2] = value
example:
books[\"auth1\"][\"humor\"] = 20
books[\"auth1\"
In 2018, I think that Pandas 0.22 supports this out of the box.
Specifically, please check the from_dict
class method of DataFrame
.
books = {}
books["auth1"] = {}
books["auth2"] = {}
books["auth1"]["humor"] = 20
books["auth1"]["action"] = 30
books["auth2"]["comedy"] = 20
pd.DataFrame.from_dict(books, orient='columns', dtype=None)
Use a list comprehension to turn a dict into a list of lists and/or a numpy array:
np.array([[books[author][genre] for genre in sorted(books[author])] for author in sorted(books)])
EDIT
Apparently you have an irregular number of keys in each sub-dictionary. Make a list of all the genres:
genres = ['humor', 'action', 'comedy']
And then iterate over the dictionaries in the normal manner:
list_of_lists = []
for author_name, author in sorted(books.items()):
titles = []
for genre in genres:
try:
titles.append(author[genre])
except KeyError:
titles.append(0)
list_of_lists.append(titles)
books_array = numpy.array(list_of_lists)
Basically I'm attempting to append a value from each key in genres
to a list. If the key is not there, it throws an error. I catch the error, and append a 0 to the list instead.
pandas do this very well:
books = {}
books["auth1"] = {}
books["auth2"] = {}
books["auth1"]["humor"] = 20
books["auth1"]["action"] = 30
books["auth2"]["comedy"] = 20
from pandas import *
df = DataFrame(books).T.fillna(0)
The output is:
action comedy humor
auth1 30 0 20
auth2 0 20 0