How to return a specific data structure with inner dictionary of lists

…衆ロ難τιáo~ 提交于 2020-06-29 04:11:07

问题


I have a csv file (image attached) and to take the CSV file and create a dictionary of lists with the format "{method},{number},{orbital_period},{mass},{distance},{year}" .

So far I have code :

import csv

with open('exoplanets.csv') as inputfile : 
    reader = csv.reader(inputfile)
    inputm = list(reader)
    print(inputm) 

but my output is coming out like ['Radial Velocity', '1', '269.3', '7.1', '77.4', '2006']

when I want it to look like :

 "Radial Velocity" : {"number":[1,1,1], "orbital_period":[269.3, 874.774, 763.0], "mass":[7.1, 2.21, 2.6], "distance":[77.4, 56.95, 19.84], "year":[2006.0, 2008.0, 2011.0] } , "Transit" : {"number":[1,1,1], "orbital_period":[1.5089557, 1.7429935, 4.2568], "mass":[], "distance":[200.0, 680.0], "year":[2008.0, 2008.0, 2008.0] }

here

Any ideas on how I can alter my code?


回答1:


If you don't want to use Pandas, maybe something like this is what you're looking for:

import csv

with open('exoplanets.csv') as inputfile : 
    reader = csv.reader(inputfile)
    inputm = list(reader)

    header = inputm.pop(0)
    del header[0] # probably you don't want "#method"

    # create and populate the final dictionary
    data = {}
    for row in inputm:
        if row[0] not in data:
            data[row[0]] = {h:[] for h in header}

        for i, h in enumerate(header):
            data[row[0]][h].append(row[i+1])

print(data)



回答2:


Hey SKR01 welcome to Stackoverflow!

I would suggest working with the pandas library. It is meant for table like contents that you have there. What you are then looking for is a groupby on your #method column.

import pandas as pd

def remove_index(row):
    d = row._asdict()
    del d["Index"]
    return d


df = pd.read_csv("https://docs.google.com/uc?export=download&id=1PnQzoefx-IiB3D5BKVOrcawoVFLIPVXQ")

{row.Index : remove_index(row) for row in df.groupby('#method').aggregate(list).itertuples()}

The only thing that remains is removing the nan values from the resulting dict.




回答3:


This is a bit complex, and I'm questioning why you want the data this way, but this should get you the output format you want without requiring any external libraries like Pandas.

import csv

with open('exoplanets.csv') as input_file: 
    rows = list(csv.DictReader(input_file))

    # Create the data structure
    methods = {d["#method"]: {} for d in rows}

    # Get a list of fields, trimming off the method column
    fields = list(rows[1])[1:]

    # Fill in the data structure
    for method in methods:
        methods[method] = {
            # Null-trimmed version of listcomp
            # f: [r[f] for r in rows if r["#method"] == method and r[f]]
            f: [r[f] for r in rows if r["#method"] == method]
            for f
            in fields
        }

Note: This could be one multi-tiered list/dict comprehension, but I've broken it apart for clarity.



来源:https://stackoverflow.com/questions/62248605/how-to-return-a-specific-data-structure-with-inner-dictionary-of-lists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!