How to create a list in Python with the unique values of a CSV file?

前端 未结 3 752
长发绾君心
长发绾君心 2020-12-06 17:38

I have CSV file that looks like the following,

1994, Category1, Something Happened 1
1994, Category2, Something Happened 2
1995, Category1, Something Happen         


        
3条回答
  •  借酒劲吻你
    2020-12-06 18:10

    A very concise way to do this is to use pandas, the benefits are: it has a faster CSV pharser; and it works in columns (so it only requires one df.apply(set) to get you there) :

    In [244]:
    #Suppose the CSV is named temp.csv
    df=pd.read_csv('temp.csv',header=None)
    df.apply(set)
    Out[244]:
    0                        set([1994, 1995, 1996, 1998])
    1            set([ Category2,  Category3,  Category1])
    2    set([ Something Happened 4,  Something Happene...
    dtype: object
    

    The downside is that it returns a pandas.Series, and to get access each list, you need to do something like list(df.apply(set)[0]).

    Edit

    If the order has to be preserved, it can be also done very easily, for example:

    for i, item in df.iteritems():
        print item.unique()
    

    item.unique() will return numpy.arrays, instead of lists.

提交回复
热议问题