问题
I have a pandas Series which presently looks like this:
14 [Yellow, Pizza, Restaurants]
...
160920 [Automotive, Auto Parts & Supplies]
160921 [Lighting Fixtures & Equipment, Home Services]
160922 [Food, Pizza, Candy Stores]
160923 [Hair Removal, Nail Salons, Beauty & Spas]
160924 [Hair Removal, Nail Salons, Beauty & Spas]
And I want to radically reshape it into a dataframe that looks something like this...
Yellow Automotive Pizza
14 1 0 1
…
160920 0 1 0
160921 0 0 0
160922 0 0 1
160923 0 0 0
160924 0 0 0
ie. a logical construction noting which categories each observation(row) falls into.
I'm capable of writing for loop based code to tackle the problem, but given the large number of rows I need to handle, that's going to be very slow.
Does anyone know a vectorised solution to this kind of problem? I'd be very grateful.
EDIT: there are 509 categories, which I do have a list of.
回答1:
In [9]: s = Series([list('ABC'),list('DEF'),list('ABEF')])
In [10]: s
Out[10]:
0 [A, B, C]
1 [D, E, F]
2 [A, B, E, F]
dtype: object
In [11]: s.apply(lambda x: Series(1,index=x)).fillna(0)
Out[11]:
A B C D E F
0 1 1 1 0 0 0
1 0 0 0 1 1 1
2 1 1 0 0 1 1
来源:https://stackoverflow.com/questions/16637171/pandas-reshaping-data