Duplicate row based on value in different column

前端未结

关注

 3  1267

感动是毒 2020-12-06 05:07

I have a dataframe of transactions. Each row represents a transaction of two item (think of it like a transaction of 2 event tickets or something). I want to duplicate each

3条回答

一个人的身影 (楼主)

2020-12-06 05:57

First, I recreated your data using integers instead of text. I also varied the quantity so that one can more easily understand the problem.

d = {1: [20, 'NYC', 1], 2: [30, 'NYC', 2], 3: [5, 'SF', 3],      
     4: [300, 'LA', 1], 5: [30, 'LA', 2],  6: [100, 'SF', 3]}

columns=['Price', 'City', 'Quantity'] 
# create dataframe and rename columns

df = pd.DataFrame.from_dict(data=d, orient='index').sort_index()
df.columns = columns

>>> df
   Price City  Quantity
1     20  NYC         1
2     30  NYC         2
3      5   SF         3
4    300   LA         1
5     30   LA         2
6    100   SF         3

I created a new DataFrame by using a nested list comprehension structure.

df_new = pd.DataFrame([df.ix[idx] 
                       for idx in df.index 
                       for _ in range(df.ix[idx]['Quantity'])]).reset_index(drop=True)
>>> df_new
    Price City  Quantity
0      20  NYC         1
1      30  NYC         2
2      30  NYC         2
3       5   SF         3
4       5   SF         3
5       5   SF         3
6     300   LA         1
7      30   LA         2
8      30   LA         2
9     100   SF         3
10    100   SF         3
11    100   SF         3

0 讨论(0)

查看其它3个回答