Duplicate row based on value in different column

前端 未结 3 1267
感动是毒
感动是毒 2020-12-06 05:07

I have a dataframe of transactions. Each row represents a transaction of two item (think of it like a transaction of 2 event tickets or something). I want to duplicate each

3条回答
  •  一个人的身影
    2020-12-06 05:57

    First, I recreated your data using integers instead of text. I also varied the quantity so that one can more easily understand the problem.

    d = {1: [20, 'NYC', 1], 2: [30, 'NYC', 2], 3: [5, 'SF', 3],      
         4: [300, 'LA', 1], 5: [30, 'LA', 2],  6: [100, 'SF', 3]}
    
    columns=['Price', 'City', 'Quantity'] 
    # create dataframe and rename columns
    
    df = pd.DataFrame.from_dict(data=d, orient='index').sort_index()
    df.columns = columns
    
    >>> df
       Price City  Quantity
    1     20  NYC         1
    2     30  NYC         2
    3      5   SF         3
    4    300   LA         1
    5     30   LA         2
    6    100   SF         3
    

    I created a new DataFrame by using a nested list comprehension structure.

    df_new = pd.DataFrame([df.ix[idx] 
                           for idx in df.index 
                           for _ in range(df.ix[idx]['Quantity'])]).reset_index(drop=True)
    >>> df_new
        Price City  Quantity
    0      20  NYC         1
    1      30  NYC         2
    2      30  NYC         2
    3       5   SF         3
    4       5   SF         3
    5       5   SF         3
    6     300   LA         1
    7      30   LA         2
    8      30   LA         2
    9     100   SF         3
    10    100   SF         3
    11    100   SF         3
    

提交回复
热议问题