Python: create a new column from existing columns

前端 未结 6 2420
一生所求
一生所求 2021-02-20 02:48

I am trying to create a new column based on both columns. Say I want to create a new column z, and it should be the value of y when it is not missing and be the value of x when

相关标签:
6条回答
  • 2021-02-20 03:08

    Let's say DataFrame is called df. First copy the y column.

    df["z"] = df["y"].copy()
    

    Then set the nan locations of z to the locations in x where the nans are in z.

    import numpy as np
    df.z[np.isnan(df.z)]=df.x[np.isnan(df.z)]
    
    
    >>> df 
       x   y   z
    0  1 NaN   1
    1  2   8   8
    2  4  10  10
    3  8 NaN   8
    
    0 讨论(0)
  • 2021-02-20 03:08

    I'm not sure if I understand the question, but would this be what you're looking for?

    "if y[i]" will skip if the value is none.

    for i in range(len(x));
        if y[i]:
            z.append(y[i])
        else:
            z.append(x[i])
    
    0 讨论(0)
  • 2021-02-20 03:16

    The update method does almost exactly this. The only caveat is that update will do so in place so you must first create a copy:

    df['z'] = df.x.copy()
    df.z.update(df.y)
    

    In the above example you start with x and replace each value with the corresponding value from y, as long as the new value is not NaN.

    0 讨论(0)
  • 2021-02-20 03:17

    Use np.where:

    In [3]:
    
    df['z'] = np.where(df['y'].isnull(), df['x'], df['y'])
    df
    Out[3]:
       x   y   z
    0  1 NaN   1
    1  2   8   8
    2  4  10  10
    3  8 NaN   8
    

    Here it uses the boolean condition and if true returns df['x'] else df['y']

    0 讨论(0)
  • 2021-02-20 03:24

    The new column 'z' get its values from column 'y' using df['z'] = df['y']. This brings over the missing values so fill them in using fillna using column 'x'. Chain these two actions:

    >>> df['z'] = df['y'].fillna(df['x'])
    >>> df
       x   y   z
    0  1 NaN   1
    1  2   8   8
    2  4  10  10
    3  8 NaN   8
    
    0 讨论(0)
  • 2021-02-20 03:28

    You can use apply with option axis=1. Then your solution is pretty concise.

    df[z] = df.apply(lambda row: row.y if pd.notnull(row.y) else row.x, axis=1)
    
    0 讨论(0)
提交回复
热议问题