Create multiple new columns for pandas dataframe with apply + function

谁说胖子不能爱 提交于 2019-12-11 04:38:34

问题


I have a pandas dataframe df of the following shape: (763, 65)

I use the following code to create 4 new columns:

df[['col1', 'col2', 'col3','col4']] = df.apply(myFunc, axis=1)

def myFunc(row):
    #code to get some result from another dataframe
    return result1, result2, result3, result4

The shape of the dataframe which is returned in myFunc is (1, 4). The code runs into the following error:

ValueError: Shape of passed values is (763, 4), indices imply (763, 65)

I know that df has 65 columns and that the returned data from myFunc only has 4 columns. However, I only want to create the 4 new columns (that is, col1, col2, etc.), so in my opinion the code is correct when it only returns 4 columns in myFunc. What am I doing wrong?


回答1:


Demo:

In [40]: df = pd.DataFrame({'a':[1,2,3]})

In [41]: df
Out[41]:
   a
0  1
1  2
2  3

In [42]: def myFunc(row):
    ...:     #code to get some result from another dataframe
    ...:     # NOTE: trick is to return pd.Series()
    ...:     return pd.Series([1,2,3,4]) * row['a']
    ...:

In [44]: df[['col1', 'col2', 'col3','col4']] = df.apply(myFunc, axis=1)

In [45]: df
Out[45]:
   a  col1  col2  col3  col4
0  1     1     2     3     4
1  2     2     4     6     8
2  3     3     6     9    12

Disclaimer: try to avoid using .apply(..., axis=1) - as it's a for loop under the hood - i.e. it's not vectoried and will work much slower compared to vectorized Pandas/Numpy ufuncs.

PS if you would provide details of what you are trying to calculate in the myFunc functuion, then we could try to find a vectorized solution...



来源:https://stackoverflow.com/questions/46696807/create-multiple-new-columns-for-pandas-dataframe-with-apply-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!