Return multiple columns from pandas apply()

后端未结

关注

 9  2139

灰色年华 2020-11-30 18:43

I have a pandas DataFrame, df_test. It contains a column \'size\' which represents size in bytes. I\'ve calculated KB, MB, and GB using the following code:

9条回答

心在旅途 (楼主)

2020-11-30 19:36

I believe the 1.1 version breaks the behavior suggested in the top answer here.

import pandas as pd
def test_func(row):
    row['c'] = str(row['a']) + str(row['b'])
    row['d'] = row['a'] + 1
    return row

df = pd.DataFrame({'a': [1, 2, 3], 'b': ['i', 'j', 'k']})
df.apply(test_func, axis=1)

The above code ran on pandas 1.1.0 returns:

   a  b   c  d
0  1  i  1i  2
1  1  i  1i  2
2  1  i  1i  2

While in pandas 1.0.5 it returned:

   a   b    c  d
0  1   i   1i  2
1  2   j   2j  3
2  3   k   3k  4

Which I think is what you'd expect.

Not sure how the release notes explain this behavior, however as explained here avoiding mutation of the original rows by copying them resurrects the old behavior. i.e.:

def test_func(row):
    row = row.copy()   #  <---- Avoid mutating the original reference
    row['c'] = str(row['a']) + str(row['b'])
    row['d'] = row['a'] + 1
    return row

0 讨论(0)

查看其它9个回答