可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
import pandas as pd
Let's say I have a dataframe
like so:
df = pd.DataFrame({"a":range(4),"b":range(1,5)})
it looks like this:
a b 0 0 1 1 1 2 2 2 3 3 3 4
and a function that multiplies X by Y:
def XtimesY(x,y): return x*y
If I want to add a new pandas series to df I can do:
df["c"] =df.apply( lambda x:XtimesY(x["a"],2), axis =1)
It works !
Now I want to add multiple series:
I have this function:
def divideAndMultiply(x,y): return x/y, x*y
something like this ?:
df["e"], df["f"] = df.apply( lambda x: divideAndMultiply(x["a"],2) , axis =1)
It doesn't work !
I want the 'e'
column to receive the divisions and 'f'
column the multiplications !
Note: This is not the code I'm using but I'm expecting the same behavior.
回答1:
Redefine your function like this:
def divideAndMultiply(x,y): return [x/y, x*y]
Then do this:
df[['e','f']] = df.apply( lambda x: divideAndMultiply(x["a"],2) , axis =1)
You shall get the desired result:
In [118]: df Out[118]: a b e f 0 0 1 0 0 1 1 2 0 2 2 2 3 1 4 3 3 4 1 6
回答2:
Almost there. Use zip* to unpack the function. Try this:
def divideAndMultiply(x,y): return x/y, x*y df["e"], df["f"] = zip(*df.a.apply(lambda val: divideAndMultiply(val,2)))
回答3:
This doesn't work when you do it twice:
df = pd.DataFrame({"a":range(4),"b":range(1,5)}) print(df) def foo(x,y): return [x/y, x*y] df[['e','f']] = df.apply( lambda x: foo(x["a"],2) , axis =1) print(df) df[['g','h']] = df.apply( lambda x: foo(x["a"],2) , axis =1) print(df)
yields:
a b 0 0 1 1 1 2 2 2 3 3 3 4 a b e f 0 0 1 0.0 0.0 1 1 2 0.5 2.0 2 2 3 1.0 4.0 3 3 4 1.5 6.0 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-65-edf8718c90ec> in <module>() 8 df[['e','f']] = df.apply( lambda x: foo(x["a"],2) , axis =1) 9 print(df) ---> 10 df[['g','h']] = df.apply( lambda x: foo(x["a"],2) , axis =1) 11 print(df) 12 E:\dev\Anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value) 2324 2325 if isinstance(key, (Series, np.ndarray, list, Index)): -> 2326 self._setitem_array(key, value) 2327 elif isinstance(key, DataFrame): 2328 self._setitem_frame(key, value) E:\dev\Anaconda3\lib\site-packages\pandas\core\frame.py in _setitem_array(self, key, value) 2352 self[k1] = value[k2] 2353 else: -> 2354 indexer = self.loc._convert_to_indexer(key, axis=1) 2355 self._check_setitem_copy() 2356 self.loc._setitem_with_indexer((slice(None), indexer), value) E:\dev\Anaconda3\lib\site-packages\pandas\core\indexing.py in _convert_to_indexer(self, obj, axis, is_setter) 1229 mask = check == -1 1230 if mask.any(): -> 1231 raise KeyError('%s not in index' % objarr[mask]) 1232 1233 return _values_from_object(indexer) KeyError: "['g' 'h'] not in index"
回答4:
df["e"], df["f"] = zip(*df.apply( lambda x: divideAndMultiply(x["a"],2) , axis =1))
Should do the trick.
(I show this example so you can see how to use multiple columns as the input to create multiple new columns)