interacting over a dateframe with functions

问题

if I have a date frame like this: N

EG_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24  \
datum_von                                                                      
2017-10-12      21.69      15.36       0.87       1.42       0.76       0.65   
2017-10-13      11.85       8.08       1.39       2.86       1.02       0.55   
2017-10-14       7.83       5.88       1.87       2.04       2.29       2.18   
2017-10-15      14.64      11.28       2.62       3.35       2.13       1.25   
2017-10-16       5.11       5.82      -0.30      -0.38      -0.24      -0.10   
2017-10-17      12.09       9.61       0.20       1.09       0.39       0.57

And I wanna check the values that are above 0 and change them to zero when they are lower.

Not sure how should I use the function iterrows() and the loc() function to do so.

回答1:

clip_lower and mask solutions are good.

Here is another one with applymap:

df.applymap(lambda x: max(0.0, x))

回答2:

you can try:

df1 = df[df > 0].fillna(0)

as result:

In [24]: df
Out[24]: 
     EG_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24  \
0  2017-10-12      21.69      15.36       0.87       1.42       0.76   
1  2017-10-13      11.85       8.08       1.39       2.86       1.02   
2  2017-10-14       7.83       5.88       1.87       2.04       2.29   
3  2017-10-15      14.64      11.28       2.62       3.35       2.13   
4  2017-10-16       5.11       5.82      -0.30      -0.38      -0.24   
5  2017-10-17      12.09       9.61       0.20       1.09       0.39   

   datum_von  
0       0.65  
1       0.55  
2       2.18  
3       1.25  
4      -0.10  
5       0.57  

In [25]: df1 = df[df > 0].fillna(0)

In [26]: df1
Out[26]: 
     EG_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24  \
0  2017-10-12      21.69      15.36       0.87       1.42       0.76   
1  2017-10-13      11.85       8.08       1.39       2.86       1.02   
2  2017-10-14       7.83       5.88       1.87       2.04       2.29   
3  2017-10-15      14.64      11.28       2.62       3.35       2.13   
4  2017-10-16       5.11       5.82       0.00       0.00       0.00   
5  2017-10-17      12.09       9.61       0.20       1.09       0.39   

   datum_von  
0       0.65  
1       0.55  
2       2.18  
3       1.25  
4       0.00  
5       0.57

回答3:

Use clip_lower:

df = df.clip_lower(0)
print (df)
            G_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24
datum_von                                                                 
2017-10-12    21.69      15.36       0.87       1.42       0.76       0.65
2017-10-13    11.85       8.08       1.39       2.86       1.02       0.55
2017-10-14     7.83       5.88       1.87       2.04       2.29       2.18
2017-10-15    14.64      11.28       2.62       3.35       2.13       1.25
2017-10-16     5.11       5.82       0.00       0.00       0.00       0.00
2017-10-17    12.09       9.61       0.20       1.09       0.39       0.57

If first column is not index:

df = df.set_index('datum_von').clip_lower(0)
print (df)
            G_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24
datum_von                                                                 
2017-10-12    21.69      15.36       0.87       1.42       0.76       0.65
2017-10-13    11.85       8.08       1.39       2.86       1.02       0.55
2017-10-14     7.83       5.88       1.87       2.04       2.29       2.18
2017-10-15    14.64      11.28       2.62       3.35       2.13       1.25
2017-10-16     5.11       5.82       0.00       0.00       0.00       0.00
2017-10-17    12.09       9.61       0.20       1.09       0.39       0.57

Alternative solution:

df = df.mask(df < 0, 0)
print (df)
            G_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24
datum_von                                                                 
2017-10-12    21.69      15.36       0.87       1.42       0.76       0.65
2017-10-13    11.85       8.08       1.39       2.86       1.02       0.55
2017-10-14     7.83       5.88       1.87       2.04       2.29       2.18
2017-10-15    14.64      11.28       2.62       3.35       2.13       1.25
2017-10-16     5.11       5.82       0.00       0.00       0.00       0.00
2017-10-17    12.09       9.61       0.20       1.09       0.39       0.57

Timings:

df = pd.concat([df]*10000).reset_index(drop=True)

In [240]: %timeit (df.applymap(lambda x: max(0.0, x)))
10 loops, best of 3: 164 ms per loop

In [241]: %timeit (df[df > 0].fillna(0))
100 loops, best of 3: 7.05 ms per loop

In [242]: %timeit (df.clip_lower(0))
1000 loops, best of 3: 1.96 ms per loop

In [243]: %timeit df.mask(df < 0, 0)
100 loops, best of 3: 5.18 ms per loop

来源：https://stackoverflow.com/questions/46970671/interacting-over-a-dateframe-with-functions

标签

python

pandas

loops

loc