问题
This should be probably elementary but still I can not figure it out.
I am reading the documentation on pd.Series and doing simple exercises.
My code is the following:
import pandas as pd
import numpy as np
pd.Series([2, 4, 6]).prod()
Out[7]: 48
a = pd.Series(np.arange(1, 100, 3))
a
Out[9]:
0 1
1 4
2 7
3 10
4 13
5 16
6 19
7 22
8 25
9 28
10 31
11 34
12 37
13 40
14 43
15 46
16 49
17 52
18 55
19 58
20 61
21 64
22 67
23 70
24 73
25 76
26 79
27 82
28 85
29 88
30 91
31 94
32 97
dtype: int32
a.prod()
Out[10]: 0
a = pd.Series(np.arange(1, 100, 2))
a.prod()
Out[15]: -373459037
type(a)
Out[18]: pandas.core.series.Series
My question is why this erratic -to my eyes-- behavior? Why a.prod() the first time evaluates to 0 and then evaluates to a negative integer?
Your advice will be appreciated.
回答1:
We can directly use numpy's np.prod
with specified dtype
to overcome int overflow :
np.prod(a.values,dtype=np.int64)
Out[938]: 5196472710489536419
回答2:
it's a int32
overflow:
In [340]: a = pd.Series(np.arange(1, 100, 3)).astype(np.int64)
# NOTE: ---------------> ^^^^^^^^^^^^^^^^^
In [341]: a.prod()
Out[341]: 8624389262030143488
来源:https://stackoverflow.com/questions/47335608/the-pd-series-prod-function