I figured out these two methods. Is there a better one?
>>> import pandas as pd
>>> df = pd.DataFrame({\'A\': [5, 6, 7], \'B\': [7, 8, 9]}
df.to_numpy().sum()
df.values
Is the underlying numpy array
df.values.sum()
Is the numpy sum method and is faster
Adding some numbers to support this:
import numpy as np, pandas as pd
import timeit
df = pd.DataFrame(np.arange(int(1e6)).reshape(500000, 2), columns=list("ab"))
def pandas_test():
return df['a'].sum()
def numpy_test():
return df['a'].to_numpy().sum()
timeit.timeit(numpy_test, number=1000) # 0.5032469799989485
timeit.timeit(pandas_test, number=1000) # 0.6035906639990571
So we get a 20% performance on my machine just for Series summations!