There are a few articles that show that MATLAB prefers column operations than row operations, and that depending on you lay out your data the performance can vary significan
In [38]: data = numpy.random.rand(10000,10000)
In [39]: %timeit data.sum(axis=0)
10 loops, best of 3: 86.1 ms per loop
In [40]: %timeit data.sum(axis=1)
10 loops, best of 3: 101 ms per loop