As @joris pointed out, iterrows
is much slower than itertuples
and itertuples
is approximately 100 times fater than iterrows
, and I tested speed of both methods in a DataFrame with 5027505 records the result is for iterrows
, it is 1200it/s, and itertuples
is 120000it/s.
If you use itertuples
, note that every element in the for loop is a namedtuple, so to get the value in each column, you can refer to the following example code
>>> df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
index=['a', 'b'])
>>> df
col1 col2
a 1 0.1
b 2 0.2
>>> for row in df.itertuples():
... print(row.col1, row.col2)
...
1, 0.1
2, 0.2