If you're using pandas.read_csv you can directly sample when loading the data, by using the skiprows parameter. Here is a short article I've written on this - https://nikolaygrozev.wordpress.com/2015/06/16/fast-and-simple-sampling-in-pandas-when-loading-data-from-files/