Python amateur here...let\'s say here I have snippet of an example csv file:
Country, Year, GDP, Population
Country1
This is an approach that will enable you to do one scan of the file to get the top 10 for each country...
It is possible to do this without pandas by utilising the heapq module, the following is untested, but should be a base for you to refer to appropriate documentation and adapt for your purposes:
import csv
import heapq
from itertools import islice
freqs = {}
with open('yourfile') as fin:
csvin = csv.reader(fin)
rows_with_gdp = ([float(row[2]) / float(row[3])] + row for row in islice(csvin, 1, None) if row[2] and row[3])
for row in rows_with_gdp:
cnt = freqs.setdefault(row[2], [[]] * 10) # 2 = year, 10 = num to keep
heapq.heappushpop(cnt, row)
for year, vals in freqs.iteritems():
print year, [row[1:] for row in sorted(filter(None, vals), reverse=True)]