I am looking for a a way to read just the header row of a large number of large CSV files.
Using Pandas, I have this method available, for each csv file:
I've used iglob as an example to search for the .csv files, but one way is to use a set, then adjust as necessary, eg:
import csv
from glob import iglob
unique_headers = set()
for filename in iglob('*.csv'):
with open(filename, 'rb') as fin:
csvin = csv.reader(fin)
unique_headers.update(next(csvin, []))