I have a csv file that has a few hundred rows and 26 columns, but the last few columns only have a value in a few rows and they are towards the middle or end of the file. Wh
Suppose you have a file like this:
a,b,c
1,2,3
1,2,3,4
You could use csv.reader to clean the file first,
lines=list(csv.reader(open('file.csv')))
header, values = lines[0], lines[1:]
data = {h:v for h,v in zip (header, zip(*values))}
and get:
{'a' : ('1','1'), 'b': ('2','2'), 'c': ('3', '3')}
If you don't have header you could use:
data = {h:v for h,v in zip (str(xrange(number_of_columns)), zip(*values))}
and then you can convert dictionary to dataframe with
import pandas as pd
df = pd.DataFrame.from_dict(data)