I have data in different columns but I don\'t know how to extract it to save it in another variable.
index a b c
1 2 3 4
2 3 4 5
Assuming your column names (df.columns
) are ['index','a','b','c']
, then the data you want is in the
3rd & 4th columns. If you don't know their names when your script runs, you can do this
newdf = df[df.columns[2:4]] # Remember, Python is 0-offset! The "3rd" entry is at slot 2.
As EMS points out in his answer, df.ix
slices columns a bit more concisely, but the .columns
slicing interface might be more natural because it uses the vanilla 1-D python list indexing/slicing syntax.
WARN: 'index'
is a bad name for a DataFrame
column. That same label is also used for the real df.index
attribute, a Index
array. So your column is returned by df['index']
and the real DataFrame index is returned by df.index
. An Index
is a special kind of Series
optimized for lookup of it's elements' values. For df.index it's for looking up rows by their label. That df.columns
attribute is also a pd.Index
array, for looking up columns by their labels.