I\'m working my way through Pandas for Data Analysis and learning a ton. However, one thing keeps coming up. The book typically refers to columns of a dataframe as df[
for setting, values, you need to use df['column'] = series
.
once this is done however, you can refer to that column in the future with df.column
, assuming it's a valid python name. (so df.column
works, but df.6column
would still have to be accessed with df['6column']
)
i think the subtle difference here is that when you set something with df['column'] = ser
, pandas goes ahead and adds it to the columns/does some other stuff (i believe by overriding the functionality in __setitem__
. if you do df.column = ser
, it's just like adding a new field to any existing object which uses __setattr__
, and pandas does not seem to override this behavior.