问题
Using pandas, I want to do something like this while looping through data frames:
for body_part, columns in zip(self.body_parts, usecols_gen()):
body_part_df = self.read_csv(usecols=columns)
if self.normalize:
body_part_df[r'x(\.\d)?'] = body_part_df[r'x(\.\d)?'].apply(lambda x: x/x_max)
print(body_part_df)
result[body_part] = body_part_df
I use regular expressions because the column names I refer to are mangled: x, x.1, x.2, ..., x.n
This gives KeyError, and I don't understand the reason. Please help. Thanks in advance.
回答1:
You cannot query a DataFrame column using a regular expression, what you can do instead is iterate over it and apply your function on the matching columns, i.e.:
import re
# ...
for body_part, columns in zip(self.body_parts, usecols_gen()):
body_part_df = self.read_csv(usecols=columns)
if self.normalize:
for column in body_part_df:
if re.match(r"x(\.\d)?", column): # or re.search() for partial matches
body_part_df[column] = body_part_df[column].apply(lambda x: x/x_max)
print(body_part_df)
result[body_part] = body_part_df
来源:https://stackoverflow.com/questions/52402886/call-specific-columns-with-regular-expression-pandas