问题
I want to create multiple dataframes of names that the same as values in one of the column. I would like this code to work like that:
import pandas as pd
data=pd.read_csv('athlete_events.csv')
Sports = data.Sport.unique()
for S in Sports:
name=str(S)
name=data.loc[data['Sport']==S]
回答1:
You can do this by modifying globals()
but that's not really adviseable.
for S in Sports:
globals()[str(S)] = data.loc[data['Sport']==S]
Below is a self-contained example:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'sport':['football', 'football', 'tennis'],
'value':[1, 2, 3]})
In [3]: df
Out[3]:
sport value
0 football 1
1 football 2
2 tennis 3
In [4]: for name in df.sport.unique():
...: globals()[name] = df.loc[df.sport == name]
...:
In [4]: football
Out[4]:
sport value
0 football 1
1 football 2
While this is a direct answer to your question, I would recommend sacul's answer, dictionaries are meant for this (i.e. storing keys and values) and variable names inserted via globals()
are usually not a good idea to begin with.
Imagine someone else or yourself in the future reading your code - all of a sudden you are using football
like a pd.DataFrame
which you have never explicitly defined before - how are you supposed to know what is going on?
回答2:
Use a dictionary for organizing your dataframes, and groupby
to split them. You can iterate through your groupby
object with a dict comprehension.
Example:
>>> data
Sport random_data
0 soccer 0
1 soccer 3
2 football 1
3 football 1
4 soccer 4
frames = {i:dat for i, dat in data.groupby('Sport')}
You can then access your frames as you would any other dictionary value:
>>> frames['soccer']
Sport random_data
0 soccer 0
1 soccer 3
4 soccer 4
>>> frames['football']
Sport random_data
2 football 1
3 football 1
来源:https://stackoverflow.com/questions/51620014/pandas-set-names-of-dataframes-in-loop