问题
I have a dataframe with text scraped online in each row which contains sports selection information (all in the same column). I am trying to transpose the data so that:
print(df):
Col A
Random text sentence
Random text sentence
Random text sentence
Race 1 - Handicap
14 - NAME
3 - NAME
5 - NAME
6 - NAME
Race Overview: lorem ipsum etc etc
Race 2 - Sprint
12 - NAME
10 - NAME
8 - NAME
11 - NAME
Race Overview: Second lorem ipsum etc etc
Becomes:
Race Name | Selection No | Selection | Race Overview
Race 1 - Handicap | 1 | 14 - Name | Race Overview: lorem ipsum etc etc
Race 1 - Handicap | 2 | 3 - Name | Race Overview: lorem ipsum etc etc
Race 1 - Handicap | 3 | 5 - Name | Race Overview: lorem ipsum etc etc
Race 1 - Handicap | 4 | 6 - Name | Race Overview: lorem ipsum etc etc
Race 2 - Sprint | 1 | 12 - Name | Race Overview: Second lorem ipsum etc etc
Race 2 - Sprint | 2 | 10 - Name | Race Overview: Second lorem ipsum etc etc
Race 2 - Sprint | 3 | 8 - Name | Race Overview: Second lorem ipsum etc etc
Race 2 - Sprint | 4 | 11 - Name | Race Overview: Second lorem ipsum etc etc
I'm thinking its a loop function searching for the key word (row beginning with Race) and then transposing the 5 rows underneath. The text is always listed underneath in the subsequesnt 5 rows. Any help or direction to some resources would be great! Thanks
回答1:
If you data repeats every 6 rows in a fixed pattern, you can do something like:
(
pd.DataFrame(data = df['Col A'].values.reshape(-1, 6))
.set_index([0, 5])
.stack()
.rename_axis(index=['Race Name','Race Overview','Selection No'])
.to_frame('Selection')
.reset_index()
)
This will give you the results below:
Race Name Race Overview Selection No Selection
0 Race 1 - Handicap Race Overview: lorem ipsum etc etc 1 14 - NAME
1 Race 1 - Handicap Race Overview: lorem ipsum etc etc 2 3 - NAME
2 Race 1 - Handicap Race Overview: lorem ipsum etc etc 3 5 - NAME
3 Race 1 - Handicap Race Overview: lorem ipsum etc etc 4 6 - NAME
4 Race 2 - Sprint Race Overview: Second lorem ipsum etc etc 1 12 - NAME
5 Race 2 - Sprint Race Overview: Second lorem ipsum etc etc 2 10 - NAME
6 Race 2 - Sprint Race Overview: Second lorem ipsum etc etc 3 8 - NAME
7 Race 2 - Sprint Race Overview: Second lorem ipsum etc etc 4 11 - NAME
来源:https://stackoverflow.com/questions/61170184/search-and-return-rows-underneath-in-python-dataframe-and-transpose