问题
Background
I have the following sample df
import pandas as pd
df = pd.DataFrame({'Text' : ['Jon J Mmith is Here from **BLOCK** until **BLOCK**',
'No P_Name Found here',
'Jane Ann Doe is Also here until **BLOCK** ',
'**BLOCK** was **BLOCK** Tom Tcker is Not here but **BLOCK** '],
'P_ID': [1,2,3,4],
'P_Name' : ['Mmith, Jon J', 'Hder, Mary', 'Doe, Jane Ann', 'Tcker, Tom'],
'N_ID' : ['A1', 'A2', 'A3', 'A4']
})
#rearrange columns
df = df[['Text','N_ID', 'P_ID', 'P_Name']]
df
Text N_ID P_ID P_Name
0 Jon J Mmith is Here from **BLOCK** until **BLOCK** A1 1 Mmith, Jon J
1 No P_Name Found here A2 2 Hder, Mary
2 Jane Ann Doe is Also here until **BLOCK** A3 3 Doe, Jane Ann
3 **BLOCK** was **BLOCK** Tom Tcker is Not here but A4 4 Hcker, Tom
Goal
1) In Text
column, add **BLOCK**
to the value (e.g. Jon J Mmith
) that corresponds to the value found in P_Name
Desired Output
Text N_ID P_ID P_Name
0 **BLOCK** is Here from **BLOCK** until **BLOCK** A1 1 Mmith, Jon J
1 No P_Name Found here A2 2 Hder, Mary
2 **BLOCK** is Also here until **BLOCK** A3 3 Doe, Jane Ann
3 **BLOCK** was **BLOCK** **BLOCK** is Not here but A4 4 Tcker, Tom
The desired output can occur in the same Text
col or a new_col
can be made
Question
How do I achieve my desired output?
回答1:
One way:
>>> df['Text'].replace(df['P_Name'].str.split(', *').apply(lambda l: ' '.join(l[::-1])),'**BLOCK**',regex=True)
0 **BLOCK** is here from **BLOCK** until **BLOCK**
1 No P_Name found here
2 **BLOCK** is also here until **BLOCK**
3 **BLOCK** was **BLOCK** **BLOCK** is not here but **...
You can use replace=True
to do this in place, or create a new column with df['new_col']=
the above. What this does is splits the P_name
column, joins it in reverse with a space, and replaces it in your Text
column.
来源:https://stackoverflow.com/questions/57029538/alter-text-in-pandas-column-based-on-names