问题
I have a pandas DataFrame as follow:
col1 col2 col3
0 1 3 ABCDEFG
1 1 5 HIJKLMNO
2 1 2 PQRSTUV
I want to add another column which should be a substring of col3
from position as indicated in col1
to position as indicated in col2
. Something like col3[(col1-1):(col2-1)]
, which should result in:
col1 col2 col3 new_col
0 1 3 ABCDEFG ABC
1 1 5 HIJKLMNO HIJK
2 1 2 PQRSTUV PQ
I tried with the following:
my_df['new_col'] = my_df.col3.str.slice(my_df['col1']-1, my_df['col2']-1)
and
my_df['new_col'] = data['col3'].str[(my_df['col1']-1):(my_df['col2']-1)]
Both of them results in a column of NaN
, while if I insert two numerical values (i.e. data['col3'].str[1:3]
) it works fine. I checked and the types are correct (int64, int64 and object). Also, outside such context (e.g. using a for loop) I can get the job done, but I'd prefer a one liner that exploit the DataFrame. What am I doing wrong?
回答1:
Use apply
, because each row has to be process separately:
my_df['new_col'] = my_df.apply(lambda x: x['col3'][x['col1']-1:x['col2']], 1)
print (my_df)
col1 col2 col3 new_col
0 1 3 ABCDEFG ABC
1 1 5 HIJKLMNO HIJKL
2 1 2 PQRSTUV PQ
来源:https://stackoverflow.com/questions/47395993/pandas-dataframe-use-column-value-to-slice-string-in-another-column