Pandas DataFrame: use column value to slice string in another column

笑着哭i 提交于 2019-12-19 04:12:43

问题


I have a pandas DataFrame as follow:

     col1  col2  col3
0    1     3     ABCDEFG
1    1     5     HIJKLMNO
2    1     2     PQRSTUV

I want to add another column which should be a substring of col3 from position as indicated in col1 to position as indicated in col2. Something like col3[(col1-1):(col2-1)], which should result in:

     col1  col2  col3       new_col
0    1     3     ABCDEFG    ABC
1    1     5     HIJKLMNO   HIJK
2    1     2     PQRSTUV    PQ

I tried with the following:

my_df['new_col'] = my_df.col3.str.slice(my_df['col1']-1, my_df['col2']-1)

and

my_df['new_col'] = data['col3'].str[(my_df['col1']-1):(my_df['col2']-1)]

Both of them results in a column of NaN, while if I insert two numerical values (i.e. data['col3'].str[1:3]) it works fine. I checked and the types are correct (int64, int64 and object). Also, outside such context (e.g. using a for loop) I can get the job done, but I'd prefer a one liner that exploit the DataFrame. What am I doing wrong?


回答1:


Use apply, because each row has to be process separately:

my_df['new_col'] = my_df.apply(lambda x: x['col3'][x['col1']-1:x['col2']], 1)  
print (my_df)
   col1  col2      col3 new_col
0     1     3   ABCDEFG     ABC
1     1     5  HIJKLMNO   HIJKL
2     1     2   PQRSTUV      PQ


来源:https://stackoverflow.com/questions/47395993/pandas-dataframe-use-column-value-to-slice-string-in-another-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!