Efficiently search for first character of a string in a pandas dataframe

淺唱寂寞╮ 提交于 2020-01-03 05:39:07

问题


I have a pandas data frame column and I need to modify any entry of that column that starts with a 2. Right now, I'm using this which works, but is very, very slow:

for i, row in df.iterrows():
    if df['IDnumber'][i].startswith('2') == True:
       '''Do some stuff'''

I feel (read: know) there's a more efficent way to do this without using a for loop but I can't seem to find it.

Other things I've tried:

if df[df['IDnumber'].str[0]] == '2':
   '''Do some stuff'''

if df[df['IDnumber'].str.startswith('2')] == True:
    '''Do some stuff'''

Which respectively give the errors:

KeyError: "['2' '2' '2' ..., '1' '1' '1'] not in index"
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

回答1:


Do you mean you want to filter rows where the value from a string column starts with some character?

>>> df
   foobar
0    0foo
1    1foo
2    2foo
3    3foo
4    4foo
5    5foo
6    0bar
7    1bar
8    2bar
9    3bar
10   4bar
11   5bar

>>> df.loc[(df.foobar.str.startswith('2'))]
  foobar
2   2foo
8   2bar

Then it is:

>>> begining_with_2 = df.loc[(df.foobar.str.startswith('2'))]
>>> for i, row in begining_with_2.iterrows():
...    print(row.foobar)

2foo
2bar


来源:https://stackoverflow.com/questions/47080315/efficiently-search-for-first-character-of-a-string-in-a-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!