Drop rows in pandas if they contains “???”

Im trying to drop rows in pandas that contains "???", it works for every other value except for "???", I do not know whats the problem.

This is my code (I have tried both types):

df = df[~df["text"].str.contains("?????", na=False)]
df = df[~df["text"].str.contains("?????")]

error that I'm getting:

re.error: nothing to repeat at position 0

It works for every other value except for "????". I have googled it, and looked all over this website but I couldnt find any solutions.

The parameter expects a regular expression, hence the error re.error. You can either escape the ? inside the expression like this:

df = df[~df["text"].str.contains("\?\?\?\?\?")]

Or set regex=False as Vorsprung sugested:

df = df[~df["text"].str.contains("?????",regex=False)]

let's convert this into running code:

import numpy as np
import pandas as pd

data = {'A': ['abc', 'cxx???xx', '???',], 'B': ['add', 'ddb', 'c', ]}
df = pd.DataFrame.from_dict(data)
df

output:

    A   B
0   abc add
1   cxx???xx    ddb
2   ??? c

with this:

df[df['A'].str.contains('???',regex=False)]

output:

    A   B
1   cxx???xx    ddb
2   ??? c

you need to tell contains(), that your search string is not a regex.

来源：https://stackoverflow.com/questions/61034172/drop-rows-in-pandas-if-they-contains

标签

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!