I have a dataframe and I try to get string, where on of column contain some string Df looks like
member_id,event_path,event_time,event_duration
30595,\"2016-
At least one of the regex patterns in urls must use a capturing group.
str.contains only returns True or False for each row in df['event_time'] --
it does not make use of the capturing group. Thus, the UserWarning is alerting you
that the regex uses a capturing group but the match is not used.
If you wish to remove the UserWarning you could find and remove the capturing group from the regex pattern(s). They are not shown in the regex patterns you posted, but they must be there in your actual file. Look for parentheses outside of the character classes.
Alternatively, you could suppress this particular UserWarning by putting
import warnings
warnings.filterwarnings("ignore", 'This pattern has match groups')
before the call to str.contains.
Here is a simple example which demonstrates the problem (and solution):
# import warnings
# warnings.filterwarnings("ignore", 'This pattern has match groups') # uncomment to suppress the UserWarning
import pandas as pd
df = pd.DataFrame({ 'event_time': ['gouda', 'stilton', 'gruyere']})
urls = pd.DataFrame({'url': ['g(.*)']}) # With a capturing group, there is a UserWarning
# urls = pd.DataFrame({'url': ['g.*']}) # Without a capturing group, there is no UserWarning. Uncommenting this line avoids the UserWarning.
substr = urls.url.values.tolist()
df[df['event_time'].str.contains('|'.join(substr), regex=True)]
prints
script.py:10: UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
df[df['event_time'].str.contains('|'.join(substr), regex=True)]
Removing the capturing group from the regex pattern:
urls = pd.DataFrame({'url': ['g.*']})
avoids the UserWarning.