searching matching string pattern from dataframe column in python pandas

前端 未结 2 987
离开以前
离开以前 2020-12-17 04:05

i have a data-frame like below

 name         genre
 satya      |ACTION|DRAMA|IC|
 satya      |COMEDY|BIOPIC|SOCIAL|
 abc        |CLASSICAL|
 xyz        |ROMA         


        
2条回答
  •  一整个雨季
    2020-12-17 04:51

    I think you can add \ to regex for escaping , because | without \ is interpreted as OR:

    '|'

    A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the '|' in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by '|' are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy. To match a literal '|', use \|, or enclose it inside a character class, as in [|].

    print df['genre'].str.contains(u'\|IC\|')
    0     True
    1    False
    2    False
    3    False
    4     True
    5     True
    Name: genre, dtype: bool
    
    print df[df['genre'].str.contains(u'\|IC\|')]
        name                        genre
    0  satya            |ACTION|DRAMA|IC|
    4    def  |DISCOVERY|SPORT|COMEDY|IC|
    5    ghj                         |IC|
    

提交回复
热议问题