Python remove stop words from pandas dataframe

后端 未结 4 1342
走了就别回头了
走了就别回头了 2020-11-29 02:08

I want to remove the stop words from my column \"tweets\". How do I iterative over each row and each item?

pos_tweets = [(\'I love this car\', \'positive\'),         


        
4条回答
  •  抹茶落季
    2020-11-29 02:52

    Check out pd.DataFrame.replace(), it might work for you:

    In [42]: test.replace(to_replace='I', value="",regex=True)
    Out[42]:
                                  tweet     class
    0                     love this car  positive
    1              This view is amazing  positive
    2           feel great this morning  positive
    3   am so excited about the concert  positive
    4              He is my best friend  positive
    

    Edit : replace() would search for string(and even substrings). For e.g. it would replace rk from work if rk is a stopword which sometimes is not expected.

    Hence the use of regex here :

    for i in stop :
        test = test.replace(to_replace=r'\b%s\b'%i, value="",regex=True)
    

提交回复
热议问题