Remove punctuations in pandas [duplicate]

邮差的信 提交于 2020-01-01 04:17:45

问题


code: df['review'].head()
        index         review
output: 0      These flannel wipes are OK, but in my opinion

I want to remove punctuations from the column of the dataframe and create a new column.

code: import string 
      def remove_punctuations(text):
          return text.translate(None,string.punctuation)

      df["new_column"] = df['review'].apply(remove_punctuations)

Error:
  return text.translate(None,string.punctuation)
  AttributeError: 'float' object has no attribute 'translate'

I am using python 2.7. Any suggestions would be helpful.


回答1:


Using Pandas str.replace and regex:

df["new_column"] = df['review'].str.replace('[^\w\s]','')



回答2:


You can build a regex using the string module's punctuation list:

df['review'].str.replace('[{}]'.format(string.punctuation), '')



回答3:


I solved the problem by looping through the string.punctuation

def remove_punctuations(text):
    for punctuation in string.punctuation:
        text = text.replace(punctuation, '')
    return text

You can call the function the same way you did and It should work.

df["new_column"] = df['review'].apply(remove_punctuations)


来源:https://stackoverflow.com/questions/39782418/remove-punctuations-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!