How to remove newline in pandas dataframe columns?

柔情痞子 提交于 2020-05-16 22:15:06

问题


I want to shorten and clean up a CSV file to use it in ElasticSearch. but there are line breaks in some Dataframes (cells) and it is not possible to parse the CSV to ElasticSearch. I now shortend the CSV with pandas and tried to remove the newline but it is not working.

Code is the following:

import pandas as pd

f=pd.read_csv("test.csv")

keep_col = ["Plugin ID","CVE","CVSS","Risk","Host","Protocol","Port","Name","Synopsis","Description","Solution",]

new_f = f[keep_col].replace('\\n',' ', regex=True)
new_f.to_csv("newFile.csv", index=False)

the shortage is working, but i have newlines in Description, Synopsis and Solutions. Any idea how to solve it with Python / Pandas? The CSV has about 100k entries so the linebreak removal has to be done in every entry.


回答1:


From what I've learnt, the third parameter for the .replace() parameter takes the count of the number of times you want to replace the old substring with the new substring, so instead just remove the third parameter since you don't know the number of times the new line exists.

new_f = f[keep_col].replace('\\n',' ')

This should help




回答2:


In case, using pandas data-frame is not compulsory , you can do it in following way using simple python:

with open('test.csv', 'r') as txtReader:
    with open('new_test.csv', 'w') as txtWriter:
        for line in txtReader.readlines():
            line = line.replace('\\n', '')
            txtWriter.write(line)


来源:https://stackoverflow.com/questions/55510574/how-to-remove-newline-in-pandas-dataframe-columns

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!