How to record bad lines skipped by pandas

你离开我真会死。 提交于 2019-11-30 09:28:55

问题


I'm reading a CSV file with pandas with

error_bad_lines=False

A warning is printed when a bad line is encountered. However, I want to keep a record of all the bad line numbers to feed into another program. Is there an easy way of doing that?

I thought about iterating over the file with a

chunksize=1

and catching the CParserError that ought to be thrown for each bad line encountered. When I do this though no CParserError is thrown for bad lines so I can't catch them.


回答1:


Warnings are printed in the standard error channel. You can capture them to a file by redirecting the sys.stderr output.

import sys
import pandas as pd

with open('bad_lines.txt', 'w') as fp:
    sys.stderr = fp
    pd.read_csv('my_data.csv', error_bad_lines=False)


来源:https://stackoverflow.com/questions/42856255/how-to-record-bad-lines-skipped-by-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!