I have CSV files, tab-separated, fields not wrapped in quotes, where field data can contain characters like single quotes, double quotes, pipes and backslashes.
(Added as a new answer since I don't have the reputation yet to comment.)
For the record, since I've been struggling with the same issue, you can use tr
to remove \b
, instead of just hoping it's not in your text anywhere.
tr -d '\010' < filename.csv > newfile.csv
(Using that \010
is the octal representation of \b
).
Since COPY
supports reading from STDIN
, you can ease the I/O impact by piping tr
's output:
cat filename.csv | tr -d '\010' | COPY FROM STDIN WITH CSV DELIMITER E'\t' QUOTE E'\b' NULL AS '';