All,
I am stuck again trying to get my data in a format that I need it in. I have a text field that looks like this.
\"deangelo 001 deangelo
In the case you need to remove line breaks from the begin or end of the string, you may use this:
UPDATE table
SET field = regexp_replace(field, E'(^[\\n\\r]+)|([\\n\\r]+$)', '', 'g' );
Have in mind that the hat ^
means the begin of the string and the dollar sign $
means the end of the string.
Hope it help someone.
select regexp_replace(field, E'[\\n\\r\\u2028]+', ' ', 'g' )
I had the same problem in my postgres d/b, but the newline in question wasn't the traditional ascii CRLF, it was a unicode line separator, character U2028. The above code snippet will capture that unicode variation as well.
Update... although I've only ever encountered the aforementioned characters "in the wild", to follow lmichelbacher's advice to translate even more unicode newline-like characters, use this:
select regexp_replace(field, E'[\\n\\r\\f\\u000B\\u0085\\u2028\\u2029]+', ' ', 'g' )
select regexp_replace(field, E'[\\n\\r]+', ' ', 'g' )
read the manual http://www.postgresql.org/docs/current/static/functions-matching.html
OP asked specifically about regexes since it would appear there's concern for a number of other characters as well as newlines, but for those just wanting strip out newlines, you don't even need to go to a regex. You can simply do:
select replace(field,E'\n','');
I think this is an SQL-standard behavior, so it should extend back to all but perhaps the very earliest versions of Postgres. The above tested fine for me in 9.4 and 9.2