We are currently migrating one of our oracle databases to UTF8 and we have found a few records that are near the 4000 byte varchar limit. When we try and migrate these reco
Answer given by Francisco Hayoz is the best. Don't use pl/sql functions if sql can do it for you.
Here is the simple test in Oracle 11.2.03
select s
, regexp_replace(s,'[^'||chr(1)||'-'||chr(127)||']','') "rep ^1-127"
, dump(regexp_replace(s,'['||chr(127)||'-'||chr(225)||']','')) "rep 127-255"
from (
select listagg(c, '') within group (order by c) s
from (select 127+level l,chr(127+level) c from dual connect by level < 129))
And "rep 127-255" is
Typ=1 Len=30: 226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255
i.e for some reason this version of Oracle does not replace char(226) and above. Using '['||chr(127)||'-'||chr(225)||']' gives the desired result. If you need to replace other characters just add them to the regex above or use nested replace|regexp_replace if the replacement is different then '' (null string).