DB2/iSeries SQL clean up CR/LF, tabs etc

╄→гoц情女王★ 提交于 2019-12-07 07:04:58

问题


I need to find and clean up line breaks, carriage returns, tabs and "SUB"-characters in a set of 400k+ string records, but this DB2 environment is taking a toll on me.

Thought I could do some search and replacing with the REPLACE() and CHR() functions, but it seems CHR() is not available on this system (Error: CHR in *LIBL type *N not found). Working with \t, \r, \n etc doesn't seem to be working either. The chars can be in the middle of strings or at the end of them.

DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC

Any hints on what I can do with this?


回答1:


If you want to clear up specific characters like carriage return (EBCDIC x'0d') and line feed (EBCDIC x'25') you should find the translated character in EBCDIC then use the TRANSLATE() function to replace them with space.

If you just want to remove undisplayable characters then look for anything under x'40'.




回答2:


I used this SQL to find x'25' and x'0D':

SELECT 
     <field>
    , LOCATE(x'0D', <field>) AS "0D" 
    , LOCATE(x'25', <field>) AS "25" 
    , length(trim(<field>)) AS "Length"
FROM <file> 
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0 

And I used this SQL to replace them:

UPDATE <file> 
SET <field> = REPLACE(REPLACE(<field>, x'0D', ' '), x'25', ' ')
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0 



回答3:


Here is an sample script that replaces X'41' by X'40'. Something that was creating issues at our shop:

UPDATE [yourfile] SET [yourfield] = TRANSLATE([yourfield], X'40', 
X'41') WHERE [yourfield] like '%' concat X'41' concat '%'    

If you need to replace more than one character, extend the "to" and "from" hexadecimal strings to the values you need in the TRANSLATE function.




回答4:


Try TRANSLATE or REPLACE.

The brute force method involves using POSITION to find the errant character, then SUBSTR before and after it. CONCAT the two substrings (less the undesirable character) to re-form the column.

The character encoding is almost certainly one of the EBCDIC character sets. Depending on how the table got loaded in the first place, the CR may be x'0d' and the LF x'15', x'25'. An easy way to find out is to get to a green screen and do a DSPPFM against the table. Press F10 then F11 to view the table is raw, hexadecimal (over/under) format.




回答5:


For details on the available functions see the DB2 for i5/OS SQL Reference.




回答6:


Perhaps the TRANSLATE() function will serve your needs.

    TRANSLATE( data, tochars, fromchars )

...where fromchars is the set of characters you don't want, and tochars is the corresponding characters you want them replaced with. You may have to write this out in hex format, as x'nnnnnn...' and you will need to know what character set you are working with. Using the DSPFFD command on your table should show the CCSID of your fields.



来源:https://stackoverflow.com/questions/7573321/db2-iseries-sql-clean-up-cr-lf-tabs-etc

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!