Escaping control characters in Oracle XDB

筅森魡賤 提交于 2019-11-28 09:44:09

问题


I'm completely new to Oracle's XDB, in particular using it to generate XML output from a database table, and am working on an application which is moving from 9i (Oracle9i Enterprise Edition Release 9.2.0.5.0 - Production) to 11g (Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production). Here's a small test case which illustrates the problem I'm having:

select xmlelement("test", test) from (select 'a' test from dual);

This works and gives me:

<test>a</test>

However in 11g, if I swap 'a' for an invalid character, such as U+0013 I get the following error:

ORA-31061: XDB error: special char to escaped char conversion failed.

Under 9i the same thing works successfully, with no error.

Obviously the ideal answer is to have some validation in place to prevent control characters getting into the simple character data that I'm trying to convert into XML, but unfortunately that's outside the scope of what I'm doing.

Is this something anyone else has experienced, and if so, is there a simple change I can make to my XML generating script, or do I need to do some other kind of cleansing? Or just manually fix the problem on the rare occasions that it happens (which would be a perfectly reasonable option for my needs).

Many thanks.


回答1:


U+0013 is not a valid unicode codepoint for XML. See e.g. Valid characters in XML. So 11g correctly raises an exception.

SQL> select xmlelement("test", unistr('a\0013b')) from dual;
ERROR:
ORA-31061: XDB error: special char to escaped char conversion failed.

no rows selected

SQL> select xmlelement("test", unistr('a\00aeb')) from dual;

XMLELEMENT("TEST",UNISTR('A\00AEB'))
--------------------------------------------------------------------------------
<test>a®b</test>

SQL> 

No idea why this will pass in 9i (I don't have that available), but that's probably simply because Oracle's implementation has evolved to be more standard conforming and/or the standard has evolved.

Your fix is correct.




回答2:


While always fixing the data at the source is the best solution, I also found this to be useful in the case where I cannot control the data at the source:

select xmlelement("test", test) from (select regexp_replace(unistr('a\0013b'), '[[:cntrl:]]', '') test from dual);

Important piece is the regexp_replace(your_field, '[[:cntrl::]]', '') to remove control characters from the data.




回答3:


Just to follow-up on this for anyone interested. As far as I can tell, 9i just passed through the invalid character, producing invalid XML. 11g throws an error, which is probably the more correct behaviour, even if it is annoying in my case.

The only reasonable solution I found was to fix the content at source.



来源:https://stackoverflow.com/questions/7270445/escaping-control-characters-in-oracle-xdb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!