ORA-31011: XML parsing failed - invalid characters (oracle sql)

流过昼夜 提交于 2019-12-24 06:48:09

问题


I'm producing an XML document using SQL on Oracle 11g database. But I'm having a problem with a database field, because the title field holds many characters some of which XML see's as invalid, I'm trying to use the below statement to catch as many as possible and convert them to NULL.

REGEXP_REPLACE (title, '’|£|&|*|@|-|>|/|<|;|\', '', 1, 0, 'i') as title

I'm still getting the parse problem so I know there must be more invalid characters I've missed. I know it's failing on this field as when I change the field to a string 'Title' (as below), the document is parsed and it works fine.

REGEXP_REPLACE ('title', '’|£|&|*|@|-|>|/|<|;|\', '', 1, 0, 'i') as title

I'm using XML version '1.0" encoding="UTF-8', is there an easy way around this or do I have to locate the records that are failing which could be any from 2 million records. The title field holds song titles from all over the world, could I use REGEXP_REPLACE to get a range of characters between char(32) and lets say char(255) anything not in this range replace with NULL.

OR is there another solution.

thanks in advance guys


回答1:


Have you considered only keeping the characters you want? I don't know what they are, but something like this

REGEXP_REPLACE('title', '[^a-zA-Z0-9 ,.!]', '', 1, 0, 'i') as title



回答2:


The only illegal characters in XML are &, < and > (as well as " or ' in attributes).

You can escape such characters with an Oracle function

Example:

select DBMS_XMLGEN.CONVERT(title) from ...

Details: https://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_xmlgen.htm#i1013100



来源:https://stackoverflow.com/questions/40493076/ora-31011-xml-parsing-failed-invalid-characters-oracle-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!