Escaping special character when generating an XML in Java

ぃ、小莉子 提交于 2019-12-02 20:14:50

You can use apache common lang library to escape a string.

org.apache.commons.lang.StringEscapeUtils

String escapedXml = StringEscapeUtils.escapeXml("the data might contain & or ! or % or ' or # etc");

But what you are looking for is a way to convert any string into a valid XML tag name. For ASCII characters, XML tag name must begin with one of _:a-zA-Z and followed by any number of character in _:a-zA-Z0-9.-

I surely believe there is no library to do this for you so you have to implement your own function to convert from any string to match this pattern or alternatively make it into a value of attritbue.

<property name="no more need to be encoded, it should be handled by XML library">0.0</property>
public class RssParser {
int length;
    URL url;
URLConnection urlConn;
NodeList nodeList;
Document doc;
Node node;
Element firstEle;
NodeList titleList;
Element ele;
NodeList txtEleList;
String retVal, urlStrToParse, rootNodeName;

public RssParser(String urlStrToParse, String rootNodeName){
    this.urlStrToParse = urlStrToParse;
    this.rootNodeName = rootNodeName;

    url=null;
    urlConn=null;
    nodeList=null;
    doc=null;
    node=null;
    firstEle=null;
    titleList=null;
    ele=null;
    txtEleList=null;
    retVal=null;
            doc = null;
    try {
        url = new URL(this.urlStrToParse);
                    // dis is path of url which v'll parse
        urlConn = url.openConnection();

                    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();

        String s = isToString(urlConn.getInputStream());
        s = s.replace("&", "&amp;");
        StringBuilder sb =
                            new StringBuilder
                                    ("<?xml version=\"1.0\" encoding=\"utf-8\"?>");
        sb.append("\n"+s);
        System.out.println("STR: \n"+sb.toString());
        s = sb.toString();

        doc = db.parse(urlConn.getInputStream());
        nodeList = doc.getElementsByTagName(this.rootNodeName); 
        //  dis is d first node which
        //  contains other inner element-nodes
        length =nodeList.getLength();
        firstEle=doc.getDocumentElement();
    }
    catch (ParserConfigurationException pce) {
        System.out.println("Could not Parse XML: " + pce.getMessage());
    }
    catch (SAXException se) {
        System.out.println("Could not Parse XML: " + se.getMessage());
    }
    catch (IOException ioe) {
        System.out.println("Invalid XML: " + ioe.getMessage());
    }
    catch(Exception e){
        System.out.println("Error: "+e.toString());
    }
}


public String isToString(InputStream in) throws IOException {
    StringBuffer out = new StringBuffer();
    byte[] b = new byte[512];
    for (int i; (i = in.read(b)) != -1;) {
        out.append(new String(b, 0, i));
    }
    return out.toString();
}

public String getVal(int i, String param){
    node =nodeList.item(i);
    if(node.getNodeType() == Node.ELEMENT_NODE)
    {
        System.out.println("Param: "+param);
        titleList = firstEle.getElementsByTagName(param);
        if(firstEle.hasAttribute("id"))
        System.out.println("hasAttrib----------------");
        else System.out.println("Has NOTNOT      NOT");
        System.out.println("titleList: "+titleList.toString());
    ele = (Element)titleList.item(i);
    System.out.println("ele: "+ele);
        txtEleList = ele.getChildNodes();
    retVal=(((Node)txtEleList.item(0)).getNodeValue()).toString();
    if (retVal == null)
        return null;
            System.out.println("retVal: "+retVal);
    }
return retVal;
}
}

Use the below code to escapes the characters in a String using XML.StringEscapeUtils is available in apche commons lang3 jar

StringEscapeUtils.escapeXml11("String to be escaped");
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!