stax - get xml node as string

后端 未结 5 1482
孤街浪徒
孤街浪徒 2021-01-05 15:49

xml looks like so:


   
      ...stuff...
   
   
              


        
5条回答
  •  南方客
    南方客 (楼主)
    2021-01-05 16:32

    I had a similar task and although the original question is older than a year, I couldn't find a satisfying answer. The most interesting answer up to now was Blaise Doughan's answer, but I couldn't get it running on the XML I am expecting (maybe some parameters for the underlying parser could change that?). Here the XML, very simplyfied:

    
        
            ...
            

    Lorem ipsum...

    Devils inside... ...

    My solution:

    public static String readElementBody(XMLEventReader eventReader)
        throws XMLStreamException {
        StringWriter buf = new StringWriter(1024);
    
        int depth = 0;
        while (eventReader.hasNext()) {
            // peek event
            XMLEvent xmlEvent = eventReader.peek();
    
            if (xmlEvent.isStartElement()) {
                ++depth;
            }
            else if (xmlEvent.isEndElement()) {
                --depth;
    
                // reached END_ELEMENT tag?
                // break loop, leave event in stream
                if (depth < 0)
                    break;
            }
    
            // consume event
            xmlEvent = eventReader.nextEvent();
    
            // print out event
            xmlEvent.writeAsEncodedUnicode(buf);
        }
    
        return buf.getBuffer().toString();
    }
    

    Usage example:

    XMLEventReader eventReader = ...;
    while (eventReader.hasNext()) {
        XMLEvent xmlEvent = eventReader.nextEvent();
        if (xmlEvent.isStartElement()) {
            StartElement elem = xmlEvent.asStartElement();
            String name = elem.getName().getLocalPart();
    
            if ("DESCRIPTION".equals(name)) {
                String xmlFragment = readElementBody(eventReader);
                // do something with it...
                System.out.println("'" + fragment + "'");
            }
        }
        else if (xmlEvent.isEndElement()) {
            // ...
        }
    }
    

    Note that the extracted XML fragment will contain the complete extracted body content, including white space and comments. Filtering those on demand, or making the buffer size parametrizable have been left out for code brevity:

    '
        
            ...
            

    Lorem ipsum...

    Devils inside... ...
    '

提交回复
热议问题