Java Transformer how to ignore namespaces

隐身守侯 提交于 2021-01-28 02:05:28

问题


I have to transform XML to XHTML but the XML defines a namespace xmlns='http://www.lotus.com/dxl' which is never used in the whole XML therefore the parser won't parse anything ...

Is there a way I ignore namepsaces? I am using the Oracle java transformer import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory

Or are there any better libraries?


回答1:


No, you can't ignore namespaces.

If the namespace declaration xmlns='http://www.lotus.com/dxl' appears in the outermost element, then you can't say it "isn't used anywhere" - on the contrary, it is used everywhere! It effectively changes every element name in the document to a different name. There's no way you can ignore that.

If you were using XSLT 2.0, then you would be able to say in your stylesheet xpath-default-namespace="http://www.lotus.com/dxl" which would pretty much do what you want: it says that any unprefixed name in a match pattern or XPath expression should be interpreted as referring to a name in namespace http://www.lotus.com/dxl. Sadly, you've chosen an XSLT processor that doesn't implement XSLT 2.0. So you'll have to do it the hard way (which is described in about 10,000 posts that you will find by searching for "XSLT default namespace").




回答2:


You can't ignore namespaces easily, and it won't be pretty, but it is possible. Of course, tricking the right part inside the Transformer implementation into just outputting the prefixes without getting flustered is implementation dependent!

OK then, this works for me going from a Node to a StringWriter:

public static String nodeToString(Node node) throws TransformerException {
  StringWriter results = new StringWriter();
  Transformer transformer = createTransformer();
  transformer.transform(new DOMSource(node), new StreamResult(results) {
    @Override 
    public Writer getWriter() {
      Field field = findFirstAssignable(transformer.getClass());
      try {
        field.setAccessible(true);
        field.set(transformer, new TransletOutputHandlerFactory(false) {
          @Override 
          public SerializationHandler getSerializationHandler() throws 
            IOException, ParserConfigurationException {

            SerializationHandler handler = super.getSerializationHandler();
            SerializerBase base = (SerializerBase) handler.asDOMSerializer();
            base.setNamespaceMappings(new NamespaceMappings() {
              @Override 
              public String lookupNamespace(String prefix) {
                return prefix;
              }
            });
            return handler;
          }
        });
      } catch(IllegalAccessException e) {
        throw new AssertionError("Must not happen", e);
      }
      return super.getWriter();
    }
  });
  return results.toString();
}
private static <E> Field findFirstAssignable(Class<E> clazz) {
  return Stream.<Class<? super E>>iterate(clazz, Convert::iteration)
    .flatMap(Convert::classToFields)
    .filter(Convert::canAssign).findFirst().get();
}
private static <E> Class<? super E> iteration(Class<? super E> c) {
  return c == null ? null : c.getSuperclass();
}
private static boolean canAssign(Field f) {
  return f == null || 
    f.getType().isAssignableFrom(TransletOutputHandlerFactory.class);
}
private static <E> Stream<Field> classToFields(Class<? super E> c) {
  return c == null ? Stream.of((Field) null) : 
    Arrays.stream(c.getDeclaredFields());
}

What this is doing is pretty much just customizing the mapping of namespaces to prefixes. Every prefix is mapped to a namespace identified by its prefix, so there shouldn't even be any conflicts. The rest of it is fighting the API.

To make the example complete, here are the methods converting to and from the XML as well:

public static Transformer createTransformer() 
  throws TransformerFactoryConfigurationError, 
    TransformerConfigurationException {

  TransformerFactory factory = TransformerFactory.newInstance();
  Transformer transformer = factory.newTransformer();
  transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
  transformer.setOutputProperty(OutputKeys.INDENT, "no");
  return transformer;
}
public static ArrayList<Node> parseNodes(String uri, String expression)
  throws ParserConfigurationException, SAXException, 
    IOException,XPathExpressionException {

  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
  factory.setNamespaceAware(false);
  DocumentBuilder builder = factory.newDocumentBuilder();
  Document doc = builder.parse(uri);
  XPathFactory xPathfactory = XPathFactory.newInstance();
  XPath xpath = xPathfactory.newXPath();
  XPathExpression expr = xpath.compile(expression);
  NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
  ArrayList<Node> nodes = new ArrayList<>();
  for(int i = 0; i < nl.getLength(); i++) {
    nodes.add(nl.item(i));
  }
  return nodes;
}


来源:https://stackoverflow.com/questions/30918197/java-transformer-how-to-ignore-namespaces

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!