Remove whitespaces in XML string

前端 未结 8 1712
我在风中等你
我在风中等你 2020-11-29 04:53

How can I remove the whitespaces and line breaks in an XML string in Python 2.6? I tried the following packages:

etree: This snippet keeps the original whitespaces:<

8条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-29 05:07

    If whitespace in "non-leaf" nodes is what we're trying to remove then the following function will do it (recursively if specified):

    from xml.dom import Node
    
    def stripNode(node, recurse=False):
        nodesToRemove = []
        nodeToBeStripped = False
    
        for childNode in node.childNodes:
            # list empty text nodes (to remove if any should be)
            if (childNode.nodeType == Node.TEXT_NODE and childNode.nodeValue.strip() == ""):
                nodesToRemove.append(childNode)
    
            # only remove empty text nodes if not a leaf node (i.e. a child element exists)
            if childNode.nodeType == Node.ELEMENT_NODE:
                nodeToBeStripped = True
    
        # remove flagged text nodes
        if nodeToBeStripped:
            for childNode in nodesToRemove:
                node.removeChild(childNode)
    
        # recurse if specified
        if recurse:
            for childNode in node.childNodes:
                stripNode(childNode, True)
    

    However, Thanatos is correct. Whitespace can represent data in XML so use with caution.

提交回复
热议问题