Get the inner HTML of a element in lxml

后端 未结 6 2269
粉色の甜心
粉色の甜心 2020-12-13 19:10

I am trying to get the HTML content of child node with lxml and xpath in Python. As shown in code below, I want to find the html content of the each of product nodes. Does i

6条回答
  •  甜味超标
    2020-12-13 19:28

    Simple function to get innerHTML or innerXML
    .
    Try it out directly https://pyfiddle.io/fiddle/631aa049-2785-4c58-bf82-eff4e2f8bedb/
    .

    function

    
    def innerXML(elem):
        elemName = elem.xpath('name(/*)')
        resultStr = ''
        for e in elem.xpath('/'+ elemName + '/node()'):
            if(isinstance(e, str) ):
                resultStr = resultStr + ''
            else:
                resultStr = resultStr + etree.tostring(e, encoding='unicode')
    
        return resultStr
    
    

    invocation

    XMLElem = etree.fromstring("
    I amJhon Corner.I work as software engineer.
    ") print(innerXML(XMLElem))

    .
    Logic Behind

    • get the outermost element name first,
    • Then get all the child nodes
    • Convert all the child nodes to string using tostring
    • Concatinate Them

提交回复
热议问题