Replacing elements with lxml.html

▼魔方 西西 提交于 2019-12-23 08:34:24

问题


I'm fairly new to lxml and HTML Parsers as a whole. I was wondering if there is a way to replace an element within a tree with another element...

For example I have:

body = """<code> def function(arg): print arg </code> Blah blah blah <code> int main() { return 0; } </code> """

doc = lxml.html.fromstring(body)
codeblocks = doc.cssselect('code')

for block in codeblocks:
  lexer = guess_lexer(block.text_content())
  hilited = highlight(block.text_content(), lexer, HtmlFormatter())
  doc.replace(block, hilited)

I want to do something along those lines, but this results in a "TypeError" because "hilited" isn't an lxml.etree._Element.

Is this feasible?

Regards,


回答1:


Regarding lxml,

In doc.replace(block, hilited)

block is the lxml's Element object, hilited is string, you cannot replace that.

There is 2 ways to do that

block.text=hilited 

or

body=body.replace(block.text,hilited)



回答2:


If you're new to python HTML parsers, you might try out BeautifulSoup, a html/xml parser, which lets you modify the parse tree easily.



来源:https://stackoverflow.com/questions/1812764/replacing-elements-with-lxml-html

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!