Replace text with HTML tag in LXML text element

本小妞迷上赌 提交于 2019-12-22 07:00:11

问题


I have some lxml element:

>> lxml_element.text
  'hello BREAK world'

I need to replace the word BREAK with an HTML break tag—<br />. I've tried to do simple text replacing:

lxml_element.text.replace('BREAK', '<br />')

but it inserts the tag with escaped symbols, like &lt;br/&gt;. How do I solve this problem?


回答1:


Here's how you could do it. Setting up a sample lxml from your question:

>>> import lxml
>>> some_data = "<b>hello BREAK world</b>"
>>> root = lxml.etree.fromstring(some_data)
>>> root
<Element b at 0x3f35a50>
>>> root.text
'hello BREAK world'

Next, create a subelement tag <br>:

>>> childbr = lxml.etree.SubElement(root, "br")
>>> childbr
<Element br at 0x3f35b40>
>>> lxml.etree.tostring(root)
'<b>hello BREAK world<br/></b>'

But that's not all you want. You have to take the text before the <br> and place it in .text:

>>> root.text = "hello"
>>> lxml.etree.tostring(root)
'<b>hello<br/></b>'

Then set the .tail of the child to contain the rest of the text:

>>> childbr.tail = "world"
>>> lxml.etree.tostring(root)
'<b>hello<br/>world</b>'



回答2:


Well I don't think you want to just change the text node of the element. What I think you want to do is to modify the text node of your Element add a SubElement of name br to your lxml_element and then set the tail attribute of your subelement to the 2nd part of the string you are parsing. I found the tutorial here: http://lxml.de/tutorial.html#the-element-class to be very useful.



来源:https://stackoverflow.com/questions/7236798/replace-text-with-html-tag-in-lxml-text-element

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!