问题
I need to add an ATTLIST declaration to the DOCTYPE tag in html documents.
After reading the documentation and googling, this is what I've come up with:
from bs4 import BeautifulSoup, Doctype
# minimal html document
doc = """<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<html/>"""
soup = BeautifulSoup(doc, features='html.parser')
# the modified doctype tag
doctype = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[<!ATTLIST span bodyref CDATA #IMPLIED>] >"""
dt = BeautifulSoup(doctype, features='html.parser')
for item in soup.contents:
if isinstance(item, Doctype):
item.replace_with(dt)
break
print(soup.prettify(formatter=None))
This produces the desired result, but it feels a bit "hacky". I'd like to just insert the ATTLIST part into the tag, and not replace the whole thing, as I've done here. Does anyone know how to do that?
回答1:
A small improvement would be to build a Doctype
object and replace with that, for example:
from bs4 import BeautifulSoup, Doctype
# minimal html document
doc = """<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<html/>"""
# the modified doctype tag
doctype = """html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[<!ATTLIST span bodyref CDATA #IMPLIED>]"""
soup = BeautifulSoup(doc, features='html.parser')
for item in soup.contents:
if isinstance(item, Doctype):
item.replace_with(Doctype(doctype))
break
print(soup.prettify(formatter=None))
Giving:
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[<!ATTLIST span bodyref CDATA #IMPLIED>]>
<html>
</html>
来源:https://stackoverflow.com/questions/55910216/editing-doctype-tag-with-beautifulsoup