BeautifulSoup innerhtml?

前端未结

关注

 6  1508

猫巷女王i 2020-11-27 13:28

Let\'s say I have a page with a div. I can easily get that div with soup.find().

Now that I have the result, I\'d like to print the WHOLE <

6条回答

攒了一身酷 (楼主)

2020-11-27 14:20
For just text, Beautiful Soup 4 get_text()

If you only want the human-readable text inside a document or tag, you can use the get_text() method. It returns all the text in a document or beneath a tag, as a single Unicode string:
```
markup = '\nI linked to example.com\n'
soup = BeautifulSoup(markup, 'html.parser')

soup.get_text()
'\nI linked to example.com\n'
soup.i.get_text()
'example.com' 
```
You can specify a string to be used to join the bits of text together:
```
soup.get_text("|")
'\nI linked to |example.com|\n' 
```
You can tell Beautiful Soup to strip whitespace from the beginning and end of each bit of text:
```
soup.get_text("|", strip=True)
'I linked to|example.com' 
```
But at that point you might want to use the .stripped_strings generator instead, and process the text yourself:
```
[text for text in soup.stripped_strings]
# ['I linked to', 'example.com'] 
```
As of Beautiful Soup version 4.9.0, when lxml or html.parser are in use, the contents of

BeautifulSoup innerhtml?

For just text, Beautiful Soup 4 get_text()

For just text, Beautiful Soup 4 `get_text()`