i'm using PHP Simple HTML DOM Parser to get text from a webpage. The page i need to manipulate is something like:
<html>
<head>
<title>title</title>
<body>
<div id="content">
<h1>HELLO</h1>
Hello, world!
</div>
</body>
</html>
I need to get the h1
element and the text that has no tags.
to get the h1
i use this code:
$html = file_get_html("remote_page.html");
foreach($html->find('#content') as $text){
echo "H1: ".$text->find('h1', 0)->plaintext;
}
But the other text? I also tried this into the foreach but i get the full text:
$text->plaintext;
but it returned also the H1
tag...
It looks like $text->find('text',2);
gets what you're looking for, however I'm not sure how well that will work when the amount of text nodes is unknown. I'll keep looking.
You can simply strip html tags using strip_tags
<?php
strip_tags($input, '<br>');
?>
Use strip tags, as @Peachy pointed out. However, passing it a second argument <br>
means string will ignore <br>
tags, which is unnecessary. In your case,
<?php
strip_tags($text);
?>
would work as you'd like, given that you are only selecting content in the content
id.
来源:https://stackoverflow.com/questions/9854154/get-text-with-php-simple-html-dom-parser