Simple HTML Dom: How to remove elements?

不问归期 提交于 2019-11-26 05:35:06

问题


I would like to use Simple HTML DOM to remove all images in an article so I can easily create a small snippet of text for a news ticker but I haven\'t figured out how to remove elements with it.

Basically I would do

  1. Get content as HTML string
  2. Remove all image tags from content
  3. Limit content to x words
  4. Output.

Any help?


回答1:


There is no dedicated methods for removing elements. You just find all the img elements and then do

$e->outertext = '';



回答2:


when you only delete the outer text you delete the HTML content itself, but if you perform another find on the same elements it will appear in the result. the reason is that the simple HTML DOM object still has it's internal structure of the element, only without its actual content. what you need to do in order to really delete the element is simply reload the HTML as string to the same variable. this way the object will be recreated without the deleted content, and the simple HTML DOM object will be built without it.

here is an example function:

public function removeNode($selector)
{
    foreach ($this->find($selector) as $node)
    {
        $node->outertext = '';
    }

    $this->load($this->save());        
}

put this function inside the simple_html_dom class and you're good.




回答3:


I think you have some difficulties because you forgot to save(dump the internal DOM tree back into string).

Try this:

$html = file_get_html("http://example.com");

foreach($html ->find('img') as $item) {
    $item->outertext = '';
    }

$html->save();

echo $html;



回答4:


I could not figure out where to put the function so I just put the following directly in my code:

$html->load($html->save());

It basically locks changes made in the for loop back into the html per above.




回答5:


This is working for me:

foreach($html->find('element') as $element){
   $element = NULL;
}



回答6:


The supposed solutions are quite expensive and practically unusable in a big loop or other kind of repetition.

I prefer to use "soft deletes":

foreach($html->find('somecondition'),$item){
    if (somecheck) $item->setAttribute('softDelete', true); //<= set marker to check in further code
    $item->outertext='';


   foreach($foo as $bar){
       if(!baz->getAttribute('softDelete'){
           //do something 
        }
    }

}



回答7:


Adding new answer since removeNode is definitely a better way of removing it:

$html->removeNode('img');

This method probably was not available when accepted answer was marked. You do not need to loop the html to find each one, this will remove them.



来源:https://stackoverflow.com/questions/8227481/simple-html-dom-how-to-remove-elements

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!