DOM manipulation in PHP

后端 未结 4 1563
Happy的楠姐
Happy的楠姐 2020-12-16 03:16

I am looking for good methods of manipulating HTML in PHP. For example, the problem I currently have is dealing with malformed HTML.

I am getting input that looks so

相关标签:
4条回答
  • 2020-12-16 03:38

    PHP has a PECL extension that gives you access to the features of HTML Tidy. Tidy is a pretty powerful library that should be able to take code like that and close tags in an intelligent manner.

    I use it to clean up malformed XML and HTML sent to me by a classified ad system prior to import.

    0 讨论(0)
  • 2020-12-16 03:38

    I've found PHP Simple HTML DOM to be the most useful and straight forward library yet. Better than PECL I would say.

    I've written an article on how to use it to scrape myspace artist tour dates (just an example.) Here's a link to the php simple html dom parser.

    0 讨论(0)
  • 2020-12-16 03:58

    The DOM library which is now built-in can solve this problem easily. The loadHTML method will accept malformed XML while the load method will not.

    $d = new DOMDocument;
    $d->loadHTML('<div>This is some <b>text');
    $d->saveHTML();
    

    The output will be:

    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html>
      <body>
        <div>This is some <b>text</b></div>
      </body>
    </html>
    
    0 讨论(0)
  • 2020-12-16 03:58

    For manipulating the DOM i think that what you're looking for is this. I've used to parse HTML documents from the web and it worked fine for me.

    0 讨论(0)
提交回复
热议问题