发表新帖

发表新帖

DOMDocument for parsing HTML (instead of regex)

前端未结

关注

 2  709

无人共我 2020-12-12 01:58

I am trying to learn using DOMDocument for parsing HTML code.

I am just doing some simple work, I already liked gordon\'s answer on scrap data using regex and simpl

2条回答

执念已碎 (楼主)

2020-12-12 02:51
You shouldn't bother with the raw DOMDocument interface. Rather use one of the jQuery-style classes for extraction. How to parse HTML with PHP?

QueryPath seems to work fine if you use more specific selectors:
```
include "qp.phar";
$qp = htmlqp("http://www.nu.nl/internet/1106541/taalunie-keurt-open-sourcewoordenlijst-goed.html");

print $qp->find(".header h1")->text();
print $qp->top()->find(".article .content")->xhtml();
```
You might need to strip the intermingled Javascript before however (->find("script")->remove()).
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题