发表新帖

发表新帖

Using regular expressions to extract the first image source from html codes?

后端未结

关注

 10  1113

深忆病人 2020-12-05 01:07

I would like to know how this can be achieved.

Assume: That there\'s a lot of html code containing tables, divs, images, etc.

Problem: How can I get matches

10条回答

长情又很酷 (楼主)

2020-12-05 01:27
I don't know if you MUST use regex to get your results. If not, you could try out simpleXML and XPath, which would be much more reliable for your goal:

First, import the HTML into a DOM Document Object. If you get errors, turn errors off for this part and be sure to turn them back on afterward:
```
 $dom = new DOMDocument();
 $dom -> loadHTMLFile("filename.html");
```
Next, import the DOM into a simpleXML object, like so:
```
 $xml = simplexml_import_dom($dom);
```
Now you can use a few methods to get all of your image elements (and their attributes) into an array. XPath is the one I prefer, because I've had better luck with traversing the DOM with it:
```
 $images = $xml -> xpath('//img/@src');
```
This variable now can treated like an array of your image URLs:
```
 foreach($images as $image) {
    echo '

    ';
  }
```
Presto, all of your images, none of the fat.

Here's the non-annotated version of the above:
```
 $dom = new DOMDocument();
 $dom -> loadHTMLFile("filename.html");

 $xml = simplexml_import_dom($dom);

 $images = $xml -> xpath('//img/@src');

 foreach($images as $image) {
    echo '

    ';
  }
```
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...

热议问题