How do you find the “main” picture of a website, given the URL?

后端 未结 4 1929
眼角桃花
眼角桃花 2021-02-05 13:59

Let\'s say you\'re given http://nytimes.com How would you pull out the \"main\" image?

The reason I\'m asking is because Flipboard is able to grab the main image from a

4条回答
  •  难免孤独
    2021-02-05 14:24

    There really isn't anything that is considered the "main" image in a web page--nothing in HTML or otherwise to distinguish this. Not to mention you'd probably have to read all the images in CSS (or rather the background images etc). But if I had to do this, here is what I would do:

    1. First I would decide a suitable image size, lets say a 400x400 minimum. (I don't want to pick any old image, something really small would likely scale horribly)

    2. I would then iterate through each image on the page.2.

    3. For each image I encountered I would check the size of it3. If it was 400x400 (my predefined size) or larger I would use this image. If it wasn't, I would check that its the largest image I've found so far and if so keep its information stored off to the side.

    4. Once I had reached a predefined number of images I've checked

      (for argument lets say 10, but surely you'd probably go much higher) I'd use the largest image I've found (stored off to the side) because I wouldn't want to scan the page indefinitely looking for images!

提交回复
热议问题