Identifying 2 same images using Java

前端 未结 10 2536
别那么骄傲
别那么骄傲 2020-12-28 21:10

I have a problem in my web crawler where I am trying to retrieve images from a particular website. Problem is that often I see images that are exactly same but different in

10条回答
  •  没有蜡笔的小新
    2020-12-28 21:37

    Depending on how detailed you want to get with it:

    • download the image
    • as you download it generate a hash for it
    • make a directory where the directory name is the hash value (if the directory does not exist)
    • if directory contains 2 or more files then compare the file sizes
    • if the file sizes are the same then do a byte by byte comparison of the image to the bytes of the images in the file
    • if the bytes are unique then you have a new image

    Regardless of if you want to do all that or not you need to:

    • download the images
    • do a byte-by-byte comparison of the images

    No need to rely on any special imaging libraries, images are just bytes.

提交回复
热议问题