Java program to download images from a website and display the file sizes

北城以北 提交于 2021-02-09 11:12:15


I'm creating a java program that will read a html document from a URL and display the sizes of the images in the code. I'm not sure how to go about achieving this though.

I wouldn't need to actually download and save the images, i just need the sizes and the order in which they appear on the webpage.

for example: a webpage has 3 images

<img src="dog.jpg" /> //which is 54kb
<img src="cat.jpg" /> //which is 75kb
<img src="horse.jpg"/> //which is 80kb

i would need the output of my java program to display


Any ideas where i should start?

p.s I'm a bit of a java newbie


If you're new to Java you may want to leverage an existing library to make things a bit easier. Jsoup allows you to fetch an HTML page and extract elements using CSS-style selectors.

This is just a quick and very dirty example but I think it will show how easy Jsoup can make such a task. Please note that error handling and response-code handling was omitted, I merely wanted to pass on the general idea:

Document doc = Jsoup.connect("").get();

Elements imgElements ="img[src]");
Map<String, String> fileSizeMap = new HashMap<String, String>();

for(Element imgElement : imgElements){
    String imgUrlString = imgElement.attr("abs:src");
    URL imgURL = new URL(imgUrlString);
    HttpURLConnection httpConnection = (HttpURLConnection) imgURL.openConnection();
    String contentLengthString = httpConnection.getHeaderField("Content-Length");
    if(contentLengthString == null)
        contentLengthString = "Unknown";

    fileSizeMap.put(imgUrlString, contentLengthString);

for(Map.Entry<String, String> mapEntry : fileSizeMap.entrySet()){
    String imgFileName = mapEntry.getKey();
    System.out.println(imgFileName + " ---> " + mapEntry.getValue() + " bytes");

You might also consider looking at Apache HttpClient. I find it generally preferable over the raw URLConnection/HttpURLConnection approach.


You should break you problem into 3 sub problems

  1. Download the HTML document
  2. Parse the HTML document and find the images
  3. Download the images and determine its size


You can use regular expressions to find tag and get image URL. After that you'll need and HttpUrlConnection class to get image data and measure it's size.


You can do this:

try {
    URL urlConn = new URL("");
    URLConnection urlC = urlConn.openConnection();
} catch (MalformedURLException e) {
} catch (IOException e) {

