Java program to download images from a website and display the file sizes

北城以北 提交于 2021-02-09 11:12:15

问题


I'm creating a java program that will read a html document from a URL and display the sizes of the images in the code. I'm not sure how to go about achieving this though.

I wouldn't need to actually download and save the images, i just need the sizes and the order in which they appear on the webpage.

for example: a webpage has 3 images

<img src="dog.jpg" /> //which is 54kb
<img src="cat.jpg" /> //which is 75kb
<img src="horse.jpg"/> //which is 80kb

i would need the output of my java program to display

54kb
75kb
80kb

Any ideas where i should start?

p.s I'm a bit of a java newbie


回答1:


If you're new to Java you may want to leverage an existing library to make things a bit easier. Jsoup allows you to fetch an HTML page and extract elements using CSS-style selectors.

This is just a quick and very dirty example but I think it will show how easy Jsoup can make such a task. Please note that error handling and response-code handling was omitted, I merely wanted to pass on the general idea:

Document doc = Jsoup.connect("http://stackoverflow.com/questions/14541740/java-program-to-download-images-from-a-website-and-display-the-file-sizes").get();

Elements imgElements = doc.select("img[src]");
Map<String, String> fileSizeMap = new HashMap<String, String>();

for(Element imgElement : imgElements){
    String imgUrlString = imgElement.attr("abs:src");
    URL imgURL = new URL(imgUrlString);
    HttpURLConnection httpConnection = (HttpURLConnection) imgURL.openConnection();
    String contentLengthString = httpConnection.getHeaderField("Content-Length");
    if(contentLengthString == null)
        contentLengthString = "Unknown";

    fileSizeMap.put(imgUrlString, contentLengthString);
}

for(Map.Entry<String, String> mapEntry : fileSizeMap.entrySet()){
    String imgFileName = mapEntry.getKey();
    System.out.println(imgFileName + " ---> " + mapEntry.getValue() + " bytes");
}

You might also consider looking at Apache HttpClient. I find it generally preferable over the raw URLConnection/HttpURLConnection approach.




回答2:


You should break you problem into 3 sub problems

  1. Download the HTML document
  2. Parse the HTML document and find the images
  3. Download the images and determine its size



回答3:


You can use regular expressions to find tag and get image URL. After that you'll need and HttpUrlConnection class to get image data and measure it's size.




回答4:


You can do this:

try {
    URL urlConn = new URL("http://yoururl.com/cat.jpg");
    URLConnection urlC = urlConn.openConnection();
    System.out.println(urlC.getContentLength());
} catch (MalformedURLException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}


来源:https://stackoverflow.com/questions/14541740/java-program-to-download-images-from-a-website-and-display-the-file-sizes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!