In Java, it's possible determine the size of a web page before download?

青春壹個敷衍的年華 提交于 2019-12-11 06:37:45

问题


I want determine the size of a web page, and so, if it is greater than a number (eg.:5MB), I will download it or not. Can I have this information?


回答1:


You can do a decent approximation with:

HttpURLConnection content = (HttpURLConnection) new URL("www.example.com").openConnection();
System.out.println(content.getContentLength());

However, this will only tell you the length of the specific resource you're requesting (e.g. the HTML at the base of the URL). You will also need to go through the HTML in the page, look at all the resources that it references (scripts from other sites, images, video, etc.) and total them all up.

That will get you fairly close to a total size, but even then you won't get a perfect count, because (a) not all URLs are going to return this information and you don't have any control over that, and (b) depending on how the content is loaded (such as through AJAX calls that hide the specifics) you won't be able to know ahead of time the complete list of resources to be downloaded.

Alternatively, if a URL doesn't return a result, I think Giacomo was suggesting the use of a CounterInputStream. Not a bad idea. You could maybe combine the above suggestion with the CounterInputStream to keep a count of the total that has been sent, and potentially stop the transfer when it reaches a specified maximum total transfer size. That way you'd essentially have a predicted size (say a site tells you it's going to be 3.3 MB), but as you're downloading you find out that it's actually 6 MB and hasn't stopped yet, and make the decision to not download anymore than that.




回答2:


I may be wrong however can't you just use

HttpURLConnection conn = (HttpURLConnection) new URL("http://www.google.com").openConnection();
System.out.println(conn.getContentLength());

?



来源:https://stackoverflow.com/questions/5902306/in-java-its-possible-determine-the-size-of-a-web-page-before-download

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!