Here is the question.
Given the url http://www.example.com, can we read the first N bytes out of the page?
You can do it natively by the next curl command (no need to donwload whole document). According to culr man page:
RANGES HTTP 1.1 introduced byte-ranges. Using this, a client can request to get only one or more subparts of a specified document.
curl
supports this with the-r
flag.Get the first 100 bytes of a document: curl -r 0-99 http://www.get.this/ Get the last 500 bytes of a document: curl -r -500 http://www.get.this/ `curl` also supports simple ranges for FTP files as well. Then you can only specify start and stop position. Get the first 100 bytes of a document using FTP: curl -r 0-99 ftp://www.get.this/README
It works for me even with Java web app that deployed to GigaSpaces.