问题
I am trying to retrieve the final location of a given URL (String ref) as follows:
HttpURLConnection con = (HttpURLConnection)new URL(ref).openConnection();
con.setInstanceFollowRedirects(true);
con.setRequestProperty("User-Agent","");
int responseCode = con.getResponseCode();
return con.getURL().toString();
It works in most cases, but rarely returns a URL which yet contains another redirection.
What am I doing wrong here?
Why do I get responseCode = 3xx, even after calling setInstanceFollowRedirects(true)?
UPDATE:
OK, responseCode can sometimes be 3xx.
If it happens, then I will return con.getHeaderField("Location") instead.
The code now is:
HttpURLConnection con = (HttpURLConnection)new URL(ref).openConnection();
con.setInstanceFollowRedirects(true);
con.setRequestProperty("User-Agent","");
int responseType = con.getResponseCode()/100;
while (responseType == 1)
{
Thread.sleep(10);
responseType = con.getResponseCode()/100;
}
if (responseType == 3)
return con.getHeaderField("Location");
return con.getURL().toString();
Will appreciate comment should anyone see anything wrong with the code above.
UPDATE
- Removed the handling of code 1xx, as according to most commenters it is not necessary.
Testing if the Location header exists before returning it, in order to handle code 304.
HttpURLConnection con = (HttpURLConnection)new URL(ref).openConnection(); con.setInstanceFollowRedirects(true); con.setRequestProperty("User-Agent",""); if (con.getResponseCode()/100 == 3) { String target = con.getHeaderField("Location"); if (target != null) return target; } return con.getURL().toString();
回答1:
HttpURLConnection will not follow redirects if the protocol changes, such as http to https or https to http. In that case, it will return the 3xx code and you should be able to get the Location header. You may need to open a connection again in case that new url also redirects. So basically, use a loop and break it when you get a non-redirect response code. Also, watch out for infinite redirect loops, you could set a limit for the number of iterations or check if each new url has been visited already.
回答2:
If you just want the redirect url, the response header should give you that:
if (con.getResponseCode() == 301) {
String redirectUrl = con.getHeaderField("Location");
}
回答3:
There probably can easily be multiple levels of redirection - imagine a bit.ly pointing to a youtu.be address pointing to youtube.com. Perhaps you need to loop until you get your 200 OK or until you hit a redirection cycle.
I have trouble locating the source code to check but I believe what I said is true. See e.g. java urlconnection get the final redirected URL
You also might need to handle protocol redirects, e.g. HTTP -> HTTPS: URLConnection Doesn't Follow Redirect
回答4:
I think I now understand what you want. I now think that you are trying to retrieve the final address, not the content of the final address. Please correct me if my assumption is wrong.
For doing this (not the content, but the address), you need a different approach. You need to switch off follow-redirects and you then need to handle the iterational redirect-following on your own until you find a non-redirecting response. Bear in mind that you can not reuse a URLConnection
.
The approaches for finding the final address and the other approach for retrieving the content of the final address are so different, because URLConnection
does not reveal the followed-to address if you switch on follow-redirects.
In your code, you seem to expect URLConnection.getURL()
to return the followed-to address. This is not the behavior of this method. It returns the original URL
which you used to create the URLConnection
. It does this no matter if you switch on follow-redirects or not.
However, if you switch it on, you will not be able to get the followed-to URL address. This is because getHeaderField("Location")
, with follow-redirects, makes no sense: it returns the redirection-target of the final redirect, which should not exist, since it's the final address.
来源:https://stackoverflow.com/questions/20806617/retrieve-the-final-location-of-a-given-url-in-java