How do you Programmatically Download a Webpage in Java

前端 未结 11 2199
无人共我
无人共我 2020-11-22 11:20

I would like to be able to fetch a web page\'s html and save it to a String, so I can do some processing on it. Also, how could I handle various types of compr

11条回答
  •  借酒劲吻你
    2020-11-22 12:00

    Bill's answer is very good, but you may want to do some things with the request like compression or user-agents. The following code shows how you can various types of compression to your requests.

    URL url = new URL(urlStr);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection(); // Cast shouldn't fail
    HttpURLConnection.setFollowRedirects(true);
    // allow both GZip and Deflate (ZLib) encodings
    conn.setRequestProperty("Accept-Encoding", "gzip, deflate");
    String encoding = conn.getContentEncoding();
    InputStream inStr = null;
    
    // create the appropriate stream wrapper based on
    // the encoding type
    if (encoding != null && encoding.equalsIgnoreCase("gzip")) {
        inStr = new GZIPInputStream(conn.getInputStream());
    } else if (encoding != null && encoding.equalsIgnoreCase("deflate")) {
        inStr = new InflaterInputStream(conn.getInputStream(),
          new Inflater(true));
    } else {
        inStr = conn.getInputStream();
    }
    

    To also set the user-agent add the following code:

    conn.setRequestProperty ( "User-agent", "my agent name");
    

提交回复
热议问题