Jsoup Cookies for HTTPS scraping

后端 未结 3 1054
不知归路
不知归路 2020-11-27 12:41

I am experimenting with this site to gather my username on the welcome page to learn Jsoup and Android. Using the following code

Connection.Response res = J         


        
3条回答
  •  青春惊慌失措
    2020-11-27 13:37

    I always do this in two steps (like normal human),

    1. Read login page (by GET, read cookies)
    2. Submit form and cookies (by POST, without cookie manipulation)

    Example:

    Connection.Response response = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
            .method(Connection.Method.GET)
            .execute();
    
    response = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
            .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username")
            .data("ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
            .cookies(response.cookies())
            .method(Connection.Method.POST)
            .execute();
    
    Document homePage = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
            .cookies(response.cookies())
            .get();
    

    And always set cookies from previuos request to next using

             .cookies(response.cookies())
    

    SSL is not important here. If you have problem with certifcates then execute this method for ignore SSL.

    public static void trustEveryone() {
        try {
            HttpsURLConnection.setDefaultHostnameVerifier(new HostnameVerifier() {
                public boolean verify(String hostname, SSLSession session) {
                    return true;
                }
            });
    
            SSLContext context = SSLContext.getInstance("TLS");
            context.init(null, new X509TrustManager[]{new X509TrustManager() {
                public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException { }
    
                public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException { }
    
                public X509Certificate[] getAcceptedIssuers() {
                    return new X509Certificate[0];
                }
            }}, new SecureRandom());
            HttpsURLConnection.setDefaultSSLSocketFactory(context.getSocketFactory());
        } catch (Exception e) { // should never happen
            e.printStackTrace();
        }
    }
    

提交回复
热议问题