Open a connection with Jsoup, get status code and parse document

前端 未结 4 1113
春和景丽
春和景丽 2021-01-02 19:26

I\'m creating a class using jsoup that will do the following:

  1. The constructor opens a connection to a url.
  2. I have a method that will check the status
4条回答
  •  盖世英雄少女心
    2021-01-02 20:09

    As stated in the JSoup Documentation for the Connection.Response type, there is a parse() method that parse the response's body as a Document and returns it. When you have that, you can do whatever you want with it.

    For example, see the implementation of getUrls()

    public class ParsePage {
       private String path;
       Connection.Response response = null;
    
       private ParsePage(String langLocale){
          try {
             response = Jsoup.connect(path)
                .userAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 (KHTML, like Gecko) Chrome/19.0.1042.0 Safari/535.21")
                .timeout(10000)
                .execute();
          } catch (IOException e) {
             System.out.println("io - "+e);
          }
       }
    
       public int getSitemapStatus() {
          int statusCode = response.statusCode();
          return statusCode;
       }
    
       public ArrayList getUrls() {
          ArrayList urls = new ArrayList();
          Document doc = response.parse();
          // do whatever you want, for example retrieving the  from the sitemap
          for (Element url : doc.select("url")) {
             urls.add(url.select("loc").text());
          }
          return urls;
       }
    }
    

提交回复
热议问题