Jsoup.connect cannot get correct html contents

試著忘記壹切 提交于 2019-12-25 02:08:44

问题


i use Jsoup to extract specified data from a website,

try{
   Document doc = Jsoup.connect("http://example/search/").get();
} catch(IOException){
  System.out.println("error");
}

but i'm got failed, and the output is "error".

when i browse with Mozilla,or another browser this address is successfully to load. Any idea?Please help me..

Best regards


回答1:


If you display the exception message from your IOException message, you will see

org.jsoup.HttpStatusException: HTTP error fetching URL. Status=500, URL=...

Solution: You need to set the user agent to correspond to the mobile website

Document doc = 
     Jsoup.connect("http://m.tokobagus.com/search/province").userAgent
      ("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko)          
        Chrome/15.0.874.120 Safari/535.2").get();

More importantly, remember to display those exception messages:

} catch(IOException ioe){
  ioe.printStacktrace();
}


来源:https://stackoverflow.com/questions/22329238/jsoup-connect-cannot-get-correct-html-contents

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!