jsoup - extract text from wikipedia article

前端 未结 3 1201
猫巷女王i
猫巷女王i 2021-01-06 12:58

I\'m writing some Java code in order to realize NLP tasks upon texts using Wikipedia. How can I use JSoup to extract all the text of a Wikipedia article (for example all the

3条回答
  •  無奈伤痛
    2021-01-06 13:27

    Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/Boston").timeout(5000);
    
    Element iamcontaningIDofintendedTAG= doc.select("#iamID") ;
    
    System.out.println(iamcontaningIDofintendedTAG.toString());
    

    OR

    Elements iamcontaningCLASSofintendedTAG= doc.select(".iamCLASS") ;
    
    System.out.println(iamcontaningCLASSofintendedTAG.toString());
    

提交回复
热议问题