How do you parse a web page and extract all the href links?

后端 未结 7 2010
情话喂你
情话喂你 2021-01-01 19:18

I want to parse a web page in Groovy and extract all of the href links and the associated text with it.

If the page contained these links:



        
7条回答
  •  一向
    一向 (楼主)
    2021-01-01 19:28

    Use XMLSlurper to parse the HTML as an XML document and then use the find method with an appropriate closure to select the a tags and then use the list method on GPathResult to get a list of the tags. You should then be able to extract the text as children of the GPathResult.

提交回复
热议问题