Get link and href text from html doc with Nokogiri & Ruby?

后端 未结 2 880
广开言路
广开言路 2020-12-28 11:02

I\'m trying to use the nokogiri gem to extract all the urls on the page as well their link text and store the link text and url in a hash.


    &         


        
2条回答
  •  萌比男神i
    2020-12-28 11:30

    Here's a one-liner:

    Hash[doc.xpath('//a[@href]').map {|link| [link.text.strip, link["href"]]}]
    
    #=> {"Foo"=>"#foo", "Bar"=>"#bar"}
    

    Split up a bit to be arguably more readable:

    h = {}
    doc.xpath('//a[@href]').each do |link|
      h[link.text.strip] = link['href']
    end
    puts h
    
    #=> {"Foo"=>"#foo", "Bar"=>"#bar"}
    

提交回复
热议问题