Can you provide examples of parsing HTML?

后端 未结 29 2637
走了就别回头了
走了就别回头了 2020-11-22 13:49

How do you parse HTML with a variety of languages and parsing libraries?


When answering:

Individual comments will be linked to in answers to questions

29条回答
  •  -上瘾入骨i
    2020-11-22 14:42

    Language: Racket

    Library: (planet ashinn/html-parser:1) and (planet clements/sxml2:1)

    (require net/url
             (planet ashinn/html-parser:1)
             (planet clements/sxml2:1))
    
    (define the-url (string->url "http://stackoverflow.com/"))
    (define doc (call/input-url the-url get-pure-port html->sxml))
    (define links ((sxpath "//a/@href/text()") doc))
    

    Above example using packages from the new package system: html-parsing and sxml

    (require net/url
             html-parsing
             sxml)
    
    (define the-url (string->url "http://stackoverflow.com/"))
    (define doc (call/input-url the-url get-pure-port html->xexp))
    (define links ((sxpath "//a/@href/text()") doc))
    

    Note: Install the required packages with 'raco' from a command line, with:

    raco pkg install html-parsing
    

    and:

    raco pkg install sxml
    

提交回复
热议问题