Can you provide examples of parsing HTML?

后端 未结 29 2642
走了就别回头了
走了就别回头了 2020-11-22 13:49

How do you parse HTML with a variety of languages and parsing libraries?


When answering:

Individual comments will be linked to in answers to questions

29条回答
  •  一整个雨季
    2020-11-22 14:27

    Language: Clojure
    Library: Enlive (a selector-based (à la CSS) templating and transformation system for Clojure)


    Selector expression:

    (def test-select
         (html/select (html/html-resource (java.io.StringReader. test-html)) [:a]))
    

    Now we can do the following at the REPL (I've added line breaks in test-select):

    user> test-select
    ({:tag :a, :attrs {:href "http://foo.com/"}, :content ["foo"]}
     {:tag :a, :attrs {:href "http://bar.com/"}, :content ["bar"]}
     {:tag :a, :attrs {:href "http://baz.com/"}, :content ["baz"]})
    user> (map #(get-in % [:attrs :href]) test-select)
    ("http://foo.com/" "http://bar.com/" "http://baz.com/")
    

    You'll need the following to try it out:

    Preamble:

    (require '[net.cgrand.enlive-html :as html])
    

    Test HTML:

    (def test-html
         (apply str (concat [""]
                            (for [link ["foo" "bar" "baz"]]
                              (str "" link ""))
                            [""])))
    

提交回复
热议问题