问题
I am trying to extract data from a site to construct a database.
I want to extract the data from "h2#1" to the line before "h2#2",
and put it into Element, so that I can handle the data easier.
The data shown in the picture is within a div where id="left"
The page I am trying to extract data:
http://koryaku.fullbokko.drecom.jp/quests/sp/eiketsu_sinka_no_hihou/netureinokishi/#1
回答1:
Try this CSS selector:
h2#1 ~ *:not(h2#2 ~ *):not(h2#2)
DEMO
http://try.jsoup.org/~T29QSXFbJqwJx2a_If4qUeD1cnU
DESCRIPTION
h2#1 ~ * /* Select any node preceded by h2#1 ... */
:not(h2#2 ~ *) /* ... and not preceded by h2#2 ... */
:not(h2#2) /* ... and exclude h2#2 itself ! */
Tested on Jsoup 1.8.3
来源:https://stackoverflow.com/questions/34723544/how-to-extract-any-nodes-between-a-node-a-and-a-node-b-with-jsoup