Xpath Table Within Table

我与影子孤独终老i 提交于 2019-12-13 03:06:56

问题


I am having a bit of a problem of scraping a table-heavy page with DOMXpath.

The layout is really ugly, meaning I am trying to get content out of a table within a table within a table. Using Firebug FirePath I am getting for the table element the following path:

 html/body/table/tbody/tr[3]/td/table[1]/tbody/tr[2]/td[1]/table[1]/tbody/tr[3]/td[4]

Now, after endless experimenting I found out, that with a stand alone table, I need to remove the "tbody" tag to make it work. But this doesn't seem to be enough for tables within tables. So my question is how do I best get content out of tables within tables within tables?

I uploaded the file which I am trying to scrape here:1


回答1:


i have gone through with the same problem as yours scrapping a source of complicated and not well formatted html where i want to get the values in a table inside another tables..

i came with the approach of eyeing the part that i want to get with some series of function like this:

function parse_html() {//gets a specific part of the table i chose to extract the contents
    $query = $xpath->query('//tr[@data-eventid]/@data-eventid'); //gets the table i want
    $this->parse_table();
}
function parse_table() {//
    $query = $xpath->query('//tr[@data-eventid="405412"]/td[@class="impact"]/span[@title]/@title');...etc//extracts the content of the table
    $this->parseEvaluate();
} 
function parseEvaluate(){
    ...verifying values if correct
}

just giving the idea..




回答2:


How about:

//*[contains(text(),"GRABME")]

I know that's probably not what you want, but you get the idea. Identify a pattern and use that pattern to construct the xpath.



来源:https://stackoverflow.com/questions/13870103/xpath-table-within-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!