问题
I have one html page where there are number of <tr><td>
elements like
<tr>
<td class="notextElementLabel width100">address:</td>
<td style="width: 100%" colspan="1" class="formFieldelement"><b>12284,CA</b></td>
</tr>
let say the above <tr>
is at 4th position means before this elements there are 3 more <tr>
Now I want to get the value of address so I am doing
$doc = new DOMDocument();
@$doc->loadHTML($this->siteHtmlData);
$tdElements = $doc->getElementsByTagName("td");
$i=0;
foreach ($tdElements as $node) {
if(trim($node->nodeValue) == 'address:'){
echo "\n\ngot it\n\n";
}else{
echo "\n\n---no ---\n\n";
}
}
How can I get the value of "12284,CA". Please guide.
Thanks
回答1:
In your case, the logic behind your query is simple enough that it can be expressed entirely in XPath syntax:
//td[text()="address:"]/following-sibling::td/b/text()
This finds any <td>
node that has a text equal to "address:"
, grabs the following <td>
, goes into the <b>
inside it and gets you the text it finds there.
That means you can do
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
echo $xpath->evaluate('string(//td[text()="address:"]/following-sibling::td/b)');
It will immediately output the result you are looking for.
回答2:
You have to get the <tr>
elements, then parse its children, similar to:
$trElements = $doc->getElementsByTagName("tr");
foreach ($trElements as $node) {
$children = $node->childNodes;
foreach( $children as $child)
echo $child->textContent; // or $child->nodeValue
}
This outputs: address: 12284,CA
Now, if there are more <tr>
elements that are not the address, you will need to parse the $children
list of nodes to make sure you find "address:"
, and then once you do, you know the value of next child is the value you're looking for.
回答3:
I got the answer by myself which is similar to nickb's answer
$tdElements = $doc->getElementsByTagName("td");
$tdCnt = $tdElements->length;
for ($idx = 0; $idx < $tdCnt; $idx++) {
if(trim($tdElements->item($idx)->nodeValue) == 'address:'){
echo $tdElements->item($idx+1)->nodeValue;
}
}
Hope it will helps
来源:https://stackoverflow.com/questions/11138158/domdocument-parse-html