问题
I have the following xml:
<datafield tag="007G">
<subfield code="c">GBV</subfield>
<subfield code="0">688845614</subfield>
</datafield>
and I try to extract the content of the <subfield code="0"
688845614
This is my code:
@echo off
for /F "tokens=2 delims=>/<" %%i in ('findstr "007G" curlread.txt') do echo %%i
pause
but as output I only get <datafield tag="007G">
There could be many <datafield tag="007G">
in the xml doc and I need to get <subfield code="0"
from every of it.
回答1:
It's always better to parse structured markup language as hierarchical data, rather than as flat text to scrape.
To return the data from only the first <subfield code="0">
node, replace your findstr
command as follows:
powershell "([xml](gc curlread.txt)).selectSingleNode('//subfield[@code=0]/text()').data"
If you will have multiple <subfield code="0">
nodes and you want the data from all of them, then
powershell "([xml](gc curlread.txt)).selectNodes('//subfield[@code=0]/text()') | %%{ $_.data }"
XPath for the win. You can also specify only <subfield code="0">
nodes that are children of <datafield tag="007G">
by modifying the XPath selector like this:
//datafield[@tag=\"007G\"]/subfield[@code=0]/text()
Important: Quotation marks in the XPath must be backslash escaped.
Edit: Given the XML you pasted in your comment below:
<datafield tag="007G">
<subfield code="c">GBV</subfield>
<subfield code="0">688845614</subfield>
</datafield>
<datafield tag="008G">
<subfield code="c">GBV</subfield>
<subfield code="0">68614</subfield>
</datafield>
... be advised that that is not fully valid XML. Valid XML has a single hierarchical root. Before your data can be parsed, you'll have to enclose it with a root tag.
Here's an example of how to do that:
@echo off & setlocal
set "xml=curlread.xml"
rem // Note that quotation marks in the XPath must be backslash escaped
set "xpath=//datafield[@tag=\"007G\"]/subfield[@code=0]/text()"
for /f "delims=" %%I in (
'powershell "([xml]('<r>{0}</r>' -f (gc %xml%))).selectNodes('%xpath%') | %%{$_.data}"'
) do (
set "subfield=%%I"
setlocal enabledelayedexpansion
echo something useful with !subfield!
endlocal
)
pause
goto :EOF
来源:https://stackoverflow.com/questions/41106654/how-can-i-get-the-content-of-the-subfield-with-batch-script