XML Data extraction

房东的猫 提交于 2019-12-02 12:12:48

Use a proper XML parser. For example, xsh:

open file.xml ;
ls //Filer//BusinessNameLine1 ;

xpath is your friend: there is xmllint tool, which could evaluate xpath

xmllint --xpath '//Filer//BusinessNameLine1/text()' yourXML

output:

Stackoverflow

test on an example with <Busn..> tag out of <Filer>:

kent$  cat t.xml
<root>
        <Trash>
                <BusinessNameLine1>trash</BusinessNameLine1>
        </Trash>
        <Filer>
                <ID>123456789</ID>
                <Name>
                        <BusinessNameLine1>Stackoverflow</BusinessNameLine1>
                </Name>
                <NameControl>stack</NameControl>
                <USAddress>
                        <AddressLine1>123 CHERRY HILL LANE</AddressLine1>
                        <City>LA</City>
                        <State>CA</State>
                        <ZIPCode>90210</ZIPCode>
                </USAddress>
        </Filer>
</root>

kent$  xmllint --xpath '//Filer//BusinessNameLine1/text()' t.xml    
Stackoverflow

You could try this combined awk and sed commands,

$ awk -v RS='</Filer>' '/^<Filer>/ {gsub (/\n/," "); print}' file | sed -r 's/.*<BusinessNameLine1>([^<]*)<\/BusinessNameLine1>.*/\1/g'
Stackoverflow
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!