Removing certain XML file entries

自闭症网瘾萝莉.ら 提交于 2020-01-06 07:47:38

问题


At the moment I am working with huge file which contains hundred thousands of xml entries, after changing them I have to upload them in specific systems as new database, the file contents looks like this:

   <Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">Product</Data></Cell>
    <Cell><Data ss:Type="String">Home &gt; Connectors &gt; Power Entry</Data></Cell>
    <Cell><Data ss:Type="Number">10430</Data></Cell>
    <Cell><Data ss:Type="String">CAMDEN-BOSS CONTACT, 6AWG, 75A CBCAG14</Data></Cell>
    <Cell><Data ss:Type="String">CONTACT, 6AWG, 75A; Connector Mounting:Cable; Contact Termination:Crimp; Current Rating:75A; SVHC:No SVHC (18-Jun-2012); Series:CBC; Voltage Rating:600V; Flammability Rating:UL94 V0; Wire Area Size Max:11mm; Wire Size AWG Max:6AWG; Wire Size AWG Min:6AWG&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Price for pack of: 1&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Country Of Origin: CN&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://LALA.co.uk/datasheets/1508502.pdf&quot;&gt;&lt;img alt=&quot;&quot; src=&quot;/ekmps/shops/LALA/resources/Design/icon-pdf.gif&quot; style=&quot;width: 16px; height: 16px;&quot; /&gt;&amp;nbsp;Technical Data Sheet&lt;/a&gt;&lt;br /&gt;</Data></Cell>
   </Row>

My job is to remove all the entries in which there aren any links to .pdf files, examble above has it so would be left, but if there wouldnt be "http://LALA.co.uk/datasheets/1508502.pdf" in description it should have been removed (all row), I can work with diferend things, from C# to.. So doesnt really matter of solution type, can anyone suggest me something?


回答1:


In Notepad++ find (Ctrl+F)

<Row[^>]*>((?!\.pdf).)*?</Row>

Replace with

(leave blank)

"Regular expression" and ". matches newline" boxes have to be checked



来源:https://stackoverflow.com/questions/13987291/removing-certain-xml-file-entries

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!