how to edit XML using bash script?

十年热恋 提交于 2019-12-17 05:14:08

问题


<root>
<tag>1</tag>
<tag1>2</tag1>
</root>

Need to change values 1 and 2 from bash


回答1:


To change tag's value to 2 and tag1's value to 3, using XMLStarlet:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <old.xml >new.xml

Using your sample input:

xmlstarlet ed \
  -u '/root/tag' -v 2 \
  -u '/root/tag1' -v 3 \
  <<<'<root><tag>1</tag><tag1>2</tag1></root>'

...emits as output:

<?xml version="1.0"?>
<root>
  <tag>2</tag>
  <tag1>3</tag1>
</root>



回答2:


You can use the xsltproc command (from package xsltproc on Debian-based distros) with the following XSLT sheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>
  <xsl:param name="tagReplacement"/>
  <xsl:param name="tag1Replacement"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>

  </xsl:template>
  <xsl:template match="tag">
    <xsl:copy>
      <xsl:value-of select="$tagReplacement"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="tag1">
    <xsl:copy>
      <xsl:value-of select="$tag1Replacement"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Then use the command:

xsltproc --stringparam tagReplacement polop \
         --stringparam tag1Replacement palap \
         transform.xsl input.xml

Or you could also use regexes, but modifying XML through regexes is pure evil :)




回答3:


my $0.02 in python because its on every server you will ever log in to

import sys, xml.etree.ElementTree as ET

data = ""
for line in sys.stdin:
    data += line

tree = ET.fromstring(data)

nodeA = tree.find('.//tag')
nodeB = tree.find('.//tag1')

tmp = nodeA.text
nodeA.text = nodeB.text
nodeB.text = tmp 

print ET.tostring(tree)

this reads from stdin so you can use it like this:

$ echo '<node><tag1>hi!</tag1><tag>this</tag></node>' | python xml_process.py 
<node><tag1>this</tag1><tag>hi!</tag></node>

EDIT - challenge accepted

Here's a working xmllib implementation (should work back to python 1.6). As I thought it would be more fun to stab my eyes with a fork. The only think I will say about this is it works for the given use case.

import sys, xmllib

class Bag:
    pass

class NodeSwapper(xmllib.XMLParser):
    def __init__(self):
    print 'making a NodeSwapper'
    xmllib.XMLParser.__init__(self)
    self.result = ''
    self.data_tags = {}
    self.current_tag = ''
    self.finished = False

    def handle_data(self, data):
    print 'data: ' + data

    self.data_tags[self.current_tag] = data
    if self.finished:
       return

    if 'tag1' in self.data_tags.keys() and 'tag' in self.data_tags.keys():
        b = Bag()
        b.tag1 = self.data_tags['tag1']
        b.tag = self.data_tags['tag']
        b.t1_start_idx = self.rawdata.find(b.tag1)
        b.t1_end_idx = len(b.tag1) + b.t1_start_idx
        b.t_start_idx = self.rawdata.find(b.tag)
        b.t_end_idx = len(b.tag) +  b.t_start_idx 
        # swap
        if b.t1_start_idx < b.t_start_idx:
           self.result = self.rawdata[:b.t_start_idx] + b.tag + self.rawdata[b.t_end_idx:]
           self.result = self.result[:b.t1_start_idx] + b.tag1 + self.result[b.t1_end_idx:]
        else:
           self.result = self.rawdata[:b.t1_start_idx] + b.tag1 + self.rawdata[t1_end_idx:]
           self.result = self.result[:b.t_start_idx] + b.tag + self.rresult[t_end_idx:]
        self.finished = True

    def unknown_starttag(self, tag, attrs):
    print 'starttag is: ' + tag
    self.current_tag = tag

data = ""
for line in sys.stdin:
    data += line

print 'data is: ' + data

parser = NodeSwapper()
parser.feed(data)
print parser.result
parser.close()



回答4:


Since you give a sed example in one of the comments, I imagine you want a pure bash solution?

while read input; do
  for field in tag tag1; do
    case $input in
      *"<$field>"*"</$field>"* )
        pre=${input#*"<$field>"}
        suf=${input%"</$field>"*}
        # Where are we supposed to be getting the replacement text from?
        input="${input%$pre}SOMETHING${input#$suf}"
        ;;
    esac
  done
  echo "$input"
done

This is completely unintelligent, and obviously only works on well-formed input with the start tag and the end tag on the same line, you can't have multiple instances of the same tag on the same line, the list of tags to substitute is hard-coded, etc.

I cannot imagine a situation where this would be actually useful, and preferable to either a script or a proper XML approach.



来源:https://stackoverflow.com/questions/6873070/how-to-edit-xml-using-bash-script

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!