使用bash脚本添加/删除xml标记

时间:2021-11-16 12:39:27

I have an xml file that I want to configure using a bash script. For example if I had this xml:

我有一个要使用bash脚本配置的xml文件。例如,如果我有这个xml:

<a>

  <b>
    <bb>
        <yyy>
            Bla 
        </yyy>
    </bb>
  </b>

  <c>
    <cc>
      Something
    </cc>
  </c>

  <d>
    bla
  </d>
</a>

(confidential info removed)

(机密信息删除)

I would like to write a bash script that will remove section <b> (or comment it) but keep the rest of the xml intact. I am pretty new the the whole scripting thing. I was wondering if anyone could give me a hint as to what I should look into.

我想编写一个bash脚本,它将删除section (或注释它),但保留xml的其余部分。我对整个脚本编写都很陌生。我想知道是否有人能给我一个提示,告诉我应该调查什么。

I was thinking that sed could be used except sed is a line editor. I think it would be easy to remove the <b> tags however I am unsure if sed would be able to remove all the text between the <b> tags.

我认为除了sed是行编辑器之外,还可以使用sed。我认为删除标记很容易,但是我不确定sed是否能够删除标记之间的所有文本。

I will also need to write a script to add back the deleted section.

我还需要编写一个脚本来添加已删除的部分。

6 个解决方案

#1


23  

This would not be difficult to do in sed, as sed also works on ranges.

这在sed中并不困难,因为sed也在范围中工作。

Try this (assuming xml is in a file named foo.xml):

试试这个(假设xml在一个名为foo.xml的文件中):

sed -i '/<b>/,/<\/b>/d' foo.xml

-i will write the change into the original file (use -i.bak to keep a backup copy of the original)

-我将把更改写入原始文件(使用-i。保留原作的备份)

This sed command will perform an action d (delete) on all of the lines specified by the range

这个sed命令将对范围指定的所有行执行操作d (delete)

# all of the lines between a line that matches <b>
# and the next line that matches <\/b>, inclusive
/<b>/,/<\/b>/

So, in plain English, this command will delete all of the lines between and including the line with <b> and the line with </b>

因此,在简单的英语中,这个命令将删除的行和的行之间的所有行

If you'd rather comment out the lines, try one of these:

如果你想注释掉这些台词,试试下面的一个:

# block comment
sed -i 's/<b>/<!-- <b>/; s/<\/b>/<\/b> -->/' foo.xml

# comment out every line in the range
sed -i '/<b>/,/<\/b>/s/.*/<!-- & -->/' foo.xml

#2


14  

Using xmlstarlet:

使用xmlstarlet:

#xmlstarlet ed -d "/a/b" file.xml > tmp.xml
xmlstarlet ed -d "//b" file.xml > tmp.xml
mv tmp.xml file.xml

#3


9  

You can use an XSLT such as this that is a modified identity transform. It copies all of the content by default, and has an empty template for b that does nothing(effectively deleting from output):

您可以使用这样的XSLT,它是一个修改后的身份转换。默认情况下复制所有内容,b有一个空模板,什么都不做(有效地从输出中删除):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<!--Identity transform copies all items by default -->
<xsl:template match="@* | node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<!--Empty template to match on b elements and prevent it from being copied to output -->
<xsl:template match="b"/>

</xsl:stylesheet>

Create a bash script that executes the transform using Java and the Xalan commandline utility like this:

创建一个bash脚本,使用Java和Xalan命令行实用程序执行转换,如下所示:

java org.apache.xalan.xslt.Process -IN foo.xml -XSL foo.xsl -OUT foo.out

java org.apache.xalan.xslt。过程——foo。xml xsl foo。xsl治疗foo.out

The result is this:

结果是这样的:

<?xml version="1.0" encoding="UTF-16"?><a><c><cc>
      Something
    </cc></c><d>
    bla
  </d></a>

EDIT: if you would prefer to have the b commented out, to make it easier to put back, then use this stylesheet:

编辑:如果您希望将b注释掉,以便更容易放回,那么使用这个样式表:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <!--Identity transform copies all items by default -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--Match on b element, wrap in a comment and construct text representing XML structure by applying templates in "comment" mode -->
    <xsl:template match="b">
        <xsl:comment>
            <xsl:apply-templates select="self::*" mode="comment" />
        </xsl:comment>
    </xsl:template>

    <xsl:template match="*" mode="comment">
        <xsl:value-of select="'&lt;'"/>
            <xsl:value-of select="name()"/>
        <xsl:value-of select="'&gt;'"/>
            <xsl:apply-templates select="@*|node()" mode="comment" />
        <xsl:value-of select="'&lt;/'"/>
            <xsl:value-of select="name()"/>
        <xsl:value-of select="'&gt;'"/>
    </xsl:template>

    <xsl:template match="text()" mode="comment">
        <xsl:value-of select="."/>
    </xsl:template>

    <xsl:template match="@*" mode="comment">
        <xsl:value-of select="name()"/>
        <xsl:text>="</xsl:text>
        <xsl:value-of select="."/>
        <xsl:text>" </xsl:text>
    </xsl:template>

</xsl:stylesheet>

It produces this output:

它产生该输出:

<?xml version="1.0" encoding="UTF-16"?><a><!--<b><bb><yyy>
            Bla
        </yyy></bb></b>--><c><cc>
      Something
    </cc></c><d>
    bla
  </d></a>

#4


6  

If you want the most appropriate replacement for sed for XML data, it would be an XSLT processor. Like sed it's a complex language but specialized for the task of XML-to-anything transformations.

如果您希望最合适地替换为XML数据的sed,那么它应该是XSLT处理器。像sed一样,它是一种复杂的语言,但专门用于xml到任何东西的转换任务。

On the other hand, this does seem to be the point at which I would seriously consider switching to a real programming language, like Python.

另一方面,这似乎是我认真考虑切换到真正的编程语言(如Python)的重点。

#5


3  

@OP, you can use awk eg

@OP,你可以使用awk eg

$ cat file
<a>                              

some text before   <b>
    <bb>
        <yyy>
            Bla
        </yyy>
    </bb>
  </b> some text after

  <c>
    <cc>
      Something
    </cc>
  </c>

  <d>
    bla
  </d>
</a>

$ awk 'BEGIN{RS="</b>"}/<b>/{gsub(/<b>.*/,"")}1' file
<a>

some text before
 some text after

  <c>
    <cc>
      Something
    </cc>
  </c>

  <d>
    bla
  </d>
</a>

#6


2  

# edit file inplace
xmlstarlet ed -L -d "//b" file.xml

#1


23  

This would not be difficult to do in sed, as sed also works on ranges.

这在sed中并不困难,因为sed也在范围中工作。

Try this (assuming xml is in a file named foo.xml):

试试这个(假设xml在一个名为foo.xml的文件中):

sed -i '/<b>/,/<\/b>/d' foo.xml

-i will write the change into the original file (use -i.bak to keep a backup copy of the original)

-我将把更改写入原始文件(使用-i。保留原作的备份)

This sed command will perform an action d (delete) on all of the lines specified by the range

这个sed命令将对范围指定的所有行执行操作d (delete)

# all of the lines between a line that matches <b>
# and the next line that matches <\/b>, inclusive
/<b>/,/<\/b>/

So, in plain English, this command will delete all of the lines between and including the line with <b> and the line with </b>

因此,在简单的英语中,这个命令将删除的行和的行之间的所有行

If you'd rather comment out the lines, try one of these:

如果你想注释掉这些台词,试试下面的一个:

# block comment
sed -i 's/<b>/<!-- <b>/; s/<\/b>/<\/b> -->/' foo.xml

# comment out every line in the range
sed -i '/<b>/,/<\/b>/s/.*/<!-- & -->/' foo.xml

#2


14  

Using xmlstarlet:

使用xmlstarlet:

#xmlstarlet ed -d "/a/b" file.xml > tmp.xml
xmlstarlet ed -d "//b" file.xml > tmp.xml
mv tmp.xml file.xml

#3


9  

You can use an XSLT such as this that is a modified identity transform. It copies all of the content by default, and has an empty template for b that does nothing(effectively deleting from output):

您可以使用这样的XSLT,它是一个修改后的身份转换。默认情况下复制所有内容,b有一个空模板,什么都不做(有效地从输出中删除):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<!--Identity transform copies all items by default -->
<xsl:template match="@* | node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<!--Empty template to match on b elements and prevent it from being copied to output -->
<xsl:template match="b"/>

</xsl:stylesheet>

Create a bash script that executes the transform using Java and the Xalan commandline utility like this:

创建一个bash脚本,使用Java和Xalan命令行实用程序执行转换,如下所示:

java org.apache.xalan.xslt.Process -IN foo.xml -XSL foo.xsl -OUT foo.out

java org.apache.xalan.xslt。过程——foo。xml xsl foo。xsl治疗foo.out

The result is this:

结果是这样的:

<?xml version="1.0" encoding="UTF-16"?><a><c><cc>
      Something
    </cc></c><d>
    bla
  </d></a>

EDIT: if you would prefer to have the b commented out, to make it easier to put back, then use this stylesheet:

编辑:如果您希望将b注释掉,以便更容易放回,那么使用这个样式表:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <!--Identity transform copies all items by default -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--Match on b element, wrap in a comment and construct text representing XML structure by applying templates in "comment" mode -->
    <xsl:template match="b">
        <xsl:comment>
            <xsl:apply-templates select="self::*" mode="comment" />
        </xsl:comment>
    </xsl:template>

    <xsl:template match="*" mode="comment">
        <xsl:value-of select="'&lt;'"/>
            <xsl:value-of select="name()"/>
        <xsl:value-of select="'&gt;'"/>
            <xsl:apply-templates select="@*|node()" mode="comment" />
        <xsl:value-of select="'&lt;/'"/>
            <xsl:value-of select="name()"/>
        <xsl:value-of select="'&gt;'"/>
    </xsl:template>

    <xsl:template match="text()" mode="comment">
        <xsl:value-of select="."/>
    </xsl:template>

    <xsl:template match="@*" mode="comment">
        <xsl:value-of select="name()"/>
        <xsl:text>="</xsl:text>
        <xsl:value-of select="."/>
        <xsl:text>" </xsl:text>
    </xsl:template>

</xsl:stylesheet>

It produces this output:

它产生该输出:

<?xml version="1.0" encoding="UTF-16"?><a><!--<b><bb><yyy>
            Bla
        </yyy></bb></b>--><c><cc>
      Something
    </cc></c><d>
    bla
  </d></a>

#4


6  

If you want the most appropriate replacement for sed for XML data, it would be an XSLT processor. Like sed it's a complex language but specialized for the task of XML-to-anything transformations.

如果您希望最合适地替换为XML数据的sed,那么它应该是XSLT处理器。像sed一样,它是一种复杂的语言,但专门用于xml到任何东西的转换任务。

On the other hand, this does seem to be the point at which I would seriously consider switching to a real programming language, like Python.

另一方面,这似乎是我认真考虑切换到真正的编程语言(如Python)的重点。

#5


3  

@OP, you can use awk eg

@OP,你可以使用awk eg

$ cat file
<a>                              

some text before   <b>
    <bb>
        <yyy>
            Bla
        </yyy>
    </bb>
  </b> some text after

  <c>
    <cc>
      Something
    </cc>
  </c>

  <d>
    bla
  </d>
</a>

$ awk 'BEGIN{RS="</b>"}/<b>/{gsub(/<b>.*/,"")}1' file
<a>

some text before
 some text after

  <c>
    <cc>
      Something
    </cc>
  </c>

  <d>
    bla
  </d>
</a>

#6


2  

# edit file inplace
xmlstarlet ed -L -d "//b" file.xml