如何使用sed提取子字符串?

时间:2022-09-13 14:22:19

I have a file containing the following lines:

我有一个文件包含以下几行:

  <parameter name="PortMappingEnabled" access="readWrite" type="xsd:boolean"></parameter>
  <parameter name="PortMappingLeaseDuration" access="readWrite" activeNotify="canDeny" type="xsd:unsignedInt"></parameter>
  <parameter name="RemoteHost" access="readWrite"></parameter>
  <parameter name="ExternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="ExternalPortEndRange" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="InternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="PortMappingProtocol" access="readWrite"></parameter>
  <parameter name="InternalClient" access="readWrite"></parameter>
  <parameter name="PortMappingDescription" access="readWrite"></parameter>

I want to execute command on this file to extract only the parameter names as displayed in the following output:

我想在这个文件上执行命令,只提取如下输出中显示的参数名称:

$sedcommand file.txt
PortMappingEnabled
PortMappingLeaseDuration
RemoteHost
ExternalPort
ExternalPortEndRange
InternalPort
PortMappingProtocol
InternalClient
PortMappingDescription

What could be this command?

这是什么命令?

5 个解决方案

#1


29  

You want awk.

你想要awk。

This would be a quick and dirty hack:

这将是一个快速而肮脏的技巧:

awk -F "\"" '{print $2}' /tmp/file.txt

awk -F“\”“{打印$2}”/tmp/file.txt

PortMappingEnabled
PortMappingLeaseDuration
RemoteHost
ExternalPort
ExternalPortEndRange
InternalPort
PortMappingProtocol
InternalClient
PortMappingDescription

#2


70  

grep was born to extract things:

grep生来就是为了提取东西:

grep -Po 'name="\K[^"]*'

test with your data:

测试数据:

kent$  echo '<parameter name="PortMappingEnabled" access="readWrite" type="xsd:boolean"></parameter>
  <parameter name="PortMappingLeaseDuration" access="readWrite" activeNotify="canDeny" type="xsd:unsignedInt"></parameter>
  <parameter name="RemoteHost" access="readWrite"></parameter>
  <parameter name="ExternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="ExternalPortEndRange" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="InternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="PortMappingProtocol" access="readWrite"></parameter>
  <parameter name="InternalClient" access="readWrite"></parameter>
  <parameter name="PortMappingDescription" access="readWrite"></parameter>
'|grep -Po 'name="\K[^"]*'
PortMappingEnabled
PortMappingLeaseDuration
RemoteHost
ExternalPort
ExternalPortEndRange
InternalPort
PortMappingProtocol
InternalClient
PortMappingDescription

#3


41  

sed 's/[^"]*"\([^"]*\).*/\1/'

sed的s /[^]* \([^”]* \)。* / \ 1 / '

does the job.

是否工作。

#4


13  

You should not parse XML using tools like sed, or awk. It's error-prone.

您不应该使用sed或awk这样的工具来解析XML。它是容易出错的。

If input changes, and before name parameter you will get new-line character instead of space it will fail some day producing unexpected results.

如果输入改变,在name参数之前,您将得到换行字符,而不是空格,它将在某一天失败,产生意想不到的结果。

If you are really sure, that your input will be always formated this way, you can use cut. It's faster than sed and awk:

如果您确实确信,您的输入将始终以这种方式格式化,您可以使用cut。比sed和awk快:

cut -d'"' -f2 < input.txt

It will be better to first parse it, and extract only parameter name attribute:

最好先解析它,只提取参数名属性:

xpath -q -e //@name input.txt | cut -d'"' -f2

To learn more about xpath, see this tutorial: http://www.w3schools.com/xpath/

要了解更多关于xpath的知识,请参见本教程:http://www.w3schools.com/xpath/

#5


0  

Explaining how you can use cut:

解释如何使用cut:

cat yourxmlfile | cut -d'"' -f2

cat yourxmlfile | cut -d'"' -f2

It will 'cut' all the lines in the file based on " delimiter, and will take the 2nd field , which is what you wanted.

它将根据“delimiter”对文件中的所有行进行“剪切”,并取第二个字段,这就是您想要的。

#1


29  

You want awk.

你想要awk。

This would be a quick and dirty hack:

这将是一个快速而肮脏的技巧:

awk -F "\"" '{print $2}' /tmp/file.txt

awk -F“\”“{打印$2}”/tmp/file.txt

PortMappingEnabled
PortMappingLeaseDuration
RemoteHost
ExternalPort
ExternalPortEndRange
InternalPort
PortMappingProtocol
InternalClient
PortMappingDescription

#2


70  

grep was born to extract things:

grep生来就是为了提取东西:

grep -Po 'name="\K[^"]*'

test with your data:

测试数据:

kent$  echo '<parameter name="PortMappingEnabled" access="readWrite" type="xsd:boolean"></parameter>
  <parameter name="PortMappingLeaseDuration" access="readWrite" activeNotify="canDeny" type="xsd:unsignedInt"></parameter>
  <parameter name="RemoteHost" access="readWrite"></parameter>
  <parameter name="ExternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="ExternalPortEndRange" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="InternalPort" access="readWrite" type="xsd:unsignedInt"></parameter>
  <parameter name="PortMappingProtocol" access="readWrite"></parameter>
  <parameter name="InternalClient" access="readWrite"></parameter>
  <parameter name="PortMappingDescription" access="readWrite"></parameter>
'|grep -Po 'name="\K[^"]*'
PortMappingEnabled
PortMappingLeaseDuration
RemoteHost
ExternalPort
ExternalPortEndRange
InternalPort
PortMappingProtocol
InternalClient
PortMappingDescription

#3


41  

sed 's/[^"]*"\([^"]*\).*/\1/'

sed的s /[^]* \([^”]* \)。* / \ 1 / '

does the job.

是否工作。

#4


13  

You should not parse XML using tools like sed, or awk. It's error-prone.

您不应该使用sed或awk这样的工具来解析XML。它是容易出错的。

If input changes, and before name parameter you will get new-line character instead of space it will fail some day producing unexpected results.

如果输入改变,在name参数之前,您将得到换行字符,而不是空格,它将在某一天失败,产生意想不到的结果。

If you are really sure, that your input will be always formated this way, you can use cut. It's faster than sed and awk:

如果您确实确信,您的输入将始终以这种方式格式化,您可以使用cut。比sed和awk快:

cut -d'"' -f2 < input.txt

It will be better to first parse it, and extract only parameter name attribute:

最好先解析它,只提取参数名属性:

xpath -q -e //@name input.txt | cut -d'"' -f2

To learn more about xpath, see this tutorial: http://www.w3schools.com/xpath/

要了解更多关于xpath的知识,请参见本教程:http://www.w3schools.com/xpath/

#5


0  

Explaining how you can use cut:

解释如何使用cut:

cat yourxmlfile | cut -d'"' -f2

cat yourxmlfile | cut -d'"' -f2

It will 'cut' all the lines in the file based on " delimiter, and will take the 2nd field , which is what you wanted.

它将根据“delimiter”对文件中的所有行进行“剪切”,并取第二个字段,这就是您想要的。