使用awk或sed将文件的特定区域输出到另一个文件?

时间:2021-03-26 08:57:49

I have a file that looks like this:

我有一个看起来像这样的文件:

d "Text 1":6,64;1 /filesys1/db1.d2
d "Text 2":6,64;1 /filesys1/db1.d2 f 730
d "Text 3":6,64;1 /filesys1/db1.d2 
d "TextA":6,64;1 /filesys1/db1.d2 f 46000
d "TextB":6,64;1 /filesys1/db1.d2
d "TextC":6,64;1 /filesys1/db1.d2 f 120000
...

I need to get everything from between the quotes and then the last 2 characters of the line and put it in a new file. I can do the two pieces separately but I can't combine them and get it to work.

我需要从引号之间获取所有内容,然后从该行的最后2个字符中获取所有内容并将其放入新文件中。我可以单独完成两件作品,但我无法将它们组合起来并让它发挥作用。

awk -F'"' '$0=$2' datatmp4 > dataout2

will get me:

会得到我:

Text 1
Text 2
Text 3
TextA
TextB
TextC

and

awk '{ print substr( $NF, length($NF) -1, length($NF) ) }' datatmp4 > dataout

will get me:

会得到我:

d2
30
d2
00
d2
00

what I need is:

我需要的是:

Text 1 d2
Text 2 30
Text 3 d2
TextA 00
TextB d2
TextC 00

4 个解决方案

#1


3  

You could concatenate the result using $2 for the text between quotes along with the result from the last 2 characters as below:

您可以使用$ 2连接引号之间的文本以及最后2个字符的结果,如下所示:

awk -F '"' '{print $2, substr($NF, length($NF)-1, length($NF))}' datatmp4 > dataout

#2


3  

You're making things too hard on yourself. There's no reason to care about or try to operate on the last field on the line ($NF) when all you want is the last 2 characters of the whole line:

你在自己身上做得太难了。当你想要的只是整行的最后2个字符时,没有理由关心或尝试操作线上的最后一个字段($ NF):

$ awk -F'"' '{print $2, substr($0,length()-1)}' file
Text 1 d2
Text 2 30
Text 3 2
TextA 00
TextB d2
TextC 00

The third line of output ends in 2<blank> because that's what was in your input file. That doesn't match your posted desired output though so be clear - do you want the last chars of each line as I've shown and you said you wanted, or do you want the last 2 non-blank chars as implied by your posted desired output?

输出的第三行以2 结尾,因为这是输入文件中的内容。这与你发布的所需输出不符,但要明确 - 你是否想要我所展示的每一行的最后一个字符,你说你想要的,或者你想要你发布的暗示的最后两个非空白字符吗?期望的输出?

#3


0  

$ awk -F"\"" '{match($NF,/..$/,a); print $2,a[0]}' last2
Text 1 d2
Text 2 30
Text 3 2
TextA 00
TextB d2
TextC 00

#4


-1  

With sed (BRE):

使用sed(BRE):

sed 's/^[^"]*"\([^"]*\).*\(.[^ ]\)/\1 \2/;' file

Another way with sed (ERE):

sed(ERE)的另一种方式:

sed -E 's/^[^"]*"|"[^ ]*( ).*(.[^ ])/\1\2/g' file

With awk:

awk -F'"' '{ print $2 " " gensub(/.*(.[^ ])/, "\\1", 1)}' file

The field separator is a quote. gensub replaces all characters from line except the 2 last characters (the second must not be a space).

字段分隔符是引用。 gensub替换除了最后2个字符之外的所有字符(第二个字符不能是空格)。

#1


3  

You could concatenate the result using $2 for the text between quotes along with the result from the last 2 characters as below:

您可以使用$ 2连接引号之间的文本以及最后2个字符的结果,如下所示:

awk -F '"' '{print $2, substr($NF, length($NF)-1, length($NF))}' datatmp4 > dataout

#2


3  

You're making things too hard on yourself. There's no reason to care about or try to operate on the last field on the line ($NF) when all you want is the last 2 characters of the whole line:

你在自己身上做得太难了。当你想要的只是整行的最后2个字符时,没有理由关心或尝试操作线上的最后一个字段($ NF):

$ awk -F'"' '{print $2, substr($0,length()-1)}' file
Text 1 d2
Text 2 30
Text 3 2
TextA 00
TextB d2
TextC 00

The third line of output ends in 2<blank> because that's what was in your input file. That doesn't match your posted desired output though so be clear - do you want the last chars of each line as I've shown and you said you wanted, or do you want the last 2 non-blank chars as implied by your posted desired output?

输出的第三行以2 结尾,因为这是输入文件中的内容。这与你发布的所需输出不符,但要明确 - 你是否想要我所展示的每一行的最后一个字符,你说你想要的,或者你想要你发布的暗示的最后两个非空白字符吗?期望的输出?

#3


0  

$ awk -F"\"" '{match($NF,/..$/,a); print $2,a[0]}' last2
Text 1 d2
Text 2 30
Text 3 2
TextA 00
TextB d2
TextC 00

#4


-1  

With sed (BRE):

使用sed(BRE):

sed 's/^[^"]*"\([^"]*\).*\(.[^ ]\)/\1 \2/;' file

Another way with sed (ERE):

sed(ERE)的另一种方式:

sed -E 's/^[^"]*"|"[^ ]*( ).*(.[^ ])/\1\2/g' file

With awk:

awk -F'"' '{ print $2 " " gensub(/.*(.[^ ])/, "\\1", 1)}' file

The field separator is a quote. gensub replaces all characters from line except the 2 last characters (the second must not be a space).

字段分隔符是引用。 gensub替换除了最后2个字符之外的所有字符(第二个字符不能是空格)。