如何使用regex和shell从字符串中提取值?

时间:2022-09-13 11:14:57

I am in shell and I have this string: 12 BBQ ,45 rofl, 89 lol

我在壳牌,我有这个弦:12 BBQ,45 rofl, 89 lol

Using the regexp: \d+ (?=rofl), I want 45 as a result.

使用regexp: \d+ (?=rofl),我希望得到45。

Is it correct to use regex to extract data from a string? The best I have done is to highlight the value in some of the online regex editor. Most of the time it remove the value from my string.

使用正则表达式从字符串中提取数据是否正确?我所做的最好的事情是突出显示一些在线regex编辑器的价值。大多数时候它会从我的字符串中删除值。

I am investigating expr, but all I get is syntax errors.

我正在研究expr,但得到的只是语法错误。

How can I manage to extract 45 in a shell script?

如何在shell脚本中提取45 ?

6 个解决方案

#1


44  

You can do this with GNU grep's perl mode:

您可以使用GNU grep的perl模式来实现这一点:

echo "12 BBQ ,45 rofl, 89 lol"|grep -P '\d+ (?=rofl)' -o

-P means Perl-style, and -o means match only.

p表示perl风格,-o表示匹配。

#2


8  

It seems that you are asking multiple things. To answer them:

看起来你问了很多问题。回答:

  • Yes, it is ok to extract data from a string using regular expressions, that's what they're there for
  • 是的,可以使用正则表达式从字符串中提取数据,这就是它们的目的
  • You get errors, which one and what shell tool do you use?
  • 你会有错误,你会使用哪个shell工具?
  • You can extract the numbers by catching them in capturing parentheses:

    你可以通过捕捉圆括号中的数字来提取数字:

    .*(\d+) rofl.*
    

    and using $1 to get the string out (.* is for "the rest before and after on the same line)

    使用$1将字符串取出(。*是“在同一行之前和之后的休息时间”

With sed as example, the idea becomes this to replace all strings in a file with only the matching number:

以sed为例,我们的想法是用匹配的数字替换文件中的所有字符串:

sed -e 's/.*(\d+) rofl.*/$1/g' inputFileName > outputFileName

or:

或者:

echo "12 BBQ ,45 rofl, 89 lol" | sed -e 's/.*(\d+) rofl.*/$1/g'

#3


6  

Yes regex can certainly be used to extract part of a string. Unfortunately different flavours of *nix and different tools use slightly different Regex variants.

是的,regex当然可以用于提取字符串的一部分。不幸的是,不同口味的*nix和不同的工具使用的Regex变体略有不同。

This sed command should work on most flavours (Tested on OS/X and Redhat)

这个sed命令应该适用于大多数风格(在OS/X和Redhat上测试)

echo '12 BBQ ,45 rofl, 89 lol' | sed  's/^.*,\([0-9][0-9]*\).*$/\1/g'

#4


0  

you can use the shell(bash for example)

您可以使用shell(例如bash)

$ string="12 BBQ ,45 rofl, 89 lol"
$ echo ${string% rofl*}
12 BBQ ,45
$ string=${string% rofl*}
$ echo ${string##*,}
45

#5


-1  

You can certainly extract that part of a string and that's a great way to parse out data. Regular expression syntax varies a lot so you need to reference the help file for the regex you're using. You might try a regular expression like:

当然可以提取字符串的这一部分,这是解析数据的好方法。正则表达式语法变化很大,因此需要为正在使用的regex引用帮助文件。您可以尝试一个正则表达式,例如:

[0-9]+ *[a-zA-Z]+,([0-9]+) *[a-zA-Z]+,[0-9]+ *[a-zA-Z]+

If your regex program can do string replacement then replace the entire string with the result you want and you can easily use that result.

如果regex程序可以执行字符串替换,那么将整个字符串替换为您想要的结果,您可以轻松地使用该结果。

You didn't mention if you're using bash or some other shell. That would help get better answers when asking for help.

您没有提到您是在使用bash还是其他shell。这将有助于在寻求帮助时得到更好的答案。

#6


-1  

You can use rextract to extract using a regular expression and reformat the result.

您可以使用rextract来提取正则表达式并重新格式化结果。

Example:

例子:

[$] echo "12 BBQ ,45 rofl, 89 lol" | ./rextract '[,]([\d]+) rofl' '${1}'
45

#1


44  

You can do this with GNU grep's perl mode:

您可以使用GNU grep的perl模式来实现这一点:

echo "12 BBQ ,45 rofl, 89 lol"|grep -P '\d+ (?=rofl)' -o

-P means Perl-style, and -o means match only.

p表示perl风格,-o表示匹配。

#2


8  

It seems that you are asking multiple things. To answer them:

看起来你问了很多问题。回答:

  • Yes, it is ok to extract data from a string using regular expressions, that's what they're there for
  • 是的,可以使用正则表达式从字符串中提取数据,这就是它们的目的
  • You get errors, which one and what shell tool do you use?
  • 你会有错误,你会使用哪个shell工具?
  • You can extract the numbers by catching them in capturing parentheses:

    你可以通过捕捉圆括号中的数字来提取数字:

    .*(\d+) rofl.*
    

    and using $1 to get the string out (.* is for "the rest before and after on the same line)

    使用$1将字符串取出(。*是“在同一行之前和之后的休息时间”

With sed as example, the idea becomes this to replace all strings in a file with only the matching number:

以sed为例,我们的想法是用匹配的数字替换文件中的所有字符串:

sed -e 's/.*(\d+) rofl.*/$1/g' inputFileName > outputFileName

or:

或者:

echo "12 BBQ ,45 rofl, 89 lol" | sed -e 's/.*(\d+) rofl.*/$1/g'

#3


6  

Yes regex can certainly be used to extract part of a string. Unfortunately different flavours of *nix and different tools use slightly different Regex variants.

是的,regex当然可以用于提取字符串的一部分。不幸的是,不同口味的*nix和不同的工具使用的Regex变体略有不同。

This sed command should work on most flavours (Tested on OS/X and Redhat)

这个sed命令应该适用于大多数风格(在OS/X和Redhat上测试)

echo '12 BBQ ,45 rofl, 89 lol' | sed  's/^.*,\([0-9][0-9]*\).*$/\1/g'

#4


0  

you can use the shell(bash for example)

您可以使用shell(例如bash)

$ string="12 BBQ ,45 rofl, 89 lol"
$ echo ${string% rofl*}
12 BBQ ,45
$ string=${string% rofl*}
$ echo ${string##*,}
45

#5


-1  

You can certainly extract that part of a string and that's a great way to parse out data. Regular expression syntax varies a lot so you need to reference the help file for the regex you're using. You might try a regular expression like:

当然可以提取字符串的这一部分,这是解析数据的好方法。正则表达式语法变化很大,因此需要为正在使用的regex引用帮助文件。您可以尝试一个正则表达式,例如:

[0-9]+ *[a-zA-Z]+,([0-9]+) *[a-zA-Z]+,[0-9]+ *[a-zA-Z]+

If your regex program can do string replacement then replace the entire string with the result you want and you can easily use that result.

如果regex程序可以执行字符串替换,那么将整个字符串替换为您想要的结果,您可以轻松地使用该结果。

You didn't mention if you're using bash or some other shell. That would help get better answers when asking for help.

您没有提到您是在使用bash还是其他shell。这将有助于在寻求帮助时得到更好的答案。

#6


-1  

You can use rextract to extract using a regular expression and reformat the result.

您可以使用rextract来提取正则表达式并重新格式化结果。

Example:

例子:

[$] echo "12 BBQ ,45 rofl, 89 lol" | ./rextract '[,]([\d]+) rofl' '${1}'
45