什么是正则表达式来返回较长字符串中两个字符之间的子字符串?

时间:2023-02-07 17:07:12

I have a string in Perl like: "Full Name (userid)" and I want to return just the userid (everything between the "()"'s).

我在Perl中有一个字符串,如:“Full Name(userid)”,我想只返回userid(“()”之间的所有内容)。

What regular expression would do this in Perl?

在Perl中,正则表达式会做什么?

5 个解决方案

#1


This will match any word (\w) character inside of "(" and ")"

这将匹配“(”和“)”中的任何单词(\ w)字符

\w matches a word character (alphanumeric or _), not just [0-9a-zA-Z_] but also digits and characters from non-roman scripts.

\ w匹配单词字符(字母数字或_),而不仅仅是[0-9a-zA-Z_],还包括非罗马字母的数字和字符。

my($username) = $str =~ /\((\w+)\)/;
# or
$str =~ /\((\w+)\)/;
my $username  = $1;


If you need it in a s///, you can get at the variable with $1 or \1.

如果你在s ///中需要它,你可以得到$ 1或\ 1的变量。

$str =~ s/\((\w+)\)/$1:\1/; # pointless example


If you want to capture all possibilities these would work better:

如果你想捕捉所有可能性,这些将更好地工作:

my($username) = $str =~ /\(([^\)]+)\)/;
# or
my($username) = $str =~ /\((.+?)\)/;


If your regexp starts to get complicated, I would recommend you learn about the /x option.

如果你的正则表达式开始变得复杂,我建议你学习/ x选项。

my($username) = $str =~ / \(  ( [^\)]+ )  \) /x;


Please see perldoc perlre, for more information.

If you are just beginning to learn regexps, I would recommend reading perldoc perlretut.

如果你刚刚开始学习regexp,我建议你阅读perldoc perlretut。

#2


Escape the brackets, capture the string in-between. Assuming user ids consist of \w characters only:

转义括号,捕捉中间的字符串。假设用户ID仅包含\ w字符:

my ($userid) = $str =~ /\((\w+)\)/ ;

m// in list context returns the captured matches.

m //在列表上下文中返回捕获的匹配项。

More information on capturing can be found in

有关捕获的更多信息,请参阅

C:\> perldoc perlretut

C:\> perldoc perlretut

#3


When you search for something between brackets, e.g. '< > [ ] ( ) { }' or more sophisticated such as xml/html tags, it's always better to construct your pattern in the way:

当您在括号之间搜索某些内容时,例如'<> [](){}'或更复杂的例如xml / html标签,以这种方式构建模式总是更好:

opening bracket, something which is NOT closing bracket, closing bracket 

Of course, in your case 'closing bracket' can be omitted:

当然,在您的情况下,'结束括号'可以省略:

my $str = 'Full Name (userid)';
my ($user_id) = $str =~ /\(([^\)]+)/;

#4


In addition to what has been said: If you happen to know that your string has exactly this format, you can also do without regexp. If your string is in $s, you could do

除了已经说过的内容:如果您碰巧知道您的字符串具有这种格式,那么您也可以不使用正则表达式。如果你的字符串是$ s,你可以这样做

chop $s; # throws away last character (by assumption must be closing parenthesis)
$username=substr($s, rindex($s,'(') + 1);

As for the regexp solutions, can you be sure that the full name can not contain also a pair of parentheses? In this case, it might make sense anchoring the closing ')' at the end of the pattern:

至于正则表达式解决方案,您能否确定全名不能包含一对括号?在这种情况下,可能有意义的是在模式的末尾锚定结束')':

/ [(]     # open paren
 ([^(]+)  # at least one non-open paren 
  [)]     # closing paren
  $       # end of line/pattern
/x && $username = $1;

#5


This will get anything between the parentheses and not just alphanumeric and _. This may not be an issue, but \w will not get usernames with dashes, pound signs, etc.

这将在括号之间得到任何东西,而不仅仅是字母数字和_。这可能不是问题,但\ w不会获得带有破折号,井号等的用户名。

$str =~ /\((.*?)\)/ ;

$ str =〜/\((。*?)\)/;

#1


This will match any word (\w) character inside of "(" and ")"

这将匹配“(”和“)”中的任何单词(\ w)字符

\w matches a word character (alphanumeric or _), not just [0-9a-zA-Z_] but also digits and characters from non-roman scripts.

\ w匹配单词字符(字母数字或_),而不仅仅是[0-9a-zA-Z_],还包括非罗马字母的数字和字符。

my($username) = $str =~ /\((\w+)\)/;
# or
$str =~ /\((\w+)\)/;
my $username  = $1;


If you need it in a s///, you can get at the variable with $1 or \1.

如果你在s ///中需要它,你可以得到$ 1或\ 1的变量。

$str =~ s/\((\w+)\)/$1:\1/; # pointless example


If you want to capture all possibilities these would work better:

如果你想捕捉所有可能性,这些将更好地工作:

my($username) = $str =~ /\(([^\)]+)\)/;
# or
my($username) = $str =~ /\((.+?)\)/;


If your regexp starts to get complicated, I would recommend you learn about the /x option.

如果你的正则表达式开始变得复杂,我建议你学习/ x选项。

my($username) = $str =~ / \(  ( [^\)]+ )  \) /x;


Please see perldoc perlre, for more information.

If you are just beginning to learn regexps, I would recommend reading perldoc perlretut.

如果你刚刚开始学习regexp,我建议你阅读perldoc perlretut。

#2


Escape the brackets, capture the string in-between. Assuming user ids consist of \w characters only:

转义括号,捕捉中间的字符串。假设用户ID仅包含\ w字符:

my ($userid) = $str =~ /\((\w+)\)/ ;

m// in list context returns the captured matches.

m //在列表上下文中返回捕获的匹配项。

More information on capturing can be found in

有关捕获的更多信息,请参阅

C:\> perldoc perlretut

C:\> perldoc perlretut

#3


When you search for something between brackets, e.g. '< > [ ] ( ) { }' or more sophisticated such as xml/html tags, it's always better to construct your pattern in the way:

当您在括号之间搜索某些内容时,例如'<> [](){}'或更复杂的例如xml / html标签,以这种方式构建模式总是更好:

opening bracket, something which is NOT closing bracket, closing bracket 

Of course, in your case 'closing bracket' can be omitted:

当然,在您的情况下,'结束括号'可以省略:

my $str = 'Full Name (userid)';
my ($user_id) = $str =~ /\(([^\)]+)/;

#4


In addition to what has been said: If you happen to know that your string has exactly this format, you can also do without regexp. If your string is in $s, you could do

除了已经说过的内容:如果您碰巧知道您的字符串具有这种格式,那么您也可以不使用正则表达式。如果你的字符串是$ s,你可以这样做

chop $s; # throws away last character (by assumption must be closing parenthesis)
$username=substr($s, rindex($s,'(') + 1);

As for the regexp solutions, can you be sure that the full name can not contain also a pair of parentheses? In this case, it might make sense anchoring the closing ')' at the end of the pattern:

至于正则表达式解决方案,您能否确定全名不能包含一对括号?在这种情况下,可能有意义的是在模式的末尾锚定结束')':

/ [(]     # open paren
 ([^(]+)  # at least one non-open paren 
  [)]     # closing paren
  $       # end of line/pattern
/x && $username = $1;

#5


This will get anything between the parentheses and not just alphanumeric and _. This may not be an issue, but \w will not get usernames with dashes, pound signs, etc.

这将在括号之间得到任何东西,而不仅仅是字母数字和_。这可能不是问题,但\ w不会获得带有破折号,井号等的用户名。

$str =~ /\((.*?)\)/ ;

$ str =〜/\((。*?)\)/;