I'm trying to write a regex that will parse out the directory and filename of a fully qualified path using matching groups.
我正在编写一个regex,它将使用匹配的组解析完全限定路径的目录和文件名。
so...
所以…
/var/log/xyz/10032008.log
would recognize group 1 to be "/var/log/xyz"
and group 2 to be "10032008.log"
将第1组识别为“/var/log/xyz”,第2组识别为“10032008.log”
Seems simple but I can't get the matching groups to work for the life of me.
看起来很简单,但是我不能让配对组为我的生活工作。
NOTE: As pointed out by some of the respondents this is probably not a good use of regular expressions. Generally I'd prefer to use the file API of the language I was using. What I'm actually trying to do is a little more complicated than this but would have been much more difficult to explain, so I chose a domain that everyone would be familiar with in order to most succinctly describe the root problem.
注意:正如一些受访者指出的,这可能不是正则表达式的好用法。一般来说,我更喜欢使用我所使用的语言的文件API。实际上我想要做的事情比这要复杂的多,但是要解释起来会困难得多,所以我选择了一个大家都熟悉的领域,以最简洁地描述根本问题。
8 个解决方案
#1
25
Try this:
试试这个:
^(.+)/([^/]+)$
#2
12
In languages that support regular expressions with non-capturing groups:
在支持非捕获组正则表达式的语言中:
((?:[^/]*/)*)(.*)
I'll explain the gnarly regex by exploding it...
我将通过爆炸来解释这个恶心的正则表达式……
(
(?:
[^/]*
/
)
*
)
(.*)
What the parts mean:
这些零件是什么意思:
( -- capture group 1 starts
(?: -- non-capturing group starts
[^/]* -- greedily match as many non-directory separators as possible
/ -- match a single directory-separator character
) -- non-capturing group ends
* -- repeat the non-capturing group zero-or-more times
) -- capture group 1 ends
(.*) -- capture all remaining characters in group 2
Example
To test the regular expression, I used the following Perl script...
为了测试正则表达式,我使用了以下Perl脚本…
#!/usr/bin/perl -w
use strict;
use warnings;
sub test {
my $str = shift;
my $testname = shift;
$str =~ m#((?:[^/]*/)*)(.*)#;
print "$str -- $testname\n";
print " 1: $1\n";
print " 2: $2\n\n";
}
test('/var/log/xyz/10032008.log', 'absolute path');
test('var/log/xyz/10032008.log', 'relative path');
test('10032008.log', 'filename-only');
test('/10032008.log', 'file directly under root');
The output of the script...
脚本的输出…
/var/log/xyz/10032008.log -- absolute path
1: /var/log/xyz/
2: 10032008.log
var/log/xyz/10032008.log -- relative path
1: var/log/xyz/
2: 10032008.log
10032008.log -- filename-only
1:
2: 10032008.log
/10032008.log -- file directly under root
1: /
2: 10032008.log
#3
8
Most languages have path parsing functions that will give you this already. If you have the ability, I'd recommend using what comes to you for free out-of-the-box.
大多数语言都有路径解析函数,这些函数已经提供给您了。如果你有这个能力,我建议你使用免费的开箱即用的东西。
Assuming / is the path delimiter...
假设/是路径分隔符…
^(.*/)([^/]*)$
The first group will be whatever the directory/path info is, the second will be the filename. For example:
第一个组将是无论目录/路径信息是什么,第二个组将是文件名。例如:
- /foo/bar/baz.log: "/foo/bar/" is the path, "baz.log" is the file
- / foo / bar /巴兹。"/foo/bar/"是路径"baz。日志”文件
- foo/bar.log: "foo/" is the path, "bar.log" is the file
- foo / bar。log:“foo/”是路径,“bar”。日志”文件
- /foo/bar: "/foo/" is the path, "bar" is the file
- /foo/bar: "/foo/"是路径,"bar"是文件
- /foo/bar/: "/foo/bar/" is the path and there is no file.
- /foo/bar/:“/foo/bar/”是路径,没有文件。
#4
4
What language? and why use regex for this simple task?
什么语言?为什么要在这个简单的任务中使用regex呢?
If you must:
如果你必须:
^(.*)/([^/]*)$
gives you the two parts you wanted. You might need to quote the parentheses:
给你你想要的两部分。您可能需要引用括号:
^\(.*\)/\([^/]*\)$
depending on your preferred language syntax.
取决于您的首选语言语法。
But I suggest you just use your language's string search function that finds the last "/" character, and split the string on that index.
但是我建议您只使用您的语言的字符串搜索函数来查找最后的“/”字符,并在该索引上拆分字符串。
#5
1
What about this?
这是什么?
[/]{0,1}([^/]+[/])*([^/]*)
Deterministic :
确定性:
((/)|())([^/]+/)*([^/]*)
Strict :
严格:
^[/]{0,1}([^/]+[/])*([^/]*)$
^((/)|())([^/]+/)*([^/]*)$
#6
0
Try this:
试试这个:
/^(\/([^/]+\/)*)(.*)$/
It will leave the trailing slash on the path, though.
它会在路径上留下斜线。
#7
0
A very late answer, but hope this will help
这是一个非常晚的回答,但希望这能有所帮助
^(.+?)/([\w]+\.log)$
This uses lazy check for /
, and I just modified the accepted answer
这使用了延迟检查/,并且我刚刚修改了已接受的答案
http://regex101.com/r/gV2xB7/1
http://regex101.com/r/gV2xB7/1
#8
-4
I would avoid doing that with regex. I would use your language's included facilities for parsing the path names, and use regex for just the searching for which its nature is required.
我将避免使用regex。我将使用您的语言包含的工具来解析路径名,并使用regex进行搜索,而搜索的性质是必需的。
#1
25
Try this:
试试这个:
^(.+)/([^/]+)$
#2
12
In languages that support regular expressions with non-capturing groups:
在支持非捕获组正则表达式的语言中:
((?:[^/]*/)*)(.*)
I'll explain the gnarly regex by exploding it...
我将通过爆炸来解释这个恶心的正则表达式……
(
(?:
[^/]*
/
)
*
)
(.*)
What the parts mean:
这些零件是什么意思:
( -- capture group 1 starts
(?: -- non-capturing group starts
[^/]* -- greedily match as many non-directory separators as possible
/ -- match a single directory-separator character
) -- non-capturing group ends
* -- repeat the non-capturing group zero-or-more times
) -- capture group 1 ends
(.*) -- capture all remaining characters in group 2
Example
To test the regular expression, I used the following Perl script...
为了测试正则表达式,我使用了以下Perl脚本…
#!/usr/bin/perl -w
use strict;
use warnings;
sub test {
my $str = shift;
my $testname = shift;
$str =~ m#((?:[^/]*/)*)(.*)#;
print "$str -- $testname\n";
print " 1: $1\n";
print " 2: $2\n\n";
}
test('/var/log/xyz/10032008.log', 'absolute path');
test('var/log/xyz/10032008.log', 'relative path');
test('10032008.log', 'filename-only');
test('/10032008.log', 'file directly under root');
The output of the script...
脚本的输出…
/var/log/xyz/10032008.log -- absolute path
1: /var/log/xyz/
2: 10032008.log
var/log/xyz/10032008.log -- relative path
1: var/log/xyz/
2: 10032008.log
10032008.log -- filename-only
1:
2: 10032008.log
/10032008.log -- file directly under root
1: /
2: 10032008.log
#3
8
Most languages have path parsing functions that will give you this already. If you have the ability, I'd recommend using what comes to you for free out-of-the-box.
大多数语言都有路径解析函数,这些函数已经提供给您了。如果你有这个能力,我建议你使用免费的开箱即用的东西。
Assuming / is the path delimiter...
假设/是路径分隔符…
^(.*/)([^/]*)$
The first group will be whatever the directory/path info is, the second will be the filename. For example:
第一个组将是无论目录/路径信息是什么,第二个组将是文件名。例如:
- /foo/bar/baz.log: "/foo/bar/" is the path, "baz.log" is the file
- / foo / bar /巴兹。"/foo/bar/"是路径"baz。日志”文件
- foo/bar.log: "foo/" is the path, "bar.log" is the file
- foo / bar。log:“foo/”是路径,“bar”。日志”文件
- /foo/bar: "/foo/" is the path, "bar" is the file
- /foo/bar: "/foo/"是路径,"bar"是文件
- /foo/bar/: "/foo/bar/" is the path and there is no file.
- /foo/bar/:“/foo/bar/”是路径,没有文件。
#4
4
What language? and why use regex for this simple task?
什么语言?为什么要在这个简单的任务中使用regex呢?
If you must:
如果你必须:
^(.*)/([^/]*)$
gives you the two parts you wanted. You might need to quote the parentheses:
给你你想要的两部分。您可能需要引用括号:
^\(.*\)/\([^/]*\)$
depending on your preferred language syntax.
取决于您的首选语言语法。
But I suggest you just use your language's string search function that finds the last "/" character, and split the string on that index.
但是我建议您只使用您的语言的字符串搜索函数来查找最后的“/”字符,并在该索引上拆分字符串。
#5
1
What about this?
这是什么?
[/]{0,1}([^/]+[/])*([^/]*)
Deterministic :
确定性:
((/)|())([^/]+/)*([^/]*)
Strict :
严格:
^[/]{0,1}([^/]+[/])*([^/]*)$
^((/)|())([^/]+/)*([^/]*)$
#6
0
Try this:
试试这个:
/^(\/([^/]+\/)*)(.*)$/
It will leave the trailing slash on the path, though.
它会在路径上留下斜线。
#7
0
A very late answer, but hope this will help
这是一个非常晚的回答,但希望这能有所帮助
^(.+?)/([\w]+\.log)$
This uses lazy check for /
, and I just modified the accepted answer
这使用了延迟检查/,并且我刚刚修改了已接受的答案
http://regex101.com/r/gV2xB7/1
http://regex101.com/r/gV2xB7/1
#8
-4
I would avoid doing that with regex. I would use your language's included facilities for parsing the path names, and use regex for just the searching for which its nature is required.
我将避免使用regex。我将使用您的语言包含的工具来解析路径名,并使用regex进行搜索,而搜索的性质是必需的。