单个正则表达式捕获组是否可以捕获没有某些中间字符的短语?

时间:2022-09-13 09:27:00

I'm working on an XML that lists regexs that are to be used as capture groups. Why it's done this way is a long story and not something I can change.

我正在研究一种XML,它列出了要用作捕获组的正则表达式。为什么这样做是一个漫长的故事,而不是我可以改变的事情。

I've just come upon a situation where I want to capture a name that spans two lines, i.e. Bob\nJones. Is there any way for me to capture that whole name into one capture group without using any other capture groups in Perl using regex? Basically, what I want is for $1 = "Bob Jones", replacing the \n with a space.

我刚刚遇到一种情况,我希望捕获一个跨越两行的名称,即Bob \ nJones。有没有办法让我将整个名称捕获到一个捕获组而不使用正则表达式在Perl中使用任何其他捕获组?基本上,我想要的是$ 1 =“Bob Jones”,用空格替换\ n。

I'm thinking this isn't feasible and the right way would just be to use to capture group for the first and last name (which I can't do in my case), but I figured I'd ask anyway, before I give up on it. Any ideas?

我认为这是不可行的,正确的方法只是用来捕获组的名字和姓氏(在我的情况下我做不到),但我想我还是会问,在我之前放弃它。有任何想法吗?

2 个解决方案

#1


6  

No.

#2


1  

Maybe you should look at some of the XML parser modules. XML::Simple is pretty ...well... simple and can parse the XML file better than you can with just regular expressions. As you found, sooner or later, you'll get to a point where the regular expressions start to get quite convoluted as you attempt to parse each and every possible combination.

也许你应该看看一些XML解析器模块。 XML :: Simple很漂亮......很简单,并且可以比使用正则表达式更好地解析XML文件。正如您所发现的那样,迟早,当您尝试解析每个可能的组合时,您将达到正则表达式开始变得非常复杂的程度。

I wish the standard Perl install came with XML and HTML and LWP modules. A significant amount of my Perl scripts always need HTML access or parsing XML files, and it's sometimes not possible to download and compile modules you need from CPAN. I believe XML::Simple needs a few other XML modules in order to work (XML::SAX comes to mind), but there's no C code compilation.

我希望标准的Perl安装带有XML和HTML以及LWP模块。我的大量Perl脚本总是需要HTML访问或解析XML文件,有时无法从CPAN下载和编译所需的模块。我相信XML :: Simple需要一些其他的XML模块才能工作(想到XML :: SAX),但是没有C代码编译。

That means you can place the XML::Simple module in the directory with your Perl script. The @INC array does contain the current directory by default. (Or, you can use the use lib pragma).

这意味着您可以使用Perl脚本将XML :: Simple模块放在目录中。默认情况下,@ INC数组确实包含当前目录。 (或者,您可以使用lib pragma)。

#1


6  

No.

#2


1  

Maybe you should look at some of the XML parser modules. XML::Simple is pretty ...well... simple and can parse the XML file better than you can with just regular expressions. As you found, sooner or later, you'll get to a point where the regular expressions start to get quite convoluted as you attempt to parse each and every possible combination.

也许你应该看看一些XML解析器模块。 XML :: Simple很漂亮......很简单,并且可以比使用正则表达式更好地解析XML文件。正如您所发现的那样,迟早,当您尝试解析每个可能的组合时,您将达到正则表达式开始变得非常复杂的程度。

I wish the standard Perl install came with XML and HTML and LWP modules. A significant amount of my Perl scripts always need HTML access or parsing XML files, and it's sometimes not possible to download and compile modules you need from CPAN. I believe XML::Simple needs a few other XML modules in order to work (XML::SAX comes to mind), but there's no C code compilation.

我希望标准的Perl安装带有XML和HTML以及LWP模块。我的大量Perl脚本总是需要HTML访问或解析XML文件,有时无法从CPAN下载和编译所需的模块。我相信XML :: Simple需要一些其他的XML模块才能工作(想到XML :: SAX),但是没有C代码编译。

That means you can place the XML::Simple module in the directory with your Perl script. The @INC array does contain the current directory by default. (Or, you can use the use lib pragma).

这意味着您可以使用Perl脚本将XML :: Simple模块放在目录中。默认情况下,@ INC数组确实包含当前目录。 (或者,您可以使用lib pragma)。