Perl Regex:用可选的多行匹配文本

时间:2023-02-08 11:13:26

I'm trying to extract object files from Linux Make files. Here are some examples:

我正在尝试从Linux Make文件中提取对象文件。下面是一些例子:

Intel E1000E:

针对英特尔E1000E:

e1000e-objs := 82571.o ich8lan.o 80003es2lan.o \
       mac.o manage.o nvm.o phy.o \
       param.o ethtool.o netdev.o ptp.o

Chelsio T3:

Chelsio T3:

cxgb3-objs := cxgb3_main.o ael1002.o vsc8211.o t3_hw.o mc5.o \
       xgmac.o sge.o l2t.o cxgb3_offload.o aq100x.o

Atheros ALX:

创锐讯ALX:

alx-objs := main.o ethtool.o hw.o

How can I make a regular expression that returns what is after := considering that having multiple lines is optional and there could be more than two lines? Note that the backslashes are part of the Makefile content.

考虑到具有多个行是可选的,并且可能有多个行,我如何创建一个返回after:=的正则表达式呢?注意,反斜杠是Makefile内容的一部分。

I know only how to specify manually the number of new lines with something like:

我只知道如何手动指定新行数,比如:

$obj_files_no_ext = "e1000";
my @filestmp = ($Makefile_contents =~ m/$obj_files_no_ext-objs\s*[\+\:]= (.*)\\\s*\n(.*)/g);

2 个解决方案

#1


1  

You can try with this:

你可以试试这个:

$obj_files_no_ext-objs\s*:=\s*((?:(?:[^\s\\]*?\.o)[\s\n\r\\]*)+)

This will capture all object files that belong to a certain $obj_files_no_ext in group 1.

这将捕获属于组1中某个$obj_files_no_ext的所有对象文件。

#2


2  

You can try with this pattern:

你可以试试这个模式:

(?>$obj_files_no_ext-objs\s*:=|\G)\s*\K(?>[^\s.]++|\.(?!o(?:\s|$)))++\.o

pattern details:

模式的细节:

(?>                    # open an atomic group
    $obj_files_no_ext  # radical
    -objs\s*:=         
  |                    # OR
    \G                 # contiguous match
)                      # close the atomic group
\s*\K                  # optional spaces and reset all the match
(?>                    # open an atomic group (filename possible characters)
    [^\s.]++           # all that is not a white character or a dot (1+ times)
  |                    # OR
    \.(?!o(?:\s|$))    # a dot not followed by "o", a space or the string end 
)++                    # repeat the atomic group one or more times
\.o           

example:

例子:

#!/usr/bin/perl
use strict;
use warnings;

my $Makefile_contents = q{e1000e-objs := 82571.o ich8lan.o 80003es2lan.o 
   mac.o manage.o nvm.o phy.o 
   param.o ethtool.o netdev.o ptp.o};

my $obj_files_no_ext = "e1000e";
my $reg = qr/(?>$obj_files_no_ext-objs\s*:=|\G)\s*\K(?>[^\s.]++|\.(?!o(?:\s|$)))++\.o/;
my @filestmp =  $Makefile_contents =~ /$reg/g;
print join(" ",@filestmp);

#1


1  

You can try with this:

你可以试试这个:

$obj_files_no_ext-objs\s*:=\s*((?:(?:[^\s\\]*?\.o)[\s\n\r\\]*)+)

This will capture all object files that belong to a certain $obj_files_no_ext in group 1.

这将捕获属于组1中某个$obj_files_no_ext的所有对象文件。

#2


2  

You can try with this pattern:

你可以试试这个模式:

(?>$obj_files_no_ext-objs\s*:=|\G)\s*\K(?>[^\s.]++|\.(?!o(?:\s|$)))++\.o

pattern details:

模式的细节:

(?>                    # open an atomic group
    $obj_files_no_ext  # radical
    -objs\s*:=         
  |                    # OR
    \G                 # contiguous match
)                      # close the atomic group
\s*\K                  # optional spaces and reset all the match
(?>                    # open an atomic group (filename possible characters)
    [^\s.]++           # all that is not a white character or a dot (1+ times)
  |                    # OR
    \.(?!o(?:\s|$))    # a dot not followed by "o", a space or the string end 
)++                    # repeat the atomic group one or more times
\.o           

example:

例子:

#!/usr/bin/perl
use strict;
use warnings;

my $Makefile_contents = q{e1000e-objs := 82571.o ich8lan.o 80003es2lan.o 
   mac.o manage.o nvm.o phy.o 
   param.o ethtool.o netdev.o ptp.o};

my $obj_files_no_ext = "e1000e";
my $reg = qr/(?>$obj_files_no_ext-objs\s*:=|\G)\s*\K(?>[^\s.]++|\.(?!o(?:\s|$)))++\.o/;
my @filestmp =  $Makefile_contents =~ /$reg/g;
print join(" ",@filestmp);