java正则表达式匹配每个组以特定字符串开头

时间:2022-09-13 16:44:56

I have a string like a1wwa1xxa1yya1zz.

我有一个像a1wwa1xxa1yya1zz的字符串。

I would like to get every groups starting with a1 until next a1 excluded. (In my example, i would be : a1ww, a1xx, a1yyand a1zz

我想让每个小组以a1开头,直到下一个a1被排除在外。 (在我的例子中,我将是:a1ww,a1xx,a1yy和a1zz

If I use :

如果我使用:

Matcher m = Pattern.compile("(a1.*?)a1").matcher("a1wwa1xxa1yya1zz");
while(m.find()) {
  String myGroup = m.group(1);
}

myGroup capture 1 group every two groups.
So in my example, I can only capture a1ww and a1yy.

myGroup每两组捕获1组。所以在我的例子中,我只能捕获a1ww和a1yy。

Anyone have a great idea ?

任何人都有一个好主意?

3 个解决方案

#1


5  

Split is a good solution, but if you want to remain in the regex world, here is a solution:

拆分是一个很好的解决方案,但如果你想留在正则表达式世界,这是一个解决方案:

Matcher m = Pattern.compile("(a1.*?)(?=a1|$)").matcher("a1wwa1xxa1yya1zz");
while (m.find()) {
  String myGroup = m.group(1);
  System.out.println("> " + myGroup);
}

I used a positive lookahead to ensure the capture is followed by a1, or alternatively by the end of line.

我使用了一个积极的先行,以确保捕获后跟a1,或者行尾。

Lookahead are zero-width assertions, ie. they verify a condition without advancing the match cursor, so the string they verify remains available for further testing.

Lookahead是零宽度断言,即。他们在不推进匹配光标的情况下验证条件,因此他们验证的字符串仍可用于进一步测试。

#2


3  

You can use split() method, then append "a1" as a prefix to splitted elements:

您可以使用split()方法,然后将“a1”作为分割元素的前缀:

String str = "a1wwa1xxa1yya1zz";
String[] parts = str.split("a1");
String[] output = new String[parts.length - 1];

for (int i = 0; i < output.length; i++)
    output[i] = "a1" + parts[i + 1];

for (String p : output)
    System.out.println(p);

Output:

a1ww
a1xx
a1yy
a1zz

#3


0  

I would use an approach like this:

我会用这样的方法:

    String str = "a1wwa1xxa1yya1zz";
    String[] parts = str.split("a1");
    for (int i = 1; i < parts.length; i++) {
        String found = "a1" + parts[i];
    }

#1


5  

Split is a good solution, but if you want to remain in the regex world, here is a solution:

拆分是一个很好的解决方案,但如果你想留在正则表达式世界,这是一个解决方案:

Matcher m = Pattern.compile("(a1.*?)(?=a1|$)").matcher("a1wwa1xxa1yya1zz");
while (m.find()) {
  String myGroup = m.group(1);
  System.out.println("> " + myGroup);
}

I used a positive lookahead to ensure the capture is followed by a1, or alternatively by the end of line.

我使用了一个积极的先行,以确保捕获后跟a1,或者行尾。

Lookahead are zero-width assertions, ie. they verify a condition without advancing the match cursor, so the string they verify remains available for further testing.

Lookahead是零宽度断言,即。他们在不推进匹配光标的情况下验证条件,因此他们验证的字符串仍可用于进一步测试。

#2


3  

You can use split() method, then append "a1" as a prefix to splitted elements:

您可以使用split()方法,然后将“a1”作为分割元素的前缀:

String str = "a1wwa1xxa1yya1zz";
String[] parts = str.split("a1");
String[] output = new String[parts.length - 1];

for (int i = 0; i < output.length; i++)
    output[i] = "a1" + parts[i + 1];

for (String p : output)
    System.out.println(p);

Output:

a1ww
a1xx
a1yy
a1zz

#3


0  

I would use an approach like this:

我会用这样的方法:

    String str = "a1wwa1xxa1yya1zz";
    String[] parts = str.split("a1");
    for (int i = 1; i < parts.length; i++) {
        String found = "a1" + parts[i];
    }