带有多个字符分隔符的Java String.split

时间:2022-09-29 16:06:13

I have strings that I need to parse that look like this

我有需要解析的字符串,看起来像这样

"(1,0,quote),(1,0,place),(1,0,hall),(2,0,wall)"

I want to split the string into chunks of triplets so that I get

我想将字符串拆分成三元组块以便我得到

1,0,quote 
1,0,place 
1,0,hall 
2,0,wall 

How can I do this with String.split? If I use a comma as a delimeter, it would split the words too. I want to split them using the delimeter "),(". How do I do this?

我怎么能用String.split做到这一点?如果我使用逗号作为分隔符,它也会分割单词。我想用分隔符“)分割它们,(”。我该怎么做?

Thanks

谢谢

3 个解决方案

#1


2  

With a split method, you'll get an array with one empty cell. Use Pattern and Matcher class instead.

使用split方法,您将获得一个包含一个空单元格的数组。请改用Pattern和Matcher类。

Try this code instead:

请尝试使用此代码:

String s = "(1,0,quote),(1,0,place),(1,0,hall),(2,0,wall)";
Pattern p = Pattern.compile("\\d+,\\d+,[^)]+");
Matcher m = p.matcher(s);

List<String> l=new ArrayList<>();
while(m.find()) {
    l.add(m.group());
}

System.out.println(l);

Output

产量

[1,0,quote, 1,0,place, 1,0,hall, 2,0,wall]

#2


3  

If you split your string with ),( you will not remove ( from start and ) at end of your string. Consider using Pattern and Matcher classes to find elements between ( ).

如果你将字符串拆分为),(你不会在字符串末尾删除(从开始和)。考虑使用Pattern和Matcher类来查找()之间的元素。

String text = "(1,0,quote),(1,0,place),(1,0,hall),(2,0,wall)";

Pattern p = Pattern.compile("\\(([^)]+)\\)");
Matcher m = p.matcher(text);
while(m.find()) {
    System.out.println(m.group(1));
}

Output:

输出:

1,0,quote
1,0,place
1,0,hall
2,0,wall

If you really want to use split on ),( you will need to manually remove first ( and last ) (splitting will only remove part that should be split on). Also you will have to escape parenthesis ) ( because they are regex metacharacter (for example used to create groups). To do this you can manually add \\ before each of ) and (, or you can surround ),( with \\Q and \\E to mark characters between these elements as literals. BUT you don't have to do this manually. Just use Pattern.quote to produce regex with escaped all metacharacters and use it as split argument like

如果你真的想使用split on),(你需要手动删除第一个(和最后一个)(拆分只会删除应该拆分的部分)。你也必须转义括号)(因为它们是正则表达式元字符)例如,用于创建组)。为此,你可以在每个之前手动添加\\和(或者你可以环绕),(用\\ Q和\\ E将这些元素之间的字符标记为文字。但是你不必手动执行此操作。只需使用Pattern.quote生成带有转义所有元字符的正则表达式,并将其用作拆分参数,如

//I assume that `text` already removed `(` and `)` from its start and end 
String[] array = text.split(Pattern.quote("),("));

#3


1  

As you mentioned, you can split at ),( - then, when iterating over the result array, you only need to take into account, that array[0] contains an additional ( and array[n-1] contains an additional ).

正如你所提到的,你可以拆分),( - 然后,当迭代结果数组时,你只需要考虑,数组[0]包含一个额外的(并且数组[n-1]包含一个额外的)。

You could also apply a regex to remove the leading and trailing bracket, first or use substring from 1 to n-2 before splitting, etc...

您还可以应用正则表达式来删除前导和尾随括号,或者在分割之前使用从1到n-2的子字符串等...

#1


2  

With a split method, you'll get an array with one empty cell. Use Pattern and Matcher class instead.

使用split方法,您将获得一个包含一个空单元格的数组。请改用Pattern和Matcher类。

Try this code instead:

请尝试使用此代码:

String s = "(1,0,quote),(1,0,place),(1,0,hall),(2,0,wall)";
Pattern p = Pattern.compile("\\d+,\\d+,[^)]+");
Matcher m = p.matcher(s);

List<String> l=new ArrayList<>();
while(m.find()) {
    l.add(m.group());
}

System.out.println(l);

Output

产量

[1,0,quote, 1,0,place, 1,0,hall, 2,0,wall]

#2


3  

If you split your string with ),( you will not remove ( from start and ) at end of your string. Consider using Pattern and Matcher classes to find elements between ( ).

如果你将字符串拆分为),(你不会在字符串末尾删除(从开始和)。考虑使用Pattern和Matcher类来查找()之间的元素。

String text = "(1,0,quote),(1,0,place),(1,0,hall),(2,0,wall)";

Pattern p = Pattern.compile("\\(([^)]+)\\)");
Matcher m = p.matcher(text);
while(m.find()) {
    System.out.println(m.group(1));
}

Output:

输出:

1,0,quote
1,0,place
1,0,hall
2,0,wall

If you really want to use split on ),( you will need to manually remove first ( and last ) (splitting will only remove part that should be split on). Also you will have to escape parenthesis ) ( because they are regex metacharacter (for example used to create groups). To do this you can manually add \\ before each of ) and (, or you can surround ),( with \\Q and \\E to mark characters between these elements as literals. BUT you don't have to do this manually. Just use Pattern.quote to produce regex with escaped all metacharacters and use it as split argument like

如果你真的想使用split on),(你需要手动删除第一个(和最后一个)(拆分只会删除应该拆分的部分)。你也必须转义括号)(因为它们是正则表达式元字符)例如,用于创建组)。为此,你可以在每个之前手动添加\\和(或者你可以环绕),(用\\ Q和\\ E将这些元素之间的字符标记为文字。但是你不必手动执行此操作。只需使用Pattern.quote生成带有转义所有元字符的正则表达式,并将其用作拆分参数,如

//I assume that `text` already removed `(` and `)` from its start and end 
String[] array = text.split(Pattern.quote("),("));

#3


1  

As you mentioned, you can split at ),( - then, when iterating over the result array, you only need to take into account, that array[0] contains an additional ( and array[n-1] contains an additional ).

正如你所提到的,你可以拆分),( - 然后,当迭代结果数组时,你只需要考虑,数组[0]包含一个额外的(并且数组[n-1]包含一个额外的)。

You could also apply a regex to remove the leading and trailing bracket, first or use substring from 1 to n-2 before splitting, etc...

您还可以应用正则表达式来删除前导和尾随括号,或者在分割之前使用从1到n-2的子字符串等...