从字符串中获取一组数字

时间:2023-02-05 19:53:46

I have a file that contains strings, which in turn contain numbers of 10 digits. I need to extract the numbers, with regex and put them in an array. I think I can use \d{10} but I'm not sure how to actually apply that with Java.

我有一个包含字符串的文件,而这些字符串又包含10位数字。我需要用正则表达式提取数字并将它们放在一个数组中。我想我可以使用\ d {10},但我不确定如何用Java实际应用它。

Also, an additional element of complexity, is potentially if there are a lot of numbers there may be multiple numbers with different forms like 123456745-9 and 123456745-95 signifying a range. I'd like to extract those numbers as well. (I can handle creating the range of numbers in java, Regex is not necessary for that)

此外,复杂性的另一个要素是潜在的,如果有很多数字可能有多个不同形式的数字,如123456745-9和123456745-95表示范围。我也想提取这些数字。 (我可以处理在java中创建数字范围,Regex不是必需的)

Any tips would be appreciated!

任何提示将不胜感激!

2 个解决方案

#1


3  

You could split on non-digit characters but keep the -:

您可以拆分非数字字符但保留 - :

String[] numbers = input.split("[^\\-\\d]+");

Example:

例:

String input = "bla bla bla 123456789 bla bla 123456789 bla bla 123456765-9 bla bla bla 123456767-89 bla bla";
input = input.replaceFirst("^[^\\-\\d]*", ""); //remove the leading non-digits if any
String[] numbers = input.split("[^\\-\\d]+"); //split
System.out.println(Arrays.toString(numbers));

outputs:

输出:

[123456789, 123456789, 123456765-9, 123456767-89]

#2


0  

The regex is simpler than you think. You just need to match any digit one or more times.

正则表达式比你想象的要简单。您只需要匹配任何数字一次或多次。

Example:

例:

String line = "a line with some digits 123456745-9 and maybe some more 343-34 and a single 1 99 ";
String regexpattern = "(\\d+)(-(\\d+))?";
Pattern pattern = Pattern.compile(regexpattern);
Matcher matcher = pattern.matcher(line);
while (matcher.find()){
    System.out.println("number= '" + matcher.group(1)+"'");
    if (matcher.group(3) != null)
        System.out.println("range '" + matcher.group(3)+"'");
}

This output would be the following

此输出将如下

number= '123456745'
ranges to '9'
number= '343'
ranges to '34'
number= '1'
number= '99'

#1


3  

You could split on non-digit characters but keep the -:

您可以拆分非数字字符但保留 - :

String[] numbers = input.split("[^\\-\\d]+");

Example:

例:

String input = "bla bla bla 123456789 bla bla 123456789 bla bla 123456765-9 bla bla bla 123456767-89 bla bla";
input = input.replaceFirst("^[^\\-\\d]*", ""); //remove the leading non-digits if any
String[] numbers = input.split("[^\\-\\d]+"); //split
System.out.println(Arrays.toString(numbers));

outputs:

输出:

[123456789, 123456789, 123456765-9, 123456767-89]

#2


0  

The regex is simpler than you think. You just need to match any digit one or more times.

正则表达式比你想象的要简单。您只需要匹配任何数字一次或多次。

Example:

例:

String line = "a line with some digits 123456745-9 and maybe some more 343-34 and a single 1 99 ";
String regexpattern = "(\\d+)(-(\\d+))?";
Pattern pattern = Pattern.compile(regexpattern);
Matcher matcher = pattern.matcher(line);
while (matcher.find()){
    System.out.println("number= '" + matcher.group(1)+"'");
    if (matcher.group(3) != null)
        System.out.println("range '" + matcher.group(3)+"'");
}

This output would be the following

此输出将如下

number= '123456745'
ranges to '9'
number= '343'
ranges to '34'
number= '1'
number= '99'