如何使用正则表达式从字符串值中提取数据?

时间:2022-09-13 11:10:22

Hello I have the following string:

你好我有以下字符串:

Country number Time Status USA B30111 11:15 ARRIVED PARIS NC0120 14:40 ON TIME DUBAI RA007 14:45 ON TIME

I need to extract following info:

我需要提取以下信息:

country = USA
number = B30111
time = 11:15
status = ARRIVED

country = PARIS
number = NC0120
time = 14:40
status = ON TIME

How can I use regex to extract the above data from it?

如何使用正则表达式从中提取上述数据?

2 个解决方案

#1


1  

You can try this:

你可以试试这个:

(?: (\w+) ([\w\d]+) (\d+\:\d+) (ARRIVED|ON TIME))

Explanation

As status can hold more than one word therefore it is not possible to distinct it from the next country that appears, therefore you must append all the possible status as or| in the regex

由于状态可以包含多个单词,因此无法将其与出现的下一个国家区分开来,因此您必须将所有可能的状态添加为或在正则表达式

Java Source:

final String regex = "(?: (\\w+) ([\\w\\d]+) (\\d+\\:\\d+) (ARRIVED|ON TIME))";
final String string = "Country number Time Status USA B30111 11:15 ARRIVED PARIS NC0120 14:40 ON TIME DUBAI RA007 14:45 ON TIME\n\n\n";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
        System.out.println("country =" + matcher.group(1));
        System.out.println("number =" + matcher.group(2));
        System.out.println("time =" + matcher.group(3));
        System.out.println("status =" + matcher.group(4));
        System.out.println("");
}

output

country =USA
number =B30111
time =11:15
status =ARRIVED

country =PARIS
number =NC0120
time =14:40
status =ON TIME

country =DUBAI
number =RA007
time =14:45
status =ON TIME

#2


0  

If you create an array based on split function, you will have each words in that array.

如果基于拆分函数创建数组,则将包含该数组中的每个单词。

String[] splitted = str.split(" ");

Then to check, try this:-

然后检查,试试这个: -

for(String test:splitted){
    System.out.println(test);
}

This looks more like a CSV file.

这看起来更像是一个CSV文件。

#1


1  

You can try this:

你可以试试这个:

(?: (\w+) ([\w\d]+) (\d+\:\d+) (ARRIVED|ON TIME))

Explanation

As status can hold more than one word therefore it is not possible to distinct it from the next country that appears, therefore you must append all the possible status as or| in the regex

由于状态可以包含多个单词,因此无法将其与出现的下一个国家区分开来,因此您必须将所有可能的状态添加为或在正则表达式

Java Source:

final String regex = "(?: (\\w+) ([\\w\\d]+) (\\d+\\:\\d+) (ARRIVED|ON TIME))";
final String string = "Country number Time Status USA B30111 11:15 ARRIVED PARIS NC0120 14:40 ON TIME DUBAI RA007 14:45 ON TIME\n\n\n";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
        System.out.println("country =" + matcher.group(1));
        System.out.println("number =" + matcher.group(2));
        System.out.println("time =" + matcher.group(3));
        System.out.println("status =" + matcher.group(4));
        System.out.println("");
}

output

country =USA
number =B30111
time =11:15
status =ARRIVED

country =PARIS
number =NC0120
time =14:40
status =ON TIME

country =DUBAI
number =RA007
time =14:45
status =ON TIME

#2


0  

If you create an array based on split function, you will have each words in that array.

如果基于拆分函数创建数组,则将包含该数组中的每个单词。

String[] splitted = str.split(" ");

Then to check, try this:-

然后检查,试试这个: -

for(String test:splitted){
    System.out.println(test);
}

This looks more like a CSV file.

这看起来更像是一个CSV文件。