如何通过特定模式从字符串中提取子字符串

时间:2022-09-13 13:17:43

I have a String object like DUMMY_CONTENT_DUMMY The part before or after _ are actually gibberish What's needed is the one between the two underscores. Is there a way, in java, to extract that content out ? Perhaps I have to write a regex?

我有一个像DUMMY_CONTENT_DUMMY这样的String对象_之前或之后的部分实际上是乱码所需要的是两个下划线之间的部分。有没有办法在java中提取该内容?也许我必须写一个正则表达式?

4 个解决方案

#1


In this case you do not need regex.

在这种情况下,您不需要正则表达式。

String str = "DUMMY_CONTENT_DUMMY";
String content = str.split("_")[1];

#2


String x = "AA_BB_CC";    
String[]  arr = x.split("_");  
String middle = arr[1];

Here middle contains your middle part which is "BB" in thsi case.

这里中间包含你的中间部分,在这种情况下是“BB”。

#3


If you did it with a regex, you could use those underscores preceded or followed by characters as clues for lookahead and lookbehind techniques. Using Friedl's book on Regular Expressions, I hacked up this code as an example.

如果你使用正则表达式,你可以使用前面或后面跟随字符的下划线作为前瞻和后瞻技术的线索。使用Friedl关于正则表达式的书,我把这个代码作为一个例子。

    /* example method from reference
    *   REF:  Friedel, J. Mastering Regular Expressions.  Ch8: Java.  circa. p.371.
    */

      public static void simpleRegexTest(){

    String myText = "DUMMY_CONTENT_DUMMY";
    String myRegex = "(?<=.*_)(.*)(?=_.*)"; 

    // compile the regex into a pattern that can be used repeatedly
    java.util.regex.Pattern p = java.util.regex.Pattern.compile(myRegex);
    // prepare the pattern to be applied to a given string
    java.util.regex.Matcher m = p.matcher(myText);
    // apply the pattern to the current region and see if the a match is found
    // if will return the first match only
    // while will return matches until there are no more

    while (m.find()){
        // use the match result
        // get the span of text matched with group()
        // in this particular instance, we are interested in group 1
        String matchedText = m.group(1);
        // get the index of the start of the span with start()
        int matchedFrom = m.start();
        // get the index of the end of the span with end()
        int matchedTo = m.end();

        System.out.println("matched [" + matchedText + "] " +
                " from " + matchedFrom +
                " to " + matchedTo + " .");

    } 

When I ran it, the results returned were: matched [CONTENT] from 6 to 13 .

当我运行它时,返回的结果是:匹配[内容]从6到13。

#4


Here's how you do it using RegEx.

以下是使用RegEx进行操作的方法。

String middlePart = yourString.replaceAll("[^_]*_([^_]*)_[^_]*", "$1");

#1


In this case you do not need regex.

在这种情况下,您不需要正则表达式。

String str = "DUMMY_CONTENT_DUMMY";
String content = str.split("_")[1];

#2


String x = "AA_BB_CC";    
String[]  arr = x.split("_");  
String middle = arr[1];

Here middle contains your middle part which is "BB" in thsi case.

这里中间包含你的中间部分,在这种情况下是“BB”。

#3


If you did it with a regex, you could use those underscores preceded or followed by characters as clues for lookahead and lookbehind techniques. Using Friedl's book on Regular Expressions, I hacked up this code as an example.

如果你使用正则表达式,你可以使用前面或后面跟随字符的下划线作为前瞻和后瞻技术的线索。使用Friedl关于正则表达式的书,我把这个代码作为一个例子。

    /* example method from reference
    *   REF:  Friedel, J. Mastering Regular Expressions.  Ch8: Java.  circa. p.371.
    */

      public static void simpleRegexTest(){

    String myText = "DUMMY_CONTENT_DUMMY";
    String myRegex = "(?<=.*_)(.*)(?=_.*)"; 

    // compile the regex into a pattern that can be used repeatedly
    java.util.regex.Pattern p = java.util.regex.Pattern.compile(myRegex);
    // prepare the pattern to be applied to a given string
    java.util.regex.Matcher m = p.matcher(myText);
    // apply the pattern to the current region and see if the a match is found
    // if will return the first match only
    // while will return matches until there are no more

    while (m.find()){
        // use the match result
        // get the span of text matched with group()
        // in this particular instance, we are interested in group 1
        String matchedText = m.group(1);
        // get the index of the start of the span with start()
        int matchedFrom = m.start();
        // get the index of the end of the span with end()
        int matchedTo = m.end();

        System.out.println("matched [" + matchedText + "] " +
                " from " + matchedFrom +
                " to " + matchedTo + " .");

    } 

When I ran it, the results returned were: matched [CONTENT] from 6 to 13 .

当我运行它时,返回的结果是:匹配[内容]从6到13。

#4


Here's how you do it using RegEx.

以下是使用RegEx进行操作的方法。

String middlePart = yourString.replaceAll("[^_]*_([^_]*)_[^_]*", "$1");