如何将CamelCase转换成可读的Java名称?

时间:2022-11-24 21:12:19

I'd like to write a method that converts CamelCase into a human-readable name.

我想编写一种方法,将CamelCase转换为可读的名称。

Here's the test case:

测试用例:

public void testSplitCamelCase() {
    assertEquals("lowercase", splitCamelCase("lowercase"));
    assertEquals("Class", splitCamelCase("Class"));
    assertEquals("My Class", splitCamelCase("MyClass"));
    assertEquals("HTML", splitCamelCase("HTML"));
    assertEquals("PDF Loader", splitCamelCase("PDFLoader"));
    assertEquals("A String", splitCamelCase("AString"));
    assertEquals("Simple XML Parser", splitCamelCase("SimpleXMLParser"));
    assertEquals("GL 11 Version", splitCamelCase("GL11Version"));
}

13 个解决方案

#1


299  

This works with your testcases:

这与您的测试案例有关:

static String splitCamelCase(String s) {
   return s.replaceAll(
      String.format("%s|%s|%s",
         "(?<=[A-Z])(?=[A-Z][a-z])",
         "(?<=[^A-Z])(?=[A-Z])",
         "(?<=[A-Za-z])(?=[^A-Za-z])"
      ),
      " "
   );
}

Here's a test harness:

这里有一个测试工具:

    String[] tests = {
        "lowercase",        // [lowercase]
        "Class",            // [Class]
        "MyClass",          // [My Class]
        "HTML",             // [HTML]
        "PDFLoader",        // [PDF Loader]
        "AString",          // [A String]
        "SimpleXMLParser",  // [Simple XML Parser]
        "GL11Version",      // [GL 11 Version]
        "99Bottles",        // [99 Bottles]
        "May5",             // [May 5]
        "BFG9000",          // [BFG 9000]
    };
    for (String test : tests) {
        System.out.println("[" + splitCamelCase(test) + "]");
    }

It uses zero-length matching regex with lookbehind and lookforward to find where to insert spaces. Basically there are 3 patterns, and I use String.format to put them together to make it more readable.

它使用零长度匹配regex与lookbehind,并期待找到插入空格的位置。基本上有三种模式,我使用字符串。把它们放在一起,使其可读性更好。

The three patterns are:

这三个模式是:

UC behind me, UC followed by LC in front of me

  XMLParser   AString    PDFLoader
    /\        /\           /\

non-UC behind me, UC in front of me

 MyClass   99Bottles
  /\        /\

Letter behind me, non-letter in front of me

 GL11    May5    BFG9000
  /\       /\      /\

References

Related questions

Using zero-length matching lookarounds to split:

使用零长度匹配查找来分割:

#2


91  

You can do it using org.apache.commons.lang.StringUtils

您可以使用org.apache.common .lang. stringutils。

StringUtils.join(
     StringUtils.splitByCharacterTypeCamelCase("ExampleTest"),
     ' '
);

#3


9  

If you don't like "complicated" regex's, and aren't at all bothered about efficiency, then I've used this example to achieve the same effect in three stages.

如果你不喜欢“复杂的”正则表达式,而且根本不关心效率,那么我用这个例子在三个阶段实现了同样的效果。

String name = 
    camelName.replaceAll("([A-Z][a-z]+)", " $1") // Words beginning with UC
             .replaceAll("([A-Z][A-Z]+)", " $1") // "Words" of only UC
             .replaceAll("([^A-Za-z ]+)", " $1") // "Words" of non-letters
             .trim();

It passes all the test cases above, including those with digits.

它通过了上面所有的测试用例,包括那些有数字的。

As I say, this isn't as good as using the one regular expression in some other examples here - but someone might well find it useful.

就像我说的,这并不像在其他例子中使用正则表达式那样好——但是有人可能会发现它很有用。

#4


6  

You can use org.modeshape.common.text.Inflector.

您可以使用org.modeshape.common.text.Inflector。

Specifically:

具体地说:

String humanize(String lowerCaseAndUnderscoredWords,
    String... removableTokens) 

Capitalizes the first word and turns underscores into spaces and strips trailing "_id" and any supplied removable tokens.

将第一个单词大写并将下划线转换为空格,并将后面的“_id”和任何提供的可移动的标记分隔开。

Maven artifact is: org.modeshape:modeshape-common:2.3.0.Final

Maven工件:org.modeshape:modeshape-common:2.3.0.Final

on JBoss repository: https://repository.jboss.org/nexus/content/repositories/releases

在JBoss库:https://repository.jboss.org/nexus/content/repositories/releases

Here's the JAR file: https://repository.jboss.org/nexus/content/repositories/releases/org/modeshape/modeshape-common/2.3.0.Final/modeshape-common-2.3.0.Final.jar

这里有一个JAR文件:https://repository.jboss.org/nexus/content/loctories/releases/releases/org/modeshape/modeshape-common/2.3.0.final/modeshape - common2.3.0.html . JAR。

#5


2  

The neat and shorter solution :

简洁明了的解决方案:

StringUtils.capitalize(StringUtils.join(StringUtils.splitByCharacterTypeCamelCase("yourCamelCaseText"), StringUtils.SPACE)); // Your Camel Case Text

#6


1  

The following Regex can be used to identify the capitals inside words:

下面的Regex可以用来识别单词中的大写:

"((?<=[a-z0-9])[A-Z]|(?<=[a-zA-Z])[0-9]]|(?<=[A-Z])[A-Z](?=[a-z]))"

It matches every capital letter, that is ether after a non-capital letter or digit or followed by a lower case letter and every digit after a letter.

它与每一个大写字母匹配,即在非大写字母或数字后面加上小写字母和字母后的每一个数字。

How to insert a space before them is beyond my Java skills =)

如何在它们之前插入一个空间,超出了我的Java技能=)

Edited to include the digit case and the PDF Loader case.

编辑的包括数字案例和PDF装载机案例。

#7


1  

I think you will have to iterate over the string and detect changes from lowercase to uppercase, uppercase to lowercase, alphabetic to numeric, numeric to alphabetic. On every change you detect insert a space with one exception though: on a change from upper- to lowercase you insert the space one character before.

我认为您将不得不遍历字符串,并检测从小写到大写的变化,从大写到小写,从字母到数字,从数字到字母。在每一个变化中,你都会发现插入一个空格,但有一个例外:从上到下,你插入空格前的一个字符。

#8


1  

This works in .NET... optimize to your liking. I added comments so you can understand what each piece is doing. (RegEx can be hard to understand)

这是在。net……优化你的喜欢。我添加了注释,这样您就可以理解每一篇文章在做什么。(RegEx可能很难理解)

public static string SplitCamelCase(string str)
{
    str = Regex.Replace(str, @"([A-Z])([A-Z][a-z])", "$1 $2");  // Capital followed by capital AND a lowercase.
    str = Regex.Replace(str, @"([a-z])([A-Z])", "$1 $2"); // Lowercase followed by a capital.
    str = Regex.Replace(str, @"(\D)(\d)", "$1 $2"); //Letter followed by a number.
    str = Regex.Replace(str, @"(\d)(\D)", "$1 $2"); // Number followed by letter.
    return str;
}

#9


0  

For the record, here is an almost (*) compatible Scala version:

对于这个记录,这里有一个几乎(*)兼容的Scala版本:

  object Str { def unapplySeq(s: String): Option[Seq[Char]] = Some(s) }

  def splitCamelCase(str: String) =
    String.valueOf(
      (str + "A" * 2) sliding (3) flatMap {
        case Str(a, b, c) =>
          (a.isUpper, b.isUpper, c.isUpper) match {
            case (true, false, _) => " " + a
            case (false, true, true) => a + " "
            case _ => String.valueOf(a)
          }
      } toArray
    ).trim

Once compiled it can be used directly from Java if the corresponding scala-library.jar is in the classpath.

一旦编译后,它可以直接从Java中使用,如果相应的scala-library。jar在类路径中。

(*) it fails for the input "GL11Version" for which it returns "G L11 Version".

(*)输入的“gl11版本”未能返回“gl11版本”。

#10


0  

I took the Regex from polygenelubricants and turned it into an extension method on objects:

我从polygenelubricants中取出Regex并将其转换为对象的扩展方法:

    /// <summary>
    /// Turns a given object into a sentence by:
    /// Converting the given object into a <see cref="string"/>.
    /// Adding spaces before each capital letter except for the first letter of the string representation of the given object.
    /// Makes the entire string lower case except for the first word and any acronyms.
    /// </summary>
    /// <param name="original">The object to turn into a proper sentence.</param>
    /// <returns>A string representation of the original object that reads like a real sentence.</returns>
    public static string ToProperSentence(this object original)
    {
        Regex addSpacesAtCapitalLettersRegEx = new Regex(@"(?<=[A-Z])(?=[A-Z][a-z]) | (?<=[^A-Z])(?=[A-Z]) | (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);
        string[] words = addSpacesAtCapitalLettersRegEx.Split(original.ToString());
        if (words.Length > 1)
        {
            List<string> wordsList = new List<string> { words[0] };
            wordsList.AddRange(words.Skip(1).Select(word => word.Equals(word.ToUpper()) ? word : word.ToLower()));
            words = wordsList.ToArray();
        }
        return string.Join(" ", words);
    }

This turns everything into a readable sentence. It does a ToString on the object passed. Then it uses the Regex given by polygenelubricants to split the string. Then it ToLowers each word except for the first word and any acronyms. Thought it might be useful for someone out there.

这就把一切都变成了可读的句子。它对通过的对象执行一个ToString。然后它使用由polygenelubricants给出的正则表达式来拆分字符串。然后,除了第一个词和任何首字母缩略词外,它还对每个单词都赋予了意义。我想它可能对外面的人有用。

#11


-1  

I'm not a regex ninja, so I'd iterate over the string, keeping the indexes of the current position being checked & the previous position. If the current position is a capital letter, I'd insert a space after the previous position and increment each index.

我不是regex ninja,所以我将遍历字符串,保持当前位置的索引和之前的位置。如果当前位置是大写字母,我将在前面的位置后插入一个空格,并增加每个索引。

#12


-2  

RegEx should work, something like ([A-Z]{1}). This will capture all Capital Letters, after that you could replace them with \1 or how ever you can refer to RegEx Groups in Java.

RegEx应该工作,类似于([A-Z]{1})。这将捕获所有的大写字母,然后您可以用\1替换它们,或者您可以在Java中引用RegEx组。

#13


-3  

http://code.google.com/p/inflection-js/

http://code.google.com/p/inflection-js/

You could chain the String.underscore().humanize() methods to take a CamelCase string and convert it into a human readable string.

您可以将字符串的string.下划线().humanize()方法连接到一个CamelCase字符串,并将其转换为可读的字符串。

#1


299  

This works with your testcases:

这与您的测试案例有关:

static String splitCamelCase(String s) {
   return s.replaceAll(
      String.format("%s|%s|%s",
         "(?<=[A-Z])(?=[A-Z][a-z])",
         "(?<=[^A-Z])(?=[A-Z])",
         "(?<=[A-Za-z])(?=[^A-Za-z])"
      ),
      " "
   );
}

Here's a test harness:

这里有一个测试工具:

    String[] tests = {
        "lowercase",        // [lowercase]
        "Class",            // [Class]
        "MyClass",          // [My Class]
        "HTML",             // [HTML]
        "PDFLoader",        // [PDF Loader]
        "AString",          // [A String]
        "SimpleXMLParser",  // [Simple XML Parser]
        "GL11Version",      // [GL 11 Version]
        "99Bottles",        // [99 Bottles]
        "May5",             // [May 5]
        "BFG9000",          // [BFG 9000]
    };
    for (String test : tests) {
        System.out.println("[" + splitCamelCase(test) + "]");
    }

It uses zero-length matching regex with lookbehind and lookforward to find where to insert spaces. Basically there are 3 patterns, and I use String.format to put them together to make it more readable.

它使用零长度匹配regex与lookbehind,并期待找到插入空格的位置。基本上有三种模式,我使用字符串。把它们放在一起,使其可读性更好。

The three patterns are:

这三个模式是:

UC behind me, UC followed by LC in front of me

  XMLParser   AString    PDFLoader
    /\        /\           /\

non-UC behind me, UC in front of me

 MyClass   99Bottles
  /\        /\

Letter behind me, non-letter in front of me

 GL11    May5    BFG9000
  /\       /\      /\

References

Related questions

Using zero-length matching lookarounds to split:

使用零长度匹配查找来分割:

#2


91  

You can do it using org.apache.commons.lang.StringUtils

您可以使用org.apache.common .lang. stringutils。

StringUtils.join(
     StringUtils.splitByCharacterTypeCamelCase("ExampleTest"),
     ' '
);

#3


9  

If you don't like "complicated" regex's, and aren't at all bothered about efficiency, then I've used this example to achieve the same effect in three stages.

如果你不喜欢“复杂的”正则表达式,而且根本不关心效率,那么我用这个例子在三个阶段实现了同样的效果。

String name = 
    camelName.replaceAll("([A-Z][a-z]+)", " $1") // Words beginning with UC
             .replaceAll("([A-Z][A-Z]+)", " $1") // "Words" of only UC
             .replaceAll("([^A-Za-z ]+)", " $1") // "Words" of non-letters
             .trim();

It passes all the test cases above, including those with digits.

它通过了上面所有的测试用例,包括那些有数字的。

As I say, this isn't as good as using the one regular expression in some other examples here - but someone might well find it useful.

就像我说的,这并不像在其他例子中使用正则表达式那样好——但是有人可能会发现它很有用。

#4


6  

You can use org.modeshape.common.text.Inflector.

您可以使用org.modeshape.common.text.Inflector。

Specifically:

具体地说:

String humanize(String lowerCaseAndUnderscoredWords,
    String... removableTokens) 

Capitalizes the first word and turns underscores into spaces and strips trailing "_id" and any supplied removable tokens.

将第一个单词大写并将下划线转换为空格,并将后面的“_id”和任何提供的可移动的标记分隔开。

Maven artifact is: org.modeshape:modeshape-common:2.3.0.Final

Maven工件:org.modeshape:modeshape-common:2.3.0.Final

on JBoss repository: https://repository.jboss.org/nexus/content/repositories/releases

在JBoss库:https://repository.jboss.org/nexus/content/repositories/releases

Here's the JAR file: https://repository.jboss.org/nexus/content/repositories/releases/org/modeshape/modeshape-common/2.3.0.Final/modeshape-common-2.3.0.Final.jar

这里有一个JAR文件:https://repository.jboss.org/nexus/content/loctories/releases/releases/org/modeshape/modeshape-common/2.3.0.final/modeshape - common2.3.0.html . JAR。

#5


2  

The neat and shorter solution :

简洁明了的解决方案:

StringUtils.capitalize(StringUtils.join(StringUtils.splitByCharacterTypeCamelCase("yourCamelCaseText"), StringUtils.SPACE)); // Your Camel Case Text

#6


1  

The following Regex can be used to identify the capitals inside words:

下面的Regex可以用来识别单词中的大写:

"((?<=[a-z0-9])[A-Z]|(?<=[a-zA-Z])[0-9]]|(?<=[A-Z])[A-Z](?=[a-z]))"

It matches every capital letter, that is ether after a non-capital letter or digit or followed by a lower case letter and every digit after a letter.

它与每一个大写字母匹配,即在非大写字母或数字后面加上小写字母和字母后的每一个数字。

How to insert a space before them is beyond my Java skills =)

如何在它们之前插入一个空间,超出了我的Java技能=)

Edited to include the digit case and the PDF Loader case.

编辑的包括数字案例和PDF装载机案例。

#7


1  

I think you will have to iterate over the string and detect changes from lowercase to uppercase, uppercase to lowercase, alphabetic to numeric, numeric to alphabetic. On every change you detect insert a space with one exception though: on a change from upper- to lowercase you insert the space one character before.

我认为您将不得不遍历字符串,并检测从小写到大写的变化,从大写到小写,从字母到数字,从数字到字母。在每一个变化中,你都会发现插入一个空格,但有一个例外:从上到下,你插入空格前的一个字符。

#8


1  

This works in .NET... optimize to your liking. I added comments so you can understand what each piece is doing. (RegEx can be hard to understand)

这是在。net……优化你的喜欢。我添加了注释,这样您就可以理解每一篇文章在做什么。(RegEx可能很难理解)

public static string SplitCamelCase(string str)
{
    str = Regex.Replace(str, @"([A-Z])([A-Z][a-z])", "$1 $2");  // Capital followed by capital AND a lowercase.
    str = Regex.Replace(str, @"([a-z])([A-Z])", "$1 $2"); // Lowercase followed by a capital.
    str = Regex.Replace(str, @"(\D)(\d)", "$1 $2"); //Letter followed by a number.
    str = Regex.Replace(str, @"(\d)(\D)", "$1 $2"); // Number followed by letter.
    return str;
}

#9


0  

For the record, here is an almost (*) compatible Scala version:

对于这个记录,这里有一个几乎(*)兼容的Scala版本:

  object Str { def unapplySeq(s: String): Option[Seq[Char]] = Some(s) }

  def splitCamelCase(str: String) =
    String.valueOf(
      (str + "A" * 2) sliding (3) flatMap {
        case Str(a, b, c) =>
          (a.isUpper, b.isUpper, c.isUpper) match {
            case (true, false, _) => " " + a
            case (false, true, true) => a + " "
            case _ => String.valueOf(a)
          }
      } toArray
    ).trim

Once compiled it can be used directly from Java if the corresponding scala-library.jar is in the classpath.

一旦编译后,它可以直接从Java中使用,如果相应的scala-library。jar在类路径中。

(*) it fails for the input "GL11Version" for which it returns "G L11 Version".

(*)输入的“gl11版本”未能返回“gl11版本”。

#10


0  

I took the Regex from polygenelubricants and turned it into an extension method on objects:

我从polygenelubricants中取出Regex并将其转换为对象的扩展方法:

    /// <summary>
    /// Turns a given object into a sentence by:
    /// Converting the given object into a <see cref="string"/>.
    /// Adding spaces before each capital letter except for the first letter of the string representation of the given object.
    /// Makes the entire string lower case except for the first word and any acronyms.
    /// </summary>
    /// <param name="original">The object to turn into a proper sentence.</param>
    /// <returns>A string representation of the original object that reads like a real sentence.</returns>
    public static string ToProperSentence(this object original)
    {
        Regex addSpacesAtCapitalLettersRegEx = new Regex(@"(?<=[A-Z])(?=[A-Z][a-z]) | (?<=[^A-Z])(?=[A-Z]) | (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);
        string[] words = addSpacesAtCapitalLettersRegEx.Split(original.ToString());
        if (words.Length > 1)
        {
            List<string> wordsList = new List<string> { words[0] };
            wordsList.AddRange(words.Skip(1).Select(word => word.Equals(word.ToUpper()) ? word : word.ToLower()));
            words = wordsList.ToArray();
        }
        return string.Join(" ", words);
    }

This turns everything into a readable sentence. It does a ToString on the object passed. Then it uses the Regex given by polygenelubricants to split the string. Then it ToLowers each word except for the first word and any acronyms. Thought it might be useful for someone out there.

这就把一切都变成了可读的句子。它对通过的对象执行一个ToString。然后它使用由polygenelubricants给出的正则表达式来拆分字符串。然后,除了第一个词和任何首字母缩略词外,它还对每个单词都赋予了意义。我想它可能对外面的人有用。

#11


-1  

I'm not a regex ninja, so I'd iterate over the string, keeping the indexes of the current position being checked & the previous position. If the current position is a capital letter, I'd insert a space after the previous position and increment each index.

我不是regex ninja,所以我将遍历字符串,保持当前位置的索引和之前的位置。如果当前位置是大写字母,我将在前面的位置后插入一个空格,并增加每个索引。

#12


-2  

RegEx should work, something like ([A-Z]{1}). This will capture all Capital Letters, after that you could replace them with \1 or how ever you can refer to RegEx Groups in Java.

RegEx应该工作,类似于([A-Z]{1})。这将捕获所有的大写字母,然后您可以用\1替换它们,或者您可以在Java中引用RegEx组。

#13


-3  

http://code.google.com/p/inflection-js/

http://code.google.com/p/inflection-js/

You could chain the String.underscore().humanize() methods to take a CamelCase string and convert it into a human readable string.

您可以将字符串的string.下划线().humanize()方法连接到一个CamelCase字符串,并将其转换为可读的字符串。