Java字符串真的是不可变的吗?

时间:2022-10-15 16:24:25

We all know that String is immutable in Java, but check the following code:

我们都知道在Java中字符串是不可变的,但是请检查以下代码:

String s1 = "Hello World";  
String s2 = "Hello World";  
String s3 = s1.substring(6);  
System.out.println(s1); // Hello World  
System.out.println(s2); // Hello World  
System.out.println(s3); // World  

Field field = String.class.getDeclaredField("value");  
field.setAccessible(true);  
char[] value = (char[])field.get(s1);  
value[6] = 'J';  
value[7] = 'a';  
value[8] = 'v';  
value[9] = 'a';  
value[10] = '!';  

System.out.println(s1); // Hello Java!  
System.out.println(s2); // Hello Java!  
System.out.println(s3); // World  

Why does this program operate like this? And why is the value of s1 and s2 changed, but not s3?

为什么这个程序是这样运行的?为什么s1和s2的值会改变,而不是s3?

15 个解决方案

#1


390  

String is immutable* but this only means you cannot change it using its public API.

String是不可变的*,但这仅仅意味着您不能使用它的公共API来更改它。

What you are doing here is circumventing the normal API, using reflection. The same way, you can change the values of enums, change the lookup table used in Integer autoboxing etc.

您在这里所做的是使用反射绕过常规API。同样的方法,您可以更改枚举的值,更改整数自动装箱中使用的查找表等等。

Now, the reason s1 and s2 change value, is that they both refer to the same interned string. The compiler does this (as mentioned by other answers).

s1和s2改变值的原因是,它们都指向相同的内嵌字符串。编译器这样做(如其他答案所述)。

The reason s3 does not was actually a bit surprising to me, as I thought it would share the value array (it did in earlier version of Java, before Java 7u6). However, looking at the source code of String, we can see that the value character array for a substring is actually copied (using Arrays.copyOfRange(..)). This is why it goes unchanged.

s3并不让我感到意外,因为我认为它会共享值数组(它在Java 7u6之前的Java早期版本中实现了)。但是,查看字符串的源代码,我们可以看到子字符串的值字符数组实际上是被复制的(使用Arrays.copyOfRange(..))。这就是它保持不变的原因。

You can install a SecurityManager, to avoid malicious code to do such things. But keep in mind that some libraries depend on using these kind of reflection tricks (typically ORM tools, AOP libraries etc).

您可以安装SecurityManager,以避免恶意代码做此类事情。但是请记住,有些库依赖于使用这些反射技巧(通常是ORM工具、AOP库等)。

*) I initially wrote that Strings aren't really immutable, just "effective immutable". This might be misleading in the current implementation of String, where the value array is indeed marked private final. It's still worth noting, though, that there is no way to declare an array in Java as immutable, so care must be taken not to expose it outside its class, even with the proper access modifiers.

*)我最初写道,字符串不是真正不可变的,只是“有效的不可变”。这在字符串的当前实现中可能具有误导性,其中值数组确实标记为private final。不过,值得注意的是,Java中没有办法将数组声明为不可变的,因此必须注意,即使使用适当的访问修饰符,也不要在类之外公开数组。


As this topic seems overwhelmingly popular, here's some suggested further reading: Heinz Kabutz's Reflection Madness talk from JavaZone 2009, which covers a lot of the issues in the OP, along with other reflection... well... madness.

由于这个话题似乎非常受欢迎,这里有一些建议进一步阅读:亨氏卡布兹在JavaZone 2009年的“反射疯狂”演讲,它涵盖了OP中的许多问题,以及其他一些思考……嗯…疯狂。

It covers why this is sometimes useful. And why, most of the time, you should avoid it. :-)

它涵盖了为什么有时这是有用的。为什么,大多数时候,你应该避免它。:-)

#2


93  

In Java, if two string primitive variables are initialized to the same literal, it assigns the same reference to both variables:

在Java中,如果两个字符串原语变量被初始化为相同的文字,那么它会为两个变量分配相同的引用:

String Test1="Hello World";
String Test2="Hello World";
System.out.println(test1==test2); // true

Java字符串真的是不可变的吗?

That is the reason the comparison returns true. The third string is created using substring() which makes a new string instead of pointing to the same.

这就是比较返回true的原因。第三个字符串是使用substring()创建的,substring()创建一个新的字符串,而不是指向相同的字符串。

Java字符串真的是不可变的吗?

When you access a string using reflection, you get the actual pointer:

当您使用反射访问字符串时,您将得到实际的指针:

Field field = String.class.getDeclaredField("value");
field.setAccessible(true);

So change to this will change the string holding a pointer to it, but as s3 is created with a new string due to substring() it would not change.

因此,对它进行更改将会更改包含指向它的指针的字符串,但是由于由于substring()的原因,s3使用新的字符串创建,所以不会更改。

Java字符串真的是不可变的吗?

#3


50  

You are using reflection to circumvent the immutability of String - it's a form of "attack".

您正在使用反射来绕过字符串的不变性—这是一种“攻击”形式。

There are lots of examples you can create like this (eg you can even instantiate a Void object too), but it doesn't mean that String is not "immutable".

你可以创建很多这样的例子(例如你甚至可以实例化一个Void对象),但是这并不意味着字符串不是“不可变的”。

There are use cases where this type of code may be used to your advantage and be "good coding", such as clearing passwords from memory at the earliest possible moment (before GC).

在某些情况下,这种类型的代码可能会被用于对您有利,并且是“良好的编码”,例如在尽可能早的时候(在GC之前)从内存中清除密码。

Depending on the security manager, you may not be able to execute your code.

根据安全管理器的不同,您可能无法执行代码。

#4


30  

You are using reflection to access the "implementation details" of string object. Immutability is the feature of the public interface of an object.

您正在使用反射来访问字符串对象的“实现细节”。不可变性是一个对象的公共接口的特性。

#5


24  

Visibility modifiers and final (i.e. immutability) are not a measurement against malicious code in Java; they are merely tools to protect against mistakes and to make the code more maintainable (one of the big selling points of the system). That is why you can access internal implementation details like the backing char array for Strings via reflection.

可见性修饰符和final(即不变性)不是Java中针对恶意代码的度量;它们仅仅是防止错误和使代码更易于维护的工具(系统的一大卖点之一)。这就是为什么您可以通过反射访问内部实现细节,如支持字符串的char数组。

The second effect you see is that all Strings change while it looks like you only change s1. It is a certain property of Java String literals that they are automatically interned, i.e. cached. Two String literals with the same value will actually be the same object. When you create a String with new it will not be interned automatically and you will not see this effect.

你看到的第二个效果是所有的字符串都在变化,而看起来你只改变了s1。它是Java字符串文本的一个特定属性,它们被自动插入,即缓存。具有相同值的两个字符串文本实际上是相同的对象。当你用new创建一个字符串时,它不会自动被插入,你也不会看到这个效果。

#substring until recently (Java 7u6) worked in a similar way, which would have explained the behaviour in the original version of your question. It didn't create a new backing char array but reused the one from the original String; it just created a new String object that used an offset and a length to present only a part of that array. This generally worked as Strings are immutable - unless you circumvent that. This property of #substring also meant that the whole original String couldn't be garbage collected when a shorter substring created from it still existed.

直到最近,#substring (Java 7u6)还以类似的方式工作,这可以解释问题的原始版本中的行为。它没有创建新的支持字符数组,而是重用了来自原始字符串的那个;它只是创建了一个新的字符串对象,该对象使用偏移量和长度来表示该数组的一部分。这通常被认为是不可变的——除非你绕过它。#substring的这个属性还意味着,当从它创建的更短的子字符串仍然存在时,整个原始字符串不能被垃圾收集。

As of current Java and your current version of the question there is no strange behaviour of #substring.

对于当前Java和当前版本的问题,#substring没有奇怪的行为。

#6


11  

String immutability is from the interface perspective. You are using reflection to bypass the interface and directly modify the internals of the String instances.

字符串不变性是从接口的角度来看的。您使用反射绕过接口并直接修改字符串实例的内部。

s1 and s2 are both changed because they are both assigned to the same "intern" String instance. You can find out a bit more about that part from this article about string equality and interning. You might be surprised to find out that in your sample code, s1 == s2 returns true!

s1和s2都被更改,因为它们都被分配给相同的“实习”字符串实例。您可以从本文中了解更多关于字符串相等和互交的内容。您可能会惊讶地发现,在示例代码中,s1 = s2返回true!

#7


10  

Which version of Java are you using? From Java 1.7.0_06, Oracle has changed the internal representation of String, especially the substring.

您正在使用哪个版本的Java ?从Java 1.7.0_06, Oracle更改了字符串的内部表示,特别是子字符串。

Quoting from Oracle Tunes Java's Internal String Representation:

引用Oracle对Java的内部字符串表示:

In the new paradigm, the String offset and count fields have been removed, so substrings no longer share the underlying char [] value.

在新的范例中,字符串偏移量和计数字段被删除,因此子字符串不再共享底层的char[]值。

With this change, it may happen without reflection (???).

有了这种变化,它就可能没有反射(??)

#8


7  

There are really two questions here:

这里有两个问题

  1. Are strings really immutable?
  2. 字符串是不可变的吗?
  3. Why is s3 not changed?
  4. 为什么s3没有改变?

To point 1: Except for ROM there is no immutable memory in your computer. Nowadays even ROM is sometimes writable. There is always some code somewhere (whether it's the kernel or native code sidestepping your managed environment) that can write to your memory address. So, in "reality", no they are not absolutely immutable.

点1:除了ROM之外,你的计算机中没有不可变的内存。现在即使是ROM也有时是可写的。总有一些代码(无论是内核还是本机代码,绕过托管环境)可以写入内存地址。所以,在“现实”中,它们不是绝对不可变的。

To point 2: This is because substring is probably allocating a new string instance, which is likely copying the array. It is possible to implement substring in such a way that it won't do a copy, but that doesn't mean it does. There are tradeoffs involved.

对点2:这是因为子字符串可能会分配一个新的字符串实例,它可能会复制这个数组。实现子字符串的方式可能是它不做复制,但这并不意味着它做复制。有权衡。

For example, should holding a reference to reallyLargeString.substring(reallyLargeString.length - 2) cause a large amount of memory to be held alive, or only a few bytes?

例如,应该保存一个引用来真正地largestring .substring(真的很大的string)。长度- 2)导致大量的内存被保存,或者只有几个字节?

That depends on how substring is implemented. A deep copy will keep less memory alive, but it will run slightly slower. A shallow copy will keep more memory alive, but it will be faster. Using a deep copy can also reduce heap fragmentation, as the string object and its buffer can be allocated in one block, as opposed to 2 separate heap allocations.

这取决于如何实现子字符串。深度拷贝将保持较少的内存活力,但它将运行稍慢。一个浅拷贝将保持更多的内存,但它将会更快。使用深度副本也可以减少堆碎片,因为字符串对象及其缓冲区可以在一个块中分配,而不是在两个单独的堆分配中。

In any case, it looks like your JVM chose to use deep copies for substring calls.

在任何情况下,您的JVM都选择对子字符串调用使用深度副本。

#9


5  

To add to the @haraldK's answer - this is a security hack which could lead to a serious impact in the app.

再加上@haraldK的回答——这是一个安全黑客,可能会对应用程序产生严重影响。

First thing is a modification to a constant string stored in a String Pool. When string is declared as a String s = "Hello World";, it's being places into a special object pool for further potential reusing. The issue is that compiler will place a reference to the modified version at compile time and once the user modifies the string stored in this pool at runtime, all references in code will point to the modified version. This would result into a following bug:

首先是对存储在字符串池中的常量字符串进行修改。当字符串被声明为字符串s =“Hello World”时,它将被放置到一个特殊的对象池中,以便进一步重用。问题是编译器会在编译时对修改后的版本进行引用,一旦用户在运行时修改了存储在这个池中的字符串,代码中的所有引用都会指向修改后的版本。这会导致以下错误:

System.out.println("Hello World"); 

Will print:

将打印:

Hello Java!

There was another issue I experienced when I was implementing a heavy computation over such risky strings. There was a bug which happened in like 1 out of 1000000 times during the computation which made the result undeterministic. I was able to find the problem by switching off the JIT - I was always getting the same result with JIT turned off. My guess is that the reason was this String security hack which broke some of the JIT optimization contracts.

当我在这些危险的字符串上执行大量计算时,我遇到了另一个问题。在计算过程中,每一百万次中就有一次错误发生,这使得结果不确定。我可以通过关闭JIT来找到问题所在——我总是在关闭JIT时得到相同的结果。

#10


5  

According to the concept of pooling, all the String variables containing the same value will point to the same memory address. Therefore s1 and s2, both containing the same value of “Hello World”, will point towards the same memory location (say M1).

根据池的概念,包含相同值的所有字符串变量都指向相同的内存地址。因此s1和s2,它们都包含相同的“Hello World”值,将指向相同的内存位置(比如M1)。

On the other hand, s3 contains “World”, hence it will point to a different memory allocation (say M2).

另一方面,s3包含“World”,因此它将指向不同的内存分配(比如M2)。

So now what's happening is that the value of S1 is being changed (by using the char [ ] value). So the value at the memory location M1 pointed both by s1 and s2 has been changed.

现在发生的是S1的值被改变了(通过使用char[]值)。因此内存位置M1同时指向s1和s2的值已经改变。

Hence as a result, memory location M1 has been modified which causes change in the value of s1 and s2.

因此,内存位置M1被修改,从而导致s1和s2的值发生变化。

But the value of location M2 remains unaltered, hence s3 contains the same original value.

但是位置M2的值保持不变,因此s3包含相同的原始值。

#11


3  

The reason s3 does not actually change is because in Java when you do a substring the value character array for a substring is internally copied (using Arrays.copyOfRange()).

s3没有实际更改的原因是,在Java中,当您执行子字符串时,子字符串的值字符数组在内部被复制(使用Arrays.copyOfRange()))。

s1 and s2 are the same because in Java they both refer to the same interned string. It's by design in Java.

s1和s2是相同的,因为在Java中它们都引用相同的交错字符串。它是用Java设计的。

#12


2  

String is immutable, but through reflection you're allowed to change the String class. You've just redefined the String class as mutable in real-time. You could redefine methods to be public or private or static if you wanted.

String是不可变的,但是通过反射,您可以更改String类。您刚刚将String类重新定义为可变的实时。如果需要,可以将方法重新定义为public、private或static。

#13


1  

[Disclaimer this is a deliberately opinionated style of answer as I feel a more "don't do this at home kids" answer is warranted]

【免责声明这是一种故意固执己见的回答,因为我觉得“不要在家里这么做,孩子们”的回答是有根据的。】

The sin is the line field.setAccessible(true); which says to violate the public api by allowing access to a private field. Thats a giant security hole which can be locked down by configuring a security manager.

sin是line字段,setaccess (true);它表示通过允许访问私有字段来违反公共api。这是一个巨大的安全漏洞,可以通过配置安全管理器来锁定。

The phenomenon in the question are implementation details which you would never see when not using that dangerous line of code to violate the access modifiers via reflection. Clearly two (normally) immutable strings can share the same char array. Whether a substring shares the same array depends on whether it can and whether the developer thought to share it. Normally these are invisible implementation details which you should not have to know unless you shoot the access modifier through the head with that line of code.

问题中的现象是实现细节,如果不使用危险的代码行通过反射来违反访问修饰符,您将永远不会看到这些细节。显然,两个(通常)不可变字符串可以共享相同的char数组。子字符串是否共享同一个数组取决于它是否可以共享,以及开发人员是否想共享它。通常情况下,这些是不可见的实现细节,你不需要知道,除非你用这行代码对访问修饰符进行拍照。

It is simply not a good idea to rely upon such details which cannot be experienced without violating the access modifiers using reflection. The owner of that class only supports the normal public API and is free to make implementation changes in the future.

依赖这样的细节并不是个好主意,如果不使用反射违反访问修饰符,就无法体验这些细节。该类的所有者仅支持普通的公共API,并且可以在将来实现更改。

Having said all that the line of code is really very useful when you have a gun held you your head forcing you to do such dangerous things. Using that back door is usually a code smell that you need to upgrade to better library code where you don't have to sin. Another common use of that dangerous line of code is to write a "voodoo framework" (orm, injection container, ...). Many folks get religious about such frameworks (both for and against them) so I will avoid inviting a flame war by saying nothing other than the vast majority of programmers don't have to go there.

话虽如此,当你用枪指着你的头迫使你做这些危险的事情时,这一行代码真的很有用。使用后门通常是一种代码味道,您需要升级到更好的库代码,而不需要犯错误。危险代码行的另一个常见用法是编写“巫毒框架”(orm, injection container,…)。许多人对这样的框架(包括支持和反对它们的框架)抱有虔诚的态度,因此我将避免引发激烈的争论,除了绝大多数程序员不需要这样做。

#14


1  

Strings are created in permanent area of the JVM heap memory. So yes, it's really immutable and cannot be changed after being created. Because in the JVM, there are three types of heap memory: 1. Young generation 2. Old generation 3. Permanent generation.

字符串在JVM堆内存的永久区域中创建。是的,它确实是不可变的,在被创建后是不能改变的。因为在JVM中,有三种类型的堆内存:1。年轻一代2。旧一代3。永久的一代。

When any object are created, it goes into the young generation heap area and PermGen area reserved for String pooling.

当创建任何对象时,它将进入为字符串池预留的年轻生成堆区域和PermGen区域。

Here is more detail you can go and grab more information from: How Garbage Collection works in Java .

这里有更详细的信息,您可以从中获取更多信息:垃圾收集如何在Java中工作。

#15


1  

You can get a clear view behind the question "Why the String class is designed to be immutable" by reading the reason in detail from here

通过从这里详细阅读原因,您可以清楚地了解“为什么字符串类被设计成不可变的”这个问题背后的含义

Exploring the String class would get you a clear view on how it is designed to become immutable Click Here to Explore the String Class

探索String类将使您清楚地了解如何将其设计成不可变的,单击此处以探索String类

#1


390  

String is immutable* but this only means you cannot change it using its public API.

String是不可变的*,但这仅仅意味着您不能使用它的公共API来更改它。

What you are doing here is circumventing the normal API, using reflection. The same way, you can change the values of enums, change the lookup table used in Integer autoboxing etc.

您在这里所做的是使用反射绕过常规API。同样的方法,您可以更改枚举的值,更改整数自动装箱中使用的查找表等等。

Now, the reason s1 and s2 change value, is that they both refer to the same interned string. The compiler does this (as mentioned by other answers).

s1和s2改变值的原因是,它们都指向相同的内嵌字符串。编译器这样做(如其他答案所述)。

The reason s3 does not was actually a bit surprising to me, as I thought it would share the value array (it did in earlier version of Java, before Java 7u6). However, looking at the source code of String, we can see that the value character array for a substring is actually copied (using Arrays.copyOfRange(..)). This is why it goes unchanged.

s3并不让我感到意外,因为我认为它会共享值数组(它在Java 7u6之前的Java早期版本中实现了)。但是,查看字符串的源代码,我们可以看到子字符串的值字符数组实际上是被复制的(使用Arrays.copyOfRange(..))。这就是它保持不变的原因。

You can install a SecurityManager, to avoid malicious code to do such things. But keep in mind that some libraries depend on using these kind of reflection tricks (typically ORM tools, AOP libraries etc).

您可以安装SecurityManager,以避免恶意代码做此类事情。但是请记住,有些库依赖于使用这些反射技巧(通常是ORM工具、AOP库等)。

*) I initially wrote that Strings aren't really immutable, just "effective immutable". This might be misleading in the current implementation of String, where the value array is indeed marked private final. It's still worth noting, though, that there is no way to declare an array in Java as immutable, so care must be taken not to expose it outside its class, even with the proper access modifiers.

*)我最初写道,字符串不是真正不可变的,只是“有效的不可变”。这在字符串的当前实现中可能具有误导性,其中值数组确实标记为private final。不过,值得注意的是,Java中没有办法将数组声明为不可变的,因此必须注意,即使使用适当的访问修饰符,也不要在类之外公开数组。


As this topic seems overwhelmingly popular, here's some suggested further reading: Heinz Kabutz's Reflection Madness talk from JavaZone 2009, which covers a lot of the issues in the OP, along with other reflection... well... madness.

由于这个话题似乎非常受欢迎,这里有一些建议进一步阅读:亨氏卡布兹在JavaZone 2009年的“反射疯狂”演讲,它涵盖了OP中的许多问题,以及其他一些思考……嗯…疯狂。

It covers why this is sometimes useful. And why, most of the time, you should avoid it. :-)

它涵盖了为什么有时这是有用的。为什么,大多数时候,你应该避免它。:-)

#2


93  

In Java, if two string primitive variables are initialized to the same literal, it assigns the same reference to both variables:

在Java中,如果两个字符串原语变量被初始化为相同的文字,那么它会为两个变量分配相同的引用:

String Test1="Hello World";
String Test2="Hello World";
System.out.println(test1==test2); // true

Java字符串真的是不可变的吗?

That is the reason the comparison returns true. The third string is created using substring() which makes a new string instead of pointing to the same.

这就是比较返回true的原因。第三个字符串是使用substring()创建的,substring()创建一个新的字符串,而不是指向相同的字符串。

Java字符串真的是不可变的吗?

When you access a string using reflection, you get the actual pointer:

当您使用反射访问字符串时,您将得到实际的指针:

Field field = String.class.getDeclaredField("value");
field.setAccessible(true);

So change to this will change the string holding a pointer to it, but as s3 is created with a new string due to substring() it would not change.

因此,对它进行更改将会更改包含指向它的指针的字符串,但是由于由于substring()的原因,s3使用新的字符串创建,所以不会更改。

Java字符串真的是不可变的吗?

#3


50  

You are using reflection to circumvent the immutability of String - it's a form of "attack".

您正在使用反射来绕过字符串的不变性—这是一种“攻击”形式。

There are lots of examples you can create like this (eg you can even instantiate a Void object too), but it doesn't mean that String is not "immutable".

你可以创建很多这样的例子(例如你甚至可以实例化一个Void对象),但是这并不意味着字符串不是“不可变的”。

There are use cases where this type of code may be used to your advantage and be "good coding", such as clearing passwords from memory at the earliest possible moment (before GC).

在某些情况下,这种类型的代码可能会被用于对您有利,并且是“良好的编码”,例如在尽可能早的时候(在GC之前)从内存中清除密码。

Depending on the security manager, you may not be able to execute your code.

根据安全管理器的不同,您可能无法执行代码。

#4


30  

You are using reflection to access the "implementation details" of string object. Immutability is the feature of the public interface of an object.

您正在使用反射来访问字符串对象的“实现细节”。不可变性是一个对象的公共接口的特性。

#5


24  

Visibility modifiers and final (i.e. immutability) are not a measurement against malicious code in Java; they are merely tools to protect against mistakes and to make the code more maintainable (one of the big selling points of the system). That is why you can access internal implementation details like the backing char array for Strings via reflection.

可见性修饰符和final(即不变性)不是Java中针对恶意代码的度量;它们仅仅是防止错误和使代码更易于维护的工具(系统的一大卖点之一)。这就是为什么您可以通过反射访问内部实现细节,如支持字符串的char数组。

The second effect you see is that all Strings change while it looks like you only change s1. It is a certain property of Java String literals that they are automatically interned, i.e. cached. Two String literals with the same value will actually be the same object. When you create a String with new it will not be interned automatically and you will not see this effect.

你看到的第二个效果是所有的字符串都在变化,而看起来你只改变了s1。它是Java字符串文本的一个特定属性,它们被自动插入,即缓存。具有相同值的两个字符串文本实际上是相同的对象。当你用new创建一个字符串时,它不会自动被插入,你也不会看到这个效果。

#substring until recently (Java 7u6) worked in a similar way, which would have explained the behaviour in the original version of your question. It didn't create a new backing char array but reused the one from the original String; it just created a new String object that used an offset and a length to present only a part of that array. This generally worked as Strings are immutable - unless you circumvent that. This property of #substring also meant that the whole original String couldn't be garbage collected when a shorter substring created from it still existed.

直到最近,#substring (Java 7u6)还以类似的方式工作,这可以解释问题的原始版本中的行为。它没有创建新的支持字符数组,而是重用了来自原始字符串的那个;它只是创建了一个新的字符串对象,该对象使用偏移量和长度来表示该数组的一部分。这通常被认为是不可变的——除非你绕过它。#substring的这个属性还意味着,当从它创建的更短的子字符串仍然存在时,整个原始字符串不能被垃圾收集。

As of current Java and your current version of the question there is no strange behaviour of #substring.

对于当前Java和当前版本的问题,#substring没有奇怪的行为。

#6


11  

String immutability is from the interface perspective. You are using reflection to bypass the interface and directly modify the internals of the String instances.

字符串不变性是从接口的角度来看的。您使用反射绕过接口并直接修改字符串实例的内部。

s1 and s2 are both changed because they are both assigned to the same "intern" String instance. You can find out a bit more about that part from this article about string equality and interning. You might be surprised to find out that in your sample code, s1 == s2 returns true!

s1和s2都被更改,因为它们都被分配给相同的“实习”字符串实例。您可以从本文中了解更多关于字符串相等和互交的内容。您可能会惊讶地发现,在示例代码中,s1 = s2返回true!

#7


10  

Which version of Java are you using? From Java 1.7.0_06, Oracle has changed the internal representation of String, especially the substring.

您正在使用哪个版本的Java ?从Java 1.7.0_06, Oracle更改了字符串的内部表示,特别是子字符串。

Quoting from Oracle Tunes Java's Internal String Representation:

引用Oracle对Java的内部字符串表示:

In the new paradigm, the String offset and count fields have been removed, so substrings no longer share the underlying char [] value.

在新的范例中,字符串偏移量和计数字段被删除,因此子字符串不再共享底层的char[]值。

With this change, it may happen without reflection (???).

有了这种变化,它就可能没有反射(??)

#8


7  

There are really two questions here:

这里有两个问题

  1. Are strings really immutable?
  2. 字符串是不可变的吗?
  3. Why is s3 not changed?
  4. 为什么s3没有改变?

To point 1: Except for ROM there is no immutable memory in your computer. Nowadays even ROM is sometimes writable. There is always some code somewhere (whether it's the kernel or native code sidestepping your managed environment) that can write to your memory address. So, in "reality", no they are not absolutely immutable.

点1:除了ROM之外,你的计算机中没有不可变的内存。现在即使是ROM也有时是可写的。总有一些代码(无论是内核还是本机代码,绕过托管环境)可以写入内存地址。所以,在“现实”中,它们不是绝对不可变的。

To point 2: This is because substring is probably allocating a new string instance, which is likely copying the array. It is possible to implement substring in such a way that it won't do a copy, but that doesn't mean it does. There are tradeoffs involved.

对点2:这是因为子字符串可能会分配一个新的字符串实例,它可能会复制这个数组。实现子字符串的方式可能是它不做复制,但这并不意味着它做复制。有权衡。

For example, should holding a reference to reallyLargeString.substring(reallyLargeString.length - 2) cause a large amount of memory to be held alive, or only a few bytes?

例如,应该保存一个引用来真正地largestring .substring(真的很大的string)。长度- 2)导致大量的内存被保存,或者只有几个字节?

That depends on how substring is implemented. A deep copy will keep less memory alive, but it will run slightly slower. A shallow copy will keep more memory alive, but it will be faster. Using a deep copy can also reduce heap fragmentation, as the string object and its buffer can be allocated in one block, as opposed to 2 separate heap allocations.

这取决于如何实现子字符串。深度拷贝将保持较少的内存活力,但它将运行稍慢。一个浅拷贝将保持更多的内存,但它将会更快。使用深度副本也可以减少堆碎片,因为字符串对象及其缓冲区可以在一个块中分配,而不是在两个单独的堆分配中。

In any case, it looks like your JVM chose to use deep copies for substring calls.

在任何情况下,您的JVM都选择对子字符串调用使用深度副本。

#9


5  

To add to the @haraldK's answer - this is a security hack which could lead to a serious impact in the app.

再加上@haraldK的回答——这是一个安全黑客,可能会对应用程序产生严重影响。

First thing is a modification to a constant string stored in a String Pool. When string is declared as a String s = "Hello World";, it's being places into a special object pool for further potential reusing. The issue is that compiler will place a reference to the modified version at compile time and once the user modifies the string stored in this pool at runtime, all references in code will point to the modified version. This would result into a following bug:

首先是对存储在字符串池中的常量字符串进行修改。当字符串被声明为字符串s =“Hello World”时,它将被放置到一个特殊的对象池中,以便进一步重用。问题是编译器会在编译时对修改后的版本进行引用,一旦用户在运行时修改了存储在这个池中的字符串,代码中的所有引用都会指向修改后的版本。这会导致以下错误:

System.out.println("Hello World"); 

Will print:

将打印:

Hello Java!

There was another issue I experienced when I was implementing a heavy computation over such risky strings. There was a bug which happened in like 1 out of 1000000 times during the computation which made the result undeterministic. I was able to find the problem by switching off the JIT - I was always getting the same result with JIT turned off. My guess is that the reason was this String security hack which broke some of the JIT optimization contracts.

当我在这些危险的字符串上执行大量计算时,我遇到了另一个问题。在计算过程中,每一百万次中就有一次错误发生,这使得结果不确定。我可以通过关闭JIT来找到问题所在——我总是在关闭JIT时得到相同的结果。

#10


5  

According to the concept of pooling, all the String variables containing the same value will point to the same memory address. Therefore s1 and s2, both containing the same value of “Hello World”, will point towards the same memory location (say M1).

根据池的概念,包含相同值的所有字符串变量都指向相同的内存地址。因此s1和s2,它们都包含相同的“Hello World”值,将指向相同的内存位置(比如M1)。

On the other hand, s3 contains “World”, hence it will point to a different memory allocation (say M2).

另一方面,s3包含“World”,因此它将指向不同的内存分配(比如M2)。

So now what's happening is that the value of S1 is being changed (by using the char [ ] value). So the value at the memory location M1 pointed both by s1 and s2 has been changed.

现在发生的是S1的值被改变了(通过使用char[]值)。因此内存位置M1同时指向s1和s2的值已经改变。

Hence as a result, memory location M1 has been modified which causes change in the value of s1 and s2.

因此,内存位置M1被修改,从而导致s1和s2的值发生变化。

But the value of location M2 remains unaltered, hence s3 contains the same original value.

但是位置M2的值保持不变,因此s3包含相同的原始值。

#11


3  

The reason s3 does not actually change is because in Java when you do a substring the value character array for a substring is internally copied (using Arrays.copyOfRange()).

s3没有实际更改的原因是,在Java中,当您执行子字符串时,子字符串的值字符数组在内部被复制(使用Arrays.copyOfRange()))。

s1 and s2 are the same because in Java they both refer to the same interned string. It's by design in Java.

s1和s2是相同的,因为在Java中它们都引用相同的交错字符串。它是用Java设计的。

#12


2  

String is immutable, but through reflection you're allowed to change the String class. You've just redefined the String class as mutable in real-time. You could redefine methods to be public or private or static if you wanted.

String是不可变的,但是通过反射,您可以更改String类。您刚刚将String类重新定义为可变的实时。如果需要,可以将方法重新定义为public、private或static。

#13


1  

[Disclaimer this is a deliberately opinionated style of answer as I feel a more "don't do this at home kids" answer is warranted]

【免责声明这是一种故意固执己见的回答,因为我觉得“不要在家里这么做,孩子们”的回答是有根据的。】

The sin is the line field.setAccessible(true); which says to violate the public api by allowing access to a private field. Thats a giant security hole which can be locked down by configuring a security manager.

sin是line字段,setaccess (true);它表示通过允许访问私有字段来违反公共api。这是一个巨大的安全漏洞,可以通过配置安全管理器来锁定。

The phenomenon in the question are implementation details which you would never see when not using that dangerous line of code to violate the access modifiers via reflection. Clearly two (normally) immutable strings can share the same char array. Whether a substring shares the same array depends on whether it can and whether the developer thought to share it. Normally these are invisible implementation details which you should not have to know unless you shoot the access modifier through the head with that line of code.

问题中的现象是实现细节,如果不使用危险的代码行通过反射来违反访问修饰符,您将永远不会看到这些细节。显然,两个(通常)不可变字符串可以共享相同的char数组。子字符串是否共享同一个数组取决于它是否可以共享,以及开发人员是否想共享它。通常情况下,这些是不可见的实现细节,你不需要知道,除非你用这行代码对访问修饰符进行拍照。

It is simply not a good idea to rely upon such details which cannot be experienced without violating the access modifiers using reflection. The owner of that class only supports the normal public API and is free to make implementation changes in the future.

依赖这样的细节并不是个好主意,如果不使用反射违反访问修饰符,就无法体验这些细节。该类的所有者仅支持普通的公共API,并且可以在将来实现更改。

Having said all that the line of code is really very useful when you have a gun held you your head forcing you to do such dangerous things. Using that back door is usually a code smell that you need to upgrade to better library code where you don't have to sin. Another common use of that dangerous line of code is to write a "voodoo framework" (orm, injection container, ...). Many folks get religious about such frameworks (both for and against them) so I will avoid inviting a flame war by saying nothing other than the vast majority of programmers don't have to go there.

话虽如此,当你用枪指着你的头迫使你做这些危险的事情时,这一行代码真的很有用。使用后门通常是一种代码味道,您需要升级到更好的库代码,而不需要犯错误。危险代码行的另一个常见用法是编写“巫毒框架”(orm, injection container,…)。许多人对这样的框架(包括支持和反对它们的框架)抱有虔诚的态度,因此我将避免引发激烈的争论,除了绝大多数程序员不需要这样做。

#14


1  

Strings are created in permanent area of the JVM heap memory. So yes, it's really immutable and cannot be changed after being created. Because in the JVM, there are three types of heap memory: 1. Young generation 2. Old generation 3. Permanent generation.

字符串在JVM堆内存的永久区域中创建。是的,它确实是不可变的,在被创建后是不能改变的。因为在JVM中,有三种类型的堆内存:1。年轻一代2。旧一代3。永久的一代。

When any object are created, it goes into the young generation heap area and PermGen area reserved for String pooling.

当创建任何对象时,它将进入为字符串池预留的年轻生成堆区域和PermGen区域。

Here is more detail you can go and grab more information from: How Garbage Collection works in Java .

这里有更详细的信息,您可以从中获取更多信息:垃圾收集如何在Java中工作。

#15


1  

You can get a clear view behind the question "Why the String class is designed to be immutable" by reading the reason in detail from here

通过从这里详细阅读原因,您可以清楚地了解“为什么字符串类被设计成不可变的”这个问题背后的含义

Exploring the String class would get you a clear view on how it is designed to become immutable Click Here to Explore the String Class

探索String类将使您清楚地了解如何将其设计成不可变的,单击此处以探索String类