有没有办法让Jruby运行时实习生所有字符串?

时间:2021-02-16 17:08:50

We have a java/jruby webapp running under tomcat, and I have been analyzing the number of objects and memory use by the app during runtime. I have noticed after startup the class "org.jruby.RubyString" had 1,118,000 instances of the string "", the total amount of heap memory used by empty strings alone is 65mb, this to me is ridiculous because it is 15% of the memory used by the webapp. The empty string is only one example of many string values with this problem, if I can intern all the jruby strings I worked out I could save about 130mb.

我们在tomcat下运行了一个java / jruby webapp,我一直在分析应用程序在运行时使用的对象和内存的数量。我注意到在启动后,类“org.jruby.RubyString”有1,118,000个字符串“”的实例,空字符串单独使用的堆内存总量是65mb,这对我来说是荒谬的,因为它是15%的内存由webapp使用。空字符串只是这个问题的许多字符串值的一个例子,如果我可以实习我制定的所有jruby字符串,我可以节省大约130mb。

I know in Java, each time when a string value is created, it will check if the value already exists in the string pool and reuse it if it does. I am wondering if there is an option in Jruby that has the same optimization? if so, how do I enable it?

我知道在Java中,每次创建字符串值时,它都会检查字符串池中是否已存在该值,如果存在则重用它。我想知道Jruby中是否有一个具有相同优化的选项?如果是这样,我该如何启用它?

Example in Jruby:

Jruby中的示例:

v1 = "a"
v2 = "a"
puts v1.object_id # => 3352
puts v2.object_id # => 3354

Example in Java:

Java中的示例:

String v1 = "a";
String v2 = "a";

System.out.println(v1.hashCode()); # => 97
System.out.println(v2.hashCode()); # => 97

4 个解决方案

#1


5  

I understand the motivation behind this, but there's really no such "magic" switch in JRuby ...

我理解这背后的动机,但JRuby中真的没有这种“神奇”的转换......

From a Java background it feels temping to save on strings, but you can't expect strings to behave the same way in JRuby as they do in Java. First of all they're a completely different object. I would go as far as to say that a Ruby String is more of a Java StringBuilder.

从Java背景来看,保存字符串很有诱惑力,但是你不能指望字符串在JRuby中的行为与在Java中的行为相同。首先,它们是一个完全不同的对象。我甚至会说Ruby String更像是一个Java StringBuilder。

It's certainly a waste to have so many "" instances lying around, but if that code as you mention is third-party code there's not much you can do about it - unless you feel like monkey patching a lot. I would try to identify the places most of the instances come from and refactor those - but remember there are some "tricky" parts on saving strings e.g. with Hash:

如此多的“实例”存在,这当然是一种浪费,但如果您提到的代码是第三方代码,那么您可以做的事情并不多 - 除非您觉得猴子修补很多。我会尝试识别大多数实例的来源并重构这些 - 但请记住,保存字符串有一些“棘手”的部分,例如与哈希:

{ 'foo' => 'bar' }

You would guess this creates 3 objects, but you'd be wrong; it actually creates two of the 'foo'. Since a String is mutable (unless frozen?) it dups the string and freezes when used as a Hash key (and there's a good reason for that).

你猜这会产生3个物体,但你错了;它实际上创造了两个'foo'。由于String是可变的(除非被冻结?)它会复制字符串并在用作Hash键时冻结(并且有充分的理由)。

Also keep in mind to refactor "intelligently" - profile the bits you're changing if you do not slow things down by trying to get cheap on instances allocated.

还要记住重构“聪明地” - 如果你不通过试图在分配的实例上获得便宜而减慢速度,那么就可以对你正在改变的位进行分析。

#2


2  

v1 = v2 = v3 = "a"

Will only create one object in Ruby, not three.

只会在Ruby中创建一个对象,而不是三个。

v1 = v2 = v3 = "a" # => "a"
v1.object_id # => 10530560
v2.object_id # => 10530560
v1 << "ll the same" # => "all the same"
v2 # "all the same"

Before doing something as drastic as interning all the strings, I'd check with other tomcat users if this is the best way of dealing with this problem. I don't use Tomcat, or JRuby, but I strongly suspect this isn't the best approach.

在做一些像实习所有字符串那样激烈的事情之前,我会检查其他tomcat用户是否这是处理这个问题的最佳方法。我不使用Tomcat或JRuby,但我强烈怀疑这不是最好的方法。

Edit If every object that was built from an "a" was the same object, then modifying one of them would modify all of the other strings. That would be a side effect nightmare.

编辑如果从“a”构建的每个对象都是同一个对象,则修改其中一个对象将修改所有其他字符串。那将是一个副作用的噩梦。

#3


1  

The only way to intern a String in JRuby is to call to_sym or intern (they alias each other), and thus making them symbols — which, as you mentioned, doesn’t quite help for third-party gems. There isn’t, as far as I’m aware, any other way.

在JRuby中实习String的唯一方法是调用to_sym或intern(它们彼此别名),从而使它们成为符号 - 正如你所提到的,这对第三方宝石没有多大帮助。据我所知,没有任何其他方式。

This is in line with MRI behaviour:

这符合MRI行为:

sebastien@greystones:~$ rvm ruby-1.9.3-p0
sebastien@greystones:~$ irb
1.9.3p0 :001 > a = "Hello World" 
 => "Hello World" 
1.9.3p0 :002 > b = "Hello World"
 => "Hello World" 
1.9.3p0 :003 > a.object_id
 => 20126420 
1.9.3p0 :004 > b.object_id
 => 19289920 

#4


0  

This is now the default behaviour in JRuby. From version 9.1 all frozen string literals (e.g. 'hello'.freeze) return the same instance, and the same goes for literal strings used as hash keys (e.g. stuff['thing']) and a few other cases. See JRuby issue #3491.

这是JRuby中的默认行为。从版本9.1开始,所有冻结的字符串文字(例如'hello'.freeze)都返回相同的实例,对于用作散列键的文字字符串(例如stuff ['thing'])和其他一些情况也是如此。见JRuby问题#3491。

If you want to aggressively freeze all string literals you can run both JRuby (9.1+) and Ruby (2.3+) with --enable-frozen-string-literal, but prepare for things to break since most gems assume that strings are mutable.

如果你想积极地冻结所有字符串文字,你可以使用--enable-frozen-string-literal运行JRuby(9.1+)和Ruby(2.3+),但是要准备好要破坏的东西,因为大多数宝石都认为字符串是可变的。

#1


5  

I understand the motivation behind this, but there's really no such "magic" switch in JRuby ...

我理解这背后的动机,但JRuby中真的没有这种“神奇”的转换......

From a Java background it feels temping to save on strings, but you can't expect strings to behave the same way in JRuby as they do in Java. First of all they're a completely different object. I would go as far as to say that a Ruby String is more of a Java StringBuilder.

从Java背景来看,保存字符串很有诱惑力,但是你不能指望字符串在JRuby中的行为与在Java中的行为相同。首先,它们是一个完全不同的对象。我甚至会说Ruby String更像是一个Java StringBuilder。

It's certainly a waste to have so many "" instances lying around, but if that code as you mention is third-party code there's not much you can do about it - unless you feel like monkey patching a lot. I would try to identify the places most of the instances come from and refactor those - but remember there are some "tricky" parts on saving strings e.g. with Hash:

如此多的“实例”存在,这当然是一种浪费,但如果您提到的代码是第三方代码,那么您可以做的事情并不多 - 除非您觉得猴子修补很多。我会尝试识别大多数实例的来源并重构这些 - 但请记住,保存字符串有一些“棘手”的部分,例如与哈希:

{ 'foo' => 'bar' }

You would guess this creates 3 objects, but you'd be wrong; it actually creates two of the 'foo'. Since a String is mutable (unless frozen?) it dups the string and freezes when used as a Hash key (and there's a good reason for that).

你猜这会产生3个物体,但你错了;它实际上创造了两个'foo'。由于String是可变的(除非被冻结?)它会复制字符串并在用作Hash键时冻结(并且有充分的理由)。

Also keep in mind to refactor "intelligently" - profile the bits you're changing if you do not slow things down by trying to get cheap on instances allocated.

还要记住重构“聪明地” - 如果你不通过试图在分配的实例上获得便宜而减慢速度,那么就可以对你正在改变的位进行分析。

#2


2  

v1 = v2 = v3 = "a"

Will only create one object in Ruby, not three.

只会在Ruby中创建一个对象,而不是三个。

v1 = v2 = v3 = "a" # => "a"
v1.object_id # => 10530560
v2.object_id # => 10530560
v1 << "ll the same" # => "all the same"
v2 # "all the same"

Before doing something as drastic as interning all the strings, I'd check with other tomcat users if this is the best way of dealing with this problem. I don't use Tomcat, or JRuby, but I strongly suspect this isn't the best approach.

在做一些像实习所有字符串那样激烈的事情之前,我会检查其他tomcat用户是否这是处理这个问题的最佳方法。我不使用Tomcat或JRuby,但我强烈怀疑这不是最好的方法。

Edit If every object that was built from an "a" was the same object, then modifying one of them would modify all of the other strings. That would be a side effect nightmare.

编辑如果从“a”构建的每个对象都是同一个对象,则修改其中一个对象将修改所有其他字符串。那将是一个副作用的噩梦。

#3


1  

The only way to intern a String in JRuby is to call to_sym or intern (they alias each other), and thus making them symbols — which, as you mentioned, doesn’t quite help for third-party gems. There isn’t, as far as I’m aware, any other way.

在JRuby中实习String的唯一方法是调用to_sym或intern(它们彼此别名),从而使它们成为符号 - 正如你所提到的,这对第三方宝石没有多大帮助。据我所知,没有任何其他方式。

This is in line with MRI behaviour:

这符合MRI行为:

sebastien@greystones:~$ rvm ruby-1.9.3-p0
sebastien@greystones:~$ irb
1.9.3p0 :001 > a = "Hello World" 
 => "Hello World" 
1.9.3p0 :002 > b = "Hello World"
 => "Hello World" 
1.9.3p0 :003 > a.object_id
 => 20126420 
1.9.3p0 :004 > b.object_id
 => 19289920 

#4


0  

This is now the default behaviour in JRuby. From version 9.1 all frozen string literals (e.g. 'hello'.freeze) return the same instance, and the same goes for literal strings used as hash keys (e.g. stuff['thing']) and a few other cases. See JRuby issue #3491.

这是JRuby中的默认行为。从版本9.1开始,所有冻结的字符串文字(例如'hello'.freeze)都返回相同的实例,对于用作散列键的文字字符串(例如stuff ['thing'])和其他一些情况也是如此。见JRuby问题#3491。

If you want to aggressively freeze all string literals you can run both JRuby (9.1+) and Ruby (2.3+) with --enable-frozen-string-literal, but prepare for things to break since most gems assume that strings are mutable.

如果你想积极地冻结所有字符串文字,你可以使用--enable-frozen-string-literal运行JRuby(9.1+)和Ruby(2.3+),但是要准备好要破坏的东西,因为大多数宝石都认为字符串是可变的。