I have a simple ActiveRecord model called Student
with 100 records in the table. I do the following in a rails console session:
我有一个名为Student的简单ActiveRecord模型,表中包含100条记录。我在rails控制台会话中执行以下操作:
ObjectSpace.each_object(ActiveRecord::Base).count
# => 0
x = Student.all
ObjectSpace.each_object(ActiveRecord::Base).count
# => 100
x = nil
GC.start
ObjectSpace.each_object(ActiveRecord::Base).count
# => 0 # Good!
Now I do the following:
现在我做以下事情:
ObjectSpace.each_object(ActiveRecord::Base).count
# => 0
x = Student.all.group_by(&:last_name)
ObjectSpace.each_object(ActiveRecord::Base).count
# => 100
x = nil
GC.start
ObjectSpace.each_object(ActiveRecord::Base).count
# => 100 # Bad!
Can anyone explain why this happens and whether there is a smart way to solve this without knowing the underlying hash structure? I know I can do this:
任何人都可以解释为什么会发生这种情况,是否有一种聪明的方法来解决这个问题而不知道底层哈希结构我知道我可以这样做:
x.keys.each{|k| x[k]=nil}
x = nil
GC.start
and it will remove all Student objects from memory correctly, but I'm wondering if there is a general solution (my real-life problem is wide spread and has more intricate data structures than the hash shown above).
它会正确地从内存中删除所有的Student对象,但我想知道是否有一个通用的解决方案(我的现实问题是广泛传播,并且具有比上面显示的散列更复杂的数据结构)。
I'm using Ruby 1.9.3-p0 and Rails 3.1.0.
我正在使用Ruby 1.9.3-p0和Rails 3.1.0。
UPDATE (SOLVED)
更新(已解决)
Per Oscar Del Ben's explanation below, a few ActiveRecord::Relation objects are created in the problematic code snippet (they are actually created in both code snippets, but for some reason they "misbehave" only in the second one. Can someone shed light on why?). These maintain references to the ActiveRecord objects via an instance variable called @records. This instance variable can be set to nil through the "reset" method on ActiveRecord::Relation. You have to make sure to perform this on all the relation objects:
根据Oscar Del Ben在下面的解释,在有问题的代码片段中创建了一些ActiveRecord :: Relation对象(它们实际上是在两个代码片段中创建的,但由于某种原因,它们仅在第二个代码片段中“行为异常”。为什么?)。它们通过名为@records的实例变量维护对ActiveRecord对象的引用。可以通过ActiveRecord :: Relation上的“reset”方法将此实例变量设置为nil。您必须确保在所有关系对象上执行此操作:
ObjectSpace.each_object(ActiveRecord::Base).count
# => 100
ObjectSpace.each_object(ActiveRecord::Relation).each(&:reset)
GC.start
ObjectSpace.each_object(ActiveRecord::Base).count
# => 0
Note: You can also use Mass.detach (using the ruby-mass gem Oscar Del Ben referenced), though it will be much slower than the code above. Note that the code above does not remove a few ActiveRecord::Relation objects from memory. These seem to be pretty insignificant though. You can try doing:
注意:你也可以使用Mass.detach(使用ruby-mass gem Oscar Del Ben引用),虽然它会比上面的代码慢得多。请注意,上面的代码不会从内存中删除一些ActiveRecord :: Relation对象。但这些似乎相当微不足道。你可以尝试做:
Mass.index(ActiveRecord::Relation)["ActiveRecord::Relation"].each{|x| Mass.detach Mass[x]}
GC.start
And this would remove some of the ActiveRecord::Relation objects, but not all of them (not sure why, and those that are left have no Mass.references. Weird).
这将删除一些ActiveRecord :: Relation对象,但不是全部(不确定为什么,剩下的那些没有Mass.references。很奇怪)。
2 个解决方案
#1
10
I think I know what's going on. Ruby's GC wont free immutable objects (like symbols!). The keys returned by group_by are immutable strings, and so they wont be garbage collected.
我想我知道发生了什么事。 Ruby的GC不会释放不可变对象(如符号!)。 group_by返回的键是不可变的字符串,因此它们不会被垃圾回收。
UPDATE:
更新:
It seems like the problem is not with Rails itself. I tried using group_by alone, and sometimes the objects would not get garbage collected:
似乎问题不在于Rails本身。我尝试单独使用group_by,有时候对象不会被垃圾收集:
oscardelben~/% irb
irb(main):001:0> class Foo
irb(main):002:1> end
=> nil
irb(main):003:0> {"1" => Foo.new, "2" => Foo.new}
=> {"1"=>#<Foo:0x007f9efd8072a0>, "2"=>#<Foo:0x007f9efd807250>}
irb(main):004:0> ObjectSpace.each_object(Foo).count
=> 2
irb(main):005:0> GC.start
=> nil
irb(main):006:0> ObjectSpace.each_object(Foo).count
=> 0
irb(main):007:0> {"1" => Foo.new, "2" => Foo.new}.group_by
=> #<Enumerator: {"1"=>#<Foo:0x007f9efb83d0c8>, "2"=>#<Foo:0x007f9efb83d078>}:group_by>
irb(main):008:0> GC.start
=> nil
irb(main):009:0> ObjectSpace.each_object(Foo).count
=> 2 # Not garbage collected
irb(main):010:0> GC.start
=> nil
irb(main):011:0> ObjectSpace.each_object(Foo).count
=> 0 # Garbage collected
I've digged through the GC internals (which are surprisingly easy to understand), and this seems like a scope issue. Ruby walks through all the objects in the current scope and marks the ones which it thinks are still being used, after that it goes through all the objects in the heap and frees the ones which have not been marked.
我已经深入了解了GC内部结构(这非常容易理解),这似乎是一个范围问题。 Ruby遍历当前作用域中的所有对象,并标记它认为仍在使用的对象,之后它遍历堆中的所有对象并释放未标记的对象。
In this case I think the hash is still being marked even though it's out of scope. There are many reasons why this may happening. I'll keep investigating.
在这种情况下,我认为哈希仍然被标记,即使它超出了范围。造成这种情况的原因有很多。我会继续调查。
UPDATE 2:
更新2:
I've found what's keeping references of objects. To do that I've used the ruby mass gem. It turns out that Active Record relation keeps track of the objects returned.
我发现了什么是保持对象的引用。为此,我使用了红宝石质量宝石。事实证明,Active Record关系会跟踪返回的对象。
User.limit(1).group_by(&:name)
GC.start
ObjectSpace.each_object(ActiveRecord::Base).each do |obj|
p Mass.references obj # {"ActiveRecord::Relation#70247565268860"=>["@records"]}
end
Unfortunately, calling reset
on the relation didn't seem to help, but hopefully this is enough information for now.
不幸的是,调用关系上的重置似乎没有帮助,但希望现在这是足够的信息。
#2
2
i do not know the answer
我不知道答案
But i tried inspecting the heap as given on http://blog.headius.com/2010/07/browsing-memory-jruby-way.html
但我尝试检查http://blog.headius.com/2010/07/browsing-memory-jruby-way.html上给出的堆
Have attached a screenshot at, https://skitch.com/deepak_kannan/en3dg/java-visualvm it was a simple program
已附上截图,https://skitch.com/deepak_kannan/en3dg/java-visualvm这是一个简单的程序
class Foo; end
f1 = Foo.new
f2 = Foo.new
GC.start
Then used jvisualvm as given above. Was running this in irb.
Seems as if jruby is tracking the object's scope. The object will not get GC'ed if there are any non-weak references to that object
然后使用上面给出的jvisualvm。在irb中运行它。似乎jruby正在跟踪对象的范围。如果对该对象有任何非弱引用,则该对象将不会获得GC
#1
10
I think I know what's going on. Ruby's GC wont free immutable objects (like symbols!). The keys returned by group_by are immutable strings, and so they wont be garbage collected.
我想我知道发生了什么事。 Ruby的GC不会释放不可变对象(如符号!)。 group_by返回的键是不可变的字符串,因此它们不会被垃圾回收。
UPDATE:
更新:
It seems like the problem is not with Rails itself. I tried using group_by alone, and sometimes the objects would not get garbage collected:
似乎问题不在于Rails本身。我尝试单独使用group_by,有时候对象不会被垃圾收集:
oscardelben~/% irb
irb(main):001:0> class Foo
irb(main):002:1> end
=> nil
irb(main):003:0> {"1" => Foo.new, "2" => Foo.new}
=> {"1"=>#<Foo:0x007f9efd8072a0>, "2"=>#<Foo:0x007f9efd807250>}
irb(main):004:0> ObjectSpace.each_object(Foo).count
=> 2
irb(main):005:0> GC.start
=> nil
irb(main):006:0> ObjectSpace.each_object(Foo).count
=> 0
irb(main):007:0> {"1" => Foo.new, "2" => Foo.new}.group_by
=> #<Enumerator: {"1"=>#<Foo:0x007f9efb83d0c8>, "2"=>#<Foo:0x007f9efb83d078>}:group_by>
irb(main):008:0> GC.start
=> nil
irb(main):009:0> ObjectSpace.each_object(Foo).count
=> 2 # Not garbage collected
irb(main):010:0> GC.start
=> nil
irb(main):011:0> ObjectSpace.each_object(Foo).count
=> 0 # Garbage collected
I've digged through the GC internals (which are surprisingly easy to understand), and this seems like a scope issue. Ruby walks through all the objects in the current scope and marks the ones which it thinks are still being used, after that it goes through all the objects in the heap and frees the ones which have not been marked.
我已经深入了解了GC内部结构(这非常容易理解),这似乎是一个范围问题。 Ruby遍历当前作用域中的所有对象,并标记它认为仍在使用的对象,之后它遍历堆中的所有对象并释放未标记的对象。
In this case I think the hash is still being marked even though it's out of scope. There are many reasons why this may happening. I'll keep investigating.
在这种情况下,我认为哈希仍然被标记,即使它超出了范围。造成这种情况的原因有很多。我会继续调查。
UPDATE 2:
更新2:
I've found what's keeping references of objects. To do that I've used the ruby mass gem. It turns out that Active Record relation keeps track of the objects returned.
我发现了什么是保持对象的引用。为此,我使用了红宝石质量宝石。事实证明,Active Record关系会跟踪返回的对象。
User.limit(1).group_by(&:name)
GC.start
ObjectSpace.each_object(ActiveRecord::Base).each do |obj|
p Mass.references obj # {"ActiveRecord::Relation#70247565268860"=>["@records"]}
end
Unfortunately, calling reset
on the relation didn't seem to help, but hopefully this is enough information for now.
不幸的是,调用关系上的重置似乎没有帮助,但希望现在这是足够的信息。
#2
2
i do not know the answer
我不知道答案
But i tried inspecting the heap as given on http://blog.headius.com/2010/07/browsing-memory-jruby-way.html
但我尝试检查http://blog.headius.com/2010/07/browsing-memory-jruby-way.html上给出的堆
Have attached a screenshot at, https://skitch.com/deepak_kannan/en3dg/java-visualvm it was a simple program
已附上截图,https://skitch.com/deepak_kannan/en3dg/java-visualvm这是一个简单的程序
class Foo; end
f1 = Foo.new
f2 = Foo.new
GC.start
Then used jvisualvm as given above. Was running this in irb.
Seems as if jruby is tracking the object's scope. The object will not get GC'ed if there are any non-weak references to that object
然后使用上面给出的jvisualvm。在irb中运行它。似乎jruby正在跟踪对象的范围。如果对该对象有任何非弱引用,则该对象将不会获得GC