JVM探索(二)

时间:2023-03-10 03:48:28
JVM探索(二)
我们先说一下垃圾回收中常见的算法,以及实现的原理,这些都是如何发生的。

Returning back to Garbage Collection, there is a term that you should know before learning about GC. The term is "stop-the-world." Stop-the-world will occur no matter which GC algorithm you choose. Stop-the-world means that the JVM is stopping the application from running to execute a GC. When stop-the-world occurs, every thread except for the threads needed for the GC will stop their tasks. The interrupted tasks will resume only after the GC task has completed. GC tuning often means reducing this stop-the-world time.

无论哪一个垃圾收集器的选择,为了执行垃圾回收,都需要停止应用程序(stop-the-world),这中间有一段的时间,常常就是优化的目标。

Young generation: Most of the newly created objects are located here. Since most objects soon become unreachable, many objects are created in the young generation, then disappear. When objects disappear from this area, we say a "minor GC" has occurred. 

为了那些很容易的变为不可达的对象的回收,所以我们才设计了年轻代,年轻代上面发生的回收,称之为minor GC

Old generation: The objects that did not become unreachable and survived from the young generation are copied here. It is generally larger than the young generation. As it is bigger in size, the GC occurs less frequently than in the young generation. When objects disappear from the old generation, we say a "major GC" (or a "full GC") has occurred. 

老年代 对象从年轻代存活下来被放在的地方,这里发生的GC,称之为major GC或者叫做 full GC

JVM探索(二)               JVM探索(二)

上面的两张图中,其中的Young Generation 和 Old Generation 都有过介绍,下面我们说一下:Permanent Generation

The permanent generation from the chart above is also called the "method area," and it stores classes or interned character strings. So, this area is definitely not for objects that survived from the old generation to stay permanently. A GC may occur in this area. The GC that took place here is still counted as a major GC. 

持久代,也叫做方法区。存储的是class类或者内部的字符串,所以这一部分不仅仅是为了存储从年老代幸存的对象。这个区也能够发生GC,也叫做
major GC或者 full GC 不过可以想象这部分的GC能够回收的对象相对于年轻代会比较少。

下面的这段既是解释当年老代引用年轻代的对象的时候的解决方案:

Some people may wonder:

What if an object in the old generation need to reference an object in the young generation?

To handle these cases, there is something called the a "card table" in the old generation, which is a 512 byte chunk. Whenever an object in the old generation references an object in the young generation, it is recorded in this table. When a GC is executed for the young generation, only this card table is searched to determine whether or not it is subject for GC, instead of checking the reference of all the objects in the old generation. This card table is managed with write barrier. This write barrier is a device that allows a faster performance for minor GC. Though a bit of overhead occurs because of this, the overall GC time is reduced.

JVM探索(二)

Figure 2: Card Table Structure.




这个时候,我们就能够明白为什么GC分为两类了,minor gc 和major gc ,原因就在于对象存储的方式或者存在的方式不同,存活时间比较短的对象
就放在新生代中,存活时间比较长的就放在了年老代中,这样可以采取更加针对的垃圾回收机制。

There are 3 spaces in total, two of which are Survivor spaces. The order of execution process of each space is as below:

  1. The majority of newly created objects are located in the Eden space. 首先放在伊甸区
  2. After one GC in the Eden space, the surviving objects are moved to one of the Survivor spaces. 伊甸区满了发生GC,幸存的对象放进幸存一区
  3. After a GC in the Eden space, the objects are piled up into the Survivor space, where other surviving objects already exist.
  4. Once a Survivor space is full, surviving objects are moved to the other Survivor space. Then, the Survivor space that is full will be changed to a state where there is no data at all.
  5. The objects that survived these steps that have been repeated a number of times are moved to the old generation.

As you can see by checking these steps, one of the Survivor spaces must remain empty. If data exists in both Survivor spaces, or the usage is 0 for both spaces, then take that as a sign that something is wrong with your system.

这个地方解释的比较粗略,所以这个地方需要很多的修补,例如对象年龄的计算,如果老年代空间不够了怎么办,之类的。

1. 分配担保,那老年代来确保minor gc的时候,幸存区如果装不下大的对象问题。

2.发生minor gc的时候,如果老年代的空间不能够容纳晋升到老年代的内存,则改为直接进行一次Full GC。如果是能够容纳,则需要

查看 HandlePromotionFailure 设置是否允许失败,如果允许,那只会进行minor gc。如果不允许,则也要可能改为进行一次Full GC。

3.对象的年龄,撑过一次minor gc 加一

The process of data piling up into the old generation through minor GCs can be shown as in the below chart:

JVM探索(二)

Figure 3: Before & After a GC.

Note that in HotSpot VM, two techniques are used for faster memory allocations. One is called "bump-the-pointer," and the other is called "TLABs (Thread-Local Allocation Buffers)."

bump-the-pointer 和 TLABs为内存分配的技巧,为了是更快的分配内存。

Bump-the-pointer technique tracks the last object allocated to the Eden space. That object will be located on top of the Eden space. And if there is an object created afterwards, it checks only if the size of the object is suitable for the Eden space. If the said object seems right, it will be placed in the Eden space, and the new object goes on top. So, when new objects are created, only the lastly added object needs to be checked, which allows much faster memory allocations. However, it is a different story if we consider a multithreaded environment. To save objects used by multiple threads in the Eden space for Thread-Safe, an inevitable lock will occur and the performance will drop due to the lock-contention. TLABs is the solution to this problem in HotSpot VM. This allows each thread to have a small portion of its Eden space that corresponds to its own share. As each thread can only access to their own TLAB, even the bump-the-pointer technique will allow memory allocations without a lock.

在这里,还需要说明一点,一个新建的对象一般都是分配在eden区的.

JVM探索(二)

上面的内容,基本上把年轻代的回收机制给说明白了,下面我们主要看年老代的回收,年老代的回收主要采取的是标记-整理算法的思想
有几种常见的垃圾回收器:
JVM探索(二)
我们看到的垃圾回收器有:
serial   单线程新生代使用的垃圾回收器,client模式下默认的选择
PartNew  多线程的新生代垃圾回收器,server模式下默认的新生代垃圾回收器,因为其能够配合CMS垃圾回收相配合
Parallel Scavenge  多线程新生代立即回收器,关注的是吞吐量,提供了-XX:MaxGCPauseMilis 最大的垃圾收集停顿时间和
-XX:GCTimeRatio 吞吐量的设置。(不能够和CMS互相配合)

serial old  单线程老年代的收集器 。 client模式下使用,server模式下,配合Parallel Scavenge 使用,或者当CMS失败后的备选方案。
Parallel old  多线程的老年代收集器。注重吞吐量和CPU敏感的应用,我们使用Parallel Scavenge + Parallel old 组合。既是:-XX:UseParallelOldGC
CMS 老年代收集器(因为采用的是标记-清理算法),追求的是最短的回收停顿事件。

某位同学的总结:
类别 serial collector parallel collector
throughput collector )
concurrent collector
(concurrent low pause collector)
介绍

单线程收集器
使用单线程去完成所有的gc工作,没有线程间的通信,这种方式会相对高效

并行收集器
使用多线程的方式,利用多CUP来提高GC的效率
主要以到达一定的吞吐量为目标

并发收集器
使用多线程的方式,利用多CUP来提高GC的效率
并发完成大部分工作,使得gc pause短

试用场景 单处理器机器且没有pause time的要求

适用于科学技术和后台处理
有中规模/大规模数据集大小的应用且运行在多处理器上,关注吞吐量(throughput)

适合中规模/大规模数据集大小的应用,应用服务器,电信领域
关注response time,而不是throughput

使用 Client模式下默认
可使用
可用-XX:+UseSerialGC强制使用
优点:对server应用没什么优点
缺点:慢,不能充分发挥硬件资源

Server模式下默认

--YGC:PS FGC:Parallel MSC

可用-XX:+UseParallelGC或-XX:+UseParallelOldGC强制指定

--ParallelGC代表FGC为Parallel MSC

--ParallelOldGC代表FGC为Parallel Compacting

优点:高效

缺点:当heap变大后,造成的暂停时间会变得比较长

可用-XX:+UseConcMarkSweepGC强制指定
优点:
对old进行回收时,对应用造成的暂停时间非常端,适合对latency要求比较高的应用
缺点:
1.内存碎片和浮动垃圾
2.old去的内存分配效率低
3.回收的整个耗时比较长
4.和应用争抢CPU
内存回收触发 YGC
eden空间不足
FGC
old空间不足
perm空间不足
显示调用System.gc() ,包括RMI等的定时触发
YGC时的悲观策略
dump live的内存信息时(jmap –dump:live)
      
YGC
eden空间不足
FGC
old空间不足
perm空间不足
显示调用System.gc() ,包括RMI等的定时触发
YGC时的悲观策略--YGC前&YGC后
dump live的内存信息时(jmap –dump:live)
      
YGC
eden空间不足
CMS GC
1.old Gen的使用率大的一定的比率 默认为92%
2.配置了CMSClassUnloadingEnabled,且Perm Gen的使用达到一定的比率 默认为92%
3.Hotspot自己根据估计决定是否要触法
4.在配置了ExplictGCInvokesConcurrent的情况下显示调用了System.gc.
Full GC(Serial MSC)
promotion failed 或 concurrent Mode Failure时;
内存回收触发时发生了什么 YGC
清空eden+from中所有no ref的对象占用的内存
将eden+from中的所有存活的对象copy到to中
在这个过程中一些对象将晋升到old中:
--to放不下的
--存活次数超过tenuring threshold的
重新计算Tenuring Threshold;
单线程做以上动作
全程暂停应用
FGC
如果配置了CollectGen0First,则先触发YGC
清空heap中no ref的对象,permgen中已经被卸载的classloader中加载的class的信息
单线程做以上动作
全程暂停应用
YGC
同serial动作基本相同,不同点:
1.多线程处理
2.YGC的最后不仅重新计算Tenuring Threshold,还会重新调整Eden和From的大小
FGC
1.如配置了ScavengeBeforeFullGC(默认),则先触发YGC(??)
2.MSC:清空heap中的no ref对象,permgen中已经被卸载的classloader中加载的class信息,并进行压缩
3.Compacting:清空heap中部分no ref的对象,permgen中已经被卸载的classloader中加载的class信息,并进行部分压缩
多线程做以上动作.

YGC
同serial动作基本相同,不同点:
1.多线程处理
CMSGC:
1.old gen到达比率时只清除old gen中no ref的对象所占用的空间
2.perm gen到达比率时只清除已被清除的classloader加载的class信息
FGC
同serial
细节参数 可用-XX:+UseSerialGC强制使用
-XX:SurvivorRatio=x,控制eden/s0/s1的大小
-XX:MaxTenuringThreshold,用于控制对象在新生代存活的最大次数
-XX:PretenureSizeThreshold=x,控制超过多大的字节的对象就在old分配.
-XX:SurvivorRatio=x,控制eden/s0/s1的大小
-XX:MaxTenuringThreshold,用于控制对象在新生代存活的最大次数

-XX:UseAdaptiveSizePolicy 去掉YGC后动态调整eden from已经tenuringthreshold的动作

-XX:ParallelGCThreads 设置并行的线程数

-XX:CMSInitiatingOccupancyFraction 设置old gen使用到达多少比率时触发
-XX:CMSInitiatingPermOccupancyFraction,设置Perm Gen使用到达多少比率时触发
-XX:+UseCMSInitiatingOccupancyOnly禁止hostspot自行触发CMS GC

注: