新HashMap(int)和番石榴Maps.newHashMapWithExpectedSize(int)之间的区别

时间:2021-09-26 20:46:58

In Java, you can create a new HashMap to hold a specific number of items like so:

在Java中,您可以创建一个新的HashMap来保存特定数量的项目,如下所示:

Map m = new HashMap(100);

Guava provides a Maps.newHashMapWithExpectedSize(int) method, which I would expect to simply call HashMap(int). But it doesn't do this, instead it calculates its own capacity and uses that.

Guava提供了一个Maps.newHashMapWithExpectedSize(int)方法,我希望它简单地调用HashMap(int)。但它不会这样做,而是计算自己的容量并使用它。

Why does newHashMapWithExpectedSize do its own thing, and why would I want to use it over calling new HashMap(int) directly?

为什么newHashMapWithExpectedSize做了自己的事情,为什么我要直接调用新的HashMap(int)呢?

3 个解决方案

#1


Have you read the method's Javadoc?

你读过方法的Javadoc了吗?

Creates a HashMap instance, with a high enough "initial capacity" that it should hold expectedSize elements without growth.

创建一个HashMap实例,具有足够高的“初始容量”,它应该保持expectedSize元素而不增长。

Note that the new HashMap(int) constructor's "initial size" parameter specifies the initial size of the hash table that entries are stored in, which is basically an implementation detail that you shouldn't have to care about. The hash table will resize when it exceeds the map's load factor (which defaults to 0.75), which means that if you specify an initial capacity of 16 and then add 16 entries to the map, the hash table will almost certainly be resized.

请注意,新的HashMap(int)构造函数的“initial size”参数指定了条目存储的哈希表的初始大小,这基本上是您不应该关心的实现细节。当哈希表超出地图的加载因子(默认为0.75)时,它将调整大小,这意味着如果指定初始容量为16,然后向地图添加16个条目,则几乎可以肯定地调整哈希表的大小。

With Guava's method, if you specify an expected size of 16 and then add 16 entries, the hash table should not resize.

使用Guava的方法,如果指定预期大小为16,然后添加16个条目,则哈希表不应调整大小。

#2


The HashMap constructor argument is the capacity of the map, i.e. the number of buckets.

HashMap构造函数参数是映射的容量,即桶的数量。

So, if you pass 10 as argument, and store 8 keys in the map, the rehash threshold (75% by default) will be reached and the map will rehash.

因此,如果您将10作为参数传递,并在地图中存储8个键,则将达到重新连接阈值(默认为75%),并且地图将重新散列。

On the other hand, the argument passed to newHashMapWithExpectedSize() is the expected size of the map. So if you pass 10, Guava will create a map with enough buckets to make sure the map doesn't rehash when inserting 10 elements: at least 14 buckets.

另一方面,传递给newHashMapWithExpectedSize()的参数是地图的预期大小。因此,如果你传递10,Guava将创建一个具有足够桶的地图,以确保在插入10个元素时地图不会重新散列:至少14个桶。

#3


Guava just multiplies the size passed by 2 (in a safe way) and calls the regular hashmap constructor. This makes it more sparse so there are less collisions while hashing.

Guava只是将传递的大小乘以2(以安全的方式)并调用常规的hashmap构造函数。这使得它更稀疏,因此在散列时碰撞更少。

The javadoc on the capacity calculation mentions that it calculates a value for the capacity so that the hashmap is between 25% and 50% full, which is far away from the threshold which would trigger a resizing.

容量计算中的javadoc提到它计算容量的值,以便散列映射在25%到50%之间,这远远超过触发调整大小的阈值。

The standard library rounds the expected size up to the nearest power of 2 and allocates that as the size and then sets the threshold for resizing to 75%. If we would randomly ask for sizes, the standard library would resize in 50% of cases.

标准库将预期大小四舍五入到最接近的2的幂,并将其分配为大小,然后将调整大小的阈值设置为75%。如果我们随机询问尺寸,标准库将在50%的情况下调整大小。

If avoiding the threshold would be the only consideration, multiplying with 1.34 would be enough to have enough space to avoid resizing on filling it with the expected size of elements.

如果避免阈值是唯一的考虑因素,乘以1.34就足以有足够的空间来避免在填充预期大小的元素时调整大小。

It looks like the typical speed vs space tradeof and the Google engineers being more speed freaks and the Sun/Oracle engineers more space nuts.

它看起来像典型的速度与空间交易,谷歌工程师更加速度怪,而太阳/甲骨文工程师更多的空间坚果。

#1


Have you read the method's Javadoc?

你读过方法的Javadoc了吗?

Creates a HashMap instance, with a high enough "initial capacity" that it should hold expectedSize elements without growth.

创建一个HashMap实例,具有足够高的“初始容量”,它应该保持expectedSize元素而不增长。

Note that the new HashMap(int) constructor's "initial size" parameter specifies the initial size of the hash table that entries are stored in, which is basically an implementation detail that you shouldn't have to care about. The hash table will resize when it exceeds the map's load factor (which defaults to 0.75), which means that if you specify an initial capacity of 16 and then add 16 entries to the map, the hash table will almost certainly be resized.

请注意,新的HashMap(int)构造函数的“initial size”参数指定了条目存储的哈希表的初始大小,这基本上是您不应该关心的实现细节。当哈希表超出地图的加载因子(默认为0.75)时,它将调整大小,这意味着如果指定初始容量为16,然后向地图添加16个条目,则几乎可以肯定地调整哈希表的大小。

With Guava's method, if you specify an expected size of 16 and then add 16 entries, the hash table should not resize.

使用Guava的方法,如果指定预期大小为16,然后添加16个条目,则哈希表不应调整大小。

#2


The HashMap constructor argument is the capacity of the map, i.e. the number of buckets.

HashMap构造函数参数是映射的容量,即桶的数量。

So, if you pass 10 as argument, and store 8 keys in the map, the rehash threshold (75% by default) will be reached and the map will rehash.

因此,如果您将10作为参数传递,并在地图中存储8个键,则将达到重新连接阈值(默认为75%),并且地图将重新散列。

On the other hand, the argument passed to newHashMapWithExpectedSize() is the expected size of the map. So if you pass 10, Guava will create a map with enough buckets to make sure the map doesn't rehash when inserting 10 elements: at least 14 buckets.

另一方面,传递给newHashMapWithExpectedSize()的参数是地图的预期大小。因此,如果你传递10,Guava将创建一个具有足够桶的地图,以确保在插入10个元素时地图不会重新散列:至少14个桶。

#3


Guava just multiplies the size passed by 2 (in a safe way) and calls the regular hashmap constructor. This makes it more sparse so there are less collisions while hashing.

Guava只是将传递的大小乘以2(以安全的方式)并调用常规的hashmap构造函数。这使得它更稀疏,因此在散列时碰撞更少。

The javadoc on the capacity calculation mentions that it calculates a value for the capacity so that the hashmap is between 25% and 50% full, which is far away from the threshold which would trigger a resizing.

容量计算中的javadoc提到它计算容量的值,以便散列映射在25%到50%之间,这远远超过触发调整大小的阈值。

The standard library rounds the expected size up to the nearest power of 2 and allocates that as the size and then sets the threshold for resizing to 75%. If we would randomly ask for sizes, the standard library would resize in 50% of cases.

标准库将预期大小四舍五入到最接近的2的幂,并将其分配为大小,然后将调整大小的阈值设置为75%。如果我们随机询问尺寸,标准库将在50%的情况下调整大小。

If avoiding the threshold would be the only consideration, multiplying with 1.34 would be enough to have enough space to avoid resizing on filling it with the expected size of elements.

如果避免阈值是唯一的考虑因素,乘以1.34就足以有足够的空间来避免在填充预期大小的元素时调整大小。

It looks like the typical speed vs space tradeof and the Google engineers being more speed freaks and the Sun/Oracle engineers more space nuts.

它看起来像典型的速度与空间交易,谷歌工程师更加速度怪,而太阳/甲骨文工程师更多的空间坚果。