字典是否被破坏或GetHashCode()仅基于不可变成员?

时间:2022-05-23 16:51:03

When an object is added to the .NET System.Collections.Generic.Dictionary class the hashcode of the key is stored internally and used for later comparisons. When the hashcode changes after its initial insertion into the dictionary it often becomes "inaccessible" and may surprise its users when an existence check, even using the same reference, returns false (sample code below).

将对象添加到.NET System.Collections.Generic.Dictionary类时,密钥的哈希码将在内部存储,并用于以后的比较。当哈希码在其初始插入字典后发生变化时,它经常变得“不可访问”,并且当存在检查(即使使用相同的引用)返回false(下面的示例代码)时,可能会使其用户感到惊讶。

The GetHashCode documentation says:

GetHashCode文档说:

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method.

只要没有对对象状态的修改来确定对象的Equals方法的返回值,对象的GetHashCode方法必须始终返回相同的哈希代码。

So, according to the GetHashCode docs, the hashcode may change whenever equality-determining state is changed, yet the Dictionary implementation does not support this.

因此,根据GetHashCode文档,只要更改了相等确定状态,哈希码就可能会更改,但是Dictionary实现不支持这一点。

Is the current .NET dictionary implementation broken in that it incorrectly ignore the hashcode allowances? Should GetHashCode() only be based on immutable members? Or, is there something else to break a possible false dichotomy?

当前的.NET字典实现是否被打破,因为它错误地忽略了哈希码限额? GetHashCode()只应基于不可变成员吗?或者,还有什么可以打破可能的错误二分法吗?

class Hashable
{
    public int PK { get; set; }

    public override int GetHashCode()
    {
        if (PK != 0) return PK.GetHashCode();
        return base.GetHashCode();
    }

    public override bool Equals(object obj)
    {
        return Equals(obj as Hashable);
    }

    public virtual bool Equals(Hashable other)
    {
        if (other == null) return false;
        else if (ReferenceEquals(this, other)) return true;
        else if (PK != 0 && other.PK != 0) return Equals(PK, other.PK);
        return false;
    }

    public override string ToString()
    {
        return string.Format("Hashable {0}", PK);
    }
}

class Test
{
    static void Main(string[] args)
    {
        var dict = new Dictionary<Hashable, bool>();
        var h = new Hashable();
        dict.Add(h, true);

        h.PK = 42;
        if (!dict.ContainsKey(h)) // returns false, despite same reference
            dict.Add(h, false);
    }
}

2 个解决方案

#1


26  

No, you just shouldn't mutate a key (in a material way) after inserting it into a dictionary. This is by design, and the way that every hash table I've ever used works. The docs even specify this:

不,你不应该在将它插入字典后改变键(以物质方式)。这是设计,以及我曾经使用的每个哈希表的工作方式。文档甚至指定了这个:

As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value. Every key in a Dictionary<TKey, TValue> must be unique according to the dictionary's equality comparer. A key cannot be null, but a value can be, if the value type TValue is a reference type.

只要对象在Dictionary 中用作键,就不能以任何影响其哈希值的方式进行更改。根据字典的相等比较器,Dictionary 中的每个键必须是唯一的。如果值类型TValue是引用类型,则键不能为null,但值可以是。 ,tvalue> ,tvalue>

So it's only going to surprise users who don't read documentation :)

因此,只会让那些不阅读文档的用户感到惊讶:)

#2


8  

To add to Jon's answer, I would just add emphasis to a certain part of the docs that you quoted:

要添加Jon的答案,我只想强调你引用的文档的某些部分:

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method.

只要没有对对象状态的修改来确定对象的Equals方法的返回值,对象的GetHashCode方法必须始终返回相同的哈希代码。

Now, right there you've broken the rules. You changed PK, which doesn't affect the outcome of Equals (because you have that ReferenceEquals check in there), but the result of your GetHashCode does change. So that's the simple answer.

现在,就在那里,你违反了规则。您更改了PK,这不会影响Equals的结果(因为您在那里检查了ReferenceEquals),但GetHashCode的结果确实发生了变化。这就是简单的答案。

Taking the more conceptual approach, I think you can look at it like this: if you have overridden the Equals and GetHashCode behavior for your type, then you've taken ownership of the concept of what it means for one instance of this type to be equal to another. And in fact you have defined it in a way that a Hashable object can be changed into something completely different; i.e., something that is no longer usable in the same way it was previously (because its hash code has changed).

采用更概念化的方法,我认为你可以这样看:如果你已经覆盖了你的类型的Equals和GetHashCode行为,那么你已经拥有了对于这种类型的一个实例意味着什么的概念。等于另一个。实际上,您已经将Hashable对象更改为完全不同的方式进行了定义;即,某些东西不再像以前那样可用(因为它的哈希码已经改变)。

Considered from this perspective, after you do dict.Add(h, true), and then you change h.PK, the dictionary doesn't contain the object referenced by h anymore. It contains something else (which doesn't really exist anywhere). It's kind of like the type you've defined is a snake that has shed its skin.

从这个角度考虑,在你执行dict.Add(h,true)之后,然后你改变h.PK,字典不再包含h引用的对象。它包含其他东西(在任何地方都不存在)。它有点像你定义的类型是一条脱落皮肤的蛇。

#1


26  

No, you just shouldn't mutate a key (in a material way) after inserting it into a dictionary. This is by design, and the way that every hash table I've ever used works. The docs even specify this:

不,你不应该在将它插入字典后改变键(以物质方式)。这是设计,以及我曾经使用的每个哈希表的工作方式。文档甚至指定了这个:

As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value. Every key in a Dictionary<TKey, TValue> must be unique according to the dictionary's equality comparer. A key cannot be null, but a value can be, if the value type TValue is a reference type.

只要对象在Dictionary 中用作键,就不能以任何影响其哈希值的方式进行更改。根据字典的相等比较器,Dictionary 中的每个键必须是唯一的。如果值类型TValue是引用类型,则键不能为null,但值可以是。 ,tvalue> ,tvalue>

So it's only going to surprise users who don't read documentation :)

因此,只会让那些不阅读文档的用户感到惊讶:)

#2


8  

To add to Jon's answer, I would just add emphasis to a certain part of the docs that you quoted:

要添加Jon的答案,我只想强调你引用的文档的某些部分:

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method.

只要没有对对象状态的修改来确定对象的Equals方法的返回值,对象的GetHashCode方法必须始终返回相同的哈希代码。

Now, right there you've broken the rules. You changed PK, which doesn't affect the outcome of Equals (because you have that ReferenceEquals check in there), but the result of your GetHashCode does change. So that's the simple answer.

现在,就在那里,你违反了规则。您更改了PK,这不会影响Equals的结果(因为您在那里检查了ReferenceEquals),但GetHashCode的结果确实发生了变化。这就是简单的答案。

Taking the more conceptual approach, I think you can look at it like this: if you have overridden the Equals and GetHashCode behavior for your type, then you've taken ownership of the concept of what it means for one instance of this type to be equal to another. And in fact you have defined it in a way that a Hashable object can be changed into something completely different; i.e., something that is no longer usable in the same way it was previously (because its hash code has changed).

采用更概念化的方法,我认为你可以这样看:如果你已经覆盖了你的类型的Equals和GetHashCode行为,那么你已经拥有了对于这种类型的一个实例意味着什么的概念。等于另一个。实际上,您已经将Hashable对象更改为完全不同的方式进行了定义;即,某些东西不再像以前那样可用(因为它的哈希码已经改变)。

Considered from this perspective, after you do dict.Add(h, true), and then you change h.PK, the dictionary doesn't contain the object referenced by h anymore. It contains something else (which doesn't really exist anywhere). It's kind of like the type you've defined is a snake that has shed its skin.

从这个角度考虑,在你执行dict.Add(h,true)之后,然后你改变h.PK,字典不再包含h引用的对象。它包含其他东西(在任何地方都不存在)。它有点像你定义的类型是一条脱落皮肤的蛇。