[ACCEPTED]-Keys are not unique for a python dictionary!-hashtable

Accepted answer
Score: 16

This can happen if you violate a requirement 16 of dict, and change its hash.

When an object 15 is used in a dict, its hash value must not change, and 14 its equality to other objects must not change. Other 13 properties may change, as long as they don't 12 affect how it appears to the dict.

(This 11 does not mean that a hash value is never allowed 10 to change. That's a common misconception. Hash 9 values themselves may change. It's only 8 dict which requires that key hashes be immutable, not 7 __hash__ itself.)

The following code adds an object 6 to a dict, then changes its hash out from 5 under the dict. q[a] = 2 then adds a as a new key 4 in the dict, even though it's already present; since 3 the hash value changed, the dict doesn't 2 find the old value. This reproduces the 1 peculiarity you saw.

class Test(object):
    def __init__(self, h):
        self.h = h
    def __hash__(self):
        return self.h

a = Test(1)
q = {}
q[a] = 1
a.h = 2
q[a] = 2

print q

# True:
print len(set(q.keys())) != len(q.keys())
Score: 1

The underlying code for dictionaries and 8 sets is substantially the same, so you can 7 usually expect that len(set(d.keys()) == len(d.keys()) is an invariant.

That 6 said, both sets and dicts depend on __eq__ and 5 __hash__ to identify unique values and to 4 organize them for efficient search. So, if 3 those return inconsistent results (or violate 2 the rule that "a==b implies hash(a)==hash(b)", then 1 there is no way to enforce the invariant:

>>> from random import randrange
>>> class A():
    def __init__(self, x):
        self.x = x
    def __eq__(self, other):
        return bool(randrange(2))
    def __hash__(self):
        return randrange(8)
    def __repr__(self):
        return '|%d|' % self.x

>>> s = [A(i) for i in range(100)]
>>> d = dict.fromkeys(s)
>>> len(d.keys())
>>> len(set(d.keys()))

More Related questions