[ACCEPTED]-What is a sensible prime for hashcode calculation?-primes
I recommend using 92821. Here's why.
To give a 35 meaningful answer to this you have to know 34 something about the possible values of i
and 33 j
. The only thing I can think of in general 32 is, that in many cases small values will 31 be more common than large values. (The odds 30 of 15 appearing as a value in your program 29 are much better than, say, 438281923.) So 28 it seems a good idea to make the smallest 27 hashcode collision as large as possible 26 by choosing an appropriate prime. For 31 25 this rather bad - already for i=-1
and j=31
you 24 have the same hash value as for i=0
and j=0
.
Since 23 this is interesting, I've written a little 22 program that searched the whole int range 21 for the best prime in this sense. That is, for 20 each prime I searched for the minimum value 19 of Math.abs(i) + Math.abs(j)
over all values of i,j
that have the same 18 hashcode as 0,0
, and then took the prime where 17 this minimum value is as large as possible.
Drumroll: the 16 best prime in this sense is 486187739 (with 15 the smallest collision being i=-25486, j=67194
). Nearly as 14 good and much easier to remember is 92821 13 with the smallest collision being i=-46272 and j=46016
.
If you 12 give "small" another meaning and want to 11 be the minimum of Math.sqrt(i*i+j*j)
for the collision as 10 large as possible, the results are a little 9 different: the best would be 1322837333 8 with i=-6815 and j=70091
, but my favourite 92821 (smallest 7 collision -46272,46016
) is again almost as good as the 6 best value.
I do acknowledge that it is quite 5 debatable whether these calculation make 4 much sense in practice. But I do think that 3 taking 92821 as prime makes much more sense 2 than 31, unless you have good reasons not 1 to.
Actually, if you take a prime so large that 7 it comes close to INT_MAX
, you have the same problem 6 because of modulo arithmetic. If you expect 5 to hash mostly strings of length 2, perhaps 4 a prime near the square root of INT_MAX
would be 3 best, if the strings you hash are longer 2 it doesn't matter so much and collisions 1 are unavoidable anyway...
Collisions may not be such a big issue... The 10 primary goal of the hash is to avoid using 9 equals for 1:1 comparisons. If you have 8 an implementation where equals is "generally" extremely 7 cheap for objects that have collided hashs, then 6 this is not an issue (at all).
In the end, what 5 is the best way of hashing depends on what 4 you are comparing. In the case of an int 3 pair (as in your example), using basic bitwise 2 operators could be sufficient (as using 1 & or ^).
You need to define your range for i and 1 j. You could use a prime number for both.
public int hashCode() {
http://primes.utm.edu/curios/ ;)
return 97654321 * i ^ 12356789 * j;
}
I'd choose 7243. Large enough to avoid collissions 2 with small numbers. Doesn't overflow to 1 small numbers quickly.
I just want to point out that hashcode has 3 nothing to do with prime. In JDK implementation 2
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
I found if you replace 31 with 27, the result 1 are very similar.
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.