A hash collision occurs when a hash function maps more than one distinct key to the same value in a hash table of size — this is one of the fundamental problems with hash tables.

We have a few mitigation strategies: open and closed hashing.

Open hashing

Open hashing (also hashing with chaining) is the idea that we can have array values actually be pointers to linked lists with keys that map to the same entry.

Worst-case time complexity is , but with an ideal hash function , the average list length is 1 and so the complexity would be .

How do we get there though? So that the average list length is 1?

  • We can make the hash table bigger
  • Use better hashing functions
    • Making table a prime number
    • Or having the hash function multiply by a large prime number before taking the modulus
      • i.e.,

Closed hashing

The idea with closed hashing (also open addressing) approaches is that no two keys will be saved to the same value.

  • Linear probing proposes we do: if there’s a collision at .
    • Could lead to clustering, where inserted keys tend to group together. This increases the average search time.
    • In practice, deletion seems complicated without using a special deleted value.
  • Quadratic probing proposes a similar idea: if there’s a collision.
  • Double hashing proposes: