Hash Table — How Hashing Works & Collision Resolution

A hash table stores key-value pairs by hashing each key into an array index. That lets the table jump near the right location instead of scanning every stored item one by one.

That is why hash tables are often close to $O(1)$ for lookup, insertion, and deletion on average. The condition matters: the hash function has to spread keys reasonably well, and the table still needs a rule for collisions.

What A Hash Table Does

A hash table stores key-value pairs such as "name" -> "Ada" or 42 -> "blue". It has two main parts:

an array of slots or buckets
a hash function that maps each key to one of those slots

If the array has $m$ slots, you can think of the hash function as producing an index from $0$ to $m-1$ .

For a small integer example, the rule might be

h(k) = k \bmod 8

Then the key $10$ goes to slot $2$ because $10 \bmod 8 = 2$ .

Real hash functions are designed more carefully, especially for strings and larger datasets. But the core idea stays the same: compute an index quickly, then check the item stored there.

Why Hash Collisions Are Unavoidable

A collision happens when two different keys map to the same slot.

That is normal because the table usually has fewer slots than the number of possible keys. With

h(k) = k \bmod 8

the keys $10$ , $18$ , and $26$ all map to slot $2$ . So even though the function is working exactly as defined, the table still has a conflict to resolve.

So a hash table does not promise a unique slot for every key. It promises a fast way to narrow the search, provided collisions are handled well.

Worked Example: One Table, One Collision

Suppose a table has $8$ slots and uses

h(k) = k \bmod 8

Insert the keys $10$ , $18$ , and $7$ .

First, $10$ goes to slot $2$ because $10 \bmod 8 = 2$ .

Next, $18$ also hashes to slot $2$ , so a collision happens.

Then $7$ goes to slot $7$ , so that insertion does not collide.

The useful pattern is always the same:

Hash the key.
Go to the suggested slot.
If a different key is already there, apply the collision rule.

Once that pattern is clear, the two main collision strategies are easier to understand.

Chaining Vs. Open Addressing

Chaining keeps a bucket at each slot

In chaining, each slot stores a small collection of entries instead of just one.

If $10$ and $18$ both hash to slot $2$ , bucket $2$ might store

$(10, \text{value}_1)$
$(18, \text{value}_2)$

To find key $18$ , the table jumps to bucket $2$ and compares only the entries in that bucket.

Chaining is conceptually simple, and deletion is usually easier than in open addressing. The tradeoff is that long buckets make operations slower.

Open addressing stays inside the array

In open addressing, each array slot holds at most one entry. If the home slot is full, the table probes other slots according to a fixed rule.

One common rule is linear probing: if slot $2$ is occupied, try $3$ , then $4$ , then $5$ , wrapping around if needed.

This avoids separate bucket lists, but nearby filled slots can build up into clusters. Another detail is that lookup and deletion have to respect the same probing rule used during insertion.

When $O(1)$ Is A Fair Claim

Hash tables are often described as having $O(1)$ lookup, insertion, and deletion on average. That average-case statement depends on conditions:

the hash function spreads keys reasonably well
the load factor stays under control
the collision strategy is implemented correctly

The load factor is the ratio

\text{load factor} = \frac{\text{number of stored entries}}{\text{number of slots}}

If the table gets too full, collisions become more common, and performance gets worse. That is why many implementations resize the array as it fills.

Worst-case performance can still degrade toward $O(n)$ if too many keys pile into the same area.

Common Hash Table Mistakes

Assuming collisions mean the hash function failed

They do not. Collisions are expected. The real question is whether the table handles them efficiently.

Treating $O(1)$ as unconditional

$O(1)$ is usually an average-case statement, not a promise for every input and every moment.

Confusing hashing here with encryption

A hash function in a basic data-structure context is mainly about fast indexing, not secrecy. Those are different goals.

Forgetting to compare the actual key

Two keys can share the same hash result. After landing in the right bucket or probe position, the table still has to check whether the key actually matches.

When Hash Tables Are Used

Hash tables are used whenever fast key-based lookup matters. Common examples include dictionaries, symbol tables, caches, indexing, and counting frequencies in data.

They are a strong fit when exact lookup by key is more important than keeping items in sorted order. If you need ordered traversal, range queries, or nearest values, a different structure may be better.

Try A Similar Collision Example

Take a small array with $8$ slots and use $h(k)=k \bmod 8$ . Insert a few keys such as $3$ , $11$ , $19$ , and $27$ . Then resolve the collisions once with chaining and once with linear probing.

That one exercise makes the main idea concrete very quickly: collisions are unavoidable, and the collision rule changes how the table behaves. If you want a useful next step, try your own version with a different table size and see how the collisions change.

Need help with a problem?

Upload your question and get a verified, step-by-step solution in seconds.

Open GPAI Solver →