from Leetcode tutorial
Hash Table
hash functions
in order to support quick insertion and searchhash set
: implementation of set
data structure to store no repeated values
hash map
: implementation of map
data structure to store (key, value)
pairshash function
is key to performance of insertion and searchhash function
to map keys to bucketskey
, hash function
decides which bucket key
should be assigned and key
will be stored in corresponding bucketkey
, the hash table will use same hash function
to find corresponding bucket and search only in specific bucket =====================
|| ||
|| ------------------> 0
|| | ||
0 --------------| ||
|| || 1
1987 -------------------| ||
|| | ||
|| |---------> 2
24 ----------| | ||
|| | | ||
|| | | || 3
2 ----------------------| ||
|| | ||
|| |--------------------> 4
|| ||
===================
x % 5 = y
as hash function
key
s through hash function
to map them to corresponding bucketi.e.
1987 is assigned to bucket 2 -- 1987 % 5 = 2
24 is assigned to bucket 4 -- 24 % 6 = 4
hash function
and search only in specific bucketi.e.
when searching for 1987, the same `hash function` is used to map 1987 to 2, and so we search in bucket 2 and successfully find 1987 in that bucket.
when searching for 23, will map 23 to 3 and therefore search in bucket 3. However, 23 is not in bucket 3, meaning that 23 is not in the hash table.
Two essential factors: the hash function
and collision resolution
hash function
is used depends on the range of key values and the number of bucketsKey Type | Key Range | Number of buckets | Hash Function Example |
---|---|---|---|
integer | 0 to 100,000 | 100 | y = x % 1000 |
char | 'a' to 'z' | 26 | y = x - 'a' |
array of integers | size <= 10, each number member of [0,1] | 1024 | y = x0 * 2^0 + x1 * 2^1 + ... + x9 * 2^9 |
array of integers | size <= 5, each number member of [0,3] | 1024 | y = x0 * 4^0 + x1 * 4^1 + ... + x4 * 4^4 |
NOTE: x represents key and y represents hash results
hash function
is an open problemhash function
would result in one-one mapping between key and bucketCollisions from result of hash function
are basically inevitable. Collision resolution algorithms should solve following concerns:
These concerns are related to capacity of bucket
and number of keys
which might be mapped into same bucket
according to hash function
.
Assume that a given bucket, which holds maximum number of keys, has N
keys.
If N
is constant and small, we could simply use array
to store keys in same bucket. If N
is variable or large, might need to use height-balanced binary search tree
instead.