from Leetcode tutorial
Hash Tablehash functions in order to support quick insertion and searchhash set: implementation of set data structure to store no repeated valueshash map: implementation of map data structure to store (key, value) pairshash function is key to performance of insertion and searchhash function to map keys to bucketskey, hash function decides which bucket key should be assigned and key will be stored in corresponding bucketkey, the hash table will use same hash function to find corresponding bucket and search only in specific bucket =====================
|| ||
|| ------------------> 0
|| | ||
0 --------------| ||
|| || 1
1987 -------------------| ||
|| | ||
|| |---------> 2
24 ----------| | ||
|| | | ||
|| | | || 3
2 ----------------------| ||
|| | ||
|| |--------------------> 4
|| ||
===================x % 5 = y as hash functionkeys through hash function to map them to corresponding bucketi.e.
1987 is assigned to bucket 2 -- 1987 % 5 = 2
24 is assigned to bucket 4 -- 24 % 6 = 4hash function and search only in specific bucketi.e.
when searching for 1987, the same `hash function` is used to map 1987 to 2, and so we search in bucket 2 and successfully find 1987 in that bucket.
when searching for 23, will map 23 to 3 and therefore search in bucket 3. However, 23 is not in bucket 3, meaning that 23 is not in the hash table.Two essential factors: the hash function and collision resolution
hash function is used depends on the range of key values and the number of buckets| Key Type | Key Range | Number of buckets | Hash Function Example |
|---|---|---|---|
| integer | 0 to 100,000 | 100 | y = x % 1000 |
| char | 'a' to 'z' | 26 | y = x - 'a' |
| array of integers | size <= 10, each number member of [0,1] | 1024 | y = x0 * 2^0 + x1 * 2^1 + ... + x9 * 2^9 |
| array of integers | size <= 5, each number member of [0,3] | 1024 | y = x0 * 4^0 + x1 * 4^1 + ... + x4 * 4^4 |
NOTE: x represents key and y represents hash results
hash function is an open problemhash function would result in one-one mapping between key and bucketCollisions from result of hash function are basically inevitable. Collision resolution algorithms should solve following concerns:
These concerns are related to capacity of bucket and number of keys which might be mapped into same bucket according to hash function.
Assume that a given bucket, which holds maximum number of keys, has N keys.
If N is constant and small, we could simply use array to store keys in same bucket. If N is variable or large, might need to use height-balanced binary search tree instead.