Data Structures and Algorithms - Old Questions
10. Why do we need Hashing? Discuss linear probing in detail.
Hashing is a well-known technique to search any particular element among several elements. It minimizes the number of comparisons while performing the search.
In hashing, an array data structure called as Hash table is used to store the data items. Hash table is a data structure used for storing and retrieving data very quickly. Insertion of data in the hash table is based on the key value. Hence every entry in the hash table is associated with some key. Using the hash key the required piece of data can be searched in the hash table by few or more key comparisons.
Fig: Hashing Mechanism
Hash function is a function which is used to put the data in the hash table. Hence one can use the same hash function to retrieve the data from the hash table. Thus hash function is used to implement the hash table. The integer returned by the hash function is called hash key.
In a hash table, data is stored in an array format, where each data value has its own unique index value. Access of data becomes very fast if we know the index of the desired data. Thus, using hashing insertion and search operations are very fast irrespective of the size of the data. So we need hashing.
Linear Probing
Linear probing is the method of handling collision. When collision occurs i.e. when two records demand for the same home bucket in the hash table then collision can be solved by placing the second record linearly down whenever the empty bucket is found. When use linear probing, the hash table is represented as a one-dimensional array with indices that range from 0 to the desired table size-1. Before inserting any elements into this table, we must initialize the table to represent the situation where all slots are empty. This allows us to detect overflows and collisions when we inset elements into the table. Then using some suitable hash function the element can be inserted into the hash table.
For example:
Consider that following keys are to be inserted in the hash table
131, 4, 8, 7, 21, 5, 31, 61, 9, 29
Initially, we will put the following keys in the hash table.
We will use Division hash function. That means the keys are placed using the formula
H(key) = key % tablesize
H(key) = key % 10
For instance the element 131 can be placed at
H(key) = 131 % 10 = 1
Index 1 will be the home bucket for 131. Continuing in this fashion we will place 4, 8, 7.
Now the next key to be inserted is 21. According to the hash function
H(key)=21%10 = 1
But the index 1 location is already occupied by 131 i.e. collision occurs. To resolve this collision we will linearly move down and at the next empty location we will prob the element. Therefore 21 will be placed at the index 2.