Message on Whatsapp 8879355057 for DSA(OA + Interview) + Fullstack Dev Training + 1-1 Personalized Mentoring to get 10+LPA Job

Introduction to Hashing (Find frequency of each number in the array) || Hashing-Part-1

Ask a question:

Author : Kumar K  

All Tutorials of Hashing are listed here : https://www.desiqna.in/tag/hashing_tutorials 

PART 1 (Introduction to Hashing) : 

So, most of us have heard about this term “Hashing” a lot in DSA and have seen people in fear as soon as they listen to it.

But I will make you fall in love with this topic . 


Important points:-

0. Make sure your programming basics in any one of these programming languages are clear(C/C++/Java/Python). Solve 30-40 problems from this list to brush up your fundamentals before starting with Hashing , in case you are a beginner : https://docs.google.com/document/d/1nFt_jEgy1X-UaKU2bu3OnC365ADCL-ls_cpXUMJrvx0/edit

1. To be good at Hashing, all you have to do is solve classical Hashing-problems and their variations.

2. Recognize patterns and improve upon your thinking skills.

3. Love Hashing, it will love you back !

4. It is one of the most important topics asked in coding interviews and OA of top companies.

5. Follow all the questions and solutions in this tutorial properly to gain mastery in this topic.

There are many variations of Hashing and I’ll discuss them all one by one so you can become a pro!


Introduction to Hashing :-

We will discuss about the theoretical jargon later , as it is not useful while solving the coding problems in interviews or contests. Let me explain the concept in most basic manner possible for now. 

Hashing is basically used to store details and information of a data structure(say array) in an efficient way which helps to solve a problem efficiently. 

That’s it!

Example :- 

Say I give you an array : [1,3,3,4,1,4,4,4,4] . And ask you(query) to find the frequency of some number in this array. I have multiple queries of this form. 

Meaning of frequency in this problem : Number of times a number appears in the array. 

Your final answer will look like this :- 

Query input number : (1) - Frequency of [1] = 2 

Query input number : (3) - Frequency of [3] = 2 

Query input number : (4) - Frequency of [4] = 5 

One way to solve it , is to run a for loop for each number in the query and calculate its frequency like this : (C++ code)

       int a[9] = [1,3,3,4,1,4,4,4,4] ; 
       int i = 1 ;
       while(i<=q){
           int query;
           cin>>query;
           int frequency = 0 ;
           int j = 0 ; 
           while(j<=8){
               if(a[j]==query){
                   frequency++;
               }
               j++;
           }
           i++;
       }     

Python : jdoodle.com/ia/CCZ 

Java : Code 



Python : Code 

Time Complexity : This method iterates over the full array of size 'N' for each of the 'Q' queries , so the time complexity will be O(N*Q) .

Can we improve it with the help of hashing ? 

Hell yea!

We discuss it below


Problem-1 : We are given an array of positive integers(a[n]) . We are given multiple queries of the form : [x] which means we need to output the frequency of integer 'x' in the array. All numbers inside the array are from range [0,N-1].

Analysis : Running a loop for each query[O(N)] and finding the frequency is a good idea but not very efficient as it takes O(N*Q) time.

Hashing Method

So , our given input array 'a' , let it be a[9] = {1,3,3,4,1,4,4,4,4} for example and demonstration purpose. Size of this array is 9.

Let us create a new array 'b' , lets its size be 9 , same as that of array 'a'. Let this array be empty , let it be filled with all zeroes.

b[9] = {0,0,0,0,0,0,0,0,0}

Main Idea : We will store frequency of all integers from array 'a' into array 'b'. 

This is the main idea of Hashing , we store the information of array 'a' into array 'b'(efficiently) which helps solving the problem efficiently. :-) 



What is the meaning of b[3]? 

- It will tell us about frequency of 3 in array 'a'.

What is the meaning of b[4]? 

- It will tell us about frequency of 4 in array 'a'.

What is the meaning of b[5]? 

- It will tell us about frequency of 5 in array 'a'.


Algorithm : We will travel the whole array , say , we encounter , a[i] , lets set , x = a[i] , then lets increment value at index 'x' in array 'b' , by doing this : 

                             x = a[i] 

                             b[x] = b[x] + 1 

After the algorithm is performed , this is how the final arrays look like : 

a[9] = {1,3,3,4,1,4,4,4,4

b[9] = {0,2,0,2,5,0,0,0,0}

We , can now clearly see , that b[1] = 2 , b[3] = 2 , b[4] = 5

It means , we have pre-calculated frequencies of all numbers in array 'a' by just running a single for loop.

Voila!!


Time Complexity : As we are running a single for loop , O(N) time is required , to answer all the queries , we iterate over the queries and answer each one of them in O(1) time as we have all the answers pre calculated :-) So , total time taken : O(N+Q).

Space Complexity : We created an extra array of size 'N' , which is array 'b' . So space taken is O(N) by the algorithm.


We can call array 'b' as hash-table or frequency array in this problem :) 


code(C++) :- 


Try it yourself ❯

Python : jdoodle.com/ia/CD3 



Java : Code 




But what if the range of numbers is large like 1000000000 ?? In our case , range was [0...N-1] so all the numbers could easily get their place in our frequency array which we were using as hash tables :) :) 

The solution lies in part two and part three of the Hashing Tutorial series :) :)

All Tutorials of Hashing are listed here : https://www.desiqna.in/tag/hashing_tutorials 

Next ❯

                                                 


Get best answers to any doubt/query/question related to programming , jobs, gate, internships and tech-companies. Feel free to ask a question and you will receive the best advice/suggestion related to anything you ask about software-engineering , development and programming problems .