Google Interview Question for Interns
- 0of 0 votes
I was asked in an interview: You are given a dump file of IPv4 addresses. You are to find 4 most common occurring subnets. Lets say an IP address if of type a.b.c.d you have to find most common occurring four subnets of the form,
Here * matches anything.
My first solution was build an in memory hashtable. Given an IP address a.b.c.d split it as ["a","b","c","d"] and add "a", "a.b", "a.b.c", "a.b.c.d" to the hash table and count it. [There are optimizations possible like considering the entire IP address as a 32 bit unsigned integer and count it with masks and shifts]
Then the question got extended: "assume you can never hold everything in memory, how would you solve it?" Now, the very first solution that I could say was to do an external sort and then count it.
The next solution I gave was to split the IP addresses into buckets. The algorithm was,
while there is an IP IP <- an IP address a <- first quadruple push IP to bucket[a]
The bucket which has maximum elements would give me the a.*.*.* solution. Now take each bucket and do the same. Even though this might give the correct result, in worst case I might end up having 255^4 buckets.- miscanon July 15, 2016 in United States
This is indeed an open ended question with more than one correct answer. What would be the best way to solve this?
| Report Duplicate | Flag | PURGE
Google Intern Algorithm
Interview Type: In-Person
Open Chat in New Window