Design a system where you can

Amazon Interview Question for Software Engineer / Developers

0

of 0 votes

8
Answers
Design a system where you can reutrn top 20 queries made in last 24 hours to users.

Think on the scale of Google and Yahoo. How would you store data. What will be your data structures, algorithm to get that data.Describe your assumptions etc.

For simplicity, you can assume that every web server create a log file with query and timestamp.
- aks October 11, 2012 in India | Report Duplicate | Flag | PURGE
Amazon Software Engineer / Developer Application / UI Design

Email me when people comment.

An error occurred in subscribing you.

Country: India

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

Use Map reduce

- Anonymous January 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Everyone before answering, please look at the scale that Google handles more than 3 billion daily search queries.

- aks October 20, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

What about reading the logs, checking the date, putting the search queries to a hash table, with counters. Every time query comes(or a similar query ) increase counter. Also keep a min heap and if counter hits min heap's min, reorganize heap.
As data will be huge, this operation can be divided per letter and to be able to get the most searched queries, we would need to merge heaps from different letters
Would appreciate any feedback

- MG October 28, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

In each datacenter have a set of "count" servers to get the current query and store it. These set of servers communicate with each other and also they communicate with count servers in other data centers. They can use distributed algorithm with Lamport logic to order the search queries and store it. They can use a sliding window algorithm to then get the top 20 in the last 24 hours.

- Anonymous January 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

since there could be many servers, and each server has its separate log file.
what we can do , rather than doing processing log file one by one. Maintain a min heap of top requests and a hashmap to index each query and to update min heap of individual server.
combine result of all the servers, i.e compare min heaps of all servers, and consolidate result in to one large heap to produce response.

- ashish June 27, 2016 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Use a min heap of size 20 to keep the frequently used queries by count.

- Dinesh October 13, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

please notice that Google handles more than 3 billion daily search queries. Do you want to add some details?

- aks October 20, 2012 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

Can we solve this by using knapsack problem logic????

- xian_7 November 07, 2012 | Flag Reply

CareerCup

Amazon Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews