Groupon Interview Question for Software Engineer / Developers


Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
0
of 0 vote

You can store an atomic counter for each time stamp. If url is hit and entry already exists for this timestamp then just increase the counter. When you need to run query just add all the counters between two timestamps. Query will be straightforward as you have only one entry for one timestamp

- msingh March 17, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

are you limiting your timestamp to hours, mins and seconds? or are you inclusing milliseconds as well? if we have a counter for each timestamp then we might end up having enormous number of counters as we will end up having a new timestamp for each millisecond or second. Please correct my assumption if wrong.

- vmzi March 30, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

One more thing that can be done is..create an extra counter for a day. In this case if you have to do large queries just add up counters for whole day + some extra rows (max 86400)
e.g.
between 01-01-2013-8:00:00 and 31-01-2013 8:00 you will have to scan
28 rows + 2* 8 * 60 * 60 rows to get actual number of hits

- m March 17, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We create following data structure

create a b+tree like structure, where we use date to create child, 
        at first level we use year, 
        second level we use months,
        and at third level we use day, 
        and then to hours.

Now these are standard, we join these hours node by linked list. so that we can move easily to another hours.

After that we create categories for URL's, It may be domain's.
here actually we create a trie kind of structure to store URL's in minimum space, and to improve search.

Now the first part is used to find from where to where,we have to search. and once we find interval in tree, we go searching and print output.

- sonesh March 17, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I would like to have one log directory for each URL, and one log file for each day. For each log line in the log file with time in seconds and the accumulated number of access since the start of the log. All the access with same time in second is shared in the same line of the log in the file.
For statistics of the number of access time for the specified URL, you just find the specified directory for the special URL, and find all the log file within the time frame, for each log file, you can find the number of the access by subtracting the earlies time of access number in that log file from the latest one in the log file within the time frame, and sum all these number to formal the final result. Here is some concern on the design:
1> distributed effort for each URL and its time frame
2> Accumulated number of access within each day, so omitting the going through of all file.

- chenlc626 March 17, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

The problem bogs down to design a data structure wherein
1) A time stamp can map multiple URLs i.e Multiple URLs can be opened in a particular time stamp
2) A single URL map to multiple timestamps i.e A single URL can be opened at multiple timestamps.

This is what I have deduced. Can someone please let me know what can be used in such a situaltion (many to many relationship)

- DashDash April 25, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Trie+BinarySearch. Store URLs in Trie, at the end store the logged times in a list, they have to be chronological (otherwise, the concept of time between is inconsistent i guess)
So this list is sorted.
Given a URL and a query. Just get to the URL's list and use binary search to retrieve the required value.
This is a better than partitioning time when the number of logs per user is sparse and the users are high.

- Gowtham Rangarajan January 22, 2016 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Trie (to store/retrieve URL) + Binary search , Timestamp is Unix time stamp.

- Gowtham Rangarajan January 22, 2016 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More