Twitter Interview Question
Data EngineersCountry: United States
What are we designing for? Space? Time?
If space is available and we can keep all data:
Solution #1
- Simple solution keep sorted list O(nlogn)
- return proper slice of list for the queried time O(logN)
Solution #2
- Create BST and insert into tree O(logN)
- return subtree whos time is less than the queried time O(logN)
If for space, assume were feed the hashtags one at time
Solution #1
- N = 10
- Create three sorted bins of size N elements. For 1min, 10min, 1hr
- create helper function refresh:
- In bin1Min, remove expired items, and bubble up to bin10Min or bin1hr if applicable
- in bin10Min, remove expired items, and bubble up to bin1hr if applicable
- in bin1hr, remove expired items.
- on insert
- call refresh()
- insert new item into proper bin based on current time difference
- find insert position( selection sort since N is small )
- if over threshold of elements, insert new item in front, and pop last
- if poped, bubble up if applicable
Do you guys think the above solution would work? Would love some feedback.
This is nothing but LRU cache.
- steephen August 29, 2017