Adobe Interview Question
Data ScientistsCountry: India
Interview Type: In-Person
The tricky part of this problem is to make random() O(1). To achieve this, use a set/hashmap to achieve O(1) insertions/deletions. To obtain O(1) for random, use an auxiliary list and return a random index from the list. Another tricky part is to keep the list in sync with the hashmap/set. To achieve this swap the quote with the last element and then remove the element when you select the random() quote.
I have provided a toy example which represents this idea:
Solution in Python Below:
from random import randint
class QuoteManager:
def __init__(self):
self.quoteToIndicesMap = {}
self.quotes = []
def add(self, quote):
if quote not in self.quoteToIndicesMap:
# Add it
self.quotes.append(quote)
self.quoteToIndicesMap[quote] = len(self.quotes) - 1 # index
return True
return False # Already there return False
def random(self):
if len(self.quotes) == 0:
print('All quotes are exhausted!')
return 'INVALID!!'
# Key trick is to swap with last element
# and then delete it
# Swap it with last position
lastQuote = self.quotes[-1]
randomIndex = randint(0, len(self.quotes) - 1)
self.quotes[randomIndex], self.quotes[-1] = self.quotes[-1], self.quotes[randomIndex]
# Delete it from list and the hashmap
quoteToPop = self.quotes.pop()
self.quoteToIndicesMap[lastQuote] = randomIndex
del self.quoteToIndicesMap[quoteToPop]
# Return the random quote
return quoteToPop
Test code:
# Test code
q = QuoteManager()
print(q.add('Quote 1')) # True
print(q.add('Quote 2')) # True
print(q.add('Quote 3')) # True
print(q.add('Quote 4')) # True
print(q.add('Quote 1')) # False
print(q.random()) # Quote 2
print(q.random()) # Quote 4
print(q.random()) # Quote 3
print(q.random()) # Quote 1
print(q.random()) # All quotes are exhausted! INVALID!
The question asks for O(1) access to a random element, you are allowed to preprocess. To pick a random element n O(1) you need to know how many quotes there are, create a random number and access in O(1). If the quotes do not change, you traverse once and store in a second file (the index file) the offset to the quote. You can read the index into memory or random access it (fixed record length). If you do not want to self create an index use a dbms. If you need to maintain the index after adding/deleting quotes, use a btree. Of the stuff doesn't fit on a single machine, shard.
- cabbagesoup June 08, 2018