Google Interview Question
Software EngineersCountry: United States
If I understand the answer properly, we will be getting a frequency of each query and we will use that as weighting for which query to return. In Python, I have made the assumption that the data is formatted as a list of tuples where the tuple contains the query and the count. E.g. [('dog', 5), ('cat', 7), ('cow', 2)]
from random import randint
def rand_query(queries):
data = []
x = 0
for query in queries:
data.append((query, x, x+query[1]))
x += (query[1] + 1)
max = sum(x[1] for x in queries)
rand = randint(0, max)
for entry in data:
if rand >= entry[1] and rand <= entry[2]:
return entry[0][0]
if '__main__' == __name__:
print(rand_query([('dog', 5), ('cat', 7), ('cow', 2)]))
function QueryGenerator(queries) {
//calculate total of all scores
var sum = queries
.map(function(x){return x[1];})
.reduce(function(a,x) { return a+x;});
var score = {};
var scoreSum = 0;
return function getQuery() {
while (true) {
if (scoreSum === sum) {
//invalidate score
score = {};
scoreSum = 0;
}
var rand = Math.round(Math.random() * (queries.length - 1));
var query = queries[rand][0];
var freq = queries[rand][1];
if (score[query] === undefined) {
score[query] = 1;
scoreSum++;
return query;
}
else if (score[query] < freq) {
score[query]++;
scoreSum++;
return query;
}
}
}
}
Simply, since each query has a count, we build array of ranges and each range represent a query, then we use rand()%countSum , where countSum is the sum of all counts for all queries, after that we do binary search to see this fits in which range and we return the query that represent this range !
here is the code, O(N)
- LaithBasilDotNet May 18, 2015