## Google Interview Question

Software Engineer / Developers**Country:**United States

**Interview Type:**In-Person

We can assume that probs given in the array can be related to frequency and

```
// calculate the cummulative frequency
ps[0] = freq[0];
for (int i=0; i < n; i ++)
ps[i] += ps[i-1] + freq[i]
int x = rand() % ps[n-1]; // u can generate numbers between (0, ps[n-1]).
// find the ceil idx in prefix_sum of freq
int idx = upper_bound(ps.begin(), ps.end(), x);
return a[idx]; // return the number associated with that
```

It assumes that probs[i] is the probability of getting strings[i], which means probs.size() == strings.size() and sum(probs) = 1.0

```
std::string GetRandomString(std::vector<std::string> strings, std::vector<double> probs) {
std::vector<double> density;
density.push_back(probs[0]);
for (int i = 1; i < probs.size(); ++i)
density.push_back(probs[i]+density.back());
auto r = random.NextFloat();
int index = 0;
while (r > density[index])
++index;
return strings[index];
}
```

The term "arbitrary probability distribution" over the given string set requires further clarification. It is due to the fact that this distribution determines the way we will be simulating the randomness to pick the next string. For instance, the probability array introduces a uniform distribution over the strings. In such a setting, we can simulate a random variable that gives the next string's index as:

where n is the total number of strings and rnd() denotes a random number generator. This code snippet returns a value [1...n]. However, if the probability distribution defined over the set of strings is not uniform then we must use an implementation of the random-index-generator above that suites the given distribution.

- Anonymous September 26, 2013