Bloomberg LP Interview Question
Software Engineer / Developersassume you have a memory space B stores the number you pick
1. starting from the very first from the stream A, assign B = A_1
2. move to the next one, if rand() > 1/2, assign B = A_2
3. move to the next one, if rand() > 2/3 assign B = A_3
...
k. move to the next one, if rand() > (k-1)/k assign B = A_k
...
you are guaranteed to give equal probability to each element
The probabily with with you are keeping the number is <= k-1/k. This value increases as k increases.
For eg, 1/2 is 0.5 10/11 is 0.90..
So how can you claim equal probability ?
So acoader, you say that when we get k = 1000, prob of keeping a number is 999/1000.
Hehehe, you just missed the big & small picture.
Prob of keeping 1st number number on iteration k is 1*(1/2)*(2/3)*(3/4)*...*((k-1)/k)=1/k
Prob of keeping nth number on it. k (n<k) is
(1/n)*((n)/(n+1))*((n+1)/(n+2))*...*((k-2)/(k-1))*(k-1)/(k) = 1/k
So, as colin said, it is guaranteed the equal probability.
colin is right. The algorithm he used has a name: Reservoir Sampling.
To CyberPhoenix, foobar, wiwi: Huh?
interesting question - i think that's the problem here is that the count in infinite.
If you assume that rand() returns a number from the infinite range 0 - 1 we an simply use that - the decimal part can be construed as index into this stream.
But theres something wrong with the above solution and I don't yet know how to improve it.
Well, rand() is a computer based function, and computers work with finite numbers (i.e. floating point is not infinite math).
Answer is simple....
It is a continuous stream, just keep track of the size of the data received.
Suppose currently you have 10 bytes, with this information you can easily pick number with equal probability. Assuming rand(10) picks number with equal probability rand(current_size) will do the trick.
int total= 0,size = 0;
while(true)
size = read(socket,buff,sizeof(buff));
total +=size;
}
rand(total); will give the number ....
Take the first number off the stream, use its absolute value as an index in the stream. If the numbers themselves are infinite, this should answer it.
- foobar November 24, 2008