Infosys Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: Written Test
This : "n + n*3/4 + n*(3/4)^2 + ... = O(n)" is just wrong. How long is the ... in this equation . This is probably O(N^2) . Even if you use the median for partitioning you have log n elements in the equation in the worst case . still O(N*log n).
@M: Take it to infinity if you want.
n + n*r + n*r^2 + ....(upto infinity) = n/(1-r) = O(n)
for 0 < r < 1.
I believe it's called a converging geometric series.
Suppose we want to sum from the 0th to the k-th term of a geometric series.
Let x = 1 + 3/4 + (3/4)^2 + ... + (3/4)^k
Then 3/4 x = 3/4 + (3/4)^2 + ... + (3/4)^(k+1)
subtracting, 1/4 x = 1 - (3/4)^(k+1) = 1 in the limit that k is large
x=4 in the limit, so the algo just ends up being 4*O(n) which is still O(n).
We'll want to be taking the smallest elements first. However, we want to avoid sorting to avoid having an O(n log n) algorithm. Of course if we could sort this would render the algorithm trivial: in the sorted array, take all the negative elements and zeros, and then as many positive elements as you can, starting from the smallest, until taking the next one would make your sum > M.
- eugene.yarovoi October 03, 2012We need an O(n) algo, so we'll have to do something different, something similar to quickselect. First, partition the list into one half having negative and zero elements, and the other half having positive elements. Take all the negative and zero elements and include them in your subset. Suppose their sum is S (which is negative or zero). You now need to find the largest subset of the positive integers that sums to T = N-S or less. Now we apply the following for the positive integers:
Pick a random pivot element (or pick the median using a deterministic method if you need determinism). Partition the set into two parts, one part less than and one part greater than or equal to the pivot. Sum the entire less than part. If the sum <= T, include all of these elements in the total count and look for the largest subset <= T-Sum in the greater than or equal part. If the sum is >T, don't include any elements in this step and repeat this step just on the less than part, with the same target T. Whenever we reach a set of size 1, we determine whether or not that one element is small enough to be included , and then stop (base case).
This is O(N) in the expectation because partioning the elements will first take O(n). Then the algo is repeated on just one of the partitions. Under perfect conditions, we'd get partition sizes of n/2 and n/2, but in the average case we might see something like 3n/4 and n/4. Even if it's more likely that the element we're looking for is in the 3n/4 part, we have n + n*3/4 + n*(3/4)^2 + ... = O(n). This is the expectation (on all inputs), but the worst case scenario if alll the dice roll the wrong way can be O(N^2) or even infinite depending on the precise implementation. However, if you use a deterministic O(N) algo for partitioning an array, this algo will then also be deterministic and worst-case O(N).