Amazon Interview Question
Software Engineer / DevelopersCountry: India
Interview Type: Phone Interview
A weakness of this is that you store the whole stream - your storage will grow without bound.
@JeffD - it's not possible to solve this problem exactly as stated without storing the whole stream. Proof: consider how any number whatsoever seen at any time in the stream could become the median at some later point, depending on what values are observed later.
what if all are coming lesser than median in burst at some stage in that case u will end up shifting lot of elements from your max heap to min heap leading to a higher complexity.
I have below mentioned a logic of doing it with BST. Do provide your comments
I can only think, if the numbers are of bounded range (and limits known beforehand!) keep an array, one for each number. Keep a count in each position of the number of times that that number has been seen. Keep a variable 'median index' which is one of the indexes, and 'how far up' it is, of an imaginary pile of the value at that index. Eg if the median is at 5, and 5 (or the number at 5, if our range is offset) has been seen 6 times, our 'how far up' might be 4 (we're trying to model, as though our input was spread out. '4' means, the median is 4/6 of the way through the pile of sightings of '5'). When a number comes in we increment or decrement 'how far up' by 0.5, and if we have to move off that number either higher or lower then we do, that's the new median. This should be O(1), only it requires storage for the full range, of every /different/ number we expect to see. (I suppose it could be a sparse, linked list, too; for a number not previously seen, you'd have to do a binary search as to where to splice it in, which is O(lg number-of-different-numbers); it's independent of N itself at least. )
yup i think the heap logic is best :) best complexity i have thought the sam e logic also .
also without the heap it is possible in each iteration takes O(n) via insertion sort which is online sorting .in this insertion logic u can implement by the doubly linked list with prev and next pointers just create node only when the any input comes so the space never exhausted :)
We can do this BST also with same complexity beside no need to maintain two heaps.
For example we know the current median value[let it be node p] and current size of tree is odd.
Lets assume 10 new data comes up and 3 goes to left of p and 7 goes to right and still current size of tree is odd.
So we have to shift our p to p.successor four times and that will be our new median.
Besides if 9 data comes up and 3 goes to left and 6 goes to right of median and making current size of tree as even.
We need to shift out p to the right three times.
p=p.successor.successor.successor
and new median will be
(p+p.successor)/2
We can do this BST also with same complexity beside no need to maintain two heaps.
For example we know the current median value[let it be node p] and current size of tree is odd.
Lets assume 10 new data comes up and 3 goes to left of p and 7 goes to right and still current size of tree is odd.
So we have to shift our p to p.successor four times and that will be our new median.
Besides if 9 data comes up and 3 goes to left and 6 goes to right of median and making current size of tree as even.
We need to shift out p to the right three times.
p=p.successor.successor.successor
and new median will be
(p+p.successor)/2
We can do this BST also with same complexity beside no need to maintain two heaps.
For example we know the current median value[let it be node p] and current size of tree is odd.
Lets assume 10 new data comes up and 3 goes to left of p and 7 goes to right and still current size of tree is odd.
So we have to shift our p to p.successor four times and that will be our new median.
Besides if 9 data comes up and 3 goes to left and 6 goes to right of median and making current size of tree as even.
We need to shift out p to the right three times.
p=p.successor.successor.successor
and new median will be
(p+p.successor)/2
One solution is to use two priority heaps: a max heap for the values below the median, and a min heap for the values above the median. The median will be largest value of the max heap. When a new value arrives it is placed in the below heap if the value is less than or equal to the median, otherwise it is placed into the above heap. The heap sizes can be equal or the below heap has one extra. This constraint can easily be restored by shifting an element from one heap to the other. The median is available in constant time, so updates are O(lg n).
- Ali_BABA January 22, 2012private Comparator<Integer> maxHeapComparator, minHeapComparator;
private PriorityQueue<Integer> maxHeap, minHeap;
public void addNewNumber(int randomNumber) {
if (maxHeap.size() == minHeap.size()) {
if ((minHeap.peek() != null) &&
randomNumber > minHeap.peek()) {
maxHeap.offer(minHeap.poll());
minHeap.offer(randomNumber);
} else {
maxHeap.offer(randomNumber);
}
}
else {
if(randomNumber < maxHeap.peek()){
minHeap.offer(maxHeap.poll());
maxHeap.offer(randomNumber);
}
else {
minHeap.offer(randomNumber);
}
}
}
public static double getMedian() {
if (maxHeap.isEmpty()) return minHeap.peek();
else if (minHeap.isEmpty()) return maxHeap.peek();
if (maxHeap.size() == minHeap.size()) {
return (minHeap.peek() + maxHeap.peek()) / 2;
} else if (maxHeap.size() > minHeap.size()) {
return maxHeap.peek();
} else {
return minHeap.peek();
}
}