Flipkart Interview Question
Software Engineer / DevelopersIf we want a solution better than nlogn then there is only one option left out is to find Kth largest element where K=(n+1)/2. This could be O(N^2) also in worst case but average case is still O(N).
@xedgenes: I am not agree with u that "median is also the average of the highest and the lowest element in the array."
Using a lazy-enough merge sort it's possible to find Kth largest element in O(n+K*log n) time. The idea is to build the binary tree of merge operations from the elements in O(n) time and getting the next element from the tree in O(log n) time. There is an algorithm how to do it in the lazy functional languages. But, probably, it will require some effort to implement it in Java.
we can use modified quick sort here.
after the first partition we need to check the index of pivot, if it is less than size/2 then we go far partitioning right side else left.do this recursively till the index of pivot not becomes size/2 i.e. median(kinda quick sort).
hence the total order will be O(n)+O(n/2)+O(n/4).... i.e. O(n).
correct me if i am wrong.
From my understanding all quicksort based solutions to find the median have expected (average) O(n) time complexity and the worst O(n2) time depending how good the pivot will be chosen. There is the median of medians algorithm with best/worst performance O(n) which is always better than O(n log n).
you can use the below algorithm:
1. take the number as it appears in the array ... say arr[0] and then arr[1] ... and so on.
2. maintain two heaps (min and max)," max heap" in left side of the median and "min heap" to the right side of the median, just to remember and visualize it better.
3. for arr[0] put it in max heap(I will call it now MAX_HEAP)
4. update your "median" also with arr[0]
5. take the next number arr[1], if it is greater than "median" put it in MIN_HEAP
6. maintain two variables(size_max_heap and size_min_heap) which keeps the number of element in the corresponding heaps
7. if (size_max_heap == size_min_heap) then median = ( "root element of max heap" + "root element of min heap")/2
8. else the root element of the heap which is bigger in size.
Well median would basically be the middle element of a sorted array or the average of the 2 middle elements in case of an even number of elements.
But it is also the average of the highest and the lowest element in the array.
So one simple solution is to keep a max and min variable and parse the array once to find out the true max and min in the array and get the average.
I was also thinking, we do 1 parse of the array and make a Search tree out of it. This takes O(n) time. Then find the left most leaf node and the right most leaf node (basically the highest and the lowest element in the tree) and find their average.
Does this seem right ?
You can't have it both ways.
If the numbers are 1, 2, 3, 4, 7, is the median 3 (middle element) or 4 (average of 1 and 7).
Constructing the tree would cost you O(n lg n) and not O(n).
When you are parsing the tree to get the leftmost and rightmost values, in the worst case, the tree formed would can be a linear linked list i.e. in the case when the array is sorted.
balancing the tree would cost you additional. Even constructing the tree from a sorted array would make it cost O(n squared), as the tree would cost n to reach the terminal node everytime and not log(n).
if we have an array like 1,3,5,7,31 5 is the median not 16. so median is not the average of max and min
"But it is also the average of the highest and the lowest element in the array."
This is only true if the array has all the elements present from start_element to end_element..more specifically,
1,2,3,4,5 = (1+5)/2=3; It ONLY works for this case.
So, we really cant go with that approach.
Use "Median of Medians" algorithm to select the pivot. Rest all is same as Quick-SELECT.
- iamnotiam April 20, 2011