Big Fish Amazon Interview Question
iOS Developers SDE-2sCountry: United States
Interview Type: In-Person
I don't get something. You wrote:
If X > Y, the median cannot be in the left of X, and cannot be in the right of Y
and
If X < Y, we eliminate the right side of X and left side of Y
}
Now consider the following arrays:
X {10,11,12}
Y {1,2,3}
If my understanding is correct the median should be (10+3) / 2. Right?
So the statements shouldn't be as follows:
If X > Y, the median cannot be in the right of X, and cannot be in the left of Y
and
If X < Y, we eliminate the left side of X and left right of Y
}
?
Maintain separate indices for both arrays and do a "merge" without merging -- just keeping track of the current highest value (and prev in case arrays are same size and we need to compute mean as average of the middle two elements).
import java.util.Arrays;
public class TwoArrayMedian {
/**
* Find the median of two sorted arrays of 8 bit ints.
*/
public static void main(String[] args) {
byte[] a = {1, 3, 5, 7, 9, 11};
byte[] b = {2, 4, 6, 8, 10, 12};
int len = a.length + b.length;
int aIndex = 0;
int bIndex = 0;
boolean evenCount = (len % 2) == 0;
int mid = len / 2;
if (evenCount) mid += 1;
int curr = 0;
int prev = 0;
for (int i = 0; i < mid; i++) {
prev = curr;
if (aIndex > a.length || a[aIndex] > b[bIndex]) {
curr = b[bIndex++];
}
else {
curr = a[aIndex++];
}
}
double median;
if (evenCount) {
median = (curr + prev) / 2.0;
}
else {
median = curr;
}
System.out.println("Median of " + Arrays.toString(a) + " and " + Arrays.toString(b) + " is " + median + ".");
}
}
Rune time is closer to n/2 -> O(n) where n is combined length of both arrays.
Say these two sorted arrays are a[n] and b[m].
If n + m is odd, then the median should be the (n + m) / 2 smallest number in the merged array.
Else the median is the average of the (n + m - 1)/2 smallest and (n + m + 1)/2 smallest.
Now let's see how we can find the Kth(0, 1, ... n+m-1) number of two sorted arrays without merging them:
(0)say its possible range is a[i, j], middle index m = (i + j) / 2
(1)binary search for equal range of a[m] in b[], say it's [left, right), which means we can insert a[m] into b[] at index range [left, right] while keeping b[] sorted.
(2)If m + left > K, which means if the Kth item is in a[], it must be smaller than a[m] so it should be in range a[i, m-1];
Else if m + right < K, which means if Kth item is in a[], it must be larger than a[m] so it should be in range a[m+1, j];
Else a[m] is the Kth item.
(3)If i > j then the Kth item must be in b[], where we can surely find it by similar method as (1) and (2)
Due to dual binary search, this search process has complexity as log(n)*log(m)
My O(logn) solution:
Let X be the median of the first list, Y be the median of the second list. We can find X, Y in O(1) time, since lists are SORTED.
L1 = {-------X-------}
L2 = {-------Y-------}
We have: numbers in the left of X (or Y) are not bigger than X (or Y); and the number in the right of X (or Y) are not smaller than X (or Y).
So, if X == Y we can conclude that the median should be equal to X.
If X < Y, the median cannot be in the left of X, and cannot be in the right of Y. Thus we can eliminate the left side of X and right side of Y, and recursively do for two remaining lists.
If X > Y, we eliminate the right side of X and left side of Y, and recursively do for two remaining lists.
We can stop when each list remains no more than 2 numbers, and find the median separately, in constant time. (I chose 2 to avoid boundary cases, you can choose 1 if you want).
This WORKs because each time we eliminate the same number of elements in right side and in left side, thus makes sure that the median always stays in the remaining lists.
This takes O(logn) because each time we eliminate half of the numbers...
Pseudo code may look like:
findMedian(st1, ed1, st2, ed2):
{
if (ed1-st1 <=2 ) return find_Median_By_Hand(st1, ed1, st2, ed2);
mid1 = (st1+ed1)/2;
X = L1[mid1];
mid1 = (st2+ed2)/2;
Y = L2[mid2];
if (X==Y) return X;
else
if (X>Y) return findMedian(st1, mid1, mid2, ed2);
else
if (X<Y) return findMedian(mid1, ed1, st2, mid2);
}
The answer is:
res = findMedian (0, n, 0, n);
Its simple to finding kth element in the merged array in logarithmic time.
take pivot in two arrays such that sum of pivot 1 + pivot 2 =n.
compare the pivot elements and search the correcponding array half if not equal
int getk(int a[],int l1,int m,int b[],int l2,int n,int k)
{
if(m<=0 || k<=0 || k<=0 || k>(m+n))
return -1;
int i = m * (k-1) / (m+n);
int j = k-i-1;
int ai_1 = (i==0)?-100:a[l1 + i-1];
int bj_1 = (j==0)?-100:b[l2 + j-1];
int ai = (i==m)?100:a[l1 + i];
int bj = (j==n)?100:b[l2 + j];
if(bj_1<=ai && ai<=bj)
return ai;
if(ai_1<=bj && bj<=ai)
return bj;
if(ai<bj)
return getk(a,l1+i+1,m-i-1,b,l2,j,k-i-1);
return getk(a,l1,i,b,l2+j+1,n-j-1,k-j-1);
}
int getmed(int a[],int m,int b[],int n)
{
if((m+n)%2!=0)
return getk(a,0,m,b,0,n,(m+n)/2+1);
return (getk(a,0,m,b,0,n,(m+n)/2) + getk(a,0,m,b,0,n,(m+n)/2+1))/2;
}
This problem has many solutions if the two arrays are of equal size, if not then there is a really complex solution which is iterative which considers three cases where one array lies completely inside another array and similar two other cases. There is a solution given by an MIT Prof for unequal arrays which is recursive and very elegant.
The OP has not mentioned what is the constraint on array sizes.
#include<stdio.h>
int max(int, int); /* to get maximum of two integers */
int min(int, int); /* to get minimum of two integeres */
int median(int [], int); /* to get median of a sorted array */
/* This function returns median of ar1[] and ar2[].
Assumptions in this function:
Both ar1[] and ar2[] are sorted arrays
Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
int m1; /* For median of ar1 */
int m2; /* For median of ar2 */
/* return -1 for invalid input */
if (n <= 0)
return -1;
if (n == 1)
return (ar1[0] + ar2[0])/2;
if (n == 2)
return (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1])) / 2;
m1 = median(ar1, n); /* get the median of the first array */
m2 = median(ar2, n); /* get the median of the second array */
/* If medians are equal then return either m1 or m2 */
if (m1 == m2)
return m1;
/* if m1 < m2 then median must exist in ar1[m1....] and ar2[....m2] */
if (m1 < m2)
{
if (n % 2 == 0)
return getMedian(ar1 + n/2 - 1, ar2, n - n/2 +1);
else
return getMedian(ar1 + n/2, ar2, n - n/2);
}
/* if m1 > m2 then median must exist in ar1[....m1] and ar2[m2...] */
else
{
if (n % 2 == 0)
return getMedian(ar2 + n/2 - 1, ar1, n - n/2 + 1);
else
return getMedian(ar2 + n/2, ar1, n - n/2);
}
}
/* Function to get median of a sorted array */
int median(int arr[], int n)
{
if (n%2 == 0)
return (arr[n/2] + arr[n/2-1])/2;
else
return arr[n/2];
}
/* Driver program to test above function */
int main()
{
int ar1[] = {1, 2, 3, 6};
int ar2[] = {4, 6, 8, 10};
int n1 = sizeof(ar1)/sizeof(ar1[0]);
int n2 = sizeof(ar2)/sizeof(ar2[0]);
if (n1 == n2)
printf("Median is %d", getMedian(ar1, ar2, n1));
else
printf("Doesn't work for arrays of unequal size");
getchar();
return 0;
}
We suppose that the array after merging two sorted array is C.
We can solve this problem with time complexity O(log(lengthA + lengthB)).
This problem can be converted to the problem of finding kth element, k is (A’s length + B’ Length)/2. If we can solve the problem of finding kth element, we can solve this problem.
Fist suppose the number of A and B'elements are greater than K/2. We compare the A[k/2-1] and B[k/2-1]. If A[k/2-1] < B[k/2-1], this show that the elements from A[0] to A[k/2-1] are in the C's first K elements. So we can discard the elements from A[0] to A[k/2-1]. Then use recursion we can solve this problem.
// Assuming arr[ ]and brr[ ] are two sorted arrays. merge these two arrays and stored in a third array crr[ ]. use Insertion sort, as it will not shuffle the elements , which are already sorted.
code:
------
public int median(int[] arr, int[] brr)
{
int[] crr=new int[arr.length+brr.length];
int i,tmp;
for(i=0;i<arr.length;i++)
crr[i]=arr[i];
for(int j=0;j<brr.length;j++)
crr[i++]=brr[j];
for(int k=1,l;i<crr.length;k++)
{
tmp=crr[k];
for(l=k;tmp<crr[l] && l>0;l--)
crr[l]=crr[l-1];
crr[l]=tmp;
}
int mid= crr.length/2;
int median;
if(crr.length%2==1)
median=crr[mid];
else
median= (crr[mid-1]+crr[mid])/2;
return median;
}
This is a duplicated question with 5433150784143360
I had an O(logn) algorithm, posted in the original question.
Let X be the median of the first list, Y be the median of the second list. We can find X, Y in O(1) time, since lists are SORTED.
We have: numbers in the left of X (or Y) are not bigger than X (or Y); and the number in the right of X (or Y) are not smaller than X (or Y).
So, if X == Y we can conclude that the median should be equal to X.
If X < Y, the median cannot be in the left of X, and cannot be in the right of Y. Thus we can eliminate the left side of X and right side of Y, and recursively do for two remaining lists.
If X > Y, we eliminate the right side of X and left side of Y, and recursively do for two remaining lists.
We can stop when each list remains no more than 2 numbers, and find the median separately, in constant time. (I chose 2 to avoid boundary cases, you can choose 1 if you want).
This WORKs because each time we eliminate the same number of elements in right side and in left side, thus makes sure that the median always stays in the remaining lists.
This takes O(logn) because each time we eliminate half of the numbers...
Pseudo code may look like:
- ninhnnsoc May 02, 2014