Microsoft Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
Note that when given an input with no matching pairs, this code returns every element in the array as a separate solution, but this is easily fixed by initializing maxLen to 2.
Very elegant solution.
why is [9,10,8,-2] not the only solution ?
beacause, it is a maximal sub-array and also the sum of all pairs is more than 4
@sourav.hitk, those 4 elements are not a valid subarray of given array (its elements are not contiguous)
I am not getting one thing.. in heap solution, when we have to delete elements from j to j', how can we make sure that it takes O(log n) time? because we may have to delete any node of the heap and the search of the node a[j] to a[j'] can be worst case O(n) time and then deletion will take O(log n) time after finding that element.
@arwin: I hope i understood your question properly. Here is the response:
When we have to delete elements from j to j' it doesn't take O(logN) time. It takes (j' - j)*O(logN).
However, if you count the elements that are deleted for all the steps the algorithm performs then there are at most N elements to delete (because when you delete an element you increase j', it is never decremented and it goes up to N).
Or to put it in another way: throughout the running of the algorithm, you heap.remove() each element of the array at most once.
No my question is what is the complexity of deleting one element from the heap? we may have to delete intermediate nodes in the heap when we dont know the exact position of that node in the heap. So should not it take O(n) time to find it first and then to delete it will take O(log n) time?
For that you need to use additional data (like a hash map) for determining effficiently the position in the heap. Try googling for how decrease key is implemented efficiently for a heap.
However I recommend that you use a std::set (actually multiset or map in order to deal with duplicate elements) instead of a heap. That will make implementation of the algorithm much easier. I used the term "heap" in the solution description because of its main purpose (to keep track of the minimum).
how abt this?
find the maximum subarray length which sum>=k, assume it is L. Now create a dequeue of length L, now just and add 1 element and remove one element check the same, as we will store the sum in temporary variable it will be O(N) algorithm
@cristian.botau
Your solution is right.
Yet I still cannot deque solution. It seems like a sliding window.
Could you give further explanation for this by using some picture? Thank you.
I don't undertstand the question.
For k = 4 and array -4 9 10 4 -3 8 9 -2, why isn't 9 10 4 -3 8 9 answer?
Because not all pairs in
9 10 4 -3 8 9
add to > k
e.g. 4 + -3 = 1 which is less than k = 4
Still it is not clear
[ -4,9 ]
[9, 10]
[10,4]
[8,9]
[9, -2]
are not part of answers, is there any constraint on length of sub array ?
@OptimumsPrime,
[-4,9] is not a solution as it is not maximal - it is a subset of a larger solution [-4 9 10]
Use two pointers with a min-heap. Everytime when inserting into heap, check if new element + min-element < k. If so, print from begin to end. Update begin to min-element position. Update end to new-element position. If not, add it to heap, update end position.
With code it should be more clear:
int main(void){
int s=0, e=0, k;
int a[1000];
pq<int> p;
map<int,int> m;
int n;
scanf("%d%d",&n,&k);
for(int i=0;i<n;i++)
{
scanf("%d",&a[i]);
m[a[i]] = i;
}
p.push(-1*a[0]);
p.push(-1*a[1]);
s = 0;
e = 1;
while(e!=n)
{
e++;
int mn = -1*p.top();
if(mn + a[e] < k)
{
for(int i=s;i<e;i++)
cout << a[i] << " " ;
cout << endl;
p.pop();
s = m[mn]+1;
}
p.push(-1*a[e]);
}
int mn1 = -1*p.top();
p.pop();
int mn2 = -1*p.top();
if(mn1+mn2 >= k)
{
for(int i=s;i<e;i++)
cout << a[i] << " ";
cout << endl;
}
return 0;
}
Interesting use of min-heap. Could you explain your solution a bit in english - it is easier to follow than code.
Where is the heap in your code? Oh, I see, you are using priority_queue as heap. Interesting
Also, why are you multiplying items with -1?
You are only printing one solution. There may be multiple subarrays. See the question - it has an example
I still could not understand the problem. For the first example
suppose k = 4
-4 9 10 4 -3 8 9 -2
Answer should be:
-4 9 10 4 --> -4+9, 9+10, 10+4 are all greater than k=4
-3 8 9 -2 --> -3+8, 8+9, 9+-2 are all greater than k=4
in my opinion. Am I correct?
You have to look at all pairs (not just adjacent pairs) in the subarray.
-4 9 10 4 is not a solution as -4 + 4 < k = 4
@bambam:
Using '\0' to end an array of ints is a little bit creepy. Why not just use 0 instead? (it's basically the same value and you don't force the compiler to cast your char to int)
I haven't analyzed your solution in depth but those inner loops makes me a little bit skeptic about the O(N) complexity you're claiming.
Suppose A being a subarray which already holds the criteria, and p is the adjacent member to A, currently being checked.
if p+min1 >= k /*and p+min2 >= k*/
add p to the A,
update min1 /*and min2 values. */
where min1 is the smallest member of A
/* min2 is the smallest member of A which is greater than or equal to min1.
*/
O(N)
(min2 >= min1) and (p + min1 >= k) implies that (p + min2 >= k)
Hence use of min2 is redundant.
void PrintMax(int *a, int length, int k)
{
int start = 0;
int delta = k-a[start];
int end = 0;
int i = 1;
int maxend = 0;
int counter=0;
while(i < length)
{
counter++;
int cur = a[i];
if(cur >= delta )
{
end = i;
int tempDelta = k-cur;
if(tempDelta > delta)
delta = tempDelta;
i++;
}
else
{
if(end > maxend)
{
cout<<"\n Sub array...";
for(int j = start;j<=end; j++)
cout<<" "<<a[j];
maxend =end;
}
start++;
i = start+1;
delta = k- a[start];
}
}
if(end > maxend)
{
cout<<"\n Sub array...";
for(int j = start;j<=end; j++)
cout<<" "<<a[j];
}
}
Added minor correction to use the gained knowledge for starting next subarray search.
void PrintMax(int *a, int length, int k)
{
int start = 0;
int delta = k-a[start];
int end = 0;
int i = 1;
int maxend = 0;
int counter=0;
while(i < length)
{
counter++;
int cur = a[i];
if(cur >= delta )
{
end = i;
int tempDelta = k-cur;
if(tempDelta > delta)
delta = tempDelta;
i++;
}
else
{
if(end > maxend)
{
cout<<"\n Sub array...";
for(int j = start;j<=end; j++)
cout<<" "<<a[j];
maxend =end;
}
if(cur > a[start] )
start++;
else
start = i;
i = start+1;
delta = k- a[start];
}
}
if(end > maxend)
{
cout<<"\n Sub array...";
for(int j = start;j<=end; j++)
cout<<" "<<a[j];
}
}
i made a code to work it out .. complexity is surely less than O(n^2) . Please prove my code wrong because i haven't used either min heap or dequeue to solve it . :)
#include <iostream>
using namespace std;
void display(int B[],int n)
{
if(n>1)
{
for(int i=0;i<n;i++)
cout<<B[i]<<" ";
}
cout<<"\n";
}
int main()
{
int A[12]= {1,2,3,4,5,-5,1,2,3,4,5,-5};
int B[10];
int k,i=1;
int check=1;
int min=0,count=1,flag=0;
cout<<"Enter k\n";
cin>>k;
B[0]=A[0];
while(i<12) // 12 is harcoded .. take n instead
{
if(A[min]+A[i]>k) //to check if the condition satisfies
{
if(i>=check) flag=1; //minimum index to counter(to exclude any minimal subset)
if(A[i]<A[min]) //update min position
min=i;
B[count]=A[i];
count++; //total count in sub-array
i++;}
else
{ if(flag==1) //if displayable ( new elements added)
{display(B,count);flag=0;check=i;} // it should not be displayed till the index i is rediscovered again
{min=min+1;B[0]=A[min];i=min+1;count=1;}
}
}
if(flag==1)
display(B,count);
system("pause");
return 0;
}
"Please prove my code wrong because i haven't used either min heap or "
The burden of proof is on you to prove the correctness of your code, not on other people to prove it wrong.
Well the code does this :-
Step 1 It keeps the track of the minimum element index all the time..
Step 2 Checks if the condition holds and keeps storing them in another array B till it holds
Step3 When condition fail the current B array is displayed and then step 1 and 2 are repeated starting from (minimum+1) index because all the elements before that index are irrelevant now
Step 4 if all the elements are traversed exit and display again
As you can see the logic is simple ... keep track of the minimum element. keep going till the condition holds and if it fails the minimum element and all the elements before it are irrelevant now because the condition would fail if we include them. so restart from minimum + 1 instead.
This is a standard maximum sub array problem, running time O(nlogn):
Algorithm:
_ recurse on the left_array from low to mid
_ recurse on the right_array from mid+1 to high
_ Find maxArray recursively on the mid_array from low to high
_ compare each of the value to k and see which one is bigger than k, return
findMaxArray for mid:
start from mid to low find the largest sum
start from mid + 1 to high find the largest sum
return totalsum and the index of left and right.
Here is the concrete example written in python:
def findMaxSubArray(array,k):
def helper(array,low,high):
if high == low:
return array[low],low,high
mid = (low+high)/2
left_array,left_left_index,left_right_index = helper(array,low,mid)
right_array,right_left_index,right_right_index = helper(array,mid+1,high)
mid_array,mid_left_index,mid_right_index = findMid(array,low,mid,high)
if left_array >= k:
return left_array,left_left_index,left_right_index
elif right_array >= k:
return right_array,right_left_index,right_right_index
elif mid_array >= k:
return mid_array,mid_left_index,mid_right_index
else:
return -1,0,0
return helper(array,0,len(array)-1)
def findMid(array,low,mid,high):
left = float("-inf")
left_sum = 0
max_left = 0
max_right = 0
for i in range(mid,low-1,-1):
left_sum += array[i]
if left_sum >= left:
left = left_sum
max_left = i
right = float("-inf")
right_sum = 0
for i in range(mid+1,high):
right_sum += array[i]
if right_sum >= right:
right = right_sum
max_right = i
else:
break
return left + right,max_left,max_right
import random
f = lambda x:-1 if x%3 == 0 else 1
a = [random.randint(0,20)*f(i) for i in range(20)]
print a
print findMaxSubArray(a,20)
print a[findMaxSubArray(a,20)[1]:findMaxSubArray(a,20)[2]]
This algorithm only return meaningful array only if there is negative number in the array, otherwise, it will be O(n) just go from left to right until you find the array that have sum greater than k
Vikas, what is DP? Dynamic programming?
and as others mentioned, it does seem like they want the longest such subarray, right?
o(n*logn) for contiguous sub-array:
int array[MAXLEN], size, threshold;
struct info {
int min;
int leftend;
int rightend;
int mark;
int count;
};
static int SortFunc(const void *in1, const void *in2) {
int ind1=*((int*)in1), ind2=*((int*)in2);
if (array[ind1]<array[ind2]) return 1;
else if (array[ind1]>array[ind2]) return -1;
else return 0;
}
void find() {
struct info *arrayinfo;
int *arrind, x, y, ind, min, count, left, right, max;
arrayinfo=(struct info *)malloc(sizeof(struct info)*size);
arrind=(int *)malloc(sizeof(int)*size);
bzero(arrayinfo, sizeof(struct info)*size);
bzero(arrind, sizeof(int)*size);
for (x=0; x<size; x++) {
arrind[x]=x;
arrayinfo[x].leftend=arrayinfo[x].rightend=-1;
}
qsort(arrind, size, sizeof(int), SortFunc);
for (x=0; x<size; x++) {
ind=arrind[x];
if ((ind==0 || arrayinfo[ind-1].mark==0) && (ind==size-1 || arrayinfo[ind+1].mark==0)) {
arrayinfo[ind].mark=1;
arrayinfo[ind].count=0;
arrayinfo[ind].leftend=ind;
arrayinfo[ind].rightend=ind;
arrayinfo[ind].min=array[ind];
}
else {
if (ind>0 && arrayinfo[ind-1].mark==1 && array[ind]+arrayinfo[ind-1].min > threshold) {
min=(array[ind] > arrayinfo[ind-1].min) ? arrayinfo[ind-1].min : array[ind];
left=arrayinfo[ind-1].leftend;
if (arrayinfo[ind-1].count==0)
count=2;
else count=arrayinfo[ind-1].count+1;
arrayinfo[ind].mark=1;
arrayinfo[ind].leftend=arrayinfo[ind-1].leftend;
arrayinfo[ind].rightend=ind;
arrayinfo[ind].min=min;
arrayinfo[ind].count=count;
arrayinfo[left].rightend=ind;
arrayinfo[left].min=min;
arrayinfo[left].count=count;
}
if (ind<size-1 && arrayinfo[ind+1].mark==1 && array[ind]+arrayinfo[ind+1].min > threshold) {
min=(array[ind] > arrayinfo[ind+1].min) ? arrayinfo[ind+1].min : array[ind];
right=arrayinfo[ind+1].rightend;
if (arrayinfo[ind].count==0) {
if (arrayinfo[ind+1].count==0)
count=2;
else count=arrayinfo[ind+1].count+1;
}
else if (arrayinfo[ind+1].count==0)
count=arrayinfo[ind].count+1;
else count=arrayinfo[ind].count+arrayinfo[ind+1].count;
if (arrayinfo[ind].leftend==-1) {
left=ind;
arrayinfo[ind].leftend=ind;
}
else left=arrayinfo[ind].leftend;
arrayinfo[ind].mark=1;
arrayinfo[left].min=min;
arrayinfo[left].count=count;
arrayinfo[left].rightend=arrayinfo[ind+1].rightend;
arrayinfo[right].leftend=arrayinfo[ind].leftend;
arrayinfo[right].min=min;
arrayinfo[right].count=count;
}
}
}
max=-1;
for (x=0; x<size; x++)
if (arrayinfo[x].count>max) max=arrayinfo[x].count;
for (x=0; x<size; x++)
if (arrayinfo[x].count==max) break;
right=arrayinfo[x].rightend;
for (; x<=right; x++)
printf(" %d", array[x]);
printf("\n");
free(arrind);
free(arrayinfo);
}
This is incorrect. When min+a[i]<=k, you can't simply make j=i and min=a[i] and continue. You need to start over from j=j+1.
For example, k=7, input array is 145535, at index 4(value: 3), you can't make j=i (4). As you can see, the longest sub array is 5535. When you make j jump to index 4 directly, you will never find this sub array. In your code, you only find 455 and 35 while miss 5535.
We can sort the array in n log(n) time and then for each element 'a' in the array we can use binary search to find 'k-a' in the array in log(n) time...even if we dont find the 'k-a' the elements on the right of the index at the end of the search would have sum greater than k....the order of the algo will be n log(n).
Your worst case can't be what you claim it is. What if the longest subarray is much smaller than the total size of the array? You have to look at all, or at least most of, the elements. Your complexity has to be at least O(input size).
Your complexity is not o(n) nor o(2n). Every time you meet a value smaller than n/2, you "start over" from the position of previous member which is less than n/2. In the worst case, your complexity is o(n^2).
No - It only starts over once, because you can only have one value smaller than k/2. So at most it goes across the largest subarray twice, thus 2n
Eugene, you are correct. So the worst case is 2n, with n as the length of the input array.
Michael, you can only have one member smaller than n/2 in the sub-array, but there can be lots of members in the input array which is smaller than n/2. For example, k=7 and the input array is 983562......, when you reach 2 at index 5, you need to start over from index 3 (value:5). This makes your code o(n^2) where n is the length of the input array.
This can be done in O(N*log(N)) time using a min-heap or O(N) using a dequeue.
Basically, the algorithm works like this: for each index i in the array computes the longest subarray that ends at position i and satisfies the requested condition.
Now, let's consider we're at index i, and [j ... i-1] is the longest subarray found in the previous iteration (for i-1). In order to compute the subarray for this iteration we need to find the smallest j' >= j such that min(a[j'], .., a[i-1]) + a[i] >= K.
Now, the trick is how to find j' efficiently.
A first approach is to use a min-heap and start with j' = j and then increment j' and remove element a[j'] from heap until the condition holds (or you reach i). Since j is incremented at most N times => there are a total of N calls to heap.remove_element. Since i is incremented N times => there are N calls to heap.insert_element. => final complexity O(N*log(N)).
A second approach, which is a little bit trickier (I suggest getting a pen and paper for this) is using a deque instead of heap. The constructed deque will have these important properties:
- in the front of the deque is index of the minimum element in seq [j..i-1] (just like the heap)
- the second element is the index of the minimum element in the sequence that remains after removing the first minimum along with the elements in front of it;
- and so on.
So basically if dequeue = [m1, m2, ...] then the initial sequence looks like this [j ... m1 ... m2 ... i-1], and:
- m1 is the index of minimum of sequence [j .. i-1],
- m2 is the index of minimum of sequence (m1 .. i-1] (please note that the interval is open at m1)
I won't explain how you perform the operations on the dequeue in order to prserve those properties (try to think them yourself or look at the code / if you have any questions feel free to ask). You have the implementation below for the time-optimal (dequeue) solution. The methods for updating the deque are push(i) - updates the deque by adding element a[i] and popBadMins() which removes minimums from dequeue and returns the new j'.
Friendly advice: If you're not familiar with dequeue trick, I suggest you try to understand it because it proved to be helpful in programming contests.
Note: Didn't test this thoroughly so I might have missed some corner cases.
- cristian.botau October 31, 2012Oh, and sorry for the long post.