Google Interview Question
Software Engineer / Developers1) Create a hash table for the m elements in B and assign a negative value, say -1, to the elements mapped.
2)In array A, we are going to make use of 3 pointers to find the solution:
Pointer X - To maintain the starting position of the window.
Pointer Y - Maintains the current position. Starting at Position zero and linear scans the list while checking for each element in the Hashtable.
Pointer Z - To the maintain the ending position of the window.
We also maintain a LL to store every window(X,Z) that was found.
Start with X=0, Y=0, Z=m; (Z=m,since the window can't be less than m)
Scan A from left to right and check for each element in the array with the hash table. If an element is found in the hash table, we are going to update the hash table value with the current index value.
The scanning is done in the following manner. Y is the index of the current element:
1. Until the first element is found keep increasing x, z by 1.
2. For every new element A[y](Hash[A[y]] == -1) after the first element found, update hash value with current index.
3. For every old element(Hash[A[y]]!= -1) found in the hash table, increase Z by 1. Then, starting from A[X] check for the first element present in the hash table and update X with that index.(Make sure this new A[X] does not repeat itself until a second value starting from A[X] is found in the hash table)
4. For every element not found, increase Z by 1.
4.Repeat 2,3 till Y = Z. When Y = Z and if Z<n, add (X,Z) to the LL.
5.Repeat 2,3,4 till Y=n. If the LL is empty and if the any of the hash table value is -1, there is no such solution window. Else return the least element from the (Z-X) value from the LL.
The time complexity of this solution O(3n): O(n) for y + O(n-m) for x + O(n-m) for z + O(m) to build hash table + O(LL) to build the LL.
If B contains repetition it will not be possible to create a hash table for the m elements in B.
The idea behind is good. But it took time to understand the explanations.
Here is the same in other words:
0. A.Length = N, B.Length = K. Hash[i] = -1, where i belongs to [0..K-1]. x = 0. z = K - 1.
1. Until the first element of B array is found in A (A[x]) keep increasing x and z.
2. y = x + 1
3. For A[y] till y == z AND y < N AND z < N:
* if Hash[A[y]] does't exist (A[y] is not an element of B), z = z + 1
if Hash[A[y]] exists AND (Hash[A[y]] == -1 OR Hash[A[y]] < x) (A[y] is an element of B AND (we haven't met it before or we last we met it before current start index) ), update Hash[A[y]] = y.
if Hash[A[y]] exists AND Hash[A[y]] != 1 AND Hash[A[y]] >= x (A[y] is an element of B AND we have met this element within the current [x,z] range), then z = z + 1, update Hash[A[y]] = y. If Hash[A[x]] == x, then z = z + 1. Scan for the first A[q] which is element of B. x = q.
* y = y + 1. Go to 3.
5. If y == z push interval [x,z] to intervals list. y = y + 1. Go to 3.
6. If y == z == N - 1: If Hash[A[x]] == x, then z = z + 1. Scan for the first A[q] which is element of B. x = q. Go to 3.
Main idea here is that we always try to keep so many elements of array A in a set between x and z indexes, that we potentially can find all B elements in that set.
Hopefully, this is a little clearer.
By window, I am assuming that is a contiguous set of numbers in A.
Brute Force Algo
There are a total of n-m+1 such subsets of A whose length is m. Since the length of the required window is greater than or equal to m, then the number of windows to evaluate is sigma(n-k+1) where k runs from m to n which is quadratic in nature.
Sounds like we could use greedy strategies to intelligently eliminate looking at all subsets depending on the knowledge of what we have gained till then.
One possible input to the greedy measure is that since the windows is the smallest, the starting and end elements of A should be a member of B.
One possible input to the greedy measure is that since the windows is the smallest, the starting and end elements of A should be a member of B.
create a hashtable structure which will hold linked lists for elements - structA
also put all elements of B in simple hashset B.
now iterate through elements of array A:
if elements of A is present in hashset B, then add the position in structA.
once all elements of A are iterated,
if structA contains zero length list of any element, then the array is not found.
else
find the maximum of all 1st elements in structA - this gives the upper bound location.
to find the starting point, first find the maximum in each StructA list, the minimum from this list gives us starting point.
The array from starting point to upper bound is the smallest set in A which contains all elements from B.
Time complexity should be = n (first scan) + log(m) (search upper bound) + n (scan for maximums - it's last elem in each list) + log(m)
I did not understand the question. Can some one post the sample input and the sample output
I am assuming that all the elements of Array B will be present in array A, but I think this assumption makes the question very simple.
What am I missing here ?
e.g B(1)@A{2,5,7};
B(2)@A{3,6,9,10};
B(3)@A{11,12,17,21}
we just need to find smallest window of B{1,2,3}
-----> the above enumeration of the locations of elements of B in set A can be done in O(n)
next, we just need to sort the enumerations of each element's position, e.g for B(3) sort 11,12,17,21 in ascending order
and take the difference of the largest 1st number to the smallest first number in these series, e.g diff(2,11) as the min window size.
Overall O(nlogn) Solution
/*Two assumption I am taking
1) B has unique integers
2) There is one windows exists which contain all the elements */
package google5;
import java.util.HashMap;
public class MinWindow {
public static void main(String args[]) {
int a[] = { 8, 0, 1, 2, 3, 89, -90, 1 };
int b[] = { 1, 3, -90 };
getMinimumWindow(a, b);
}
private static void getMinimumWindow(int[] a, int[] b) {
// TODO Auto-generated method stub
int n = a.length;
int m = b.length;
HashMap<Integer, Integer> aMap = new HashMap<Integer, Integer>();
init(aMap, b);
int i = 0;
int j = 0;
int k = m - 1;
while (j <= k && k < n) {
int key = a[j];
boolean isExists = aMap.containsKey(key);
int value = 0;
if (isExists) {
value = aMap.get(key);
}
if (!isExists) {
//increament i j k until first match
if (i == j) {
i++;
j++;
k++;
} else {
j++;
k++;
}
}
//if exists count the frequency
if (isExists) {
value = value + 1;
aMap.put(key, value);
j++;
}
// So increament i if we have frequency of a[i]>1
while (aMap.containsKey(a[i])) {
value = aMap.get(a[i]);
if (value > 1) {
aMap.put(a[i], value - 1);
i++;
} else {
break;
}
}
}
System.out.println(i);
System.out.println(j);
}
Initialize the Map count frequency of integer in current window
private static void init(HashMap<Integer, Integer> aMap, int[] b) {
// TODO Auto-generated method stub
for (int i : b) {
aMap.put(i, 0);
}
}
}
/*
Very complex logic even I can not code it again but I try please test it or give some input
*/
import java.util.HashMap;
public class MinWin2 {
public static void main(String args[]) {
int a[] = { 1, 2, 3, 4, 8, 3, 2, 3, 3, 3 };
int b[] = { -90, 1, 3, -90, 10 };
System.out.println("First Case");
getMinimumWindow(a, b);
int c[] = { -90, 1, 3, 0, 10, -90, 3, -90, 5, 10 };
System.out.println("Second Case");
getMinimumWindow(c, b);
}
private static void getMinimumWindow(int[] a, int[] b) {
// TODO Auto-generated method stub
int n = a.length;
int m = b.length;
if (m > n)
return;
HashMap<Integer, int[]> aMap = new HashMap<Integer, int[]>();
init(aMap, b);
int i = 0;
int j = 0;
int start = 0;
int end = n - 1;
int match = 0;
while (true) {
int key = a[j];
boolean isExists = aMap.containsKey(key);
int[] value = null;
if (isExists) {
value = aMap.get(key);
value[0] = value[0] + 1;
j++;
if (value[0] <= value[1])
match++;
} else {
if (match == 0)
i++;
if (j < n - 1)
j++;
}
if (match == m) {
if (end - start > j - i) {
end = j;
start = i;
}
value = aMap.get(a[i]);
value[0]--;
i++;
if (value[0] < value[1])
match--;
while (!aMap.containsKey(a[i])) {
i++;
}
}
if (j == n - 1 && match < m)
break;
}
System.out.println(start);
System.out.println(end);
}
// Initialize the Map count frequency of integer in current window
private static void init(HashMap<Integer, int[]> aMap, int[] b) {
// TODO Auto-generated method stub
for (int i : b) {
int[] value;
if (aMap.containsKey(i)) {
value = aMap.get(i);
value[1]++;
} else {
value = new int[2];
value[0] = 0;
value[1] = 1;
}
aMap.put(i, value);
}
}
}
<pre lang="java" line="1" title="CodeMonkey36280" class="run-this">/* The class name doesn't have to be Main, as long as the class is not public. */
class Foo
{
public final String ERROR = "Error: ";
public String findMinimumWindow(String A, String B){
int min = 0;
int max = 0;
if (A != null && A.length() != 0) {
min = A.length();
max = -1;
if (B != null && B.length() != 0){
for ( int i=0; i < B.length(); i++)
{
int index = -1;
index = A.indexOf(B.charAt(i));
if (index == -1)
return ERROR + "'" + B.charAt(i) + "' not found" ;
if (min > index)
min = index;
if (max < index)
max = index;
}
}else {
return ERROR + "B is empty";
}
}else {
return ERROR + "A is empty";
}
return A.substring(min, max+1);
}
public static void main(String[] args) {
Foo foo = new Foo();
System.out.println(foo.findMinimumWindow("", null));
System.out.println(foo.findMinimumWindow(null, ""));
System.out.println(foo.findMinimumWindow("", ""));
System.out.println(foo.findMinimumWindow("abcdefgh", "dfc"));
System.out.println(foo.findMinimumWindow("a", "b"));
}
}
</pre><pre title="CodeMonkey36280" input="yes">1
2
10
42
11
</pre>
My solution:
a. Create a hash table with m keys, one for each value in B. Each key in H maps to a dynamic array of sorted indices containing indices in A that are equal to B[i]. This takes O(n) time. We go through each index j in A. If key A[i] exists in H (O(1) time) then add an value containing the index j of A to the list of indices that H[A[i]] maps to.
At this point we have 'binned' n elements into m bins. However, total storage is just O(n).
b. The 2nd part of the algorithm involves maintaining a ‘left’ index and a ‘right’ index for each list in H. Lets create two arrays of size m called L and R that contain these values. Initially in our example,
L = (0, 0, 0, 0)
R = (1, 1, 1, 1)
We also keep track of the “best” minimum window.
We then iterate over the following actions on L and R which are inherently greedy:
i. In each iteration, we compute the minimum and maximum values in L and R.
For L, Lmax - Lmin is the window and for R, Rmax - Rmin is the window. We update the best window if one of these windows is better than the current best window. We use a min heap to keep track of the minimum element in L and a max heap to keep track of the largest element in R. These take O(m*log(m)) time to build.
ii. From a ‘greedy’ perspective, we want to take the action that will minimize the window size in each L and R. For L it intuitively makes sense to increment the minimum index, and for R, it makes sense to decrement the maximum index.
We want to increment the array position for the minimum value until it is larger than the 2nd smallest element in L, and similarly, we want to decrement the array position for the largest value in R until it is smaller than the 2nd largest element in R.
Next, we make a key observation:
If L[i] is the minimum value in L and R[i] is less than the 2nd smallest element in L, ie, if R[i] were to still be the minimum value in L if L[i] were replaced with R[i], then we are done. We now have the “best” index in list i that can contribute to the minimum window. Also, all the other elements in R cannot contribute to the best window since their L values are all larger than L[i]. Similarly if R[j] is the maximum element in R and L[j] is greater than the 2nd largest value in R, we are also done by setting R[j] = L[j]. Any other index in array i to the left of L[j] has already been accounted for as have all indices to the right of R[j], and all indices between L[j] and R[j] will perform poorer than L[j].
Otherwise, we simply increment the array position L[i] until it is larger than the 2nd smallest element in L and decrement array position R[j] (where R[j] is the max in R) until it is smaller than the 2nd largest element in R. We compute the windows and update the best window if one of the L or R windows is smaller than the best window. We can do a Fibonacci search to optimally do the increment / decrement. We keep incrementing L[i] using Fibonacci increments until we are larger than the 2nd largest element in L. We can then perform binary search to get the smallest element L[i] that is larger than the 2nd largest element in L, similar for the set R. After the increment / decrement, we pop the largest element from the max heap for R and the minimum element for the min heap for L and insert the new values of L[i] and R[j] into the heaps. This is an O(log(m)) operation.
Step ii. would terminate when Lmin can’t move any more to the right or Rmax can’t move any more to the left (as the R/L values are the same). Note that we can have scenarios in which L[i] = R[i] but if it is not the minimum element in L or the maximum element in R, the algorithm would still continue.
Runtime analysis:
a. Creation of the hash table takes O(n) time and O(n) space.
b. Creation of heaps: O(m*log(m)) time and O(m) space.
c. The greedy iterative algorithm is a little harder to analyze. Its runtime is really bounded by the distribution of elements. Worst case, we cover all the elements in each array in the hash table. For each element, we perform an O(log(m)) heap update.
Worst case runtime is hence O(n*log(m)) for the iterative greedy algorithm. In the best case, we discover very fast that L[i] = R[i] for the minimum element in L or the maximum element in R…run time is O(1)*log(m) for the greedy algorithm!
Average case seems really hard to analyze. What is the average “convergence” of this algorithm to the minimum window. If we were to assume that the Fibonacci increments / binary search were to help, we could say we only look at m*log(n/m) elements (every list has n/m elements) in the average case. In that case, the running time of the greedy algorithm would be m*log(n/m)*log(m).
Total running time
Best case: O(n + m*log(m) + log(m)) time = O(n) assuming m << n
Average case: O(n + m*log(m) + m*log(n/m)*log(m)) time = O(n) assuming m << n.
Worst case: O(n + n*log(m) + m*log(m)) = O(n*log(m)) assuming m << n.
Space: O(n + m) (hashtable and heaps) always.
My solution:
a. Create a hash table with m keys, one for each value in B. Each key in H maps to a dynamic array of sorted indices containing indices in A that are equal to B[i]. This takes O(n) time. We go through each index j in A. If key A[i] exists in H (O(1) time) then add an value containing the index j of A to the list of indices that H[A[i]] maps to.
At this point we have 'binned' n elements into m bins. However, total storage is just O(n).
b. The 2nd part of the algorithm involves maintaining a ‘left’ index and a ‘right’ index for each list in H. Lets create two arrays of size m called L and R that contain these values. Initially in our example,
L = (0, 0, 0, 0)
R = (1, 1, 1, 1)
We also keep track of the “best” minimum window.
We then iterate over the following actions on L and R which are inherently greedy:
i. In each iteration, we compute the minimum and maximum values in L and R.
For L, Lmax - Lmin is the window and for R, Rmax - Rmin is the window. We update the best window if one of these windows is better than the current best window. We use a min heap to keep track of the minimum element in L and a max heap to keep track of the largest element in R. These take O(m*log(m)) time to build.
ii. From a ‘greedy’ perspective, we want to take the action that will minimize the window size in each L and R. For L it intuitively makes sense to increment the minimum index, and for R, it makes sense to decrement the maximum index.
We want to increment the array position for the minimum value until it is larger than the 2nd smallest element in L, and similarly, we want to decrement the array position for the largest value in R until it is smaller than the 2nd largest element in R.
Next, we make a key observation:
If L[i] is the minimum value in L and R[i] is less than the 2nd smallest element in L, ie, if R[i] were to still be the minimum value in L if L[i] were replaced with R[i], then we are done. We now have the “best” index in list i that can contribute to the minimum window. Also, all the other elements in R cannot contribute to the best window since their L values are all larger than L[i]. Similarly if R[j] is the maximum element in R and L[j] is greater than the 2nd largest value in R, we are also done by setting R[j] = L[j]. Any other index in array i to the left of L[j] has already been accounted for as have all indices to the right of R[j], and all indices between L[j] and R[j] will perform poorer than L[j].
Otherwise, we simply increment the array position L[i] until it is larger than the 2nd smallest element in L and decrement array position R[j] (where R[j] is the max in R) until it is smaller than the 2nd largest element in R. We compute the windows and update the best window if one of the L or R windows is smaller than the best window. We can do a Fibonacci search to optimally do the increment / decrement. We keep incrementing L[i] using Fibonacci increments until we are larger than the 2nd largest element in L. We can then perform binary search to get the smallest element L[i] that is larger than the 2nd largest element in L, similar for the set R. After the increment / decrement, we pop the largest element from the max heap for R and the minimum element for the min heap for L and insert the new values of L[i] and R[j] into the heaps. This is an O(log(m)) operation.
Step ii. would terminate when Lmin can’t move any more to the right or Rmax can’t move any more to the left (as the R/L values are the same). Note that we can have scenarios in which L[i] = R[i] but if it is not the minimum element in L or the maximum element in R, the algorithm would still continue.
Runtime analysis:
a. Creation of the hash table takes O(n) time and O(n) space.
b. Creation of heaps: O(m*log(m)) time and O(m) space.
c. The greedy iterative algorithm is a little harder to analyze. Its runtime is really bounded by the distribution of elements. Worst case, we cover all the elements in each array in the hash table. For each element, we perform an O(log(m)) heap update.
Worst case runtime is hence O(n*log(m)) for the iterative greedy algorithm. In the best case, we discover very fast that L[i] = R[i] for the minimum element in L or the maximum element in R…run time is O(1)*log(m) for the greedy algorithm!
Average case seems really hard to analyze. What is the average “convergence” of this algorithm to the minimum window. If we were to assume that the Fibonacci increments / binary search were to help, we could say we only look at m*log(n/m) elements (every list has n/m elements) in the average case. In that case, the running time of the greedy algorithm would be m*log(n/m)*log(m).
Total running time
Best case: O(n + m*log(m) + log(m)) time = O(n) assuming m << n
Average case: O(n + m*log(m) + m*log(n/m)*log(m)) time = O(n) assuming m << n.
Worst case: O(n + n*log(m) + m*log(m)) = O(n*log(m)) assuming m << n.
Space: O(n + m) (hashtable and heaps) always.
Solved example for my post:
A[5, 1, 1, 5, 6, 1, 1, 5]
B[5, 6]
H:
{
5 => {1, 4, 8}
6 => {5}
}
Greedy Algorithm:
L => {1, 1}
R => {3, 1}
Iteration 1:
a. Lmin = 1 (since H{5}[1] < H{6}[1]), Lmax = 5. Window: 5 - 1 + 1= 5
Increment Lmin pointer, it now becomes 2.
L => {2, 1}
Rmin = H{6}[1] = 5, Rmax = H{5}[3] = 8. Window = 8 - 5 + 1 = 4. Best window so far = 4 (less than 5 computed above).
We also note the indices in A (5, 8) for the best window.
Decrement Rmax, it now becomes 2 and the value is 4.
R => {2, 1}
b. Now, Lmin = 4 (H{5}[2]) and the index i in L is 1. Lmax = 5 (H{6}[1]) and the index in L is 2.
We can't increment Lmin since L[1] = R[1] = 2. Thus we just compute the window now.
The window = Lmax - Lmin + 1 = 2 which is the best window so far.
Thus, the best window in A = (4, 5).
Supppose the size of larger array is m and size of smaller array is n.
First sort the smaller array(O(nlogn))).
Then for each element in larger array perform a binary search in that sorted array.
When the first match occurs associate a two pointers say low and high with that array that will keep track of the end of the window.
When the next match is found just move the high pointer to that position.
Cover the whole larger array in this manner and u get a range between those two pointers which is the required window.
Algo
- aseemagarwal19 October 03, 2010First build an array for storing the positions of the elements of B as they occur in A
e.g A={7,6,3,2,1,4,8,7,9,10,5,3}
B={3,5,7,9}
then we have {2,11}
{10}
{0,7}
{8}
Then we find each permutation of path recursively
2,10,0,8 diff-10
2,10,7,8 diff-10
11,10,0,8 diff-11
11,10,7,8 diff-4
answer is the minimum path