Interview Question
InternsCountry: United States
Interview Type: In-Person
Your solution is fundamentally correct, but how could a_k or a_l be the same as a_i or a_j? We know a_i <> a_j, implying a_j - a_i <> 0. We then know that a_k = a_i + 2*(a_j - a_i) = a_j + (a_j - a_i), so clearly a_k is distinct from both a_i and a_j. Same logic applies for a_l. We actually do know all 4 numbers are distinct.
Also, instead of using a tree, a simpler and lower-constant factor method would be to just to have a sorted copy of the input array and do a binary search when you want to find an element.
Yes, a_j <> a_i implies all are distinct. You are right.
And correct about sorted array too.
Here is python code for the sorting + binary search idea
import bisect
def findAS(ints):
sints = sorted(ints)
n = len(sints)
for i in range(0,n):
for j in range(i+1,n):
b, l = check(sints, sints[i], sints[j])
if b:
return b,l
return (False, [])
def check(sints, ai, aj):
if (ai == aj):
return (False, [])
n = len(sints)
ak = ai + 2*(aj - ai)
al = ai + 3*(aj - ai)
pos = bisect.bisect_left(sints, ak)
if pos >= n or sints[pos] != ak:
return (False, [])
pos = bisect.bisect_left(sints, al)
if pos >= n or sints[pos] != al:
return (False, [])
return (True, [ai, aj, ak, al])
def main():
l = [31, 8, 2, 9, 5, 6, 12, 11, 20]
print findAS(l)
if __name__ == "__main__":
main()
@zyfo2: As far as I understand, seems like no one is claiming it will be. But thanks for pointing out.
@Anonymous someone said "instead of using a tree, a simpler and lower-constant factor method would be to just to have a sorted copy of the input array and do a binary search when you want to find an element." I was replying to him.
@zyfo: That statement does not claim hashtable is better. All it says is that compared to a tree, sorting and binary search might be better.
first sort the input.
a(0)<=a(1)<=a(2)...
we need to find the four term with (a(j)-a(i)) == (a(k) - a(j)) == (a(l) - a(k))
for a(0) we check if a(1) can form the four term, we need to search a(0)+2*(a(1)-a(0)) and a(0)+3*(a(1)-a(0)) from the input.
this process should take O(n^2logn) and seems work.
Sort the array.
Then for every combination of i and j, check if such k and l exist.
public static boolean FourTermSequence(int[] array)
{
Sort(array);
int length=array.length(),ai,aj,ak,al;
for(int i=0;i<length;i++)
{
for(int j=0;j<length;j++)
{
if(i==j)
continue;
ak=a[i]+2*(a[j]-a[i]);
al=a[i]+3*(a[j]-a[i]);
if(BinarySearch(array,ak)&&BinarySearch(array,al)
return true;
}
}
return false;
}
public static boolean BinarySearch(int[] array, int find)
{
int low=0,high=array.length()-1,mid;
while(low<high)
{
mid=(low+high)/2;
if(array[mid]==find)
return true;
else if(array[mid]>find)
high=mid-1;
else
low=mid+1;
}
return false;
}
Complexity is O(n2 logn)
Are there any improvements possible?
In order for this to be correct, you should also issue a continue statement if(a[i]==a[j]).
You can cut the constant factor in roughly 2 by only checking ascending sequences in the sorted array. This is correct because let's say there's a descending sequence x, x-y, x-2*y, x-3*y. Then there also exists an ascending sequence, namely (x- 3*y), (x- 3*y) + y, (x- 3*y) + 2*y, (x- 3*y) + 3y (that is, the same sequence backwards). Use
for(int i=0;i<length;i++)
{
for(int j=i+1;j<length;j++)
{
instead of
for(int i=0;i<length;i++)
{
for(int j=0;j<length;j++)
{
Your binary search also contains subtle problems. while(low<high) should be while(low <= high), because your ranges are inclusive on both ends, and you could be at the point where you've narrowed the search down to one element, but that one element could still be the element you're looking for. Also, it's recommended to use mid = low + (high - low)/2; instead of mid=(low+high)/2; to avoid the possibility that low+high could overflow (for very large arrays).
For every possible combination of (a_j, a_i), check if a_i + 2(a_j - a_i) and a_i + 3(a_j - a_i) are present in the input, say using a hastable or a tree.
- Anonymous January 25, 2013Note that the constraint is only a_j <> a_j, this implies a_k and a_l are distinct, but could possibly be same as a_i or a_j.
So either O(n^2) (expected, if using hashtable) or Theta(n^2log n) is using a balanced tree.