Amazon Interview Question
Testing / Quality Assurances"S1/n1 = S2/n2 = (S1+S2)/(n1+n2) = S/n"
wrong, this s1/n1 = s2/n2 doesn't imply (s1+s2)/(n1+n2)
@sinvija - s1/n1 = s2/n2 doesn't imply (s1+s2)/(n1+n2)
You are mistaken in assuming that by saying "S1/n1 = S2/n2 = (S1+S2)/(n1+n2) = S/n"
the writer wants to say that s1/n1 = s2/n2 implies (s1+s2)/(n1+n2).
Instead this derives from the conditions.
average is same for two parts, so s1/n1 = s2/n = S/n.
Now just replace S with (s1+s2) & n with (n1+n2).
So what, I often ask to solve NP problems on interviews. Many NP problems have trivial solutions that give you close to optimal results in average case etc. This is DP question BTW.
@Anonymous: Yes partition problem is NP, and you can't find exact solution for NP problems if problem set becomes larger, however, there are subset of NP problems which can be solved in pseudo linear time using DP if space constraint is satisfied, and they are called weak NP problems. Partition problem and subset-sum problem happens to be one of those weak NP problems which can be solved in pseudo linear time using DP
{{
private static boolean makeSameAverage(int... a) {
int sum = 0;
for (int n : a)
sum += n;
boolean dp[][] = new boolean[sum][a.length];
dp[0][0] = true;
for(int n : a){
for (int s = n; s < sum; ++s) {
for (int i = 0; i < a.length - 1; ++i) {
if (dp[s - n][i]) {
dp[s][i + 1] = true;
// found something?
if (i > 0 && i < a.length && sum * i == s * a.length)
return true;
}
}
}
}
return false;
}
}}
private static boolean makeSameAverage(int... a) {
int sum = 0;
for (int n : a)
sum += n;
boolean dp[][] = new boolean[sum][a.length];
dp[0][0] = true;
for(int n : a){
for (int s = n; s < sum; ++s) {
for (int i = 0; i < a.length - 1; ++i) {
if (dp[s - n][i]) {
dp[s][i + 1] = true;
// found something?
if (i > 0 && i < a.length && sum * i == s * a.length)
return true;
}
}
}
}
return false;
}
If both parts have the same average and you add the 2 segments together the average is the same as the whole.
So do this:
1. Go over the array and get the length and sum, then divide them to get average. O(n)
2. Then use this O(n^2) algorithm to find a subset that has the same average.
public List foundSetWithAverage(int[] arr, float avg)
{
return foundSetWithAverageHelper(arr, avg, 0, 0, 0);
}
private List findSetWithAverageHelper(int[] arr, int pos, float avg, int sum, int len)
{
if (pos == arr.length)
{
return null;
}
List list = findSetWithAverageHelper(arr, pos + 1, avg, sum, len);
if (list != null)
{
return list;
}
sum = sum + arr[pos];
len = len + 1;
if (sum / len == avg)
{
list = new List();
list.add(pos);
return list;
}
list = findSetWithAverageHelper(arr, pos + 1, avg, sum, len);
if (list != null)
{
list.add(pos);
return list;
}
return null;
}
3. Then split the original one into 2 other ones O(n)
S= S1 union S2
be the partition
s1 is summation of S1
s2 is summation of S2
s1/n1 =s2/n2 = (s1+s2)/(n1+n2)=s/n where s is summation of all elements of S
let P(i,j) denote there is a subset of 1..i such that summation=j
so find P(i,j) such that j=s/n
and
P(i,j) = max { P(i-1,j) ,P(i-1,j-ai) } ai is the ith element
if P(n,s/n)=1 then return "yes"
else return "no partition"
I think following recursion will be helpful:
bool func(number_of_elements, required_sum, range)
{
if (range < number_of_elements) return false;
if (required_sum < 0) return false;
return func(number_of_elements, required_sum, range - 1) || func(number_of_elements - 1, required_sum - arr[range - 1], range - 1);
}
I would not probably use DP specifically for this problem. Memoization seems like a better idea, as I am guessing that most of the sub-problems will not get used.
FuncResults = new [N][S][N] // where N is number of elements in the array. S is the overall sum of the array.
The array will be initialized to -1 representing mot calculated so far.
bool func(number_of_elements, required_sum, range)
{
if (range < number_of_elements) return false;
if (required_sum < 0) return false;
// not calculated so now is the time to calculate.
if (FuncResults[number_of_elements][required_sum][range] == -1)
{
// other values are 0 or 1.
FuncResults[number_of_elements][required_sum][range] = func(number_of_elements, required_sum, range - 1) || func(number_of_elements - 1, required_sum - arr[range - 1], range - 1);
}
return FuncResults[number_of_elements][required_sum][range];
}
Now let's come to the partition.
All sums could be in range of Min(arr) to S. Number of elements could be one partition could be i and N-i.
for (int s = Min(arr) ; s < S; ++s)
{
for (int elements = 1; elements <= N/2; ++elements)
{
if (s * (N - elements) == (S - s) * elements) // if (s/elements == (S -s) / (N - elements)
AND (func(elements, s, N)))
{
return true;
}
}
return false;
}
Overall cost will be n*n*S
//partition the array into parts such that the difference of avg of two partitions is minimum
//DP problem
// S[i][j][k] = true if there exists some subset of size j from index [0...(k-1)] such that the sum of subset equals i
// S[i][j][k] = S[i][j][k-1] || S[(i-array[k-1])][j-1][k-1]
// E[i][j] is used to store the array elements which would help in finding the elements of the partition
// E[i][j] = array[k] such that S[i-array[k]][j-1][k] = true
// after finding S[i][j][k] for 0<=i<=(sum of array) , 0<=j,k<=size ,
// we need to find S[i][j][size] such that ( (overall avg) - i/j ) is minimum , so check S for 1<=i<=sum , 1<=j<=size/2 , k=size
#include<iostream>
#include<cstdlib>
#include<limits>
#include<cmath>
using namespace std;
int main(){
int *array,size,sum,x,mark_i,mark_j;
int ***S,**E;
double avg,min,tmp_avg,diff;
cin>>size;
array=new int[size];
sum=0;
for(int i=0;i<size;i++){
cin>>array[i];
sum+=array[i];
}
avg=sum*1.0/size;
S=new int**[sum+1];
E=new int*[sum+1];
for(int i=0;i<=sum;i++){
S[i]=new int*[size+1];
E[i]=new int[size+1];
for(int j=0;j<=size;j++){
S[i][j]=new int[size+1];
S[i][j][0]=0;
E[i][j]=-1;
if(j==0){
for(int k=0;k<=size;k++){
if(i==0) S[i][j][k]=1;
else S[i][j][k]=0;
}
}
}
}
S[0][0][0]=1;
for(int i=0;i<=sum;i++){
for(int j=1;j<=size;j++){
for(int k=1;k<=size;k++){
if((x=i-array[k-1])>=0){
if(S[x][j-1][k-1]){
if(E[i][j]==-1) E[i][j]=array[k-1];
}
S[i][j][k] = S[i][j][k-1] || S[x][j-1][k-1];
}
else S[i][j][k] = S[i][j][k-1];
}
}
}
min=numeric_limits<float>::max();
for(int i=0;i<=sum;i++){
for(int j=1;j<=size/2;j++){
if(S[i][j][size]){
tmp_avg=i*1.0/j;
if((diff=fabs(avg-tmp_avg)) < min){
min=diff;
mark_i=i;
mark_j=j;
}
}
}
}
cout<<"overall avg. = "<<avg<<"\nclosest avg. of partition = "<<mark_i*1.0/mark_j;
cout<<"\nthe elements in a partition are:\n";
while(mark_j!=0){
cout<<E[mark_i][mark_j]<<" ";
mark_i = mark_i - E[mark_i][mark_j--];
}
cout<<endl;
}
I have thought of one approach, it is simple so i am thinking i must be missing something. Please break it for me:
Steps:
1) Sort the array in non-decreasing order
2) Make 2 min-queue, let's name them q1,q2
3) 2 variable s1,s2 to keep running sum of both queue elements
4) start from the end of array assign the number to a queue which will make |s1-s2| minimum, in case of tie just insert element in q1
5) so in the end we will have elements in 2 queue such that the diff of sum is minimum ( as this is how we partitioned till now)
6) now shift |e1-e2|/2 elements from min-queue which has more elements to min-queue which has less element.
we are done-------------please comment on this as i am stuck to crash this approach for a test case.
This task could be solved using dynamic programming.
public final class ArrayPartitioner {
public ArrayPartitioner(int[] arr) {
if( arr == null ){
throw new IllegalArgumentException("array parameter is NULL");
}
this.arr = arr;
}
@SuppressWarnings("serial")
public List<List<Integer>> partition(){
final Queue<SubTask> workingQueue = new LinkedList<SubTask>();
workingQueue.add( SubTask.createAddedToFirst(arr[0]) );
workingQueue.add( SubTask.createAddedToSecond(arr[0]) );
final Queue<SubTask> oneStepQueue = new LinkedList<SubTask>();
for( int i = 1; i < arr.length; i++ ){
int value = arr[i];
while( ! workingQueue.isEmpty() ){
SubTask task = workingQueue.poll();
oneStepQueue.add( task.addToFirst(value) );
oneStepQueue.add( task.addToSecond(value) );
}
workingQueue.addAll( oneStepQueue );
oneStepQueue.clear();
}
for( final SubTask task : workingQueue ){
if( task.isAverageSame() ){
return new ArrayList<List<Integer>>(){{
add(task.first);
add(task.second);
}};
}
}
return createEmptyResult();
}
@SuppressWarnings("serial")
private List<List<Integer>> createEmptyResult(){
return new ArrayList<List<Integer>>(){{
add( new ArrayList<Integer>() );
add( new ArrayList<Integer>() );
}};
}
private final int[] arr;
//=== NESTED ====
static class SubTask {
SubTask(){
super();
}
SubTask(List<Integer> first, List<Integer> second, int firstSum,
int secondSum) {
this();
this.first = first;
this.second = second;
this.firstSum = firstSum;
this.secondSum = secondSum;
}
List<Integer> first = new ArrayList<Integer>();
List<Integer> second = new ArrayList<Integer>();
int firstSum = 0;
int secondSum = 0;
double getAverage(boolean useFirst){
if( useFirst ) {
if( first.size() == 0 ){
return 0;
}
return ((double)firstSum)/first.size();
}
if( second.size() == 0 ){
return 0;
}
return ((double)secondSum)/second.size();
}
boolean isAverageSame(){
return Double.compare( getAverage(true),
getAverage(false) ) == 0;
}
SubTask addToFirst(int value){
List<Integer> firstCopy = new ArrayList<Integer>( first );
firstCopy.add( value );
return new SubTask( firstCopy, new ArrayList<Integer>(second),
firstSum + value, secondSum );
}
SubTask addToSecond(int value){
List<Integer> secondCopy = new ArrayList<Integer>( second );
secondCopy.add( value );
return new SubTask( new ArrayList<Integer>(first),
secondCopy, firstSum, secondSum + value );
}
static SubTask createAddedToFirst(int value){
return new SubTask().addToFirst(value);
}
static SubTask createAddedToSecond(int value){
return new SubTask().addToSecond(value);
}
}
}
it is crazy question. if you learn it in the school, you know it, otherwise, it is very hard to solve it in the interview, it a DP question.
That dude with C# example is on steroids or something... he forgot to add CORBA IDL interface for it :-)
Code should be only 10-15 lines at most.
Here is a doc with simple example in C from Cornell univ: www . cs . cornell . edu / ~wdtseng / icpc / notes / dp3.pdf
bool T[10240];
bool partition( vector< int > C ) {
// compute the total sum
int n = C.size();
int N = 0;
for( int i = 0; i < n; i++ ) N += C[i];
// initialize the table
T[0] = true;
for( int i = 1; i <= N; i++ ) T[i] = false;
// process the numbers one by one
for( int i = 0; i < n; i++ )
for( int j = N C[i]; j >= 0; j )
if( T[j] ) T[j + C[i]] = true;
return T[N / 2];
we need to find partition with equal average not with equal sum...this question is different than balanced partition.....can someone formulate the recurssion for this question??
- sort the given array: O(nlogn)
- find the average of the 'sum': O(n)
- if 'sum' is even, then there may be a valid output exists. half the 'sum', say 'x': O(1)
- find a subset from teh array such that there exists a subset which adds to give 'x'. this is optimal coin change problem. if there is no subset exists, then there is no valid output exists: O(nx)
In the above scenario, We'll only get the cases when both the partitions will be having equal sum, but it'll leave the cases where sum isn't equal, only average is equal.
I think the above algorithm will work fine .. .
sort the array , find the average ... sum shud be even to get equal averages otherwise we will not have a solution ..
for example array = [ 1, 2 ]
u cant partition this into 2 parts such that both have same avg ..
As others pointed out that we've to find partition such that average of two partitions are same, it seems that we need to solve the problem of finding all possible sum for a subset of size k (for 1<=k<=n/2), where n is total # of items.
Let S1 = sum of partition 1
n1 = # of items in partition 1
S2 = sum of partition 2
n2 = # of items in partition 2
S = S1+S2
n = n1+n2
Since, S1/n1 = S2/n2, it implies S1/n1 = S2/n2 = (S1+S2)/(n1+n2) = S/n
Now, we populate a table of size (N/2) x N x S, where each entry T[k][i][j] is 1 if we there exists some subset of size k using the set of elements {1,2,...,i} that sum to j. During the computation of this table entries, if we ever found that j/k = S/n, there exists a way to make a partition with equal average. T[k][i][j] can be generated as a recurrence like :
An example to follow above idea:
TC would be O(N^2*S), space could be reduced to O(N*S). Is there any catch in this approach?
- buried.shopno May 02, 2011