Interview Question
Country: United States
First of all, the example output in the question is wrong. Here's the correct output
[4 6 8] [7 6 8] [9 4 8]
[3 4 8] [4 8] [9 7 8]
[3 7 8] [7 8] [3 9 6]
[9 6] [3 6] [3 9]
No of groups: 12
OK, let's analyze the problem. The silly part of the problem is that we have to find and report ALL
the groups. That yields to non-polynomial time complexity algorithm in worst case.
If we had to find any group with a given criteria or prove that no such group exists or solve optimization problem
(e.g. find the group of a maximum length divisible by X) then we could provide a nice polynomial (or as some people say pseudo-polynomial) time
algorithm based on dynamic programming with the complexity O(X*N).
For example, if we want to maximise the length of the group, we could use the following recursive relationship for DP:
dp[i][s] = MAX( dp[i-1][r], dp[i-1][(x+s-arr[i-1])%x] + 1)
where dp[i][s] holds the maximum length of subset formed from the first i elements of the array with a sum s modulo x.
Note, that in my solution all sums are computed using modulo X. so s is a remainder of a division of a sum to X.
dp[arr.size()][0] will contain the length of a largest subset with a sum divisible by x.
Ok, let's solve the original problem. We want to print all groups efficiently and try to keep running time (pseudo)polynomial
if there is a few groups to report. The algorithm below is a hybridization of a dynamic programming with a
search with pruning and backtracking. dp[i][s] relationship mentioned above is used to avoid recursing into the costly branches
which don't lead to printable results.
Space complexity O(X*N), time complexity O(X*(N+k)) where k is the # of groups which will be found and printed.
If no groups exist with the given criteria or O(k) == O(N) then algorithm terminates on O(X*N) time.
If O(k) = O(N^c) for some constant c the algorithm terminates in polynomial time O(X*N^c)
Code:
class FindGroupsWithSumDivisibleByX
{
int x;
const vector<int>& arr;
int count;
// dp[i][j] holds maximum length of a subsequence of a first i elements of x
// which sum to j modulo X
vector<vector<int>> dp;
public:
FindGroupsWithSumDivisibleByX(const vector<int>& arr, int x): arr(arr), x(x)
{
dp.resize(arr.size() + 1, vector<int>(x, INT_MIN));
}
void PrintAllGroups()
{
count = 0;
vector<int> output(arr.size());
Backtrack(output, 0, arr.size(), 0);
cout << "No of groups: " << count << endl;
}
int Backtrack(vector<int>& output, int pos, int i, int sumModuloX)
{
if (i == 0) {
PrintGroup(output, pos, sumModuloX);
return 0;
}
int& cache = dp[i][sumModuloX];
if (cache != INT_MIN) {
if (pos >= x || cache == 0 || (pos == 0 && cache == 1)) {
// print accumulated group (if matches the criteria)
PrintGroup(output, pos, sumModuloX);
// pruning the current recursive search branch
return cache;
}
}
output[pos] = arr[i - 1];
// case a: adding arr[i-1] to the group
int a = Backtrack(output, pos + 1, i - 1, (x + sumModuloX - (arr[i - 1] % x)) % x) + 1;
// case b: not adding arr[i-1] to the group
int b = Backtrack(output, pos, i - 1, sumModuloX);
cache = max(a, b);
return cache;
}
void PrintGroup(vector<int>& output, int pos, int sumModuloX)
{
if (sumModuloX == 0 && pos > 1 && pos <= x) {
for (int j = pos - 1; j >= 0; j--)
cout << output[j] << " ";
cout << endl;
count++;
}
}
};
Some notes:
1) negative numbers: the code supports negative numbers, however according to C++ standard the sign of the % result for negative param is
implementation defined.
As written above, the code will work on mainstream compilers but if you care about portability
you have to replace arr[i]%x to abs(arr[i%x])*sgn(arr[i]).
2) space complexity can be optimized further: technically the program uses only 2 bits of the values stored in dp array,
so the storage can be reduced. This can be even reduced to a single bit-matrix without changing the asymptotic complexity
but at the cost of larger constants
3) I also took an assumption that numbers in array are unique. If this assumption isn't correct we need to also take care to avoid reporting duplicated groups unless this is permitted
Assumptions:
- Each number in the array can only be used once
- X is <= length of the array
This can be solved with a costly backtracking algorithm.
- zortlord December 31, 2014