Interview Question
Software Engineer / DevelopersCountry: -
That would be my approach as well. We can't have a better time complexity because there are n*m numbers in the matrix and we need (at least) to read them all.
~
The space complexity is also O(n*m) - store the matrix. I think we can't avoid it because we need the whole matrix to resolve collisions. The hash table needs an extra O(n) space which is ignored in big-O notation.
My approach is similar to the above mentioned solutions. But, I have a little more on the "hash" value. Here goes -
For each row, read the integer in each column. Let this number be N. Now, set the Nth bit in a bitset. Use this bitset as a hash value and store the "count" in a hashmap. Every other row that resembles this row will get the same hash key. We will then have to retrieve the count, increment it and store it back.
Consider each row as a string of m numbers. So you have m strings. Now sort them using mergesort. So complexity is O(n log n) * Comparison of two strings which is O(m).
So overall complexity is O(n*m*log n).
consider the first string sorted as
"12345" and "34567" now string comparison we could not find the duplicates...
Best method is to built binary tree from each row. As each row is traversed built the binary tree. If it is already present you can print the index of the entire row. This will take less complexity. Space Complexity O(n * m) in worst case, if all the elements are different. Time complexity would be O(n) in searching the entire row
Here's a solution along Mintaka's lines. It recursively for every column puts all the rows which share the column value into a set, and then makes a recursive call for comparing that set. So as it progresses through the columns, they are progressively filtered into different duplicate sets.
The runtime is O(n*m) since it looks at each value in the matrix exactly once.
package a1;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
public class FindDuplicateRows {
public static void main (String [] args)
{
int [][] matrix =
{
{0,1,2,3,4},//0
{2,3,4,2,1},//1
{0,1,2,3,5},//2
{1,0,2,3,4},//3
{2,3,4,2,1},//4
{0,1,2,3,4},//5
{0,1,2,3,4} //6
};
findDupes(matrix);
}
public static void findDupes(int [][] matrix)
{
Set <Integer> m = new HashSet<>();
for(int i = 0; i < matrix.length; i++)
{
m.add(i);
}
findDupes(matrix, 0, m);
}
private static void findDupes(int[][] matrix, int at, Set<Integer> m) {
if(at == matrix[0].length)
{
for(int j : m)
{
System.out.print(j + ", ");
}
System.out.println();
}
else
{
Map <Integer, Set <Integer >> dupeRows = new HashMap<>();
for(int j : m)
{
Set <Integer> s = new HashSet<>();
//System.out.println(at + ", " + j);
if(!dupeRows.containsKey(matrix[j][at]))
{
s = new HashSet<>();
dupeRows.put(matrix[j][at], s);
}
else
{
s = dupeRows.get(matrix[j][at]);
}
s.add(j);
}
for(Set <Integer> s : dupeRows.values())
{
findDupes(matrix, at+1, s);
}
}
}
}
Here is a simple solution for this :
private void printDuplicates(Integer[][] int2darray) {
List<String> stringList = new ArrayList<String>();
for (int i = 0; i < int2darray.length ; i++) {
Integer[] innerArray = int2darray[i];
String innerArrayStr = "";
for (int j=0; j < innerArray.length ; j++ ) {
innerArrayStr += innerArray[j]+":";
}
if(!stringList.contains(innerArrayStr)) {
stringList.add(innerArrayStr);
} else {
System.out.println("Row " + (i+1) + " has the duplicate elements");
}
}
}
1. Sort each row. If preceding element in a row is small, we consider it to be a smaller row. so [1,10] is smaller than [2, 0]. Since it will be costly to swap the whole row, we can use an array of pointers which points each row.
2. If there are any duplicate rows, they should be place next to each other.
The naive approach would just compare each row to all other rows. This requires O(n*n*m) time and O(1) space.
- ks4394 September 10, 2013A better approach would be to use a hash table. For each row we compute the hash value of it, and check if it exists in the table. This requires O(n*m) time and O(n) space.
This approach has a problem with collisions. If the hash value already exists in the table, we must check if the rows are actually equal (or their has value collide), so this affects the time complexity.
Here is an implementation of this approach in C++: coliru.stacked-crooked.com/a/7a47232f32677be5
Is there a better approach in terms of time or space complexity?