Write a program to remove dupl

Amazon Interview Question for Software Engineer / Developers

0

of 0 votes

42
Answers
Write a program to remove duplicate elements from array.(Array contains elements in range 1...N). Algorithm must be O(N) in time and O(1) in space. Come up with as many test cases as you can.
- omkar November 02, 2010 | Report Duplicate | Flag | PURGE
Amazon Software Engineer / Developer Arrays

Email me when people comment.

An error occurred in subscribing you.

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 4 vote

We will traverse array and will make a[abs(a[i])] negative if its +ve.. if a[a[i]] is already negative .. means no is repeated we will replace it with zero and so on .. o(n) ...

in next traversala we will remove all the zeros from the array and will make all -ve values +ve .. o(n)

o(n)+o(n) =o(n)

- Ashish December 03, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Works only if the max value is less than or equal to the size of the array. Else it refers to invalid space!

- Avenger April 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

you cannot do it in O(1) space with O(N) time if there are more than one duplicated elements in the array. If there was one, i would just sum it up and get the difference from arithmetic series of 1..N

- Anonymous November 02, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Yes you can! LOL
-Obama

- chennavarri November 03, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

/* The question is find and delete duplicates in an array of size N with a max value N in time complexity O(N), and space O(1)
The time complexity of this program is O(2n + n +n ) = O(4n) i.e. O(n). Space complexity = O(1).
Assuming the input array can store an element of value '2n'.
From the problem we know that the elements range 1 to n so we place elements in their corresponding positions.

Pseudo code:
for each element a[i]
   if(a[i]==i+1 || a[i]>n)
      continue //i.e. either inplace or duplicate
   else
      while(true)
	   if(a[i]>n) //i.e. a duplicate that is already detected
		break
           if(a[a[i]-1]==a[i])
		a[i]+=n	//i.e. a duplicate is detected
		break
           swap a[i] and a[a[i]-1]
           if(a[a[i]-1]==a[i]) //i.e. swapped inplace
		break

Complexity analysis:
Consider for an element a[i], k elements are swapped. i.e. k elements are inplace.
which means each of these k elements require atmost 1 comparison i.e. O(k) and the max value k
can take is n-1 => O(n-1 + n) == O(2n) ==> O(n)
For example, element a[0], the worst case would be, doing a maximum swaps of (n-1) which
means that n-1 elements are in place. so for each of these n-1 elements we only do one comparison i.e. a[i]==i+1
Therefore, atmost 2n comparisons are made
Once we place all elements in place, all elements greater than n are duplicates,
move duplicate elements to the end, so parse once again and swap duplicates to the end.
=> Complexity = O(2n + n + n) = O(4n) ==> O(n)
Test case:

1 5 3 8 4 3 8 5 2 // i=0,
1 4 3 8 5 3 8 5 2 // i=1 swap a[1] and a[a[1]-1] i.e. swap '5' and '4'
1 8 3 4 5 3 8 5 2 // i=1 swap a[1] and a[3] i.e. swap '4' and '8',
1 5 3 4 5 3 8 8 2 // i=1 swap a[1] and a[6] i.e. swap '8' and '5',
1 14 3 4 5 3 8 8 2 // i=1 Since, a[5-1]==5 && a[1]==5, a[i]+=n => 14
1 14 3 4 5 3 8 8 2 // i=2, since a[2] = 2+1 do nothing
1 14 3 4 5 3 8 8 2 // i=3, since a[3] = 3+1 do nothing
1 14 3 4 5 3 8 8 2 // i=4, since a[4] = 4+1 do nothing
1 14 3 4 5 12 8 8 2 // i=5 Since, a[3-1]==a[i], a[i]=n+a[i] => 12
1 14 3 4 5 12 17 8 2 // i=6 Since, a[8-1]==a[i], a[i]=n+a[i] => 17
1 14 3 4 5 12 17 8 2 // i=7 Since, a[7]==8; //inplace so continue
1 2 3 4 5 12 17 8 14 // i=8 swap a[1] and a[8] i.e. swap '14' and '2'

//All duplicates are greater than n with a value a[i]-n i.e. 12,17,14 i.e. 3,8,5
*/

#include <iostream>
using namespace std;
//!Function to swap values, pass by reference
void swap(int *x,int *y){
	*x = *y + *x;
	*y = *x - *y;
	*x = *x - *y;
	cout<<"swapping '"<<*x<<"' and '"<<*y;
}

int main(){
	int a[] = {3,3,3,3,3,2,2,2,2,2,2,3,3,3,3,3,2,2,2,2,2,2,4};
	int n = sizeof(a)/sizeof(int);
	int numComparisons = 0;
	for (int i=0;i<n;++i){
		++numComparisons;
		if(a[i]==i+1 || a[i]>n)
			continue; 			//i.e. either inplace or duplicate
		else{
			while(i<n){
							++numComparisons;
				if(a[i]>n) 		//i.e. a duplicate that is already detected
					break;
				int u=a[a[i]-1];
				if (u==a[i])
				{ 
					a[i]+=n;	//i.e. a new duplicate is detected
					break;
				}
				cout<<endl<<"swapping a["<<i<<"] and a["<<(a[i]-1)<<"] i.e. ";swap(&a[i],&a[a[i]-1]);
				if(a[a[i]-1]==a[i])
					break;
			}
		}
	}

	//Move duplicates to the end, though there are two while loops, 
	//they break on a same condition i.e. when both pivots meet and 
	//pivots are ALWAYS MOVING INWARDS so the maximum number of iterations for these while loops together is N
	int endPivot = sizeof(a)/sizeof(int) -1;
	int beginPivot = 0;
	while(beginPivot<=endPivot){
		++numComparisons;
		if(a[beginPivot]>n){
			while(beginPivot<=endPivot){
				++numComparisons;
				if(a[endPivot]<n){
					swap(&a[beginPivot],&a[endPivot]);
					break;
				}
				--endPivot;
			}
		}
		++beginPivot;
	}
	//Final array
	cout<<endl<<"Final Array : ";
	for (int k=0;k<n;++k){
		cout<<a[k]<<",";
	}
	//Print duplicates
	cout<<endl<<"Duplicates are :: ";
	for (int i=beginPivot;i<n;++i){
		++numComparisons;
		if(a[i]>n)
			cout<<a[i]-n<<",";
	}
	cout<<endl<<"Number of comparisons"<<numComparisons<<endl;
}

- chennavarri November 03, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

can u plzZ explain the logic of ur algo.. How does it lead to the correct solution..??

- Anonymous November 03, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

The pseudocode and example explains it.
The logic is:
Since we know there are elements from 1 to N, if no duplicates then it should be exactly 1 to N. So if we place every element as we read into its correct place i.e. a[0]=1,a[1]=2,a[2]=3,a[3]=4,a[4]=5......etc then we should have 1 to N.
And consider, if we found an element that is already in place then we ignore it. And if we found an element that we want to move to its correct position but it already exists in the correct position then it's a duplicate. And we need something to mark this as duplicate so I increase the value by N so any value x greater than N are duplicates with value x-N.

So in the example we have 9 elements => n=9 so we have values from 1 to 9 from the problem statement.
first we start at a[0] and we find that a[0]==1 i.e. 1 is in its place.
next element a[1] which is 5, this has to be at position 4, check a[4]==5 or not, its not. so we swap a[1] with a[4].
In while loop, we check if the swapped element is in place i.e. a[1]==2? its not, so we should move a[1] i.e. '4' to its correct position i.e. a[3]. is a[3]==4?? its not, so swap a[1] and a[3] i.e. swap '4' and '8'.
again in while loop, we check if a[1]==2? its not, so we swap '8' and '5' the same way as above.
again in while loop,check if a[1]==2? its not, so we should move a[1] to the correct position i.e. a[a[1]-1] => a[5-1] => a[4], BUT a[4] is already == 5 so this is a duplicate, we mark it by increasing the value by n i.e. 9+5 =14. break out of while loop
next element a[2], a[2]==3 , so it is in place, do nothing.
next element a[3], a[3]==4 , so it is in place, do nothing....so on

- chenna November 03, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

first of all you are just detecting duplicates, the question says remove duplicates. You should be reading the question twice before posting intelligent answers

- extinct November 04, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

the problem says numbers are in range of 1...N, but not necessary contain all number from 1 to N. Without this strong assumption, your method will be incorrect.

- papaya November 05, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

The algorithm works to find duplicates between 1..N, the algorithm is built for the problem.
A solution is written for a problem, if a problem says "build an algorithm", then there is no problem statement at all. I hope people who make comments think once before they talk.

- chenna November 05, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

You replied with some non-sense, but without answering papaya's question. You made wrong assumption, pointed out by others, but you didn't try to fix or explain, rather, you just insist you were correct. How could such a thing happen on the earth??

- nano9527 November 07, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

papaya says, "the problem says numbers are in range of 1...N, but not necessary contain all number from 1 to N."
Yes, I know that. I am not assuming that.
what am saying is if there are numbers between 1 to N in an array and complexity is O(N) that means the size of array cannot be greater than N and no element is greater than N, right?
Now, if no duplicates exist then the array has to be of size N right?
Now if duplicates exist then after moving the elements to their correct positions you should be left with duplicates. If you try the following array
{3,3,3,3,3,2,2,2,2,2,2,4};
The algorithm will give,
Final Array :15,2,3,4,15,15,14,14,14,14,14,15,
Duplicates are :: 3,3,3,2,2,2,2,2,3,
Correct me if am wrong.

- chennavarri November 07, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@extinct: I was just lazy to write the deletion code, but anyways, added the code for moving duplicates to the end of the array, now, you can delete or truncate or .... anything.
again the complexity remains the same O(n) or O(4n), space complexity O(1).

- chennavarri November 07, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@chenna
it's not O(1) in space as you assume the array can be indexed upto N. Extra space you need N - n(number of elements).
space complexity O(N).

- MN November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@MN space complexity is the additional memory required. As chennas algo uses the input array and not an additional array i.e the algo is inplace. You should think before you post.

- Anonymous November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

that's what I wrote above... if your array has only n (<N) elements ...why would it have more space allocated? you should perhaps read my comment carefully

- MN November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

buddy..you are not getting the point. the point is not about n or N. The point is whatever may be the size of array, the algorithm doesn't use any extra space.
For example: consider n = n1, if the program takes n1+k bytes of memory. then for n = n2, the program takes n2+k bytes of memory, the algorithm takes k bytes of memory irrespective of value of n. so the space complexity = O(k) ==> O(1). I hope this helps

- chennavarri November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

ok... let's work through one example here.
suppose R = {1 to 10^6} and Arr = [1,2,10,100,10^4, 10^6,10^6,10^4}.
you get n=8 and N = 10^6

now because of the fact you use "if(a[a[i]-1]==a[i])" you need Arr to be accessible upto N.
you are right you always need constant space but that could be at most N. I don't think this would qualify for a O(1) solution.

- MN November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Edit : yeah, I guess I realize the merit in your argument now. It's O(1) as the space required is constant N. It does uses extra memory, so it is still not inplace but that's not what we are looking for.
Correct me if I am wrong?

- MN November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I guess I was working under the assumption that N<=n because the problem doesn't make sense if N>n consider the following
N=5 and array = 1 1 1 2 1 4 3 1...... 10000 elements
Now as per the question find an algorithm O(5) for an array O(10000)
So I deduced that N<=n. Thanks

- Anonymous November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

so The question is find and delete duplicates in an array of size N with a max value N in time complexity O(N), and space O(1).
I wrote the right solution, there's no 'n'. There's only N which is the size of the array and max of the array.
damn it.. someone got me confused somewhere

- chennavarri November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

you could probably just use -1 to mark duplicate elements.

- sharma November 14, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Bloody hell. I'm not cut out for it.

- Astrologer November 03, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think the problem stmt intends to convey that the solution should have linear time complexity. So if the length of the array is m ,m could be >> N (a relatively small number)

Let's define this frequency array F[1...N] that would hold the frequencies of the element k in the original array A. So e.g. if N = 5, and the original array is
[2,1,3,1,1,1,3,5,2,4], the frequency array F would be [4,2,2,1,1]. Observe that regardless of the size of the original array, the size of F would always be the same. So F would actually occupy constant space for any value of m.

The algo now gets very simple. A single pass over the original array A

initialize the elements of F to zero
for (i=1...N){
if( F[a[i]] == 0){
F[a[i]]++;
}else{ // if we have already encountered the *value* a[i] in an earlier iteration
a[i]=NULL;
}
}

Run the above algo on [2,1,3,1,1,1,3,5,2,4]. At the end of the pass,the array become
[2,1,3,null,null,null,null,5,null,4], which is what's the goal ..enjoy !!

- random November 04, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Dude...the question says space complexity of O(1)

- AV November 05, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Dude, space complexity would indeed be O(1). Even if the size of F is N, it's constant (because N is a finite number)and is independent of the size (m) of the original array.

- random November 06, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

In fact, I already explained the constancy of the additional space used in my original post.Put some thought and read through the post before shooting a comment

- random November 06, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

In fact, I already explained the constancy of the additional space used in my original post.Put some thought and read through the post before shooting a comment

- random November 06, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

What I actually meant was n==N. Or else the question doesn't make sense

- Anonymous November 11, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

@above
wat absurd..!!

- Anonymous November 04, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I don't know what is the problem if the array size is >N.
If size of array is m
for(i =1 to m){
a[a[i]]=a[i];
}
when size of array is smaller than N than
for(i=1 to m){
a[a[i]%m]=a[i];
}

- nitin November 09, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

getting the numbers position using mod is tricky. say the size of array is 5, and range is 1 to 1000. position of 7 is 7%5=2. but so is position of 12. so you might overwrite some elements

- xenon November 23, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

If you need to preserve the order of numbers than we need to think more.

- nitin November 09, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Visit coders-stop.blogspot.com/

- Anonynomus November 18, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@chenna
laundiya hoshihaar hai .....
machak !!

- asja November 25, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

<pre lang="java" line="1" title="CodeMonkey50426" class="run-this"> int countVector = 0;
int length = 0;
for(int i = 0 ; i< 10 ; i++)
{
if((countVector & (1<<a[i])) != 0)
continue;
else
{
a[length++] = a[i];
countVector = countVector | 1<<a[i];
}
}

for(int i = 0 ; i<length ; i++)
cout<<a[i];
cout<<endl;</pre><pre title="CodeMonkey50426" input="yes">
</pre>

- Anonymous December 08, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

it'll work only for non-negative integer array.

- Paras December 08, 2010 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

#include <stdio.h>

int main()
{
int a[]={1,2,2,3,3,3,4,5,6,6,9};

int counter, lastIndex;
int length = sizeof(a)/sizeof(int);

// Set 0 for duplicates
for(counter=1,lastIndex=0;counter < length;counter++)
{
if(a[counter] == a[lastIndex])
{
a[counter] = 0;
continue;
}
lastIndex=counter;
}

// Print the whole array
for(counter=0,lastIndex=0;counter<length;counter++)
printf("%d ", a[counter]);
printf("\n");

// Move the 0(s) to end of the list.
for(counter=1,lastIndex=0;counter<length;counter++)
{
printf("lastIndex=%d counter=%d\n",lastIndex,counter);
if(a[counter] == 0)
{
//Set the lastIndex for the firsttime.
if(lastIndex == 0) lastIndex = counter;
continue;
}
if(lastIndex == 0) continue;

a[lastIndex]=a[counter];
a[counter]=0;
//For the first time lastIndex will be set above..
lastIndex = lastIndex+1;
}

for(counter=0,lastIndex=0;counter<length;counter++)
printf("%d ", a[counter]);

getchar();
return 0;
}

- chukka.swathi December 20, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

void removeDups (int a[], int n)
{
    for (int i = 0; i < n; i++)
    {
        if (a[i] != i)
        {
            if (a[a[i]] != a[i])
                swap (a[i], a[a[i]]);
        }
    }

    for (int i = 0, j = 0; j < n; i++, j++)
    {
        while (a[j] != j)
            j++;
        a[i] = a[j];
    }
    a[i] = '\0';
}

- Siraj December 29, 2010 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

that is perfect.

- yuweihui2 January 31, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

is dat O(n) ?

- RaZ March 05, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

for test case:
5 3 1 2 4
it will give wrong answer...........
it will iterate as:-
1) 4 3 1 2 5
2) 4 1 3 2 5
3) 4 2 3 1 5
4) 4 2 3 1 5

and in 2nd loop
2 3 5....
so wrong chutiye.....

- tera baapp August 21, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

package org.com.array;
// Removing Duplicate item from the array
public class P3
{
 static void duplicate(int a[])
 {
	 int t= a.length;
	 int index=0;
	 for(int i=0;i<t;i++)
	 {
		 for(int j=i+1;j<t;j++)
		 if(a[i]==a[j])
			 index=i;
		    }
	 for(int i=index;i<t-1;i++)
	 {
		 a[i]=a[i+1];
		     
	 }
	t--;
	 for(int i=0;i<t;i++)
		 System.out.print(a[i] + " ");
 }
	
	public static void main(String[] args)
	{
		int a [] ={1,2,3,4,2,6,7,3,9,10,10};
		duplicate(a);
	
		

	}

}

- Anonymous March 08, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

package org.com.array;
// Removing Duplicate item from the array
public class P3
{
static void duplicate(int a[])
{int t= a.length;
int index=0;
for(int i=0;i<t;i++)
{ for(int j=i+1;j<t;j++)
if(a[i]==a[j])
index=i;}
for(int i=index;i<t-1;i++)
{a[i]=a[i+1];}
t--;
for(int i=0;i<t;i++)
System.out.print(a[i] + " ");}
public static void main(String[] args)
{int a [] ={1,2,3,4,2,6,7,3,9,10,10};
duplicate(a);}}

- Anonymous March 08, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

package org.com.array;
// Removing Duplicate item from the array
public class P3
{
 static void duplicate(int a[])
 {int t= a.length;
	 int index=0;
	 for(int i=0;i<t;i++)
	 { for(int j=i+1;j<t;j++)
		 if(a[i]==a[j])
			 index=i;}
	 for(int i=index;i<t-1;i++)
	 {a[i]=a[i+1];}
	t--;
	 for(int i=0;i<t;i++)
		 System.out.print(a[i] + " ");}
	public static void main(String[] args)
	{int a [] ={1,2,3,4,2,6,7,3,9,10,10};
		duplicate(a);}}

- Anonymous March 08, 2015 | Flag Reply

CareerCup

Amazon Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews