Given an array of unsorted int

Amazon Interview Question for Analysts

0

of 0 votes

21
Answers
Given an array of unsorted integers , how to find duplicates in O(n)
- Gopi August 10, 2011 | Report Duplicate | Flag | PURGE
Amazon Analyst Algorithm

Email me when people comment.

An error occurred in subscribing you.

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 0 vote

Use hashes. Store frequency of each number in the hash. If it is 0 then no duplicates if >= 1 then duplicate exist.

- Anonymous August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

but i am not privilized to use extra space. I have given same reply but he is not satisfied

- Anonymous August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

is the data is bigger than size of the array?

- Anonymous August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You would need more information to solve this in O(n). How big is the array ? What is the range of the numbers we are looking at ?

One of the derivative of this problem is that there is only duplicate which can be solved if the nos are unsorted but in sequence then you can sum the array where sum should be n(n+1)/2. The difference would be no. which is duplicate or zero otherwise.

- Neo August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use two bit arrays. If numbers are n bit then log(n) bits can cover this range and ceil(log(n) / 8) is the number of integers required to act as bit arrays. If number is not present in first bit array mark it. If number is present in another bit array then mark it in another bit array. Finally have another pass on the array and if it is absent in second array that means they are not repeated! Since bit arrays are significantly less than the range they cover it may not be considered extra memory.

- Ashish Kaila August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Hi,
could you please explaoin this line " If numbers are n bit then log(n) bits can cover this range and ceil(log(n) / 8) is the number of integers required to act as bit arrays." ..

- Anonymous August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Well if you have n bits it can represent 2^n numbers. Hence if you know the range, the number of bits required to represent it is log2(n).

- Ashish Kaila August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

And so number of bytes required = bits / 8 = log(n) / 2. However since an int is 4 bytes long, you only need an integer array of size ceil(log(n) / 32).

- Ashish Kaila August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hi Ashish Kaila,
Please correct me if I am wrong.
lets assume I have numbers like 1,2,3 and given one bit array..
so to represent 1 , we need to set 0th bit ... (001) , to set 2 , we need to set 1st bit (010) .. for 3 , we need to set 0th and 1st bits (0011) ..so now if we look at the bit array , we will see it as 0011... so we cannot make it out which numbers are set ...
so could you please explain how to use bit array in this case.

- gopi.komanduri August 11, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

My concept was to have two bit arrays:
Lets assume we have two bit arrays A for keeping track of which numbers we have seen and B to record duplicates:

Consider the input: 23453

n = 2     3       4       5       3 (repeat! set bit in B)
A = 1     110    1110   11110   11110
B = 0     000    0000   00000   00100

So scan B and you will find 3 repeated more than once!

- Ashish Kaila August 11, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hi Ashish,
Thanks for the explanation.. but still that is not pretty much clear to me..
n = 2 3 4 5 3 (repeat! set bit in B)
A = 1 110 1110 11110 11110
B = 0 000 0000 00000 00100

as we have taken 2 bit arrays...
lets take one bit array to scan for first time.
bit myarray[10] // given 10 in random
Input : 23453
for first time , we get 2
so we set myarray[2] bit (assumed array starts with 0 .. and will set corresponding bit in array..
next we set 3 .. now our array contains .. 0011 ( first zero as number didnt start with zero .. next 0 as no 1 .. next 1 as we have 2 .. next 1 as we have 3)
next we set myarray[4] .. next 5 ...
now next 3.. as bit 3 is already set .. we are declaring as duplicate..
is this what you are trying to say?
if so ,
lets assume we have array like 142,100,98,1500,12,150,142,1500 ..
could you please explain how to apply the above concept and fetch the dup in o(n)

- gopi.komanduri August 11, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

1) You could use additional memory ( bit vector or hashtable ).
2) You could use sorting in O(N) - counting sort, for example.

- max August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

for 1. Option , I have used additional memory and gave solution , then the interviewer , came back asking me to do in O(1) space complexity and O(n) time complexity.

2)... I think we can use counting sort if and only if we know the range and also requires to use additional memory. (please correct me if I am wrong )
But that is not acceptable as per the interviewer.

- Gopi August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

According to pp2 counting(disturbtion) sort requires additional memory and range of numbers should be small (we could determine the range in O(N) - just store minValue and maxValue).

- max August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

as to the O(1) space, if the array of integers has a range, e.g. from 1 to i_max, we can define a bitset with size of i-max. it is O(1) space.

- Anonymous August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

as to the O(1) space, if the array of integers has a range, e.g. from 1 to i_max, we can define a bitset with size of i-max. it is O(1) space.

- Anonymous August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

sorry if this question is dumb..
Bitset means , using one interger (of some data type for get considerable bits) .. as it will have 32 bits as using that as counter array .. similar to count sort..?
if so , then the array should have less than 32 elements .. please correct me if I am wrong.

- Anonymous August 10, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@Gopi: tell the interviewer that NO solution exist for general inputs (no range specified) that solves it in O(n) time and O(1) space.

If you could solve it, you could claim yourself as a great scientist (and crappy jobs @ amazon would be too trivial for you); 'coz O(nlogn) is lower bound for array distinctness problem (related to comparison based sorting).

- anon August 14, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Sort it first. It will take nlogn. after sorting scan the array to find if it has duplicates. It will take one pass. So O(n) for this.

Without sorting and without using external memory I dont think that there is any way to find the duplicates in O(n) time.

- Anonymous August 10, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

unless the elements are in range (0,n] . theres no way for o(n) time & o(1) space

- blueskin August 12, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If we use Binary Search Tree with the array, each time we hit the node, we can check if this is a duplicate, we can build the BST and mark the duplicates at the same time, it should be O(n) then.

CareerCup

Amazon Interview Question for Analysts

Books

Videos

Resume Review

Mock Interviews