Amazon Interview Question
AnalystsYou would need more information to solve this in O(n). How big is the array ? What is the range of the numbers we are looking at ?
One of the derivative of this problem is that there is only duplicate which can be solved if the nos are unsorted but in sequence then you can sum the array where sum should be n(n+1)/2. The difference would be no. which is duplicate or zero otherwise.
Use two bit arrays. If numbers are n bit then log(n) bits can cover this range and ceil(log(n) / 8) is the number of integers required to act as bit arrays. If number is not present in first bit array mark it. If number is present in another bit array then mark it in another bit array. Finally have another pass on the array and if it is absent in second array that means they are not repeated! Since bit arrays are significantly less than the range they cover it may not be considered extra memory.
Hi,
could you please explaoin this line " If numbers are n bit then log(n) bits can cover this range and ceil(log(n) / 8) is the number of integers required to act as bit arrays." ..
Well if you have n bits it can represent 2^n numbers. Hence if you know the range, the number of bits required to represent it is log2(n).
And so number of bytes required = bits / 8 = log(n) / 2. However since an int is 4 bytes long, you only need an integer array of size ceil(log(n) / 32).
Hi Ashish Kaila,
Please correct me if I am wrong.
lets assume I have numbers like 1,2,3 and given one bit array..
so to represent 1 , we need to set 0th bit ... (001) , to set 2 , we need to set 1st bit (010) .. for 3 , we need to set 0th and 1st bits (0011) ..so now if we look at the bit array , we will see it as 0011... so we cannot make it out which numbers are set ...
so could you please explain how to use bit array in this case.
My concept was to have two bit arrays:
Lets assume we have two bit arrays A for keeping track of which numbers we have seen and B to record duplicates:
Consider the input: 23453
n = 2 3 4 5 3 (repeat! set bit in B)
A = 1 110 1110 11110 11110
B = 0 000 0000 00000 00100
So scan B and you will find 3 repeated more than once!
Hi Ashish,
Thanks for the explanation.. but still that is not pretty much clear to me..
n = 2 3 4 5 3 (repeat! set bit in B)
A = 1 110 1110 11110 11110
B = 0 000 0000 00000 00100
as we have taken 2 bit arrays...
lets take one bit array to scan for first time.
bit myarray[10] // given 10 in random
Input : 23453
for first time , we get 2
so we set myarray[2] bit (assumed array starts with 0 .. and will set corresponding bit in array..
next we set 3 .. now our array contains .. 0011 ( first zero as number didnt start with zero .. next 0 as no 1 .. next 1 as we have 2 .. next 1 as we have 3)
next we set myarray[4] .. next 5 ...
now next 3.. as bit 3 is already set .. we are declaring as duplicate..
is this what you are trying to say?
if so ,
lets assume we have array like 142,100,98,1500,12,150,142,1500 ..
could you please explain how to apply the above concept and fetch the dup in o(n)
1) You could use additional memory ( bit vector or hashtable ).
2) You could use sorting in O(N) - counting sort, for example.
for 1. Option , I have used additional memory and gave solution , then the interviewer , came back asking me to do in O(1) space complexity and O(n) time complexity.
2)... I think we can use counting sort if and only if we know the range and also requires to use additional memory. (please correct me if I am wrong )
But that is not acceptable as per the interviewer.
According to pp2 counting(disturbtion) sort requires additional memory and range of numbers should be small (we could determine the range in O(N) - just store minValue and maxValue).
as to the O(1) space, if the array of integers has a range, e.g. from 1 to i_max, we can define a bitset with size of i-max. it is O(1) space.
as to the O(1) space, if the array of integers has a range, e.g. from 1 to i_max, we can define a bitset with size of i-max. it is O(1) space.
sorry if this question is dumb..
Bitset means , using one interger (of some data type for get considerable bits) .. as it will have 32 bits as using that as counter array .. similar to count sort..?
if so , then the array should have less than 32 elements .. please correct me if I am wrong.
@Gopi: tell the interviewer that NO solution exist for general inputs (no range specified) that solves it in O(n) time and O(1) space.
If you could solve it, you could claim yourself as a great scientist (and crappy jobs @ amazon would be too trivial for you); 'coz O(nlogn) is lower bound for array distinctness problem (related to comparison based sorting).
Use hashes. Store frequency of each number in the hash. If it is 0 then no duplicates if >= 1 then duplicate exist.
- Anonymous August 10, 2011