Palantir Technology Interview Question
Software Engineer / DevelopersTeam: Engineer
Country: United States
Interview Type: Phone Interview
public static boolean findDuplicateWithingN(int[] array, int n) {
HashSet<Integer> set = new HashSet<Integer>(n);
int count = 0;
for (int i = 0; i < array.length; i++) {
Integer current = new Integer(array[i]);
// Duplicate found
if (set.contains(current)) {
return true;
}
// Always keep set size as n
if (count < n) {
count++;
}
else {
set.remove(new Integer(array[i - n]));
}
set.add(current);
}
return false;
}
But what if the item you're removing occurs again some elements later? E.g if you had 4 4 2 3 4 with a distance of k = 2. When you remove the 4 first 4 after getting to 3 (since it's now more than 2 distant) you won't remember the second 4 you added to the set.
@Anon So what, thesecond 4 wold already have matches the first one. If youa re looking to find all possible matches than thetime a first match occurs you can add back another element.
@jbai,
The Count business in unnecessary
Replace the condition with
if (i < n) {
// do Nothing
}
else {
set.remove(new Integer(array[i - n]));
}
set.add(current);
}
Yes, it seems the count is not necessary. Thanks
public static boolean findDuplicateWithinN(int[] array, int n) {
HashSet<Integer> set = new HashSet<Integer>(n);
for (int i = 0; i < array.length; i++) {
Integer current = new Integer(array[i]);
// Duplicate found
if (set.contains(current)) {
return true;
}
// Always keep set size as n
if (i >= n) {
set.remove(new Integer(array[i - n]));
}
set.add(current);
}
return false;
}
You didn't answer the second part - if there are two values within +/-1 of each other within k distance of each other ? Using a HashSet won't work for this question, because the adjacent values might be shared with other values from the array (e.g. 7 is shared by 6 and 8).
The second part of the question is easily added to this solution:
set.add(current)
set.add(current + 1)
set.add(current - 1)
Then when you are removing, you do the same thing:
set.remove(new Integer(array[i - n]))
set.remove(new Integer(array[i - n] - 1))
set.remove(new Integer(array[i - n] + 1))
- Keep a HashMap of each element of the array and the index that they last appeared.
- Iterate through the array, and whenever we see the element is already there in the hashMap, get it's 'last found index', subtract the current index to get the difference (the add 1, as indexes are 0 based), and then compare with the given threshold number.
Here is the code:
public static boolean findDuplicateWithingN(int[] nums, int i) {
HashMap<Integer, Integer> lastFoundIndx = new HashMap<Integer, Integer>();
for(int k=0;k<nums.length;k++){
Integer lastIndx = lastFoundIndx.remove(nums[k]);
if(lastIndx != null){
if((k - lastIndx) + 1 <= i){
return true;
}
}
lastFoundIndx.put(nums[k], k);
}
return false;
}
public static boolean anyDuplicateWithinKIndices(int[] arr, int k) {
if (arr == null || arr.length < 2) {
return false;
}
Set<Integer> set = new HashSet<>();
for (int i = 0, span = 2 * k + 1, len = arr.length; i < len; ++i) {
if (i >= span) {
set.remove(arr[i - span]);
}
if (set.contains(arr[i])) {
System.out.println(i + " " + arr[i]);
return true;
} else {
set.add(arr[i]);
}
}
return false;
}
So my understanding of question is to find duplicate
1. within distance k
2. No number between thm should be more than duplicate+(-)value
First one is easy. I am thinking about how can we achieve second condition. Since we can use k extra space so that can we used to store min/max between ( index, index+k) . But how can we do this in orderof(n)
1) Keep first index counter at 0 element in an array.
2) Keep second index counter at (0+k) element in an array.
3) if(first_index_counter_element && second_index_counter_element == same) make note of both the indexes.
4) keep incrmenting both the first_index_counter and second_index_counter unill second_index_counter==SIZE_OF_ARRAY
Second part of the question is very unclear.If someone can help with an example it would be great.
My solution:
public boolean findDuplicateWithinK(int[] arr, int k){
return findLDistanceWithinK(arr, k, 0);
}
public boolean findLDifferenceWithinK(int[] arr, int k, int l){
HashSet<Integer> set = new HashSet<Integer>();
for (int i = 0; i < arr.length; i++){
if (set.contains(arr[i] + l)){
return true;
}
if (l != 0 && set.contains(arr[i] - l)){
return true;
}
if (i >= k){
HashSet.remove(arr[i - k]));
}
set.add(arr[i]);
}
return false;
}
This is My solution. I believe this code works for both cases. For first case is necessary to use l = 0.
bool duplicate(vector<int> v, int k, int l) {
for (int i = 0; i < v.size(); i++) {
int lower = v[i+k] - l;
int upper = v[i+k] + l;
if (v[i] >= lower && v[i] <= upper) {
return true;
}
}
return false;
}
function duplicateWithinK(_array, k, I) {
var i, length,
map = {
_range: {},
_data : {},
_index: 0,
_counter: [],
add: function(key, value) {
if (this._counter.length === k) {
delete this._range[
(this._counter[this._index] -
I) + ':' + (this._counter[
this._index] + I)];
delete this._data[this._counter[
this._index]];
this._counter[this._index] = key;
this._range[(key - I) +
':' + (key + I)] = key;
this._data[key] = value,
this._index += 1;
if (this._index ===
this._counter.length) {
this._index = 0;
}
} else {
this._range[(key - I) + ':' +(key + I)] =
key;
this._counter.push(key);
this._data[key] = value;
}
},
test: function(value) {
var key, k;
for (key in this._range) {
if (this._range.hasOwnProperty(key)){
k = key.split(':');
if (value >= k[0] && value <= k[1]) {
return true;
}
}
}
return false;
}
};
for (i = 0, length = _array.length; i < length; i += 1){
key = _array[i];
if (!map.test(key) && !map._data[key]) {
map.add(key, {});
} else {
return true;
}
}
return false;
}
duplicateWithinK([1,3,7,1,9], 2, 2);
So this is O(2K) space which reduces to O(k) and O(kn) time which likewise reduces to
O(n), I believe.
?? Testing n elements against k ranges is only O(kn) this reduces to O(n) since k is a constant that drops out.
Create two synced maps that maintain the invariant that they contain no more than the last k elements. One map is keyed to the value, the other to the range value - I : value + I. Once the maps contain k elements adding a k + 1 element to the map should remove/replace the keys for the next element now outside of the range k, before inserting element k + 1.
For each element compare it's value to the k ranges, if it does not fall within any of the ranges and it is not in the map add it to the map else return true,
else return false;
int find_rep_int(int a[] , int k, int N) {
using namespace __gnu_cxx;
int i=0;
hash_map <int,int> htable;
i=0;
while(i<N) {
cout<<"Test"<<endl;
htable[a[i]]+=1;
if(htable[a[i]] > 1)
return a[i];
if(i > k ) {
htable.erase(a[i-k]);
}
for(hash_map<int,int>::iterator ii = htable.begin() ; ii!=htable.end(); ++ii)
cout<<(*ii).first<<" : "<<(*ii).second<<endl;
i++;
}
return 0;
}
int find_rep_int(int a[] , int k, int N) {
using namespace __gnu_cxx;
int i=0;
hash_map <int,int> htable;
i=0;
while(i<N) {
cout<<"Test"<<endl;
htable[a[i]]+=1;
if(htable[a[i]] > 1)
return a[i];
if(i > k ) {
cout<<"erasing:"<<a[i-k]<<" ind:"<<i-k<<endl;
htable.erase(a[i-k]);
}
for(hash_map<int,int>::iterator ii = htable.begin() ; ii!=htable.end(); ++ii)
cout<<(*ii).first<<" : "<<(*ii).second<<endl;
i++;
}
return 0;
}
Here's a C++ implementation of the second problem that has complexity O(n) and space O(k). Original question implies a boolean return value, but it could easily be made into a count or return the index of the duplicate (although not the original)
bool findRepeatingWithinOne(int[] arr, int k) {
int N= sizeof(arr)/sizeof(int);
std::map<int,int> hash;
int counter = 0;
int startIdx = 0;
for (int i = 0; i < N; i++) {
if (hash[arr[i]] > 0)
return true;
hash[arr[i]] += 1;
hash[arr[i]+1] += 1;
hash[arr[i]-1] += 1;
counter++;
if (counter > k) {
hash[arr[startIdx]] -= 1;
hash[arr[startIdx]+1] -= 1;
hash[arr[startIdx]-1] -= 1;
startIdx++;
}
}
}
First part is fairly easy. For the second part, here's what you do.
Note that we can map each integer to a bucket of length L. So values from 0 to L are in bucket 0, L+1 to 2L+1 are in bucket 1, etc..
We note that if any two integers map to the same bucket, we have two values within L. So, no bucket will every have more than 1 entry (we will only maintain the last k indices). Also note that a value can only be within L of an item in the same bucket, previous bucket, or next bucket.
Therefore, the algorithm basically works like this:
for each item, figure out bucket/offset. If there is already an item in same bucket or an item in previous bucket with greater/equal offset or an item in next bucket with smaller/equal offset, return true.
Else, add to fixed queue and remove expired entry from bucket if necessary. Then add new entry to bucket.
Use a Hash_map of size k.
Now hash the first k values into this map, and anytime you try to insert a duplicate, return the answer (check if value already present, if not then insert; because most hash maps will not report duplicate insertions)
Once you have mapped this initial set of k values, iterate over the rest of the elements by:
1) remove the value from the map that was at the beginning of your k-window (you may keep a pointer/index to this location and increment the location in every iteration)
2) map the next value in the array (again, you can keep a pointer/index to the current 'end' of the k-window
3) like above, check if duplicate, else keep iterating
Space: O(k) {using hash map of size k)
Time: O(n) {iterating over all the elements of array only once! }
Edge cases: think of how you can take care of the end: your k window will start skrinking so you must be careful of keeping track of the 'end' index
Test cases: arrays of size: 0, 1, k-1, k, k+1, k*2; arrays with no dups, with dups at the edges of array; with continuous dups (2 4 4 4 4 4 3); array with only dups (4 4 4 4 4)
Try this, passed automated tests !
def duplicate(A, K, I):
buckets = {}
for i in xrange(len(A)):
if i > K:
del buckets[A[i - K - 1]/I]
if ((A[i]/I) in buckets) or \
((A[i]/I-1) in buckets and buckets[A[i]/I-1] - A[i] >= -I) or \
((A[i]/I+1) in buckets and buckets[A[i]/I+1] - A[i] <= I):
return True
buckets[A[i]/I] = A[i]
return False
The first part can be solved in space complexity O(1) and time complexity O(kn)
Instead of defining the hashset, use the array itself.
public static boolean containsDuplicate(int[] array, int k){
for (int i=0; i<k;i++){
for(int j=i+1; j<k;j++){
if(array[i]==array[j])
return true;
}
}
for (int i = k+1; i<array.length; i++){
for(int j=i-k;j<i-1;j++){
if(array[i-1]==array[j]){
return true;
}
}
}
return false;
}
bool dup(const vector<int>& a, int k)
{
if (k > a.size()) return true;
int n = a.size();
for (int i=0; i < k; i++) h[a[i]]++;
for (int i=0; i <= n-k ; i++)
{
if (h.count(a[i]) > 0) return false;
h[a[i]]--;
if (i+k < n) h[a[i+k]]++;
}
}
public boolean findDups(int[] n, int k, int tol) {
HashSet<Integer> values = new HashSet<Integer>(k*(2*tol + 1));
for (int i = 0; i < n.length; i++) {
int nVal = n[i];
for (int t = nVal - tol; t < nVal + tol; t++) {
if (values.contains(t)) {
return true;
} else {
values.add(t);
}
}
if (i >= k) {
nVal = n[i-k];
for (int t = nVal - tol; t < nVal + tol; t++) {
values.remove(t);
}
}
}
return false;
}
Complexity O((2*tol +1)n) time, O((2*tol+1)k) space. If we know that the tolerance is reasonably small, this approaches O(n)/O(k) as requested.
Update:
HashSet<Integer> values = new HashSet<Integer>(k*(tol + 1));
for (int i = 0; i < n.length; i++) {
int nVal = n[i];
for (int t = nVal - tol; t <= nVal; t++) {
if (values.contains(t)) {
return true;
} else {
values.add(t);
System.out.println("Added: " + t);
}
}
if (i >= k) {
nVal = n[i-k];
for (int t = nVal - tol; t <= nVal; t++) {
values.remove(t);
}
}
}
return false;
Now O((tol+1)n) time and O((tol +1)k) space. We only need to "expand" each number in one direction to find an overlap (the next number will cover the other direction). For example:
n = {1, 8, 3}, k = 2, tolerance = 2
first run -1, 0, 1 are added to the set
second run 6, 7, 8 are added to the set
third run 1, 2, 3 are added to the set and we have a match
Something like this should work -
Keep a synced linked list and hash table. On each insert add all values from x +/- L to the hashtable by incrementing how much they're there. (example: if list is 1,4,5... and L is 2, and k > 3 then table would contain at the end {-1:1, 0:1, 1:1, 2:2, 3:3, 4:2, 5:2, 6:2, 7:1})
Hash table has constant time lookups. For first part, just use L=0. Time is O(LN) (actually 2LN if its +/- L) and space is O(K). Here's some python code that does that (hasn't been thoroughly tested, but should work.
class LimitedLL(object):
def __init__(self, k, l):
self.head = None
self.tail = None
self.length = 0
self.k = k
self.l = l
self.table = {}
def add(self, x):
def insert(x):
if x in self.table:
self.table[x] += 1
else:
self.table[x] = 1
for i in xrange(1,self.l):
if x + i in self.table:
self.table[x+i] += 1
else:
self.table[x+i] = 1
if x - i in self.table:
self.table[x-i] += 1
else:
self.table[x-i] = 1
if self.head == None:
self.head = {'v': x, 'next': None}
self.tail = self.head
self.length = 1
insert(x)
return
self.tail['next'] = {'v': x, 'next': None}
insert(x)
self.length += 1
if self.length > self.k:
temp_x = self.head['v']
self.head = self.head['next']
for i in xrange(-self.l, self.l):
self.table[temp_x + i] -= 1
if self.table[temp_x + i] == 0:
del self.table[temp_x + i]
def contains(self, x):
return x in self.table
def find_dups(arr, k, l):
ll = LimitedLL(k,l)
for el in arr:
if ll.contains(el):
return True
else:
ll.add(el)
return False
static void Find_Duplicates_In_Array(int a[],int k){
int count=0;
Set<Integer> s=new HashSet<Integer>();
for(int i=0;i<k;i++){
boolean r=s.add(a[i]);
if(r==false){
System.out.println("Duplicate element"+a[i]+"at position:"+(i+1) );
count++;
}
}
System.out.println("Found "+count+"number of duplicates");
}
Can you please give an example
- DashDash May 19, 2013