Google Interview Question
Software Engineer in TestsCountry: United States
Interview Type: In-Person
This I like better than the other upvoted solution.
This seems to have locality of reference...
I agree with the point raised by anonymous. I have written a program to verify this with array size 10^7 and there is clearly performance difference between the two cases. Here is the code:
#include<stdio.h>
#include<stdlib.h>
#define SIZE 10000000
void remove_duplicates(int *arr);
void remove_duplicates_cached(int *arr);
int main(int argc, char **argv)
{
int *arr =(int *) malloc(SIZE*sizeof(int));
int i;
for(i=0;i<SIZE;i++)
{
arr[i]= rand()%1000;
}
if(strcmp("cached",argv[1]))
{
remove_duplicates(arr);
printf("\n Without cache running time");
}
else
{
remove_duplicates_cached(arr);
printf("\n With cache running time");
}
return 0;
}
void remove_duplicates(int *arr)
{
int itr=0;
int prev_val=arr[itr++];
int index=1;
for(;itr<SIZE;itr++)
{
if(arr[itr] != prev_val)
{
arr[index++] = arr[itr];
prev_val=arr[itr];
}
}
printf("\n Index is %d",index);
}
void remove_duplicates_cached(int *arr)
{
int i,index=1;
for(i=1;i<SIZE;i++)
{
if(arr[i]!=arr[i-1])
arr[index++] = arr[i];
}
printf("\n Index in cached is %d",index);
}
and output of time command is:
$ time a.exe uncached
Index is 9990184
Without cache running time
real 0m0.699s
user 0m0.016s
sys 0m0.031s
$ time a.exe cached
Index in cached is 9990184
With cache running time
real 0m0.659s
user 0m0.015s
sys 0m0.000s
public void removeDuplicate(int[] array) {
int unique = 0;
int dup = 1;
for (; dup < array.length; dup++) {
if (array[unique] != array[dup]) {
// swap two elements
unique++;
int tmp = array[unique];
array[unique] = array[dup];
array[dup] = tmp;
}
}
// delete the duplicated numbers
for (; unique < array.length - 1; unique++) {
array[unique + 1] = null;
}
}
May be we can just skip the duplicates and copy them back....Not so cost effective but does inplace sort.
/*Remove Duplicates in a already sorted array*/
#include<stdio.h>
main()
{
int i,arr[10],j,n,p;
printf("Enter the number of elements:\n");
scanf("%d",&n);
for(i=0;i<n;i++)
scanf("%d",&arr[i]);
for(i=0;i<n;i++)
{
for(j=i;arr[i]==arr[j] && j < n;j++);
for(p=i+1;j<n;p++,j++)
arr[p]=arr[j];
n=p;
}
for(i=0;i<n;i++)
printf("%d\n",arr[i]);
}
Assuming you have a variable representing the size of the array, do the following:
run through the array, for each element, if it has the same value as the previous element, swap/overwrite it with the last element, decrease the array size variable by one, and try that same element again, so something like
for(i = 0; i < size; i++){
if(i != 0){
if (a[i] == a[-1]){
a[i] = a[size-1]; // replace with last element
size--; //decrease size by 1
i--; //so not to skip newly added in case
}
}
}
This will be of O(n);
Then you just sort everything that's covered by the size variable, with whatever O(nlogn) algorithm you want. STL sort will do the trick
total O = O(n+nlogn) = O(nlogn);
This does not use any additional data types, and does in-place sort
just use two pointers, old,new
old=A;
while(*old!=*(old+1))
old++;
new++=++old;
for(int i=0;i<size-(new-A);i++){
if(*old!=*new){
*old=*new;
old++;
}
new++;
}
while takes old to original of the1st duplicate and then new++=++old make old point to 1st duplicate and new to next element.
now we copy from new to old if new is not a duplicate.
What if we modify insertion sort slightly to accommodate duplicates as below
int[] InsertionSortWithoutDuplicates(int[] arr)
{
int len = arr.length;
for (int i = 2; i < len; i++)
{
int save = arr[i];
// find the position of the cur i in the sorted array
for (j = i - 1; (j >= 0) && (arr[j] > arr[i]); j--);
if (arr[j] == arr[i]) // duplicate
{
// shift the duplicate to the end and reduce the array len
int temp = arr[len - 1];
arr[len - 1] = arr[i];
arr[i] = temp;
len--;
i--; // we have to consider current i again
}
else if (j + 1 != i)
{
// shift the array to accomodate the new element in sorted pos
// like regular insertion sort
for (int k = i - 1; k >= j; k--)
arr[k + 1] = arr[k];
arr[j] = save;
}
}
int[] sortedArr = new int[len];
System.arraycopy(arr, 0, sortedArr, 0, len);
return sortedArr;
}
package google;
import java.util.Arrays;
public class InlineRemovalOfRepeatedValuesInSortedArray {
/**
* Question :- given an array of sorted integers with duplicate values , sort the array so that there are only unique values
* left in sorted order ( do not use any additional data type , do inplace sort )
*
* No Need of pseudo code
*/
public static void main(String[] args) {
int[] input = {1, 2, 3, 3,3, 4, 4,4};
Arrays.sort(input);
int i = 0, j = i+1;
for(; j < input.length; ){
if(input[i] == input[j])
break;
else{
i++; j++;
}
}
for(; j < input.length;){
if(input[i] !=input[j]){
input[++i] = input[j++];
}else{
j++;
}
}
int k = 0;
while(k <= i && k < input.length){
System.out.println(input[k]);
k++;
}
}
}
Here we have taken a Input array and sorted it(Just to make sure that we start with a sorted array as mentioned in the problem). The last for loop is just to printout the end output, though we can optimize it by printing it while copying a value is previous for loop. The algo is in place and a O(n) time complex
#include<stdio.h>
int main(){
int a[] = {1,1,1,2,2,2,2,2,2,3,3,3};
int i=0,j=0,n,k;
int len = sizeof(a)/sizeof(int);
while(1){
n = a[i];
while(a[j]==n && j<len){
j++;
}
if(j>=len){
break;
}
a[++i] = a[j];
}
for(k=0;k<=i;k++){
printf("%d ",a[k]);
}
return 0;
}
awesomecoder.blogspot.com/2012/08/remove-duplicates-from-sorted-array-of.html
A C++ version of the algorithm. This will work even if the iterator is neither random access nor bidirectional, a little more generic than what was asked, but works and does a little more :) For the example I use the new C++11 forward_list to demonstate this point. The return value is one past the last value of the sorted list with the duplicates removed, as is the C++ style for iterators and algorithms. Let me know what you think :)
#include <iostream>
#include <forward_list>
template <typename Iter>
Iter remove_duplicates(Iter first, Iter last)
{
if(first == last)
return last;
Iter prev;
Iter new_last;
prev = first;
std::advance(first,1);
new_last = first;
for(; first != last; ++first, ++prev)
if(*first != *prev)
*new_last++ = *first;
return new_last;
}
int main()
{
std::forward_list<int> numbers = {1,1,2,2,3,3,4,4,5,6,6,7,7};
auto e = remove_duplicates(numbers.begin(), numbers.end());
for (auto i = numbers.begin(); i != e; ++i)
std::cout << *i << ((std::next(i) != e) ? ", " : "\n");
return 0;
}
O(n^2) solution but is a stable Sort
public class UniqueSortedIntegerLeft {
public static Integer[] sort(final Integer[] input) {
int duplicateIndex = -1;
for (int i = 1; i < input.length; ){
if (input[i-1] == input[i]) {
if (duplicateIndex == -1) {
duplicateIndex = i;
}
i++;
}
else {
if (duplicateIndex != -1) {
shiftArray(input, duplicateIndex, i);
duplicateIndex = -1;
} else {
i++;
}
}
}
return input;
}
private static void shiftArray(final Integer[] array, int startIndex, int copyFromIndex) {
int copyValue = array[startIndex];
while (copyFromIndex < array.length) {
array[startIndex++] = array[copyFromIndex++];
}
while (startIndex < array.length) {
array[startIndex++] = copyValue;
}
}
private static void printArray(final Integer[] array) {
for (Integer value: array) {
System.out.print(value + " ");
}
System.out.println();
}
private static void swap(final Integer[] array, int from, int to) {
int tmp = array[from];
array[from] = array[to];
array[to] = tmp;
}
public static void main(String args[]) {
printArray(sort(new Integer[]{1,1,1,2,3}));
printArray(sort(new Integer[]{1,1,2,3,4,4,5,6}));
printArray(sort(new Integer[]{1,1,1,3,4}));
}
}
Output:
1 2 3 1 1
1 2 3 4 5 6 1 4
1 3 4 1 1
1 #include <iostream>
2
3 using namespace std;
4
5 int main() {
6 int arr[] = {1,2,2,2,3,3,3,4,4,5,5,6};
7 int len = sizeof(arr)/sizeof(int);
8
9 int i, j = 0;
10 for(i = 1; i < len; i++) {
11 if(arr[i]==arr[j]) {
12 } else {
13 arr[++j] = arr[i];
14 }
15 }
16 for(int i = j+1; i < len; i++)
17 arr[i] = 0;
18
19 for(int i = 0; i < len; i++)
20 cout << arr[i] << " ";
21 cout << endl;
22
23 return 0;
24 }
package com.snips.test;
public class ArrayCrop {
/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
int a[]={1,1,1,1,1,1,2,2,2,2,3,4,4,4,6,6,8,9,9,9,12,12,23,23,65,65,65,65};
int sortedIndex=0;
int marker = a[0];
for (int i=1;i<a.length;i++){
if(a[i] != marker){
marker = a[i];
a[++sortedIndex] = marker;
}
}
System.out.println("SortedIndex = "+sortedIndex);
for(int i=0;i<sortedIndex+1;i++){
System.out.println(a[i]);
}
}
}
python:
def removeDuplicatesFromSortedArray(A):
if len(A) <= 0:
return A
i = 0
j = i+1
last_seen = A[i]
while j < len(A):
if A[j] != last_seen:
last_seen = A[j]
i += 1
temp = A[i]
A[i] = A[j]
A[j] = temp
j += 1
for n in xrange(j-1, i, -1):
del A[n]
return A
if __name__ == '__main__':
A = [1,1,2,3,4,5,6,6,6,6,6,6,7,7]
print removeDuplicatesFromSortedArray(A)
A = [1,1,1,1,1,1,1,1,1,1,1,1,1]
print removeDuplicatesFromSortedArray(A)
A = []
print removeDuplicatesFromSortedArray(A)
A = [1,1,1,1,1,1,1,1,1,1,1,1,1,6]
print removeDuplicatesFromSortedArray(A)
A = [1,2,3,4,5,6]
print removeDuplicatesFromSortedArray(A)
C++ solution. I know C++ is horrible with arrays, thus the messy handling of setting the unnecessary space to default values, and having to pass in a size, etc.
O(n)
// uniquefy the sorted array with duplicates
int* uniquefyArr(int* arr, int size) {
int tgtIndex = 0;
for (int i = 1; i < size; i++) {
if (arr[i] != arr[tgtIndex]) { // move the tgtIndex pointer
tgtIndex++;
arr[tgtIndex] = arr[i];
}
}
tgtIndex++;
// clean-up
while (tgtIndex < size) {
arr[tgtIndex] = NULL;
tgtIndex++;
}
return arr;
}
int main() {
int arr[17] = {1, 1, 1, 2, 2, 3, 3, 3, 4, 5, 6, 7, 8, 8, 9, 9, 9};
// print initial array
cout << "{";
for (int i : arr) {
cout << i << ", ";
}
cout << "}" << endl;
// print the result
int* uniquified = uniquefyArr(arr, 17);
cout << "{";
for (int j = 0; j < 17; j++) {
cout << uniquified[j] << ", ";
}
cout << "}" << endl;
return 0;
}
Sort the numbers in a tree skipping the numbers if repeated, then do an in order traversal
If we need to keep the original array R we need two traversals:
1.) count the number of duplicates D
2.) create new array[R - D] and add only the non repeated elements
return the array from 2.)
Here is a sample Java code:
public int[] getUnique(int[] sortedArr) {
if (srotedArr = null || sortedArr.length = 0)
return sortedArr;
int[] clearedArr = new int[sortedArr.length - countDuplicate(sortedArr)];
int insIdx = 0;
Integer lastValue = null;
for (int i = 0; i < sortedArr.length; i++)
if (lastValue == null || lastValue != sortedArr[i])
clearedArr[insIdx++] = sortedArr[i];
lastValue = sortedArr[i];
}
return clearedArr;
}
private int countDuplicate(int[] sortedArr) {
int cnt = 0;
Integer lastValue = null;
for (int i = 0; i < sortedArr.length; i++)
if (lastValue != null && lastValue == sortedArr[i])
cnt++;
else
lastValue = sortedArr[i];
return cnt;
}
Are you sure your countDuplicate method is correct?
Consider the simple example [0, 1, 2]. Number of duplicates in this input is 0, but your method would return 2.
- NuclearTeen August 13, 2012