## Spotify Interview Question for Software Engineer / Developers

Country: United States
Interview Type: Phone Interview

Comment hidden because of low score. Click to expand.
2
of 2 vote

Use "external sort" and find duplicates. O(n logn)

Comment hidden because of low score. Click to expand.
1
of 1 vote

Modify the external sort and remove duplicate at the time of merging

Comment hidden because of low score. Click to expand.
0
of 0 vote

I suppose one way would be to break the set up into chunks so that two chunks fit into memory at the same time, then test the first chunk against every other chunk, removing duplicates. Then test the second chunk against all but the first, removing duplicates and so on. In the end you have tested every chunk against every other chunk, so there should be no duplicates.
It would take O(N^2) time.

Comment hidden because of low score. Click to expand.
0

Yeah , the solution seems correct ... just one thing is you need to consider each chunk itself if duplicates exist in that chunk itself

Comment hidden because of low score. Click to expand.
0
of 0 vote

lk.kh

Comment hidden because of low score. Click to expand.
0
of 0 vote

yi il,i

Comment hidden because of low score. Click to expand.
0
of 0 vote

Bucket sort or external sort

Comment hidden because of low score. Click to expand.
0
of 0 vote

``````import com.google.common.collect.ArrayListMultimap;

import java.util.*;
import java.util.function.BiFunction;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class RemoveDuplicates {
private static final int MEMORY_SIZE = 4;

public static void main(String[] args) {
final List<String> values = Arrays.asList("1", "2", "4", "a", "2", "a", "9", "3", "1", "7", "3");
removeDuplicates(values.stream(), values.size(), hashFunction(), MEMORY_SIZE, 0);
}

private static final int[] primes = new int[] { 1, 2, 3, 5, 7, 11, 13, 17, 19, 23}; // Etc.

public static BiFunction<String, Integer, Integer> hashFunction() {
return (s, prime) -> s.hashCode() % prime;
}

public static <T> void removeDuplicates(final Stream<T> dataset, final int size, BiFunction<T, Integer, Integer> hashFunction, final int memorySize, int iteration) {
if(size > memorySize) { // We can also choose to do an external sort when iteration > MAX_ITERATION and then de-duplicate (but that is slower)
final ListMultimap<Integer, T> map = ArrayListMultimap.create();
dataset.forEach(el -> map.put(hashFunction.apply(el, primes[iteration]), el));
for(Integer key : map.keySet()) {
final Collection<T> value = map.get(key);
System.out.println("List (" + iteration + "): " + value);
removeDuplicates(value.stream(), value.size(), hashFunction, memorySize, iteration + 1);
}
} else {
final Set<T> set = dataset.distinct().collect(Collectors.toSet());
System.out.println(set);
}
}
}``````

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.