## Microsoft Interview Question

Software Engineer / Developerswhat about this approach ?

I am assuming that sorting needs to be done on "id" and its unique for each individual.

1. Lets split the Big file with million entries into n (f1,f2,..fn) small files.

2. Maintain the n small files in n min-heaps.

3. Now in each heap we know the smallest id in that heap, and we have n ids.

4. We take the root of each n heap in memory and find the min amongst them, lets say its xj.

5. Now we know xj is the smallest id present in the whole of file.

6. we swap xj from heap fj with the last element of f1 heap, reduce the size of f1 by one, and heapify f1 and fj from which x1 was originally part of.

Repeat 4, 5, 6 till we run out of heaps.

After that we will have n arrays (f1,f2..fn) which are sorted.

just append these arrays and we have the sorted file wrt to id.

would go for an external merge sort

- Anonymous January 21, 2008