Amazon Interview Question
SDE1sCountry: India
Interview Type: In-Person
1) Take 2 files in memory, pick the top 1 million entries from it (just like we do for merge sort) and store them in a 3rd file. Name this third file as 'X'.
2) Take one more file from disk, and X, and together follow the same step as in step 1).
Essentially, in each pass, we are getting the best 1 million of the two files involved for comparison.
Note: This is a naive solution. Thinking for better ones.
The condition is strong 10 files each has 10 million SORTED number. I will use merge sort.
3 million in memory to use: one million left for the sorted number; then 2 million cut into 10 sections, each has 0.2 million. Keep ten pointers to each of them. Get the values of the pointers pointing to and get the smallest, put into the "sorted" area. Read a new integer from the file which is related to the selected section and update the value the pointer pointed to and increase the pointer by 1. (For all pointers, it can not be more than 1 million: pointer % 0.2 million). For each value need no more than 9 comparison. Overall efficiency O(n).
1. create a file reader for each file
- Sniga June 07, 20142. create Min Heap and put first number from each file.
3. delete min from Min Heap. if the deleted element is from file 'X' then read next element from file 'X' only
4. repeat step 3 for 1 million times.