Amazon Interview Question
InternsCountry: India
Interview Type: In-Person
One way to do this is:
1.Read as many number as possible into RAM, sort them and write them to a file, say F1.
2. Do the above step until the array is exhausted. As a result we will have a number sorted files F1,F2,....Fm
3.Construct a mean-heap, using the first entry of each of these files.
4.Extract the root of the mean-heap, write it to the final destination, and replace it with a value from the file where the original root value belonged to and maintain mean-heap property.
5. Repeat 3-4 until all the files are written into the final destination.
Using bit manipulation array. That is if we have 2000 elements then using the bit operation they can be stored in 625 memory locations. Now their index is specified by element/32 and their bit position is specified by element%32.
Suppose we have 2000 elements .I am just taking some elements as:
arr[2000]={33,32,65,35};
now result array=33/32=1 and bit is 33%32= so bit one is set that is result[1]=2
now for 12/32=0 and bit is 32%32=0 so in index 0 bit 0 is set that is result[0]=1
for 65/32 index is 2 and bit is 1 set so result[2]=2
35 index is again 1 so 35%32=3 so bit result[1]=2|8=10
in this way the result array holds the numbers. Now to sort them
restore the array in sorted order. This one is good algo and was given by algos when i asked him the same question.
int i,j,index=0;
for(i=0;i<=625;i++)
for(j=0;j<32;j++)
{
array[index++]=i*32+j;
}
Using a bit-vector is a good idea if the elements to be sorted are all unique or if they contain duplicates up to a certain constant number, say the numbers all can have up to a maximum of 4 duplicates. However, manipulating the bit-vector can get unwieldy if the input contains arbitrary number of duplicates.
In that case, a min-heap is a good alternative to sort huge data that don't fit into main memory at once. Two limits - an upper limit and a lower limit can define the range of elements to be stored in the heap. Make (the required number of) passes over the data, capture the defined range of elements and print them out at the end of the pass.
For ex:if we have space to hold 1000 elements but need to sort 2000 elements, a heap to hold 500 elements can be used, the rest of the space being used to load elements from hard-disk, and then in 4-5 passes, we can capture, and print the elements in the ranges: 1-500, 501-1000, 1001-1500 & 1501 to 2000.
external merge sort
- algos August 02, 2013