Bloomberg LP Interview Question
Financial Software DevelopersI think answer of this question is not based on the complexity
becuase he given 1 millian of data
he just want to chack ur approach to sort huge amount of data
so it is not possibble to store all the data in memory at same time
so we cant able to sort it using quick sort
but we can use merge sort by dividing each portion in small parts and sort them individually and merge them by which can get final sorted result ..
so please verify me as am i right or not..??
Between mergesort and quicksort I'd vote for quicksort, because it's in-place as someone has already said, and because it's recursive, and so (I'm mostly guessing) has better cache behavior. Definitely C++'s std::sort is better than C's qsort because with qsort the values are passed via void*, ie you lose the type information, meaning the compiler misses possible opportunities to optimize. Also, C++'s sort is a template, it's possible that the comparison functor will be inlined, something that's not possible with C's qsort. All of this is in Stroustrup's blue book, where he discusses std::sort.
In fact, I think time complexity won't be a big issue here. We should consider the potential page fault. If we use merge-sort, the possibility for page fault will be apparently decrease and page fault can lead to a big efficiency problem. In addition optimization in C++, I think the more robust thread library will be a big advantage since merge-sort's efficiency can also be improved by the multithreading
Merge sort is preferred becoz of its complexity i.e. O(nlogn),but quick sort has limitation that number should not be already sorted.
as far as c Or c++. I don't think it really matters because under the hood c++ uses c.
Thats why i said about c,c++. but interviewer doesnt seemed pleased.
infact he never seemed pleased with any of my answers.
> as far as c Or c++. I don't think it really matters
No, C++ is faster, because C's 'qsort' function takes void* ptrs to what it's sorting, and those are opaque to the compiler. C++'s std::sort on the other hand, is templated by the type, and exposing that type information lets the compiler speed the code up.
since we have large set of data ,so can be done is using the concept of external merge sort i.e bring a chunk of data into the memory and sort it ...
i had read bout this logic sumtime back.. plz reply u find anything wrong in this ..
For sorting 1 million numbers, I guess quick sort is better as the merge sort's space complexity would be too bad
Theta(n) space complexity for merge sort
A million numbers would be a million bytes i.e. a megabyte. hmmmm thats the size of a picture on my camera.
Anyways stackoverflow has more discussion on Merge vs Quick. Check it out
As for me quicksort look preferable choise as not requires additional memory. If there's doubt that number can be already sorted then we can use randomization first.
If I was on the interview I would ask what type of numbers are in array. If integer, why not to use LSD sorting? If numbers in array are "generally" ordered, we can use Insertion sort - on partially sorted data it outperforms quicksort.
Merge Sort has a time complexity of only nlogn for the worst case while it is n^2 for quick sort. But,Quick sort is an in-place sort, but merge-sort is not. So extra space requirements of O(n) for merge sort.
With regard to C or C++, C++ is preferred due to its better optimization feature.
One question for u guys.. Is heap sort is the best sort for handling large no. of inputs?
As it has a worst case of O(nlogn) and no space requirements...
If whole data fits in memory and data is uniformly random, quicksort can perform better (please note, if data is already sorted, qsort performs O(n^2) operations). If data can be sorted and can be random, merge sort would be advantegous.
If whole data does not fit into memory (ie. you are sorting 1M data on a machine which has 512K memory) external merge sort with replacement selection algorithm performs better. Data split up into pieces and in each run a piece of data is sorted and output to disk.
If we consider the time complexity then Merge sort is better because
time complexity of merge sort is O(nlogn) in all cases but for quick sort in some cases the time complexity can be O(n^2)(i.e due to bad partitions)
I we consider the space complexity quick sort is better because it is a in place sort
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#include<time.h>
static int mergeCount=0;
int scanNumbers(int n, int a[n])
{
int i;
for(i=0;i<n;i++) {
//scanf("%d",&a[i]);
a[i]=rand()%1000+1;
}
}
void copyArray(int n, int a[n], int b[n]) {
int i;
for(i=0;i<n;i++) {
b[i]=a[i];
}
}
int printNumbers(int n, int a[])
{
int i;
for(i=0;i<n;i++) {
printf("%d ",a[i]);
}
printf("\n");
}
/*void insertionSort(int n, int a[]) {
int i,j,temp,count=0;
for(i=1;i<n;i++)
{
for(j=i;j>0;j--)
{
count++;
if(a[j-1] > a[j]) {
temp = a[j-1];
a[j-1] = a[j];
a[j] = temp;
}
}
}
printf("No. of iterations in insertion sort is %d\n", count);
}
void selectionSort(int num, int a[]) {
int i,j,temp,count=0;
for(i=0;i<num-1;i++) {
int j,minIndex=i,min=a[i];
for(j=i+1;j<num;j++) {
count++;
if (a[j] < min) {
min = a[j];
minIndex = j;
}
// Finds the smallest number in the subarray from i+1 to num
}
int temp=a[i];
a[i]=a[minIndex];
a[minIndex]=temp;
// This makes sure all the elements upto index i are found and sorted
}
printf("No. of iterations in selection sort is %d\n", count);
}
void bubbleSort(int num, int a[]) {
int i,j,temp,count=0;
for(i=0;i<num;i++)
{
for(j=i+1;j<num;j++)
{
count++;
if (a[j]<a[i])
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
}
printf("No. of iterations in bubble sort is %d\n", count);
}
*/
void mergeSort(int start, int end, int a[]) {
mergeCount++;
if (start == end) {
return;
}
int half = (start+end)/2;
mergeSort(start, half, a);
mergeSort(half+1, end, a);
int b[end-start+1], count=0, left=start, right=half+1;
while (left<=half && right <=end) {
if ( a[left] < a[right]) {
b[count] = a[left];
left++;
} else {
b[count] = a[right];
right++;
}
count++;
}
if (right > end) {
while (left <=half) {
b[count++]=a[left++];
}
}
if (left > half) {
while (right <=end) {
b[count++]=a[right++];
}
}
int j=start;
count=0;
while (j<=end) {
a[j++]=b[count++];
}
}
int main() {
srand(time(0));
/*char ch;
printf("Select sorting algorithm to use:");
scanf("%c", &ch);
if ( ch != 'i' && ch != 's' && ch != 'b') {
printf("Wrong input %c", ch);
return 1;
}*/
int n;
printf("enter a number to sort: ");
scanf("%d",&n);
int a[n],b[n];
scanNumbers(n, a);
printNumbers(n, a);
/* copyArray(n, a, b);
printf("insertion sort started\n");
insertionSort(n, b);
printf("insertion sort completed\n");
copyArray(n, a, b);
printf("bubble sort started\n");
bubbleSort(n, b);
printf("bubble sort completed\n");
copyArray(n, a, b);
printf("selection sort started\n");
selectionSort(n, b);
printf("selection sort completed\n");*/
copyArray(n, a, b);
printf("Merge sort started\n");
mergeSort(0, n-1, b);
printf("Merge sort completed with iterations %d\n", mergeCount);
printNumbers(n, b);
/*if ( ch == 'i') {
printf("Sorting numbers using insertion sort\n");
insertionSort(n, a);
} else if ( ch == 'b') {
printf("Sorting numbers using bubble sort\n");
bubbleSort(n, a);
} else if ( ch == 's') {
printf("Sorting numbers using selection sort\n");
selectionSort(n, a);
}*/
//printNumbers(n, a);
return 0;
}
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#include<time.h>
static int mergeCount=0;
int scanNumbers(int n, int a[n])
{
int i;
for(i=0;i<n;i++) {
//scanf("%d",&a[i]);
a[i]=rand()%1000+1;
}
}
void copyArray(int n, int a[n], int b[n]) {
int i;
for(i=0;i<n;i++) {
b[i]=a[i];
}
}
int printNumbers(int n, int a[])
{
int i;
for(i=0;i<n;i++) {
printf("%d ",a[i]);
}
printf("\n");
}
void insertionSort(int n, int a[]) {
int i,j,temp,count=0;
for(i=1;i<n;i++)
{
for(j=i;j>0;j--)
{
count++;
if(a[j-1] > a[j]) {
temp = a[j-1];
a[j-1] = a[j];
a[j] = temp;
}
}
}
printf("No. of iterations in insertion sort is %d\n", count);
}
void selectionSort(int num, int a[]) {
int i,j,temp,count=0;
for(i=0;i<num-1;i++) {
int j,minIndex=i,min=a[i];
for(j=i+1;j<num;j++) {
count++;
if (a[j] < min) {
min = a[j];
minIndex = j;
}
// Finds the smallest number in the subarray from i+1 to num
}
int temp=a[i];
a[i]=a[minIndex];
a[minIndex]=temp;
// This makes sure all the elements upto index i are found and sorted
}
printf("No. of iterations in selection sort is %d\n", count);
}
void bubbleSort(int num, int a[]) {
int i,j,temp,count=0;
for(i=0;i<num;i++)
{
for(j=i+1;j<num;j++)
{
count++;
if (a[j]<a[i])
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
}
printf("No. of iterations in bubble sort is %d\n", count);
}
void mergeSort(int start, int end, int a[]) {
mergeCount++;
if (start == end) {
return;
}
int half = (start+end)/2;
mergeSort(start, half, a);
mergeSort(half+1, end, a);
int b[end-start+1], count=0, left=start, right=half+1;
while (left<=half && right <=end) {
if ( a[left] < a[right]) {
b[count] = a[left];
left++;
} else {
b[count] = a[right];
right++;
}
count++;
}
if (right > end) {
while (left <=half) {
b[count++]=a[left++];
}
}
if (left > half) {
while (right <=end) {
b[count++]=a[right++];
}
}
int j=start;
count=0;
while (j<=end) {
a[j++]=b[count++];
}
}
int main() {
srand(time(0));
/*char ch;
printf("Select sorting algorithm to use:");
scanf("%c", &ch);
if ( ch != 'i' && ch != 's' && ch != 'b') {
printf("Wrong input %c", ch);
return 1;
}*/
int n;
printf("enter a number to sort: ");
scanf("%d",&n);
int a[n],b[n];
scanNumbers(n, a);
printNumbers(n, a);
/* copyArray(n, a, b);
printf("insertion sort started\n");
insertionSort(n, b);
printf("insertion sort completed\n");
copyArray(n, a, b);
printf("bubble sort started\n");
bubbleSort(n, b);
printf("bubble sort completed\n");
copyArray(n, a, b);
printf("selection sort started\n");
selectionSort(n, b);
printf("selection sort completed\n");*/
copyArray(n, a, b);
printf("Merge sort started\n");
mergeSort(0, n-1, b);
printf("Merge sort completed with iterations %d\n", mergeCount);
printNumbers(n, b);
/*if ( ch == 'i') {
printf("Sorting numbers using insertion sort\n");
insertionSort(n, a);
} else if ( ch == 'b') {
printf("Sorting numbers using bubble sort\n");
bubbleSort(n, a);
} else if ( ch == 's') {
printf("Sorting numbers using selection sort\n");
selectionSort(n, a);
}*/
//printNumbers(n, a);
return 0;
}
Quicksort should be used since it can be done in-place. Mergesort has a space requirement of O(N) and so, is not suitable in this case.
- Metta September 28, 2010