Google Interview Question for Software Engineer / Developers






Comment hidden because of low score. Click to expand.
3
of 3 vote

The answer above is almost right.

Basically you use a min heap (sorting increasing order) of size k, where k is the number of files.

First you read the first records from all the files into the heap

Loop until (no more records in any of the files/file list is empty)
Remove the min element from the heap, add it to output file
Read the next element from the file from which the min element came
if the file has no more records, delete the file from the file list
continue
add it to the min heap

Now you have one file with all elements sorted.

The complexity is n log k, where n is the total number of records.

Another question is what if the number of files itself is too big? The answer is simple. Select the first k files that you can manage using your memory resources. Produce an output file. Then work with the next k files to produce the next output file. So on and so forth. Once done, now recurse your method with the new set of files (the output files, they are all sorted). Do it until you output only one file.

- Rajesh Konda June 29, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

wonderful explaination!!

- rhino December 21, 2009 | Flag
Comment hidden because of low score. Click to expand.
2
of 2 vote

http //en(dot)wikipedia(dot)org/wiki/External_sorting

complete explaination

- Arpit April 17, 2011 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

This softwareinterviewqna person seems to be trying to advertise his/her site here!

He/She copies the question posted here, to the blog and then answers it there and posts a link to the blog here!

I don't see any reason why you need to post your text-only answers elsewhere and post a link to that here. You might as well post the answer here.

- Anonymous March 05, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Please... cut the self-righteous BS about sharing. If you want to share, just put it here. Anyone who visits this site comes with sharing in mind. I don't see them trying to advertise their blogs/sites here.

btw, how useful and valuable would it be, when your site goes away?

If you want to, you can include the answer here AND include a link to your blog. Putting just the link comes off as unsolicited spam. Maybe your intentions are good, but it sure does not look that way.

btw, have my posts increased traffic to your site? :-)

- Anonymous March 06, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

What does a sorted file mean? Sorted line by line? What?

Assuming each line contains an int and we sort on that,

Just use a Heap of size k, for k files.

Now if the list of files is so huge that the OS can't handle them, then we might need to break the list of files up, create intermediate merge files and re-merge them. We might end up with a tree like structure for the merges, if multiple steps of re-merge are involved, so we will have to write a recursive function for that.

Note that the basic step for each merge step will remain the same and that is split into a separate function.

// All data structures etc made up, but you should get the idea.

BasicMerge(FileList files, File outputFile)
{
    MinHeap h = new MinHeap();
  
    while (files.HasFiles()) {
    
        foreach (File file in files) {

            if (file.ReachedEnd()) {
                files.Remove(file);
                continue;
            }

            int val = file.ReadInt();
            h.Insert(val);
        }

        int value = h.GetMin();
        outputFile.WriteInt(value);
    }
}

// This does the merge of the whole list, could be 100,000 files
// if required.
// maxOsFiles = constant which determines a limit of OS.
void CompleteMerge(FileList files, File outputFile) {

    FileList [] brokenUpList;
 
    FileList MergeFileList = new FileList();

    if (files.NumberofFiles() > maxOSFiles) {
        brokenUpList = ComputeSplit(files, maxOSFiles);        
    }else{
        BasicMerge(files, outputFile);
        return;
    }

    foreach (FileList fileList in brokenUpList) {
        File tempOutput = GenerateTempFile();
        BasicMerge(fileList, tempOutput);
        MergeFileList.Add(tempOutput);
    }
    CompleteMerge(MergFileList, outputFile);
}

- Anonymous March 07, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

The BasicMerge above is incorrect, but the idea to use a heap is correct.

- Anonymous March 07, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Nice explanation !

- Anonymous December 27, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We can use Map Reduce for this problem. Load the 1000 sorted files into HDFS . In the map phase output individual values as intermediate key value pair. Then the shuffling and sorting process would automatically sort the values and present it to reducer in which we can write values to file c.

- Gautham Karkala November 01, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

http://www.softwareinterviewqna.com/blog/?p=90

- Anonymous March 05, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

OK, I'm the "softwareinterviewqna person".

For me this internet thing is all about sharing. Gayle provided this great place for us to share questions, I love it. I provide my solution with careful consideration, thorough explanation, I contributed my value. I choose to put it on my site, it's just simple like that.

- eraera March 06, 2009 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More