Yahoo Interview Question for Software Engineer / Developers






Comment hidden because of low score. Click to expand.
1
of 1 vote

Each machine needs to be able to
(a) Sort the numbers:

Sort();

(b) Given a value x, return the number of elements < x, == x, > x.

int LessThanCount(int x);
int EqualCount(int x);
int BiggerThanCount(int x);

(c) Given indexes begin and last, return a median of the values between begin & last (inclusive).

int Median(int begin, int last);

All 3 need additional constant space which should be OK.

Now, the master machine will keep values begin[i] and last[i] for each machine (and 3 more for temporary storage). (2+3)*N = O(n) data.

Algorithm:

(0) Initially, set begin[i] = 0, last[i] = N-1 for all i, 0 <= i < N.
(1) Sort the data on each machine. Sort();

Repeat:
    (2) Find a machine k for which last[k]-begin[k] is maximum
    (3) Ask machine k for median, med = Median(begin[k], last[k]);
    (4) For each machine i, get 3 additional numbers:
            smaller[i] = LessThanCount(med);
            equal[i] = EqualCount(med);
            bigger[i] = BiggerCount(med);
    (5) Sum all these numbers to obtain
            TotalSmaller, TotalEqual, and TotalBigger
    (6) if (abs(TotalSmaller-TotalBigger) <= TotalEqual)
            We are done, return x.
    (7) if not, we have 2 choices:
       (a) TotalSmaller + TotalEqual < TotalBigger (i.e. x is too small)
           Set begin[i] = smaller[i] + equal[i].
       (b) TotalSmaller > TotalEqual + TotalBigger (i.e. x is too big)
           Set last[i] = smaller[i] - 1.

Note that at each step, we are either increasing begin[i] or decreasing last[i] and therefore the pool of possible values is being reduced. This is because the new x is picked from values that are bigger than x_s and smaller than x_b, where x_s is the most recent x that was too small and x_b is the most recent x that was too big.

The biggest concern here is obviously the space and there we used O(N) per machine as required.

The running time is something like
O(N*log_N) for sorting in paralel on all machines (Heapsort, no extra space)
+
O(N*log_N) for partitioning
-- We have total of O(log_n) iterations, at least on average. In each iteration, each machine will take at most O(n) time to return smaller[i], equal[i], and bigger[i] in parallel, plus we need to add these and assign new begin/last which also is O(n).

Total = O(N*log_N) time, O(N) space per machine.

- czpete October 11, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Beautifully described.

- cirus November 10, 2010 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

very nicely written indeed!

- abc January 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

very nicely written indeed!

- abc January 02, 2011 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

something similar to external merge sort??

- Mahesh September 16, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

What we just needed is median of median, find the median of each file and then find the median of the median of all the files.

- Abhishek Kothari June 23, 2014 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More