Two Sigma Interview Question for Software Engineer / Developers


Country: United States
Interview Type: Phone Interview




Comment hidden because of low score. Click to expand.
1
of 1 vote

A generic and general solution is hard (billions of integers, large integers (e.g. 64 bit) and the desire to be exact and fast in all situations, extremes), but you may have some constraints you can leverage e.g.:
- integer size is only 10 or 16 bit: you could use counting sort technique
- do you know something about the distribution: can you tell a probability the median being within a certain range after seeing a few values?
- you might get away with less precision for the median? Maybe precise in a certain range and less if far away from expected?
- size of the stream, can it fit into memory?
- can you assume after you've seen a sample (e.g. 10'000 values) the median is reasonable stable?

The two classics are
- counting sort if median range can somehow be limited
let's assume you have 10 bit values, you count for each of the 1024 possible values
the number of occurences. you insert in O(1) and if you need to know the median,
it's in O(1), too --> easy
- two heaps if values fit in memory and integers are large
Min-heap (for values >= median) and Max-heap (for values < median)
if the next value from the stream is < Min-heap, insert it to the min heap otherwise to the max heap
if the two heap sizes differ by more than 1, take the top from the larger heap and put it to the smaller heap
to get the median: if both heaps are equal size, take the average of the bot tops otherwise take the top of the larger heap
--> insert O(lg(N)), get median: O(1)

maybe interesting could be, if you work with the two heaps but clean up memory from time to time by removing the tails on both heaps. You could for example say, when your heap exceeds 10'000 items, you purge 5'000 tail items, in the min heap this would mean for example to remove the 5'000 values (first quartile) or median of max-heap and do the symetrically the same with the min-heap (forth quartile) ...
depending on the nature of the distribution, chances are, your median never moves out the window and you get a O(1), O(1) solution since you limit N to a constant ...

- Chris April 26, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

max heap, mean, and min heap would be one of possible solution.

- Anonymous April 27, 2017 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More