Facebook Interview Question
Software EngineersCountry: United States
Interview Type: In-Person
Sure. The point is to divide the x-axis into segments which indicate on which segments which producer is the closest one. If you take any 2 producers with distinct x-axis value, theres a unique point on the x-axis where the closest producer changes from one to the other. First the goal is to have a deque which stores the producer, and the point on the x-axis where this producer becomes the closest producer.
First we sort all the producers by the x-axis, and then start iterating over them. For each consecutive pair of producers we calculate the point on the x-axis where the right producer overtakes the left producer and becomes the closest, and we insert that producer along with that x-axis value to a deque. If this x-axis value is to the left of the previous value, we pop the previous producer/value pair out first, since now we know that that producer can never be the closest producer on the x-axis (since its always eclipsed by this new producer we're handling). We continue doing this until we've iterated through all the points, and end up with a deque which gives us the segments where each producer is the closest to the x-axis, after which determining the closest producer to each consumer is trivial
Hi, I wrote a checker for your program and it gave some counterexamples
Consumers = [5, 7, 8] Producers = [(1, 7), (3, 5), (9, 1)]
Expected [(9, 1), (9, 1), (9, 1)]
Expected distances: [17, 5, 2]
Got [(3, 5), (9, 1), (9, 1)]
Got distances: [29, 5, 2]
Here is a checker hxxp://ideone.com/9pYTpR
@emb you're right, there's a bug, line 4th to last line should have while, not if. Replace "if len(d) > 1 and c >= d[1][1]:" with "while len(d) > 1 and c >= d[1][1]:" and it should work.
Your solution is really elegant. Here is my way (slower but still O(n))
Suppose we need to find nearest(consumers, producers) and there are more consumers than producers. We know that nearest producers indices are increasing.
So we find a solution for nearest(consumers[::2], producers) and then using answers for even consumers, find answer for odd ones in O(len(consumers) + len(producers)).
Suppose now there are more producers than consumers.
Since for each consumer we need to find one producer, we can throw away len(producers) - len(consumers) producers (or more). That's how we do it:
We maintain a stack of producers: p[0], p[1], p[2], p[3], ..., p[top]
where p[0] is current best for consumer 0, p[1] is current best for consumer 1, and so on
For each producer we do the following: (a)
1)if stack is empty, push that producer on the stack
2)otherwise, suppose topmost producer is p[top]. If that producer is further than current producer to consumers[top], we pop p[top] and goto (a). If we didn't go to (a), insert producer at the top (only if top < len(consumers)
The resulting complexity is O(n + n / 2 + n / 4 + ...) = O(n)
Your soln is based on the fact that no two producers have the same x-coordinate.
It is quite easy to extend it to the general case: indeed, after sorting the producers by x-coordinate, we can take only the one with the y coordinate closest to 0 and safely ignore all the other "covertical" producers
Use some elimination rules to avoid traversing all producers
1. Sort consumers
2. Sort producers on X and then on Y for the ones that has the same X
for each consumer
3. Stop search when a producer is further on X axis than the distance to the closest producer found thus far
when moving to the next consumer,
4. start from the producer closest to the previous consumer
certainly not O(n) but some improvements here.. very curious to see the O(n) solution if one exists.
C#
public static List<Producer> GetNearestProducer(List<int> consumers, List<Producer> producers)
{
List<Producer> result = new List<Producer>();
if ((consumers != null) && (producers != null))
{
consumers.Sort();
producers.Sort();
int nearestProducerIndex = 0;
Producer nearest = null;
foreach(int c in consumers)
{
Double minDistance = Double.MaxValue;
int lastX = int.MaxValue;
//Start from the closest
for (int i = nearestProducerIndex; i < producers.Count(); i++)
{
Producer p = producers[i];
if (lastX != p.X) //skip all but smallest Y value for a given X
{
lastX = p.X;
if (Math.Abs(p.X - c) > minDistance)
{
/* distance along x axis alone is greater than minDistance
* for any point further down the producers list*/
break;
}
Double distance = p.GetDistance(c);
if (distance < minDistance)
{
minDistance = distance;
nearest = p;
nearestProducerIndex = i;
}
}
}
//this is the closest Pruducer
result.Add(nearest);
}
}
return result;
}
class Producer : IComparable
{
public int X;
public int Y;
public Producer (int x, int y)
{
this.X = x; this.Y = y;
}
public int CompareTo(Object o)
{
Producer p = o as Producer;
if (p == null)
{
throw new ArgumentException();
}
else
{
//first sort by X, then by Y
return this.X == p.X ?
Y.CompareTo(p.Y) : X.CompareTo(p.X);
}
}
public Double GetDistance(int x)
{
return Math.Sqrt(Math.Pow((X - x), 2) + Math.Pow(Y, 2));
}
}
Since the interviewer wanted an algorithm with a runtime less than O(n^2), I guess we can use the nearest neighbor search algorithm with a kd-tree. Finding the nearest point to a given input point takes O(log n), and since there are N points, we can get the solution in O(nlogn) time.
This would work but coding a kd tree in an interview seems nearly impossible and wouldnt solve the second problem with linear run time
#include <stdio.h>
#include <math.h>
typedef struct
{
double x;
double y;
} point;
int calculateDistance(point c, point p, double *distance)
{
point d1 = {(p.x)-(c.x), (p.y)-(c.y)};
*distance = sqrt(pow(d1.x, 2) + pow(d1.y, 2));
}
int main()
{
int sizeC, sizeP, i, j;
// inputting the sizes of consumers and producers
printf("Enter number of consumers: ");
scanf("%d", &sizeC);
printf("Enter number of producers: ");
scanf("%d", &sizeP);
// getting inputs for consumers
point consumer[sizeC];
for(i=0; i<sizeC; i++){
printf("Enter consumer(x) %d: ", i+1);
scanf("%lf", &(consumer[i].x));
consumer[i].y = 0.0;
}
// getting inputs for producers
point producer[sizeP];
for(i=0; i<sizeP; i++){
printf("Enter producer(x, y) %d: ", i+1);
scanf("%lf, %lf", &(producer[i].x), &(producer[i].y));
// calculating distances
double distance[sizeC][sizeP];
for(i=0; i<sizeC; i++)
for(j=0; j<sizeP; j++)
calculateDistance(consumer[i], producer[j], &distance[i][j]);
// displaying the least distance point from consumers
for(i=0; i<sizeC; i++){
printf("for %lf nearest producer is ", consumer[i]);
int leastIndex = 0;
for(j=0; j<sizeP; j++)
leastIndex = distance[i][leastIndex]>distance[i][j] ? j : leastIndex;
printf("(%.1lf, %.1lf)\n\n", producer[leastIndex].x, producer[leastIndex].y);
}
}
#include <stdio.h>
#include <math.h>
typedef struct
{
double x;
double y;
} point;
int calculateDistance(point c, point p, double *distance)
{
point d1 = {(p.x)-(c.x), (p.y)-(c.y)};
*distance = sqrt(pow(d1.x, 2) + pow(d1.y, 2));
}
int main()
{
int sizeC, sizeP, i, j;
// inputting the sizes of consumers and producers
printf("Enter number of consumers: ");
scanf("%d", &sizeC);
printf("Enter number of producers: ");
scanf("%d", &sizeP);
// getting inputs for consumers
point consumer[sizeC];
for(i=0; i<sizeC; i++){
printf("Enter consumer(x) %d: ", i+1);
scanf("%lf", &(consumer[i].x));
consumer[i].y = 0.0;
}
// getting inputs for producers
point producer[sizeP];
for(i=0; i<sizeP; i++){
printf("Enter producer(x, y) %d: ", i+1);
scanf("%lf, %lf", &(producer[i].x), &(producer[i].y));
// calculating distances
double distance[sizeC][sizeP];
for(i=0; i<sizeC; i++)
for(j=0; j<sizeP; j++)
calculateDistance(consumer[i], producer[j], &distance[i][j]);
// displaying the least distance point from consumers
for(i=0; i<sizeC; i++){
printf("for %lf nearest producer is ", consumer[i]);
int leastIndex = 0;
for(j=0; j<sizeP; j++)
leastIndex = distance[i][leastIndex]>distance[i][j] ? j : leastIndex;
printf("(%.1lf, %.1lf)\n\n", producer[leastIndex].x, producer[leastIndex].y);
}
}
Is the goal to print closest producer to every consumer or vice versa?
The question asks to find closest consumer to every producer, but the example answer lists the opposite.
typedef struct Point
{
int x;
int y;
};
int FindClosestConsumer(int array[], int firstIdx, int lastIdx, int numToFind)
{
if (firstIdx > lastIdx) return -1; //invalid inputs
int midIdx = -1;
while (firstIdx <= lastIdx)
{
midIdx = (firstIdx + lastIdx) / 2;
if (numToFind == array[midIdx]) break;
if (numToFind > array[midIdx])
{
firstIdx = midIdx + 1;
}
if (numToFind < array[midIdx])
{
lastIdx = midIdx - 1;
}
}
return array[midIdx];//note: midIdx cannot remain -1 since we are returning -1 error code if firstIdx<lastIdx and control will never reach this point.
}
int main()
{
unsigned int producerCount = 0, consumerCount = 0;
while (producerCount == 0)
{
cout << "Enter number of producers:\n";
cin >> producerCount;
if (producerCount == 0)
{
cout << "Please input producer count greater than 0.\n";
}
}
while (consumerCount == 0)
{
cout << "Enter number of consumers:\n";
cin >> consumerCount;
if (producerCount == 0)
{
cout << "Please input consumer count greater than 0.\n";
}
}
Point *producers = new Point[producerCount];
cout << "Enter " << producerCount << " producers(x y): \n";
for (int i = 0; i < producerCount; i++)
{
cin >> producers[i].x >> producers[i].y;
}
int *consumers = new int[consumerCount];
cout << "Enter " << consumerCount << " consumers(x): \n";
for (int i = 0; i < consumerCount; i++)
{
cin >> consumers[i];
}
for (int i = 0; i < producerCount; i++)
{
int closestConsumer = FindClosestConsumer(consumers, 0, consumerCount - 1, producers[i].x);
if (closestConsumer == -1)
{
cout << "Invalid inputs.\n";
}
else
{
cout << "Closest consumer for producer (" << producers[i].x << ", " << producers[i].y << ") =" << closestConsumer <<"\n";
}
}
delete[] consumers;
delete[] producers;
return 0;
}
Distance between consumer k and left of k-1's producer is not shorter than distance between consumer k and k-1's producer. Also, distance between consumer k and right of k+1's producer is not shorter than distance between consumer k and k+1's producer. We can solve this problem using binary search.
c++, binary search, O(n log n)
typedef unsigned long long UINT64;
const UINT64 MAX_DISTANCE = 18446744073709551615;
struct Point {
int x;
int y;
Point(int coordX, int coordY) {
x = coordX;
y = coordY;
}
bool operator<(const Point& a) const {
return x < a.x;
}
};
struct ConsumerData {
int index;
int left;
int right;
ConsumerData(int i, int l, int r) {
index = i;
left = l;
right = r;
}
};
int findNearestProducer(vector<Point>& producers, int consumer, int startIndex, int endIndex) {
int nearestIndex, i;
UINT64 minDistance, currDistance;
nearestIndex = 0;
minDistance = MAX_DISTANCE;
for (i = startIndex; i <= endIndex; i++) {
currDistance = (producers[i].x - consumer) * (producers[i].x - consumer) + producers[i].y * producers[i].y;
if (currDistance < minDistance) {
minDistance = currDistance;
nearestIndex = i;
}
}
return nearestIndex;
}
vector<Point> findNearestProducers(vector<Point>& producers, vector<int>& consumers) {
vector<Point> nearestProducers;
vector<int> nearestIndex;
queue<ConsumerData> q;
int i, index;
if (producers.size() == 0 || consumers.size() == 0) return nearestProducers;
sort(producers.begin(), producers.end());
sort(consumers.begin(), consumers.end());
nearestProducers.assign(consumers.size(), Point(0, 0));
nearestIndex.assign(consumers.size(), -1);
i = 0;
index = findNearestProducer(producers, consumers[i], 0, producers.size() - 1);
nearestIndex[i] = index;
nearestProducers[i] = producers[index];
if (consumers.size() == 1) return nearestProducers;
i = consumers.size() - 1;
index = findNearestProducer(producers, consumers[i], nearestIndex[0], producers.size() - 1);
nearestIndex[i] = index;
nearestProducers[i] = producers[index];
if (consumers.size() == 2) return nearestProducers;
q.push(ConsumerData(i / 2, 0, i));
while (!q.empty()) {
ConsumerData c = q.front();
q.pop();
i = c.index;
index = findNearestProducer(producers, consumers[i], nearestIndex[c.left], nearestIndex[c.right]);
nearestIndex[i] = index;
nearestProducers[i] = producers[index];
if (c.left < i - 1) {
q.push(ConsumerData((i + c.left) / 2, c.left, i));
}
if (c.right > i + 1) {
q.push(ConsumerData((i + c.right) / 2, i, c.right));
}
}
return nearestProducers;
}
Nice, you've got the O(n*log(n)) idea.
So we just take the middle consumer, find its nearest producer and then for consumers on the left/right we only need to search for producers on the left/right of nearest producer.
Next I was given the following hint:
However, for O(n) solution you need stronger observation - that is if we take some two consumers c_i and c_j (i < j) and some two producers p_k p_l (k < l) then if distance(c_i, p_k) > distance(c_i, p_l), then distance(c_j, p_k) > distance(c_j, p_l). Or, in english - if some consumer is closer to p_l than to p_k, then all consumers on the right are closer to p_l than to p_k
isn't this property already used in kyduke's O(nlgn) solution? Can anybody clarify how this observation be used for a O(n) solution?
/*assuming both producers and consumres are sorted by x. The premise is that in order to find the closest producer to consumer i, we only need to look at producer of consumer i-1 onwards on x axis*/
function findNearestProducer(producers, consumer, startIndex) {
var closestProducerIndex = startIndex,
producer,
distance,
closestDistance;
for(; startIndex < producers.length; startIndex++) {
producer = producers[startIndex];
distance = Math.pow(producer.x - consumer, 2) + Math.pow(producer.y,2);
if(closestDistance === undefined || distance < closestDistance) {
closestProducerIndex = startIndex;
closestDistance = distance;
}
}
return closestProducerIndex;
}
function printNearestProducers(producers, consumers) {
var i, j, consumer, producer, lastProducerIndex = 0;
for(i = 0; i < consumers.length; i++) {
consumer = consumers[i];
lastProducerIndex = findNearestProducer(producers, consumer, lastProducerIndex);
console.log(JSON.stringify(consumer) + " " + JSON.stringify(producers[lastProducerIndex]) + "\n");
}
}
printNearestProducers([{x: 0, y: 3},{x: 1, y: 1},{x: 3, y: 2},{x: 8, y: 10},{x: 9, y: 100}], [1, 5, 7]);
// O(n*m) where n is the no of consumer and m is the no of producers
1. Sort consumers
2. Sort producers on X and then on Y for the ones that has the same X
for each consumer
3. calculate the distance from consumer to producer. c^2=a^2+b^2 ( since the points for a triangle) unless consumer and producer have the same x-coordinate and in such case "distance = y-coordinate of producer"
import java.util.*;
/* P
* |\
* b| \c => c^2 = a^2 + b^2 i.e c =
* | \
* 0,0|____\C
* a
*
*/
public class ConsumerProducer {
//Integer = consumer , HashMap<Integer,List<Integer> => distance from the nearset producer List<Integer>
static HashMap<Integer,HashMap<Integer,Integer[]>> resultMap = new HashMap<Integer,HashMap<Integer,Integer[]>>();
static List<Integer[]> producer = new ArrayList<Integer[]>(); // Integer[] x,y
static List<Integer> consumer = new ArrayList<Integer>(); // Consumer x
public static int getDistance(Integer[] producer, int consumer){
if(producer[0] == consumer){ // both have same x coordinate , return y
return producer[1];
}else{
int a = Math.abs(consumer-producer[0]);
int b = producer[1];
int c = (int)Math.sqrt(a*a + b*b);
return c;
}
}
public static void getNearestPorducer(){
for(int i=0;i<=producer.size()-1;i++){
for(int j=0;j<=consumer.size()-1;j++){
Integer[] currProducer = producer.get(i);
Integer currConsumer = consumer.get(j);
Integer newDistance = getDistance(currProducer,currConsumer);
if(resultMap.containsKey(currConsumer)){ // if that consumer was already calculated via other producer
Integer oldDistance = (int)resultMap.get(currConsumer).keySet().toArray()[0];
if(newDistance < oldDistance){
resultMap.get(currConsumer).remove(oldDistance); // remove that key
resultMap.get(currConsumer).put(newDistance, currProducer); // add current producer
}
}else{
HashMap<Integer,Integer[]> newMap = new HashMap<Integer,Integer[]>();
newMap.put(newDistance,currProducer);
resultMap.put(currConsumer, newMap); // new current producer
}
}
}
for(int key : resultMap.keySet()){
System.out.println("Conumer : "+ key);
int nearestProducerDistance = (int)resultMap.get(key).keySet().toArray()[0];
System.out.println("Nearest Producer with distance : " + nearestProducerDistance + " -> " + Arrays.toString(resultMap.get(key).get(nearestProducerDistance)));
System.out.println();
}
}
public static void main(String[] args) {
producer.add(new Integer[]{0, 3});
producer.add(new Integer[]{1, 1});
producer.add(new Integer[]{3, 2});
producer.add(new Integer[]{8, 10});
producer.add(new Integer[]{9, 100});
consumer.add(1);
consumer.add(7);
consumer.add(5);
Collections.sort(consumer);
Collections.sort(producer, new Comparator<Integer[]>(){
public int compare(Integer[] o1, Integer[] o2) {
Integer o1x = o1[0];
Integer o1y = o1[1];
Integer o2x = o2[0];
Integer o2y = o2[1];
if(o1x < o2x){ // Sort on x and if x is same , on y
return -1;
}else if(o2x < o1x){
return 1;
}else{
return o1y.compareTo(o2y);
}
}
});
getNearestPorducer();
}
}
The idea is simple.
For all the points on the plane, just grab their x value. Then search in the consumer closest value to x. Then that is the closest point. The key here is consumer is a simple straight line. So the hypotenuse cannot be made smaller by picking far x values.
I wonder why they asked such a tough question, took me better part of an hour to figure out the algorithm and probably another to get the code right (though there might be an easier solution too, not sure). I'd guess in the actual interview they gave much more hints... or maybe they're just looking for algo researchers and competitive programmers, who knows.
Here's O(n) solution in python (or nlogn if sorting needed). The key observation is that when we iterate the producers from left to right, the new producer will become the closest one to the x-axis at some point eventually, unless a later producer completely eclipses it.
- Anonymous October 22, 2015