Facebook Interview Question
Software Engineer InternsCountry: UK
Interview Type: Phone Interview
If you're computing distance as euclidean distance, I believe you don't have to square root each calculation. I could have a lambda function for comparing distances like this:
{{
// C++
[](std::pair<int, int>& f, std::pair<int, int>& s) {
int first = pow(f.first, 2) + pow(f.second, 2);
int sec = pow(s.first, 2) + pow(s.second, 2);
return first < second;
}
}}
To sort the list of points.
Guys, use heap.
[ en.wikipedia.org/wiki/Heap_(data_structure) ]
You should be good to go. Get a k-heap, distance metric is Euclidean, I guess everyone knows that - and you should be good.
The code is so so trivial.
In Java, use a [ docs.oracle.com/javase/8/docs/api/java/util/PriorityQueue.html ]
1. O(k * n). We can go k times through the array. For i-th closest point, we find the closest point out of points i throug n. The closest point gets swapped with i-th point.
2. O(log k * n). We can use a heap (the code below).
3. Aa preprocessing, we can insert all of the points into a structure like R-Tree. Finding k closest point would be taking expected O(log n).
vector<Point> Closest(vector<Point> const &a, Point const &origin, int k)
{
vector<Point> closest;
if (a.size() >= k &&
k > 0)
{
priority_queue<PQEntry> pq;
for (int i = 0; i < a.size(); ++i) {
double dx = origin.x_ - a[i].x_;
double dy = origin.y_ - a[i].y_;
double distance = dx * dx + dy * dy;
bool insert = false;
if (i < k) {
insert = true;
} else if (distance < pq.top().distance_) {
pq.pop();
insert = true;
}
if (insert) {
pq.push(PQEntry(i, distance));
}
}
while (!pq.empty()) {
closest.push_back(a[pq.top().point_idx_]);
pq.pop();
}
}
return closest;
}
The best known algorithm that uses fast-selection is in O(n).
1. compute the distance of the points with the origin.
in order to find the k-smallest values, use the k-selection version that at every iteration the pivot is the median.
Please note that you need to use the fast-median algorithm (that partitions the data in groups of size 5 and uses the median of median ....)
Thus the running-time is:
T(n) = T(n/2)+ O(n)
--> T(n) = O(n)
Another way of implementation (to get rid of median finding) is to use randomized k-selection. I.e., shuffle the distances first and then follow the regular k-selection.
The EXPECTED running time of the algorithms is O(n)
--
p.s.: using the k-heap should be fast in practice.
C# solution
class Point{
private int x;
private int y;
private double distance;
public int X { get { return x; } set { x = value; } }
public int Y { get { return y; } set { y = value; } }
public double Distance { get { return distance; }}
public Point(int x, int y){
this.x = x;
this.y = y;
this.distance = Math.Sqrt(Math.Pow(x, 2) + Math.Pow(y, 2));
}
public static Point[] KClosest(Point[] points, int k){
if (points.Count() <= k) { return points; }
Point[] closest = new Point[k];
Array.Sort(points, (Point x, Point y) => x.Distance.CompareTo(y.Distance));
Array.Copy(points,closest,k);
return closest;
}
}
my C# solution
class Point{
private int x;
private int y;
private double distance;
public int X { get { return x; } set { x = value; } }
public int Y { get { return y; } set { y = value; } }
public double Distance { get { return distance; }}
public Point(int x, int y){
this.x = x;
this.y = y;
this.distance = Math.Sqrt(Math.Pow(x, 2) + Math.Pow(y, 2));
}
public static Point[] KClosest(Point[] points, int k){
if (points.Count() <= k) { return points; }
Point[] closest = new Point[k];
Array.Sort(points, (Point x, Point y) => x.Distance.CompareTo(y.Distance));
Array.Copy(points,closest,k);
return closest;
}
}
C# solution
class Point{
private int x;
private int y;
private double distance;
public int X { get { return x; } set { x = value; } }
public int Y { get { return y; } set { y = value; } }
public double Distance { get { return distance; }}
public Point(int x, int y){
this.x = x;
this.y = y;
this.distance = Math.Sqrt(Math.Pow(x, 2) + Math.Pow(y, 2));
}
public static Point[] KClosest(Point[] points, int k){
if (points.Count() <= k) { return points; }
Point[] closest = new Point[k];
Array.Sort(points, (Point x, Point y) => x.Distance.CompareTo(y.Distance));
Array.Copy(points,closest,k);
return closest;
}
}
My C# solution
class Point{
private int x;
private int y;
private double distance;
public int X { get { return x; } set { x = value; } }
public int Y { get { return y; } set { y = value; } }
public double Distance { get { return distance; }}
public Point(int x, int y){
this.x = x;
this.y = y;
this.distance = Math.Sqrt(Math.Pow(x, 2) + Math.Pow(y, 2));
}
public static Point[] KClosest(Point[] points, int k){
if (points.Count() <= k) { return points; }
Point[] closest = new Point[k];
Array.Sort(points, (Point x, Point y) => x.Distance.CompareTo(y.Distance));
Array.Copy(points,closest,k);
return closest;
}
}
We can solve it
- azambhrgn May 10, 20171- FInd the distance of each point from center- O(n)
2- Sort the distance array and output k element- O(nlogn)
Second point can be reduced to O(nlogk), by using heap of size k.
Hence total time complexity is O(nlogk) with heap