Amazon Interview Question for Backend Developers
- 0of 0 votes
AnswerThere is 3D space, limited with a cube, with edge=2000.
- John.Ackerman.jb May 17, 2017 in United States
The center of coordinate system is point (0; 0; 0), so the maximum/minimal coordinate value is 1000/-1000.
There are 10000 points generated with discrete uniform distribution inside of K spheres, located in the cube.
Radius (R) of each sphere is 250.
Centers of spheres are located at the distance of not less than 2*R.
It is required to determine which point related to which sphere.
Input: array of 10000 structures, like:
struct Point {
int number;
int x;
int y;
int z;
}
where number is unique id of the point, x,y,z - it's coordinates.
Output:
array of 10000 structures, like:
struct Point {
int number;
int cluster_id;
}
where cluster_id is unique cluster id of a sphere that point belongs to.
Initially I considered a following solution:
1) Find Euclidian distance for each point from center of coordinates (0;0;0) to point's coordinates.
2) Sort this array of distances in descending order.
3) Get the point from the sorted array of distances and put in a new Set of Cluster Maximals.
4) Compare following point from the array to each value from the Set of Cluster Maximals (initially 1 value).
If it's Euclidian distance less than or equal to 2*R, then
mark this point as belonging to Kth cluster (range=1..N), otherwise add the point to the Set of Cluster Maximals.
5) Repeat step 4.
Two concerns I have:
1) There is an issue that my algorithm would work only in case if Claster Maximals are located on the surface of the spheres.
2) Plus, according to the task requirements, there could be the case when 2 spheres can have 1 and only common point.
I think in case if point belongs to 2 spheres, it is ok to mark it with cluster_id of any of these 2 shperes.
Could you please provide a proper solution to the task?| Report Duplicate | Flag | PURGE
Amazon Backend Developer
Interview Type: In-Person
that's a tricky one, it looks like one of the cluster algorithms, such as k-mean should be applied, but if your study didn't cover that, I would assume you are lost in some way.
- Chris May 18, 20171. Start by picking a first arbitrary point from the list of points not assigned to a cluster.
2. Then search within R-distance of that point at least one second point. If found, estimate a sphere center by averaging all those points coordinates. Then around this center, add all points within R distance of the center. If not found, extend the search to 2R. If not found, put that single point into a single sphere and do not further consider that point. If found, average the two points, try to find more points that potentially go into this sphere...
3. Repeat until no points are left which gives an initial guess of spheres.
(initial center selection could be done by observing point density and guess centers from a group of dense points, e.g. by separating the space into R/2 cubes in order to get a space-based index etc.
or another approach would be to first separate the space into this cubes and start searching from outside in ...)
4. At this stage a number of spheres will probably overlap and points can be assigned to multiple spheres.
5. use the points in conflict and assign it to the sphere with more points in it
6. clean up spheres that are empty, recompute centers, repeat 4 until no changes happen *
* it's not clear it will converge, so maybe a upper limit of passes and/or error limit must be given