Google Interview Question
Backend DevelopersCountry: United States
Create a directed weighted graph out of the given list ( a-100->b , b -200->c, e-200->f, ...). Then you can either use BFS or dijkstra and stop after two followup. Keep track of the parent node and the distance.
I would prefer dijkstra (with a good datastructure) over BFS because dijkstra follows the shortest path while BFS that just all possible steps.
Dijkstra does not seem appropriate to me, for two reasons. First, it finds shortest (cheapest) paths from a given source to ALL reachable nodes. That makes it very wasteful, since we only want the cheapest path from x to y. Second, and more importantly, Dijkstra always adds ONE NODE per cycle, but it NEVER cares how many hops are needed. This problem is constrained to a maximum of 3 hops (two transfers).
To find the cheapest path from one given node to another given node, you want something more like a double-ended search. Basically BFS for one hop out from x, and BFS for one hop out from y (along INCOMING arcs), then scan through once more to link them up.
Dijkstra does not seem appropriate to me, for two reasons. First, it finds shortest (cheapest) paths from a given source to ALL reachable nodes. That makes it very wasteful, since we only want the cheapest path from x to y. Second, and more importantly, Dijkstra always adds ONE NODE per cycle, but it NEVER cares how many hops are needed. This problem is constrained to a maximum of 3 hops (two transfers).
To find the cheapest path from one given node to another given node, you want something more like a double-ended search. Basically BFS for one hop out from x, and BFS for one hop out from y (along INCOMING arcs), then scan through once more to link them up.
@Rob. That's correct. Using a bidirectional BFS it's easy to find the subgraph that contains the shortest path. But what if the subgraph still contains millions of links. Yes, there are only 3 hops, but each node can have 1 million weighted edges. It would be 1,000,000 ^ 3 paths to scan.
@Alex This is Rob replying. A few thoughts. In the actual hypothetical context of flights, google tells me there are 17,678 commercial airports in the entire world, so surely no node in this ACTUAL question has even 10^4 edges. But if we generalize to the abstract problem without thinking about actual airports/flights, then the algorithm I wrote was (degree)^2, not (degree)^3. My first step is to pre-process the list of all flights into a dictionary (if not originally given as such). Then, if there are 10^6 flights out of x, and 10^6 flights into y, I check the 10^12 possible "middle links". Each lookup in the dict takes O(1) time, so the overall running time would be O(degree^2). There is certainly room for significant optimizations, depending on what the actual data is like. For example, if you find a direct flight or a two-edge flight in the first two rounds, then you can prune your search significantly when sifting through the three-edge possibilities.
#flights = [(('a','b'), 200) , (('c','d'), 300), (('a','c'), 150) , (('b','d'), 600), (('a','e'), 250), (('e','f'), 3), (('f','d'), 100), (('a','f'), 600)]
#out: find_cheapest_flight('a', 'd', f, out , inc) --> (353, [('a', 'e'), ('e', 'f'), ('f', 'd')])
def process_flights(flights):
flight_dict = dict(flights)
outgoing = dict() # key = a, value = [(b, 200), (c, 100)]
incoming = dict() # key = c, value = [(a, 100)]
for flight in flights:
origin, dest = flight[0]
fare = flight[1]
if origin in outgoing:
outgoing[origin].append((dest, fare))
else:
outgoing[origin] = [(dest, fare)]
if dest in incoming :
incoming [dest ].append((origin, fare))
else:
incoming [dest ] = [(origin, fare)]
return flight_dict, outgoing , incoming
def find_cheapest_flight(x, y, flight_dict, outgoing , incoming):
if x not in outgoing or y not in incoming:
print 'Sorry, no flights service those cities.'
return None
cheapest_route = []
cheapest_fare = 10**10
if (x,y) in flight_dict:
cheapest_route = [(x,y)]
cheapest_fare = flight_dict[(x,y)]
for flight in outgoing[x]:
city1, fare = flight
if (city1, y) in flight_dict:
if fare + flight_dict[(city1, y)] < cheapest_fare :
cheapest_fare = fare + flight_dict[(city1, y)]
cheapest_route = [(x, city1), (city1, y)]
for flight in incoming[y]:
city2, fare3 = flight
if (city1, city2) in flight_dict:
fare2 = flight_dict[(city1, city2)]
if fare + fare2 + fare3 < cheapest_fare :
cheapest_fare =fare + fare2 + fare3
cheapest_route = [(x, city1), (city1, city2), (city2, y)]
return cheapest_fare, cheapest_route
@Rob. That's correct. But to me it looks simpler and more efficient to use Dijkstra, limiting number of hops to 2. To make it correct, when we reach a node, we consider 2 criteria - cost, and number of hops. E.g. if we reach the node with 1 hop and $50 cost, and the node already has a label of 2 hops and $10, the new node also survives, because it contains something better in the criteria.
Alternatively, we can use the bidirectional BST to find the 3-hop subgraph (just marking its nodes as enabled), and in the subgraph, we can use a regular Dijkstra to find the actual cheapest path.
This can be done by simple DFS, extending for only 2 edges.
Worst case complexity would be O(n2)
- sudip.innovates November 27, 2017