Facebook Interview Question
Software DevelopersCountry: United States
Interview Type: Phone Interview
I guess the first question I would ask is: what is the scale of this system? A local RPC server using ALPC on regular desktop type system can handle on the order of 10s of k of concurrent synchronous RPC calls. If that is not sufficient, then it needs to be distributed. Then, the question is how would it be load balanced?
But the basic idea is to have work dispatcher, but if you use say RPC, it already creates the worker thread for you, or if it is something like a REST call, then the HTTPD sever is already creating threads to handle the HTTP client get request.
The work on the thread could be simple, make black box call, return if successful, if not, sleep 2 seconds, try again, and return regardless success or fail.
If it is an async IPC mechanism, you always create a worker thread to process the request, and you could do the same as above, but if you are worried about wasting threads, you could create something like a threadpool timer to do the 2 sec retry instead of sleeping which is the better way to go.
Note: Since this answer will be long, I have created a google doc instead of writing everything in the text box here.
- Saurabh July 03, 2018docs.google.com/document/d/190Ik3yauub4spoSFRldclBuwjyXuk7RLyxf1GFpPk_U/edit?usp=sharing
I will approach this problem using following steps
Step 1: Requirements and assumptions
Step 2: Interface / API design
Step 3: Back of the envelope estimation (if any)
Step 4: Design data model (classes etc)
Step 5: High-level architecture
Step 6: Detailed architecture
Step 7: Performance improvements (remove bottlenecks)
The description that gives a good indication of what can be the design is as follows:
1. Job_arrived() --> This can be an exposed web service / method to accept job
2. Job_run() --> this can be a separate job processor that reads job (from somewhere) and starts processing, in this case, calls external black box service.
3. External black box service --> Application needs a way to talk to external interfaces over REST/SOAP and load balancers
4. The system to be large scale --> System needs to be distributed in nature
5. More than one job can be accepted each second --> System should be scalable so preferably the job receivers need to be separate from job processors so that each can be scaled independently.
So the overall design could be as follows (more details in the google doc):
==============
Application exposes Job receiver web service. This service accepts the job information and puts the job on a queue (preferably min priority queue). Job processors are looking for any job on the queue. As soon as the job appear, they pick up from the head of the queue and call external black box system.
If the response comes back as a failure, they wait 2 seconds and put the job back on the priority queue. Job maintains failure count so that if it fails the second time, processors do not add them to the queue.
The reason for separating job receivers and processors is scalability, based on the load on the system (That is job queue size), the number of processors can be scaled up or down.
Please refer to the google doc link for rest of the part of the design (link at the top)