Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
because each request may reach at different server. (for big web balance ^_^)
echo server will log the request by timestamp. suppose one request reach at one server. and he want know what's her number. he can simlely count the requests before that timestamp on all servers. and then he will know it's number.
Let's say that there are N search servers, and let's assume that during any given day that N is constant. No servers fail, and no servers are added.
Let's assume that all servers are identical in performance, and that requests are load balancing evenly across the servers.
At midnight we send a message to each of the servers. The message includes the server's allocated server number S, which is in the range 0 to N-1. Each server begins counting search requests from S*(1b/N) to (S+1)*(1b/N).
If a server's counter reaches 1b, then we have a winner. The search results returned from that search server would include some flag that indicates this, which would be used by the front end page rendering servers to mark up the page with the special winner banner.
Although highly performant this solution is not very resilient to failures.But working through this approach is useful as it forms the basis of a better solution.
Let's introduce a new server, of which there is only one, called the counter server. The counter server has a request/response interface that the search servers can use to request a range of numbers. The range allocated could be between 1 and 1b/N... but we'd pick a number greater than 1 to reduce the contention for the counter server, and less than 1b/N so that we were more resilient to failures. Perhaps we'd measure the QPS of the search server and then multiply that by 10s, 30s, or 60s... Let's call the range extent R.
Now if a search server fails or a search server is added the change to N has little effect. We'll have to accept that this weakening of consistency has introduced some fuzziness. If a server fails having processed R/2 requests... the counter server sill thinks that R were counted. So the billionth query is now really the 1b-R/2 query.... which may, or may not, be acceptable...
But, having a single counter server introduces a single point of failure, so we would actually want M counter servers, and we would want them to co-ordinate with each other to ensure that any failure of an individual server has no effect. Co-ordination is expensive in terms of latency as to reach agreement a server has to talk with M-1 servers, twice. But by tuning R to take account of M, then we can build a system that is performant, reliable... and quite accurate.
I would go for holding a counter in Memcached. It's distributed and scales very well. The only thing to consider is incrementing it properly:
<?php
$cache = new Memcache();
$cache->addServer('10.0.0.1', 11211);
$cache->addServer('10.0.0.2', 11211);
$cache->addServer('10.0.0.3', 11211);
$key = 'search_query_counter';
// We first try to increment, because this will fit n-1 search queries
if (false !== $counter = $cache->increment($key, 1)) {
if (pow(10, 9) === $counter) {
// show banner
}
} else {
// The counter does not exist (this is our first search query)
if (false === $cache->add($key, 1, 0, 0)) {
// But it's possible (on high-traffic websites) that another process/server was faster, so now the counter DOES exist, so we increment
$cache->increment($key, 1);
}
}
Have 2-3 count-servers specific to count in each datacenter. Whenever a request comes to a particular datacenter, user info with serach key will be sent to one of these servers in the dataceneter. Each of these count-servers across all dataceneters will communicate with each other through distributed protocol. They can use Lamport logical clock to decide on the ordering of the searches. So they get sequence number of each search key to store. when the sequence number hits 1 billion we know that is the user. If multiple servers send the query at the same time then based on some kind of goodness value we can prioritise. example: If there are 10 servers and 3 servers send sequenceNum of 10 to all then remaining 7 will send 11, 12, 13 to these 3 servers based on which message received first. Assume all 3 servers received 13 from different servers then it is possible that all 3 will send 13. Then arrange them based on goodness value of the servers which is agreed upon by all servers. So everyone will see message ordered in1 server 11th position, second server 12, and third server 13th. This way we can exactly know when the 1 billionth message is got in real time and then the count server will notify the server which sent the search request so that it can display the banner.
Here's a solution.
First of all, I'd explain interviewer that precise solution is impossible (due to impossibility of absolutely precise clock synchronization in a distributed system), so any solution will be an approximation anyway.
As a second step, I'd explain interviewer that aiming for high precision of finding the exact winner will unavoidably introduce unwanted serialization point and will kill the performance of the system (essentially it will kill the scale), so we need to find a practical and pragmatic solution which doesn't compromise with perf, but we can relax accuracy constraint. At the same time we have to ensure that banner is shown to exactly one user. Or more precisely - to at MOST one user (since the reply with the banner can be lost).
This can be achieved by keeping a local counter of requests on each server. Those servers periodically send their counters to Aggregation System which is responsible for aggregation and - global counting. They don't send updates very frequently - but time interval is tuned so it doesn't create a lot of network traffic. In response to updates the last known global counter is returned, so the each server has an approximate view of the global state. Now each server can use trivial statistic computation to estimate rate of growth of local counter and the view o global counter, so it can make his own decision which of HIS clients is billionth user. Server makes such decision for one and only one request. Only for that request - we delay the processing and perform real time message exchange with another system, let's call it Coordination System, to confirm whether the server can display the banner.
The Coordination System has to be globally-central and fault-tolerant. It has to implement Paxos based lock-step state machine or similar protocol to guarantee that there will be not more than 1 winner even in the presence of failures of servers of coordination system. In the case if we don't want such complexity, I guess FB has enough many to pay for N cars [for any N:)], so the system can be simplified and we can give probabilistic guarantees.
The Aggregation System should be also designed fault tolerant, but constraints here are relaxed and simple eventual consistency is enough. It makes sense for Aggregation System to hierarchical - we can have one child branch of the system per FB datacenter, so we can localize the update traffic.
search qps = 1b dau * 2 search per day/86400s = 20k/s
peak is 40k/s
One counter server can keep the sum in memory.
All web servers informs the counter when a search request is received.
A second counter does the same thing to prevent single point failure.
Slave reports to master if it hits 1b.
I would use a separate database just for this purpose. Using a javascript include just for this purpose an ajax request will be sent to the server for every search made. I assume here the database server here is handling the concurrency of all incoming requests. Once the count reaches 1b, the same request for sent for each search will return and ajax create the banner.
The tricky part here is identifying the real winner. For example someone made the 1b search could not be the winner because of latency issues. For this part you cannot do much as there is no way to identify that the request was made as it can be forged from the client.
Guys,
It's impossible to rely on a single centralised counter/machine given the scale of facebook.
The easiest think to do is perform the operation off-line. At runtime, the system should log each search request along with a timestamp and the id of the user the performed the query. This will lead to many different log files across the different servers that handle search. After some time, one can process these files using MapReduce/Hadoop and get the bilionth user.
Then good luck calling the lucky b*st*rd. ;-)
Guys,
It's impossible to rely on a single centralised counter/machine given the scale of facebook.
The easiest think to do is perform the operation off-line. At runtime, the system should log each search request along with a timestamp and the id of the user the performed the query. This will lead to many different log files across the different servers that handle search. After some time, one can process these files using MapReduce/Hadoop and get the bilionth user.
Then good luck calling the lucky b*st*rd. ;-)
We could use Lamport's algorithm to maintain total order in distributed systems without using physical clocks.
A single counter will cause contention with millions of users trying to access the shared counter (if properly synchronized). I'd fill create n containers each allocating 1 billion/ n requests, each of them storing requests in parallel. The requests will contain the time stamp. Once all container are filled up, if they are kept sorted by time stamp, just get the n maximums of each and done
You are right, can't use a single counter here. The problem with storing time stamps is memory consumption. A 27-bit tick value (each value corresponding to a millisecond during the day) for 1 billion entries alone will lead to ~3.5GB memory consumption. Another problem would be trying to keep the values sorted by time stamp. Insertions are expensive in data structures that provide this capability. If we don't maintain the sort order "upfront" on each insert, then 1bth search query becomes "expensive" since 1b elements need to be sorted. Secondly, how would we know that this condition has been reached?
One question that I would clarify from the interviewer is how reliable the system needs to be. For e.g. what happens if the server goes down? Do we still need to be able to tell the "real" 1bth user? Or do we restart? Or is "estimation" okay in these cases?
The more pertinent issue is perhaps how you get a user to believe a banner that says "You are the 1 billionth visitor to this page. Click here to claim your free car. "
- Anonymous September 23, 2012