Google Interview Question
SDE1sCountry: United States
Why cant we use a single master node that provides the counter.(something similar to the master node is a hadoop cluster). Each other server will ping the master and ask for the value of the counter.
We can have a request queue to take case of concurrency issues.
Then one question comes to mind that the master node becomes the single point of failure. we can do the same thing that cloudera did with master node and have multiple nodes/servers be passive master so that when one server/node goes down, some other node can take up the queue from there.
For the distributed counter it is important to ask about the requirements and restrictions of the system. How many counters do we need in total ? Is it going to have more frequent updates or more frequent reads? or both? How important is accuracy? Depending on these answers, there could be multiple solutions.
There are many ways to do so.
1> Cache Coherence technique is highly used in industry scale. Specially at hardware level.
2> Another approach which I could think of is as follows.
Make a Counter Base Class and maintain in one of the Server. Rest of the remaining server may act only as the supplement support , which holds the copy of the counter class object ( May be we can scale using Derived Class of this Base Class ).
Now, the counter update is not a simple counter increment but a queue, Once a counter gets incremented by any of the server, it process that by sending a increment info to the queue which is maintained by base server.
Base Server has to do 2 task,
1> Process the queue operation sequentially and in some routine time manner.
2> Once, the operation is completed, send the updated counter value to each of the corresponding associated servers.
@hprem991: nice answer but with couple of comments
Your 2 point: "Once, the operation is completed, send the updated counter value to each of the corresponding associated servers".
The above operation would take some time to update in all the servers.So you will have a time where one server will show x counter and other server will show y counter.So how to deal with that?
In embedded world we have memory barriers to take care of this i.e. once an instruction is executed the memory barrier api's will make sure that this updated variable is visible to all the cores i.e.
core 1 core 2
x=0
x++ if(x==1)
printf("got it");
in the above code "got it" may or may not be printed but if you place memory
barrier around the "x++" instruction then "got it" will always be printed.What happens is the memory barrier instruction will make the updated x variable value to be visible in all the cores.
So we need a way to achieve memory barrier functionality when you are updating the servers.
google for "memory barrier" if you want more information.
@aka :- Thats excellent approach you explained above at Embedded Level for overcoming Conflict. The Same can be applied as well.
In middleware there is yet another way of doing so is blocking of Critical Section approach, using threads or Semaphore. Over the time in which the section of code updation is going on, here so called Base Server has to ensure that everyone is in sync but meanwhile it can accept that all other incoming increment request and can accumulate in his queue.
What about other tierware solutions hprem99? The synergy of the memory barriers excludes the opportunistic cost of additional lower layer burdens. The mutex/semaphore abstraction isn't concrete enough for a solid foundation of rsync like protocols. We will be eventually consistent, but CAP theorem precludes us from doing any better.
@Anonymous. Thanks for pointing out CAP which is obviously cannot be beaten. Specially the "Partition Tolerance" coz the system like this has to be capable of overcoming fault tolerance if any one of the server goes down or got hacked.
Apart from that, I assume this is middleware approach to tackle this as you mentioned lower layer has even better protection.
The solution your looking for is a "distributed counter". You can search about this topic, because I cannot include links here. A good approach to solve the scalability design problems is that starting with a small scale version of the solution and then extend your solution for a large scale version.
- oOZz May 29, 2013