Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
There are two broad ways to resolve conflicts in distributed systems : manual and automatic.
Manual way, preserve both conflicting data and let user decide on there fate. Perhaps you would have seen it is Google drive. If you mount google driver on two different machines(pcs/macs/chromebook). And at the same time, edit the same file, google driver would save a second copy of the file by adding “-2”(or something to this sort) at the end of the filename. So secretFile and secretFile-2. Google driver is a multi master system.
Another way is by knowing the semantics of the data you are storing. This is very well researched in filesystem context(from what I know). For example if delete(file2) and write(file2, buffer) is in conflict, there ordering does not matter. If delete operation is done first, write would be discarded as there is no file: end state is that file2 is deleted. If write operation is done first: end state is that file2 is deleted. In both case, there end state is same, so ordering does not matter here.
Another way is to let application resolve conflict. Application would provide a conflictResolver handle, that the system would call when a conflict is discovered. conflictResolver could be per operation or per application.
There are two broad ways to resolve conflicts in distributed systems : manual and automatic.
Manual way, preserve both conflicting data and let user decide on there fate. Perhaps you would have seen it is Google drive. If you mount google driver on two different machines(pcs/macs/chromebook). And at the same time, edit the same file, google driver would save a second copy of the file by adding “-2”(or something to this sort) at the end of the filename. So secretFile and secretFile-2. Google driver is a multi master system.
Another way is by knowing the semantics of the data you are storing. This is very well researched in filesystem context(from what I know). For example if delete(file2) and write(file2, buffer) is in conflict, there ordering does not matter. If delete operation is done first, write would be discarded as there is no file: end state is that file2 is deleted. If write operation is done first: end state is that file2 is deleted. In both case, there end state is same, so ordering does not matter here.
Another way is to let application resolve conflict. Application would provide a conflictResolver handle, that the system would call when a conflict is discovered. conflictResolver could be per operation or per application.
There are two broad ways to resolve conflicts in distributed systems : manual and automatic.
Manual way, preserve both conflicting data and let user decide on there fate. Perhaps you would have seen it is Google drive. If you mount google driver on two different machines(pcs/macs/chromebook). And at the same time, edit the same file, google driver would save a second copy of the file by adding “-2”(or something to this sort) at the end of the filename. So secretFile and secretFile-2. Google driver is a multi master system.
Another way is by knowing the semantics of the data you are storing. This is very well researched in filesystem context(from what I know). For example if delete(file2) and write(file2, buffer) is in conflict, there ordering does not matter. If delete operation is done first, write would be discarded as there is no file: end state is that file2 is deleted. If write operation is done first: end state is that file2 is deleted. In both case, there end state is same, so ordering does not matter here.
Another way is to let application resolve conflict. Application would provide a conflictResolver handle, that the system would call when a conflict is discovered. conflictResolver could be per operation or per application.
If it's not a peer to peer system, I can think of two ways. One is each master has a sequence number, and when a conflict occurs, live master with a lowest sequence takes over. Lowest sequenced master can be elected using polling.
Other is masters elect a leader using something like Paxos. Yay! I think this answer is not bad.
To Resolve conflicts in a distributed system there are two broad ways: manual and automatic.
- NoName October 19, 2015Manual way, preserve both conflicting copies and let user decide on what to do with it(perhaps you must have seen this is google drive). If you mount google drive on two different PC(or mac), and edit the same file at the same time from the two machines, google driver would create a second file with "-2"(or something of this sort) added to the end of the filename. It is up to the user to decides what to do with it.(Remember google driver mounted on multiple machines is a multi master system).
Other way is to know about the semantic of the data that is being stored and resolve it using the semantic knowledge. This is very well researched in filesystem(at least from what I know). For example if a delete(file2) and a write(file2) conflicts, there order does not matter: If delete goes first write would fail: end state is file2 is deleted. If write goes first: end state is file2 is deleted. This is the automatic way.
One more way is to let application provide a conflictResolution handler, which would be called when a conflict is detected. So, this would be the application specific approach. conflictResolution handler can be provided per operation or per application. This is also automatic way.