Uber Interview Question for Software Engineer / Developers

Country: United States
Interview Type: In-Person

1)offline algo: check each file if it has changed since last sync, keep syncstate on master
Easy and robust, slow and long delays on replication
2) online, non-transactional: monitor filesystem changes and replicate asap. Asap is the challenge if slave is not reachable, so add a persistent queue on master or a reliable 3rd system. Needs a setup and resync to initialize, depends on transaction safety of queue.
3) online transactional: hook into filesystem and only accept master write if slave write was accepted (2 phase commit): transactional, but introduces a runtime dependency from master to slave, which does not scale.

Alternatice thoughts: issue transactions on both, use quorum and vector clocks, optimize, trade off on write speed, read speed and reliability requirements. Hard to do generic (filesystem) as you might end up with full ACID requirements: look at the specific use case, if possible.

- Chris October 09, 2017 | Flag Reply
ChrisK, nice analysis. However, this was more of a coding question, not design. I guess the emphasis was on picking the right data structure ( Tree / LinkedHashmap) to represent the filesystem so that change detection and copy can be done in a efficient way.

- Player October 09, 2017 | Flag Reply
How about using a Merkle Tree? It's used to detect changes in filesystems

- DoesItMatter June 18, 2018 | Flag Reply

