Amazon Interview Question for Principal Software Engineers


Country: United States
Interview Type: Phone Interview




Comment hidden because of low score. Click to expand.
1
of 1 vote

1) Use RSync protocol for file synchronization between clients and client-server.
2) Store file blocks of equal size instead of file. Have a file block server which keeps track of file blocks and SHA-256 hash of each block stored.
3) Use Amazon S3 to store file blocks.
4) Each client should be identified by it's unique root namespace.
5) Store client metadata in a DB (MySQL) Table. Each row in DB table will represent a particular file. Table attributes should be list of file blocks, root namespace of client who owns this file, a relative path of file in namespace to get the location of file in client's dropbox installation folder. Store other user metadata like settings, account configuration, access level etc. also in this DB in some other table.
6) Have a metedata server to fetch result from DB.
7) Have multiple instances of metadata and file block server to handle large number of requests. Amazon S3 will handle your file blocks.
8) Use memcache and load balancer with metadata server for efficiency.

To upload a file, client will split the file into blocks of equal size (4MB) and client will talk to metadata server to send the information (hash of file blocks) about the file to upload. If any file block is not already found in Amazon S3 then block server will tell metadata server about it. Metadata server will tell client to send those file blocks to block server. Client will send those file blocks to block server. Once block server stores file blocks in Amazon S3, metadata server adds the entry for that file in MySQL DB representing an update in client namespace.

To sync file to other clients of a user, metadata server will inform the clients about the update in client namespace. Clients will ask metadata server about newly added file. Metadata server will send the list of hashes of file blocks of the new file.Then the clients will talk to block server, give the list of file block hashes, retrieve the corresponding file blocks and combine them to generate the file.

AND THIS IS HOW EXACTLY DROPBOX WORKS.

- Cerberuz September 03, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Aren't you giving out propritery information, if that last line is indeed the case.

- Bhaavan May 23, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Amazon S3 provides object storage, not block storage! AWS EBS does provide block storage.

- Andrew June 20, 2019 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

- How can we make sure the files are in sync? Use a hash function like SHA-1 for each file to produce a 160 bit value for each file. You can then easily compare this small value for the files on the client computer and your back end.

- Protocol? Use FTP over SSL. The client program connect to the server and after authentication tries to sync the files. And if finds a mismatch compare the date, and either download or upload the file.

- A good hint for implementation: Since most of the files do not change frequently, you can have a database to store the files hash values and query when you want to compare them. This way you reduce the traffic to access the files on the server. You only go to the files if they are needed. And when upload a file to the server the hash value in DB has to be uploaded. You should also put a layer of web service between the DB and client.

- Mohammad September 02, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

how would you make this implementation scale? How can we ensure little downtime if a datacenter goes down?

- paul September 03, 2014 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More