How does one application imple

Amazon Interview Question for Principal Software Engineers

0

of 0 votes

3
Answers
How does one application implement similar to DropBox? How can we Mmake sure they are in sync for files. How u’ll check for files are downloaded. How u’ll download files. What protocols u’ll use?
- newbee September 02, 2014 in United States | Report Duplicate | Flag | PURGE
Amazon Principal Software Engineer Knowledge Based

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

1) Use RSync protocol for file synchronization between clients and client-server.
2) Store file blocks of equal size instead of file. Have a file block server which keeps track of file blocks and SHA-256 hash of each block stored.
3) Use Amazon S3 to store file blocks.
4) Each client should be identified by it's unique root namespace.
5) Store client metadata in a DB (MySQL) Table. Each row in DB table will represent a particular file. Table attributes should be list of file blocks, root namespace of client who owns this file, a relative path of file in namespace to get the location of file in client's dropbox installation folder. Store other user metadata like settings, account configuration, access level etc. also in this DB in some other table.
6) Have a metedata server to fetch result from DB.
7) Have multiple instances of metadata and file block server to handle large number of requests. Amazon S3 will handle your file blocks.
8) Use memcache and load balancer with metadata server for efficiency.

To upload a file, client will split the file into blocks of equal size (4MB) and client will talk to metadata server to send the information (hash of file blocks) about the file to upload. If any file block is not already found in Amazon S3 then block server will tell metadata server about it. Metadata server will tell client to send those file blocks to block server. Client will send those file blocks to block server. Once block server stores file blocks in Amazon S3, metadata server adds the entry for that file in MySQL DB representing an update in client namespace.

To sync file to other clients of a user, metadata server will inform the clients about the update in client namespace. Clients will ask metadata server about newly added file. Metadata server will send the list of hashes of file blocks of the new file.Then the clients will talk to block server, give the list of file block hashes, retrieve the corresponding file blocks and combine them to generate the file.

AND THIS IS HOW EXACTLY DROPBOX WORKS.

- Cerberuz September 03, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Aren't you giving out propritery information, if that last line is indeed the case.

- Bhaavan May 23, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Amazon S3 provides object storage, not block storage! AWS EBS does provide block storage.

- Andrew June 20, 2019 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

- How can we make sure the files are in sync? Use a hash function like SHA-1 for each file to produce a 160 bit value for each file. You can then easily compare this small value for the files on the client computer and your back end.

- Protocol? Use FTP over SSL. The client program connect to the server and after authentication tries to sync the files. And if finds a mismatch compare the date, and either download or upload the file.

- A good hint for implementation: Since most of the files do not change frequently, you can have a database to store the files hash values and query when you want to compare them. This way you reduce the traffic to access the files on the server. You only go to the files if they are needed. And when upload a file to the server the hash value in DB has to be uploaded. You should also put a layer of web service between the DB and client.

- Mohammad September 02, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

how would you make this implementation scale? How can we ensure little downtime if a datacenter goes down?

- paul September 03, 2014 | Flag

CareerCup

Amazon Interview Question for Principal Software Engineers

Books

Videos

Resume Review

Mock Interviews