Programming Interview Questions

Page:

1

Comment hidden because of low score. Click to expand.

1

of 1 vote

Multiple issues:
1) Performance - assuming 1M numbers to be added take 10ms. Doing sequential summing on a single machines means 10000*10=100s. Too bad.
2) Overflow - Here total are 10Bill integers, cant bank on max int limit of ~2.15Bill. Long is safer @ max of 9.2BillBill. Best is go for BigInteger. Can also go with long as it can be faster....so decide the trade off based on data range.
3) Beware of OOM due to heap size limits. With ints, each data set is 4MB+(4B per int + object overhead), with longs ~16MB . For BigInteger plan more, as it internally uses int[].

So assuming max heap configured at 8GB, assuming 50% is available for you ...so you can process (4GB/16MB) ~250 files per node in a multi-threaded implementation, needed 40 nodes.
Assuming each node takes 20-30ms to do the job. Have workers/jobs to delegate job to 20 machines with file paths(each machine can read the target files directly from a central mount location). Kick them off in parallel. Sum up the 40 results...each taking 20-30ms each... account for say 2-5ms of network latency...you can get final answer in ~35-40ms.
Optimize further as per your perf SLA.

Comments welcome!!

- gg_poks June 11, 2018 | Flag Reply

Page:

1

CareerCup

gg_poks

Books

Videos

Resume Review

Mock Interviews