AnswersAt your disposal you have 100 processing nodes. You have 1 billion rows of data which are located on the storage attached to one processor. All the processors can read from the storage of any other processor over a network. The objective is to sum one of the columns of the data set with minimum cost(time).

Cost summary

a. 0.01 unit(of time) to read 1 row of data from a local processor

b. 0.05 unit (of time) to read 1 row of data from a remote processor

c. 1000 units(of time) to partition the file into two pieces.

d. 0.1 unit(of time) to copy 1 row of data from a remote processor to a local processor

e. 1 unit(of time) to sum 1 row of data

What is the optimal configuration of the system to sum 1 column of the data file with the minimum cost(time)? Consider location of the data, the partitioning of the data, the number of processors to be used.

