There is a job which would com

Pega Interview Question for Software Engineer / Developers

0

of 0 votes

6
Answers
There is a job which would comprise million of tasks. There are multiple JVMs. Design a system such that these tasks are shared across JVMs.
- HV October 02, 2012 in India | Report Duplicate | Flag | PURGE
Pega Software Engineer / Developer Application / UI Design

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

What sorts of tasks? Can these tasks be done in any order? Does each JVM reside on a separate machine, or in such an environment that it gets its own resources?

- eugene.yarovoi October 02, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

I'm assuming there is no ordering of tasks. To ensure high throughput, you need a queue to hold pending tasks. Every VM runs an application to poll the queue for work, executes it, reports the result in some fashion and loops forever. Of course, you would need someone to enqueue the tasks in the queue. You would also want to invent some concept of batching identical tasks together so that resource utilization is maximized. The good thing about a pool of executor is that you can scale up/scale down with the load, very easily.

If the tasks have priority, the simplest solution is to use multiple queues, one for each priority band. The task executors look at the highest priority queue first, then the second and so on. There is a risk of starvation though.

The queue is interesting. It is required because there is no guarantee how long a particular task would take. Immediately, there's an impedence mismatch in the producers and the consumers. There would definitely be a situation where the producer generates tasks faster than the consumers can eat combined. The queue needs to be fault tolerant as well and needs to have a defined delivery guarantee. Typically such applications would use queue implementations such as Amazon SQS. SQS guarantees at least once delivery, which means a task can be requested to be done twice. The executors need to be able to ensure this does not happen (look up idempotency in distributed systems)

eugene.yarovoi:
It actually, doesnt matter if the VMs reside on the same host or not. Even if they did, you have a higher probability of a failure (a host going down meaning more than one VMs are out of service). You need to design for reliability anyway.

- dr.house October 02, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

"You would also want to invent some concept of batching identical tasks together so that resource utilization is maximized. "

Define "identical tasks". What exactly do you mean here?

"The queue is interesting. It is required because there is no guarantee how long a particular task would take."

Not sure I'm following your train of thought there. It's certainly not required, though maybe desired. What sort of situation are you contrasting this with? One where you can try to partition the tasks into roughly equal sets because they have known durations?

"It actually, doesnt matter if the VMs reside on the same host or not."

I'm not so sure about that one. If the JVMs are all sitting on one machine that has no multiprocessing or anything like that, it might be wiser for it to not waste memory by loading the same classes in many different JVMs. I'm not saying this is a good strategy all the time, but I could see certain situations where running everything on one JVM might give the best performance. So the answer to such a question is not completely irrelevant. I agree, however, that the interviewer was probably looking for some sort of distribution strategy and was probably thinking of these JVMs being on multiple machines or at least on a machine with multiprocessing capabilities.

- eugene.yarovoi October 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

- These tasks can be executed in any order.

- Each JVM reside on a separate machine.

- The interviewer also had a follow-up question : At some point, if I want to abort this job, how it can be achieved?

- HV October 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

A queue approach like outlined above by dr.house seems reasonable then. What are your requirements when aborting? We could just flush the queue. Do you need to roll back to the state everything was in before any of the jobs took effect?

- eugene.yarovoi October 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

JMX is your answer. You will create JMX agents in all servers, each agent to carry out those tasks and return status once complete. It has another function of cancelling the task. So each JVM node is configured with this agent as well as with other information as in how many worker threads each one should have etc. Then you create a central JVM with JMX manager to which all these agents report to. Manager initiates all agents and is reponsible for allocation of task to agents so that all jvms are equally occupied. Manager can also provide a function to cancel task which allows it to communicate with each other agent in cacelling and stopping execution of task.

- parag June 27, 2013 | Flag Reply

CareerCup

Pega Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews