We are given a list of numbers

Microsoft Interview Question for Software Engineer / Developers

0

of 0 votes

28
Answers
We are given a list of numbers A= {5,9,12,16,25,30,45} and their fixed probability of occurance P={0.22,0.18,0.20,0.05,0.25,0.02,0.08} where sum(p) =1. The problem is to arrange the numbers in a binary search tree in a way that minimizes the expected total access time. Note: In a binary search tree, number of comparisions needed to access element at depth d is (1+d)
- samant.bit August 02, 2012 in India | Report Duplicate | Flag | PURGE
Microsoft Software Engineer / Developer Dynamic Programming

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 9 vote

Yes this problem can be solved by using dynamic programming in O(n3) time
C(i,j) = min { C(i,r-1) + p(r) + C(r+1,j) + sum_{k=i}^{r-1} p(k) + sum_{k=r+1}^{j} p(k) } i <=j
i<= r <=j
= 0 if i = j-1 here we have no occurrence of non existing element node , i.e we only search nodes which are present

To understand this problem more clearly you can watch this video
youtube.com/watch?v=DEOebw3vmXs

- Anonymous August 02, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Agreed to the comment above. The explanation could be more clear.

- Anonymous August 04, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 3 vote

The problem is very similar to matrix chain multiplication, the only difference is the calculation of new comparison from previous ones.
Now look at it very carefully, it is clear that we need to use DP, and in DP also we need to calculate(remember) a matrix. Now because the not a be placed at any position from i to j (if we are calculating MAT[ i ][ j ]), so the complexity will go to O(N*N*N)
Now Here is the algorithm :
M[i][j] = min(p[k] + M[i][k-1] + M[k+1][j] + sum(M[r][r] for r=i:j) for k=i:j), please consider boundary conditions,
we set M[i][i] = p[i], not that firstly we sort the array of number and maintain their corresponding probabilities also(sort can be in any order).

For example like above construct a upper diagonal matrix, here each entry actually store the comparisons only. Now if you want to construct the tree, you need another matrix(B od same size which store node data), which tells who to be chosen as parent node at each place, so M[0][N-1] will give you cost and B[0][N-1] will give you root node.

Please let me know if am not clear..

- sonesh November 22, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 8 vote

Just Build a Huffman Tree!

- cobra August 03, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

in huffman tree, the element with higher frequency or probability can be reached in short time than element with low frequency or probability..

to have details of huffman encoding.. see CORMEN's Introduction to Algorithm

- cobra August 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

They require to build binary search tree and not Huffman tree.

- Anonymous August 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

you are right..
1. sort the elements according to the probability => preorder of BST
2. build BST

so that Root element is the element with highest probability..
for the given case..by doing like this only drawback , 30 is searched faster than 16.

- cobra August 03, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

cobra: that's not correct. Consider a tree whose inorder traversal is 1, 2, 3, 4, 5, 6, .., N and whose probabilities are, for instance, .01, .02, .03, .035, .... Strictly ascending probabilities with respect to the in-order traversal.Your idea would place N as the root, N-1 as its left child, N-2 as the left child of that, ... Your tree would be a linked list. This can be worse in the expectation than a more balanced tree even if the root of the balanced tree isn't the node that has the highest probability.

- eugene.yarovoi August 04, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

i guess cobra's solution is correct. its just about to keep the probabilities as the keys to BST and info as satellite data.

- timedout August 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

It's not correct, for the reason I mentioned. Take a dataset like the one I mentioned. Run cobra's algorithm. Then run the dynamic programming algorithm described by another response. Then observe how the dynamic programming solution yields a better expected time. This shows that cobra's algorithm isn't optimal.

- eugene.yarovoi August 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use a splay tree, and insert the elements one-by-one in increasing order of probability. In a splay tree (which is a special type of BST), the most recently inserted/accessed elements occur at the top of the tree.

Note; I'm not sure this solution is correct.

- Anonymous August 07, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

You haven't given a good rationale as to why it would be correct. Just because the most frequent elements are near the top, doesn't mean they're arranged in the most optimal way, or if they are, you should give a proof or at least an argument. I think that, generally speaking, when you post a solution to a non-trivial problem, the burden is on you to show that the solution is correct.

Try comparing your results with the results of the DP approach, whose optimality is easy to see and prove. You'll probably find cases where your trees are suboptimal.

- eugene.yarovoi August 07, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I feel the splay tree solution mentioned here is a good attempt ... but if nodes are considered in increasing order of probability for constructing the BST, then it will lead to a linked list like structure ... so this will not be an efficient BST ... Let me know if I am missing anything here

- Anand January 05, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use OBST
cse.yorku.ca/~aaw/Gubarenko/BSTAnimation. html

- GKalchev August 08, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It appears to me that it might be simply done by using sorting the elements on the basis of their probabilities, and then putting them in the tree in a level-order fashion, i.e., if the elements after sorting are e1, e2, e3, e4, e5, e6 and so on, then e1 will be root, e2, e3 would be its children, e4, e5 would be e2's children and so on.

Here's my logic for that:
The expected time is Sum{P(i)*T(i)} for all i, right? and we need to minimize it. So, simple enough, i say. Put the Highest P(i)'s with the lowest T(i)'s... as many as you can. the highest P(i) would be at root, with t=1 (lowest) , followed by e2 and e3, the next highests getting the next lowest time (t=2).

- prkhr August 08, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Then u may not end up in a bst!!

- madhu August 08, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Agreed with madhu. You need to ensure you form a BST. You'd be right if that was not a constraint here.

- eugene.yarovoi August 08, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Treap will solve this

- Anonymous August 18, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Just because a treap is a tree with some random structure doesn't mean that it's the solution to this question involving trees and probabilities. The solution here is deterministic because we're asked to minimize an expectation.

The DP approach is the right idea.

- eugene.yarovoi August 18, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Just build a priority queue using a heap where the priority is the probability of occurance in this case.

- Anonymous August 23, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

DP
Base Case:
OBST(i,0) = 0 OBST(0,j) = 0

Sub Optimal Problem:
OBST(i,j) = OBST(i,r-1)+P(r)+OBST(r+1,j) + Sum(pk)(i<=k<=j)
i<=r<=j

- Kevin March 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is exactly the same as Optimal Binary Search tree problem..

- Sriram December 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Knuth has a section on this.

Dynamic programming problem.

- Anonymous August 02, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

is knuth a book on algorithms?

- Anonymous August 02, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

If you are trying to be funny, you succeeded in making me laugh.

I believe the answer refers to "The Art of Computer Programming" by Donald E Knuth. (One of the volumes I suppose, probably Vol II, on sorting and searching).

- Anonymous August 02, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

This problem can be solved using Huffman encoding

- AAH August 02, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 vote

@samant.bit do dey expect the o/p to be 25 12 45 5 16 30 9???

- codinglearner August 03, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Huffman tree is based on greedy algorithm, however that will not always give correct (optimal) solution for BST

- Anonymous August 03, 2012 | Flag

CareerCup

Microsoft Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews