## Morgan Stanley Interview Question for Technical Support Engineers

Country: India
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
2
of 2 vote

How about a B-Tree which stores the number of occurrences of a word.

h t t p: //en.wikipedia.org/wiki/B-tree

Comment hidden because of low score. Click to expand.
1
of 1 vote

i answered il use a trie data structure to store words with the counts..
the node in structure will have a character name,a pointer and a variable count..
ill traverse through the book character by character...in the trie structure from the root node i keep moving towards the corresponding character....when in the book character is a \t (i.e a space) in the structure the last character nodes counter will be incremented...so in the end what the strucutre will have is for any node's count will give the count of the word which is denoted by characters from root node to that node....
so once structure created traverse the tree and find max count and traverse again from root to get that word...
please post for alternative nd better solutions

Comment hidden because of low score. Click to expand.
0

i too had the same solution. Hash table could also b a sol but since the words would b huge and there is infinite range of words so hashtable is not in picture

Comment hidden because of low score. Click to expand.
0

What would be the character on the root? Is your proposing multiple trees or single tree?

Comment hidden because of low score. Click to expand.
0

You can do external merge sort on the complete book(all merge logic depends on the RAM available since book can not be loaded completely into memory).

After the sorting is done, You need to read through the sorted structure again and check for the highest occurence word.

Comment hidden because of low score. Click to expand.
0

what is the complexity? is it better than O(N*M) ?
N number of words in file.
M length of word to be searched.

Comment hidden because of low score. Click to expand.
0
of 0 vote

The root wil not have any character..
A single tree.. a node will have have 26 pointers max named by character wich it points ..nd my solution was just an idea nd not on implementation level..if someone can giv a better solution with the coding pls do so..

Comment hidden because of low score. Click to expand.
0
of 0 vote

merge sort will be n^2..and comparision on each character as well... with trie each character in book will be accessed just once and compared log of n to base 26... isnt that better? correct me if i am wrong..

Comment hidden because of low score. Click to expand.
0
of 0 vote

This is a tricky question and is expensive to to solve using traditional methods.
One alternative method is to use a large Array. Each element of this Array is a pointer to string.
Now read each string from the book. Take a random no in range 0 to Max_Array_Size. Replace the existing Array[randomNo] with the recently read string.
At the end of reading all words from book, check the array and find the string which is occuring maximum times.

I was asked similar question and argued that the solution will not work. Interviewer explained the following
1. Algorithm most of the times work. I forgot the algorithm name.
2. Array size has to be of certain size according to a formula. I don't remember that
3. The algorithm is very fast and efficient.
4. Can we used in search type applications like google/bing etc where you don't need 100% accurate results.
5. If a word occurs many times in the book, there is higher probability that the word is present at multiple places in the Array.

Comment hidden because of low score. Click to expand.
0

bloom filter

Comment hidden because of low score. Click to expand.
0
of 0 vote

Since it is a big big book, internal sorting or internal data structure may not work.

My solution is split the whole book into K buckets.
handle bucket one by one.
For firest bucket, take 20% of top frequency words to build a maxheap.
When next bucket comes in, maintain the maxheap.
After read all bucket, the words on the top of the maxheap is most frequent words.

Do not know if it works. I will be very happy if it is corrected.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Another point to this answer, we can use MapReduce way to optimize the execution.
Distribute book to some characters;
Assign different machines/threads to walk through their character, log all words and count by HashMap;
Summarize the result, set a flag into each HashMap entry, keep it the MAX, so we no need to iterate after it.

Comment hidden because of low score. Click to expand.
0
of 0 vote

You can use median of medians algorithm to find the same...

for details visit :
www . ardendertat.com/2011/10/27/programming-interview-questions-10-kth-largest-element-in-array/

Comment hidden because of low score. Click to expand.
0
of 0 vote

For language words it is better to use Hash map (unordered_map in C++ 11) than a binary tree, as it is possible to create a good hash function which follows language rules (i.e. which vowels follow which consonants and vice versa in English, so the probability of collisions will be small.

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.