## Expedia Interview Question for Software Engineer / Developers

Country: United States

Comment hidden because of low score. Click to expand.
0
of 0 vote

Construct a "trie" tree with an extra field for keeping a count during first parse of the book. Once done, output the top 10 frequently occurring words. May be there are some other optimal algos, but this is what struck me immediately. Assuming there are N words in the book, and out of which k are distinct. Time complexity to create a "trie" tree would be O(N) and searching top 10 frequently occurring words would be O(k).

Comment hidden because of low score. Click to expand.
0

something better than tries like ternary trees can be used. also, since we want to know top 10, a list of the top 10 words can be made real time to eliminate any extra time for searching top 10.

Comment hidden because of low score. Click to expand.
0

It will be impossible to do this in real time. This is because the sorting-property(freq) is not known at head time.

Comment hidden because of low score. Click to expand.
0
of 0 vote

As we are finding the word count from a predefined language (e.g. English), we can use the predefined dictionary of language words and their digests (hash values). Now, we will place the words from the book into each bucket associated with a word digest in constant time (calculation of digest and placing into appropriate bucket can be done in constant time). Once we are done with parsing the book, will end up with word counts and these can be sorted in O(lg n) time (to the base 2).

Comment hidden because of low score. Click to expand.
0

Which sorting algorithm gives O(lgn) ? The best is know is O(n)

Comment hidden because of low score. Click to expand.
0
of 0 vote

Tree Map data structure can be used. Sort and store. Only top10 will be stored according to condition specified.

Comment hidden because of low score. Click to expand.
0
of 0 vote

step 1 : If the file(the book) is very large and can't be sorted in memory you can split it into chunks that can be sorted in memory.

step 2 : For each sorted chunk compute sorted pairs of (words, nr_occurrence), at his point you can renounce to the chunks because you need only the sorted pairs.

step 3 : Iterate over the chunks and sort the chunks and always keep the top ten appearances.

Example:

Step 1:

a b a ab abb a a b b c c ab ab

split into :

chunk 1: a b a ab
chunk 2: abb a a b b
chunk 3: c c ab ab

Step 2:

chunk 1: a2, b1, ab1 chunk 2: a2, b2, abb1
chunk 3: c2, ab2

Step 3(merge the chunks and keep the top ten appearances):

a4 b3 ab3 c2 abb1

Comment hidden because of low score. Click to expand.
0
of 0 vote

Add words to a hashmap with count as the value. Insert the words into max-heap, so that you directly pop the 10 max elements. Running time = Insertion of n elements into hashtable + Insertion of n elements into max-heap + selecting top 10 from max-heap = O(n)+O(nlogn) +O(10) = O(nlogn)

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.