Yahoo Interview Question
Software Engineer / DevelopersOur map should be of the form <word, pair<count, pointer>>. This pointer should be an index to a heap structure of size k (5 here). We should maintain a min - heap, every time the count of a word is greater than root of the heap we remove the minimum count seen and replace it with the current value and max-heapify. Cost of max-heapify is lg k. Even if we have to max-heapify after reading every word, the complexity will come out to be O(n lg k) whereas in case of sorting it'd be O (n lg n).
1)read the document and build a tree of words with its count
2)the bst should contain words in lexicographical ordering
3)increment each count of the word if it is fount in the tree otherwise add it in the tree
4)sort this tree.
novel approach but building a tree and searching for the contents of the tree takes O(log n) time.. hashing is preferable? Insertion and searching takes O(1)
~ read the document - O(n)
- son_of_a_thread August 03, 2011~ maintain a hashmap, and hash the strings as keys with initial count of 1 if the current term isnt present else increment the bucket count by 1 - O(1)
~ sort the map by values - O(n log n)