Bloomberg LP Interview Question
Financial Software DevelopersIt seems to me that the phrase "a dictionary is given" is an ambiguous phrase. I would ask the interviewer what this phrase meant.
If it meant that a data structure representing the dictionary was provided, I would perform what ever lookup was appropriate for the data structure. I cannot imagine a complexity greater for such a lookup that would be greater than N and any attempt to fill an additional data structure would be at best N before the lookup could be performed.
If it meant that a file of words was provided, it gets pretty tricky. Are the number of entries in the file known? Is the format of the file known? Can the position of entry n/2 be easily calculated? If so, I would perform a binary search on the disk file. I am aware that this would require a lot of disk accesses, but I doubt that it would be expensive than reading the while disk file into some more accessible data structure.
Only if the format of the disk file made a binary search difficult would I read the data into a data structure.
If it were upto the candidate, I would represent the dictionary as a suffix tree.
In that case looking up any word is straight forward.
Actually the most query-efficient solution would be to have both a suffix and prefix tree. For an easy analysis, let's assume there we are in one of these cases:
1) 1 extra character somewhere,
2) 1 mispelled characters somewhere,
3) 1 missing character somewhere.
Therefore, the word can be represented as "(part contained in the suffix tree) (character x) (part contained in the prefix tree)". Walk down both trees and at some point we either won't be able to proceed further; intersect the sets containing the following characters and find out the value of x to fix the word.
I am going to use a simple greedy algorithm for the problem and gives possible predictions to choose from if the word is mispelled.
as the dictionary is given to us checking whether a word is mentioned in the dictionary is matter of time function check(word) say O(1).
Step1: check whether are there any matches for the word in the dictionary if not continue else end
Step 2:
for(i < length of word)
for( j from a to z)
word[i] = j
if( check(word) = = TRUE)
possibility[i][j] = word[i];
output possibility[i][j];
The Program runs for 26 * length of the word(n) i.e O(n)
I told I'll implement aa hash table for the dictionary. And check the misspelled word in the dictionary by this hashfunction:
- Siva December 30, 2009hashval = (127*hashval()+key.charAt(i))%N (N is a big prime number 16908799)