## Amazon Interview Question for Software Engineer / Developers

Country: United States

Comment hidden because of low score. Click to expand.
11
of 11 vote

here's a O(n*m) (n=no of words in the dictionary, m = max length of the word)complex solution with space O(52). Perform the following steps:
1. Take all the words from the lis with length same as the given word or +/- 1
2. Create a letter count array/map of the given word (for eg if the word is "coverage" then a=1, c=1, e=2, g=1, r=1, v=1) (say a[26] contains the letter count from a to z)
3. take each word from step 1. create its letter count array (say b[26]
4 for i = 1 to 26 sum += abs(a[i]-b[i]). each word with sum value = 1 or 2 will be the required word.

Comment hidden because of low score. Click to expand.
0

This looks like a cool solution. I was trying to work something out using 3 regular expressions, but I think your method is the way to go.

However, it will fail for terms which are anagrams. As an example, if we search for "Dictionary", the word 'Indicators' will be counted as a word which needs only 1 letter changed to get an exact match.

How can we change your technique to account for this?

Comment hidden because of low score. Click to expand.
0

@Michael - it is supposed to find "indicators" if input is "dictionary"

In the question, it is clearly mentioned that "spelling order is not important."

see my C# implementation below

Comment hidden because of low score. Click to expand.
0

I see now. In that case, the method which you and The Artist have posted is definitely the correct solution.

Comment hidden because of low score. Click to expand.
0

There is a small issue in the solution. If you are accepting words with sum = 2, you might end up taking words which are different by 2 letters also.

Comment hidden because of low score. Click to expand.
0

I think your algorithm does not consider "The list is already sorted alphabetically."
Sorted list must benefit your algorithm.

Comment hidden because of low score. Click to expand.
0

@Anjana: sum=2 should be considered when the size of the dictionary word is same as the given string

@Eric: that could be a decoy. I did not find much help from the sorted list.

Comment hidden because of low score. Click to expand.
0

Couldn't get the part where sum=2...i mean if we find exact match, sum=0, differ by 1 then sum=1. How does sum=2 fit in. Please explain.

This might be a little dumb question but nonetheless, would be great if you would explain.

Comment hidden because of low score. Click to expand.
0

@Jas: consider the example given in the question where you convert 'coverage' to 'converge'. In that case the sum will be two. Basically here you are taking one character of the given string and replacing it with any other alphabet.

Comment hidden because of low score. Click to expand.
0

Step 4 is wrong.
Suppose we have words "a" and "b".
The arrays would be a[][]=[1][0] and b[][]=[0][1].
The result would be sum=0 which means that "a" and "b" are completely equals.

Comment hidden because of low score. Click to expand.
0

I am not sure why this question would be asked(as you have to traverse the whole dictionary of million words). The better question to ask would be how would you design the dictionary so that you can print the words efficiently.
The answer to that would be :
Have your dictionary stored in key , value pair : key : count[26]array, value list of all strings that match (basically all anagrams will be in this string)
Now given word with count array G[]
There are the following three cases :
1) anagram of Given(G) + An extra alphabet
2) anagram - 1 alpha
3) change one alphabet in G
Now solution for above 3 cases :

``````1) for i = 0 to 25 : {G[i]++ ;  printlist(hash(G)); G[i]--; }
2) Take old G, not from 1) . for i = 0 to 25 : {if(G[i]) {G[i]-- ;  printlist(hash(G)); G[i]++; }
3) for i=0 to 25
{
G[]= given;
if(G[i])
{
G[i]--;
for j=0 to 25 -> skip j=i;
{
G[j]++;
printlist(hash(G))
G[j]--;
}
G[i]++;
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

and i think my above solution will work for any difference just the sum value need to be modified

Comment hidden because of low score. Click to expand.
0
of 0 vote

public List<string> FindWord(List<string> dictionary, string word)
{
List<string> output = new List<string>();

int[] charCount = new int[26];
char[] input = word.ToCharArray();

for (int i = 0; i < input.Length; i++)
{
charCount[input[i] - 'a']++;
}

int[] temp = new int[26];

foreach (string _word in dictionary)
{
if (Math.Abs(_word.Length - input.Length) == 1 || Math.Abs(_word.Length - input.Length) == 0)
{
charCount.CopyTo(temp, 0);

char[] _input = _word.ToCharArray();

for (int i = 0; i < _input.Length; i++)
{
temp[_input[i] - 'a']--;
}

int sum = 0;

for (int i = 0; i < temp.Length; i++)
{
sum += Math.Abs(temp[i]);
}

if (sum <= 2)
{
}
}
}

return output;
}

Comment hidden because of low score. Click to expand.
0
of 0 vote

How about a backtracking solution? Lets say we build a trie of the one million words. lets say input word is of length n. We check all the branches and keep track of number of mismatches and if mismatch count is less than permitted we allow traversing the trie else we exit the branch and check another branch using backtracking.

This approach however checks for words with only same length. I am checking how this can be modified to words with different length.

``````backtracking(root, "coverage", 1);

backtracking(trie node,string input_word, int mismatch_allowed)
{
if (mismatch_allowed < 0)
return;
if (input_word.length == 0)
{
print current node details (i.e. complete word till this node);
return;
}

for character_branch in (all_branches_at_root)
{
if (character_branch == input_word[0])
{
// first char matches, check next chars
backtracking(character_branch_node, "overage", mismatch_allowed);
}
else {
// character mismatch reduce the count
backtracking(character_branch_node, "overage", --mismatch_allowed);
}
}
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

I think the key to solving this problem is to use the information that the list of words is sorted (i.e.: you already have the word list preprocessed to help you with the query).

Consider (for complexity computation):
N - number of words in the word list (max. 1 million)
L - size of a word (max. 40)
A - size of the alphabet (= 26)

For 1 letter distance you can use the following algorithm:
1. Generate all the possible words that are 1 letter distance away from the query word
- this has the complexity O(L*A)
2. Look up each of the generated words in the word list using a binary search:
- the look up of an individual word is O(L*logN)
- we have O(L*A) lookups => final complexity is O(L^2*A*logN) which is roughly (not considering the hidden constant) about 832.000 operations which is better than O(L*N) which is roughly 40 million operations.

For distance = 2, this algorithm performs worse than the O(L*N) version.

Comment hidden because of low score. Click to expand.
0

i think you missed the point in the question where it says that the order does not match which can result upto a max of (40!)/(2^14) possible combinations which is humungous.

Comment hidden because of low score. Click to expand.
0

Although, if the order doesn't matter, I don't see how the fact that the list is sorted may be helpful.

Comment hidden because of low score. Click to expand.
0
of 0 vote

The question is more interesting and of practical use if the ordering of letters is important.

Comment hidden because of low score. Click to expand.
0
of 0 vote

@anjana

In the algo presented by The Artist

a minor modification is needed such that it doesnt take into account of the words such as (sumit with sumitas)

what we can do is make the changes in that space i.e O(26)
for example the word sumit has to be searched
so s-1,u-1,m-1,i-1,t-1

when the word sumita is encountered all the fields in O(26) is zero except a which is -1,
so this is our word.

now if we take sumi , then only value corresponding to field t is -1.

but if we take a word such as samit then field corresponding to u would be 1 and a would be -1,

in a iteration corresponding to the word length there can be only a pair of 1 & -1;
we wont consider if there is 1&1 or -1 & -1 which might arise in the example of (umi or sumitaf) which would be the case for difference in length corresponding to 2.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

We can solve the problem by the approach artist suggested. But we can reduce the time complexity from O(mn) to O((number of words with size+number of words with size-1+number of words with+1))m.

Simply apply the binary search on the list given to find the first occurance of the word with size-1, It would take O(logn)(i.e. log one million= approximately 20). From that index onwards you can traverse the list to until you reach a word with size+2. Since the list is already sorted, We don't need to traverse the entire list.

Comment hidden because of low score. Click to expand.
0

Sorry for my ignorance. It doesn't work. I just confused of alphabetical sorting

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Couldn't get the part where sum=2...i mean if we find exact match, sum=0, differ by 1 then sum=1. How does sum=2 fit in. Please explain.

This might be a little dumb question but nonetheless, would be great if you would explain.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Couldn't get the part where sum=2...i mean if we find exact match, sum=0, differ by 1 then sum=1. How does sum=2 fit in. Please explain.

This might be a little dumb question but nonetheless, would be great if you would explain.

Comment hidden because of low score. Click to expand.
0

@fallen_angel: see the comments to my original response.

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.