How can we implement spell che

Amazon Interview Question for SDE1s

2

of 2 votes

17
Answers
How can we implement spell checker.
- rapirapp July 04, 2013 in United States | Report Duplicate | Flag | PURGE
Amazon SDE1 Algorithm Data Structures

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 6 vote

Since we are required to just check whether a spelling is correct or not, I would use a hashmap. Just insert all the words from the dictionary in a hashmap and consequently the lookup would take contant time.
I would use trie only when I have to suggest the spelling using the string that user has entered as a prefix. Lookup in trie takes time proportional to the length of the query string.

- Epic_coder July 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

In this case hashmap will take a lot of memories.
Also take more time because of large No. of data in it.

Can we think for any optimal solution...??

- Rahul July 05, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 votes

Oxford English Dictionary is known to have about 200,000 words. If you assume the average word size is 5, you need to store 200,000*5 characters in the main memory which is about 1 megabyte of data. Lets say our hash table is really very sparse so add a factor of X10, this pushes the memory requirement to 10megabyte. Almost all modern data computers and mobile phones have at least this much main memory.

@Rahul, lookup in hashmap take O(1) time. Insertion may take larger time but it's a one time cost.

- Epic_coder July 05, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

TRIE is best for storing the words as there could be multiple words to one key, so retrieval won't be possible in constant time.

- seeker July 06, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Trie is best in approach as its complexity will be of O(1) and same will be for hashMap but it will be faster in Trie if your enter next character with time gap as in Trie you have to traverse to next node where as in hashMap you have to do search again.

Correct me if I am wrong.

- Naveen July 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Time complexity of Trie is O (key length) on add and search operations where as in hashMap searching keys will be O(1) in the best case otherwise it depends on the number of elements attached to the same hashcode.

- mail.nuwan June 29, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Hashmap is used to store all the words and then the checking of spelling is done based on whether that value is in the hashmap or not.

- vgeek July 05, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Trie data structure can be used.IF that word's spelling is wrong then the function returns False.or else True

#include<stdio.h>

struct trie {
    struct trie *edges[26];
}

// Initailizes memory for trie t
trie *initializeTrie(struct trie *t) {
    t=(trie *)malloc(sizeof(trie));
    for(int index=0;index<26;index++)   {
        t->edges[index]=NULL;
    }
    return t;
}


// Adds a word to the existing trie
trie *addWord(struct trie *t,char *word)    {
    
    if(word[0]=='/0')   {
        
    }
    else    {
        int index=t[0]-'a';
        if(t->edges[index]==NULL)   {
            t->edges[index]=initializeTrie(t->edges[index]);
        }
        
        t->edges[index]=addWord(t->edges[index],++str);
    }
    return t;


}

// Checks whether the word's spelling is correct or not

bool spellChecker(trie *t,char *word)   {
    if(word[0]=='/0')
        return true;
    else    {
        int index=word[0]-'a';
        if(t->edges[index]!=NULL)   {
            spellChecker(t->edges[index],++word);
        }
        else
            return false;
    }
}

// Driver function to implement Spell Checker
int main()  {
    
    Trie *t=(trie *)malloc(sizeof(trie));
    // Suppose the dictionary consists of four words
    t=addWord(t,"Sibendu");
    t=addWord(t,"Hello");
    t=addWord(t,"Bhubaneswar");
    t=addWord(t,"India");
    bool check1=spellChecker(t,"Sibendu");  // Returns true
    bool check2=spellChecker(t,"Subendu");  // Returns false
    
}

- Sibendu Dey July 05, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

t->edges[index]=addWord(t->edges[index],++str <== typo);

- Anonymous July 12, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Use Trie DataStructure :P

- Harry :) July 04, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Here is the comparison between hashmap and trie for a spell checker
1)Time complexity => o(length of the word) for trie always .For hashmap it will be mostly o(1) but it can be more than that also .Moreover we are reading the word in a spell checker so it trie will be better.
2)Space complexity => Hashmap will take more space because in Trie there will be words which are present in the path to other words which will save space e.g "for" will be present in the path of "formula" which will save space

- 3139a1m July 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

As mentioned by others, trie is better compare to hashmap from memory perspective.
- If number of words in consideration are low, hash map is good solution
- If the words used have common prefix, trie is good
- If you have huge number of words, ternary search tree is good options
Stolen from : stackoverflow.{}com/questions/10017808/best-data-structure-for-dictionary-implementation

- shsf July 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Easiest way is to create a list of words that are in a dictionary and one or two edits away from the word in question.

One edit away means that you can:
1. Insert an alphabet at any position in the string
2. Delete a string alphabet from any position in the string
3. Exchange a pair of string alphabets from any two positions in the string

The better way is to create an inverted index of character n-grams (typically bigrams) and use Jaccard similarity to obtain the top K suggestions

- Rediscovery July 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

BK tree looks promising. See here
blog.notdot.net /2007 /4 /Damn-Cool-Algorithms-Part-1-BK-Trees

- Mithya July 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 vote

trie will be having too much complexity implement it with kgrams and lehvenstein distance

- Sourabh July 05, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 vote

trie will be having too much complexity implement it with kgrams and lehvenstein distance

- Sourabh July 05, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-2

of 2 vote

trie will be having too much complexity implement it with kgrams and lehvenstein distance

- Sourabh July 05, 2013 | Flag Reply

CareerCup

Amazon Interview Question for SDE1s

Books

Videos

Resume Review

Mock Interviews