Microsoft Interview Question
Software Engineer / DevelopersTeam: Bing
Country: India
Interview Type: In-Person
Pre-associate hashcode for each character (say a prime number). Say 'post' is the string for which hashcode needs to be calculated. Then the hashcode can be something like h(p)+h(o)+h(s)+h(t).
So, for any anagram of 'post', say 'stop', hashcode will be the same.
Store the set of strings as a hashmap <hashcode, List<Strings>>
Now we can retrieve all the anagrams for the string by getting the hashcode for the strings and fetching all list of strings from the above hashmap. It would be O(1) complexity once the map is constructed.
Calculating the hash of a string has complexity proportional to the size of the string, not O(1).
1. Store each entry in a hashmap, where the key is the sorted string entry. list is for all the words with the same key.
Eg. Replays will be stored as map.put(, "aelprsy", list.add("replays"))
2.A = Sort the characters of the key string.
2. check to see if A exists in the hashmap.
You can do it with a lot less space by not even bothering with the hashish and just sorting each input string as you go and comparing it to a pre-sorted target string.
1 #define CHAR_COUNT 26
2 // Returns a pointer to beginning of anagram of str2 in str1 (NULL if not found)
3 const char *find_anagram(const char *str1, const char *str2) {
4 int *summary = new int[CHAR_COUNT];
5 memset(summary, 0, sizeof(int) * CHAR_COUNT);
6
7 // Compute length of str2, make sure str1 is not too short and compute
8 // summary of str2 (count of each character)
9 int len = 0;
10 while(str2[len])
11 if(!str1[len])
12 return NULL;
13 else
14 summary[toupper(str2[len++]) - 'A']++;
15
16 int *temp_summary = new int[CHAR_COUNT];
17 int base = 0, overflowed_chars = 0;
18 while(str1[base + len - 1]) {
19 memcpy(temp_summary, summary, sizeof(int) * CHAR_COUNT);
20 int tot = 0;
21
22 for(int i=0 ; ; i++) {
23 char c = toupper(str1[base + i]);
24
25 if(summary[c - 'A'] == 0) {
26 base += (i + 1);
27 break;
28 }
29
30 temp_summary[c - 'A']--;
31 if(temp_summary[c - 'A'] == -1) {
32 overflowed_chars++;
33 }
34
35 if((++tot == len) && (overflowed_chars == 0))
36 return str1 + base;
37
38 if(i - len > 0) {
39 char d = toupper(str1[base + i - len]);
40 temp_summary[d - 'A']++;
41 if(temp_summary[d - 'A'] == 0)
42 overflowed_chars++;
43
44 tot--;
45 }
46 }
47 }
48 return NULL;
49 }
1 #define CHAR_COUNT 26
2 // Returns a pointer to beginning of anagram of str2 in str1 (NULL if not found)
3 const char *find_anagram(const char *str1, const char *str2) {
4 int *summary = new int[CHAR_COUNT];
5 memset(summary, 0, sizeof(int) * CHAR_COUNT);
6
7 // Compute length of str2, make sure str1 is not too short and compute
8 // summary of str2 (count of each character)
9 int len = 0;
10 while(str2[len])
11 if(!str1[len])
12 return NULL;
13 else
14 summary[toupper(str2[len++]) - 'A']++;
15
16 int *temp_summary = new int[CHAR_COUNT];
17 int base = 0, overflowed_chars = 0;
18 while(str1[base + len - 1]) {
19 memcpy(temp_summary, summary, sizeof(int) * CHAR_COUNT);
20 int tot = 0;
21
22 for(int i=0 ; ; i++) {
23 char c = toupper(str1[base + i]);
24
25 if(summary[c - 'A'] == 0) {
26 base += (i + 1);
27 break;
28 }
29
30 temp_summary[c - 'A']--;
31 if(temp_summary[c - 'A'] == -1) {
32 overflowed_chars++;
33 }
34
35 if((++tot == len) && (overflowed_chars == 0))
36 return str1 + base;
37
38 if(i - len > 0) {
39 char d = toupper(str1[base + i - len]);
40 temp_summary[d - 'A']++;
41 if(temp_summary[d - 'A'] == 0)
42 overflowed_chars++;
43
44 tot--;
45 }
46 }
47 }
48 return NULL;
49 }
What does anagram of a sting implies?
- abhishekatuw January 22, 20121. The string is made of many words and the anagram is shuffled in terms of words.
Or
2. The string is shuffled in terms of characters.
For first case a trie can be used along with a small supporting array type of ds to keep count.
For second case a character based hash can be used with the count of occurance.