Interview Question
Country: United States
There is one flaw with this solution, what if the 2 strings contain the same letters but in different amounts?
"An anagram is a type of word play, the result of rearranging the letters of a word or phrase to produce a new word or phrase, using all the original letters exactly once"
So this would break your solution, which would return true.
str1 = "aaabbbccc"
str2 = "abc"
You need at least one more pass to check if there are any non-zero entries in your array.
This is essentially how I would do it with hashes in ruby
list1 = "this was it"
list2 = "it was this"
h = new Hash
list1.each do | s |
if h.containsKey(s)
h[s] = h[s] +1
else
h[s] = 1
end
end
list2.each do | s |
if h.containsKey(s)
if h[s] - 1 < 0
return false
end
h[s] = h[s] - 1
else
return false
end
end
# one last step to go through values in hash and make sure they are all 0
# since each key is unique we only need to check each letter once for count
list1.unique.each do | s |
if h[s] != 0
return false
end
end
return true
Making the solution O(n+n+m) where m is less then n so essentially O(n). Probably better space storage as well since we aren't storing 256 only letters that are used in the string.
1.sort the both string and in one loop compare each character, if each character matches, then both strings are anagram of each other. Time Complexity O(nlogn), Space complexity O(1)
2. Create a hashmap with key as character and value as counter. Traverse the first string and insert character in map if it does not exist, otherwise increment the counter.
Now, for second string, loop through each character, and search in the map and if any character does not exist, then return false, if exists then reduce the value of that key. once looping over second string is over, iterate through hashmap and all values stored in the map should be zero. then return true, otherwise return false.
psudo code:
bool isAnagram(char *str1, char *str2)
{
if (strlen(st1) != strlen(str2))
return false;
int charIndex[256] = {0};
int i = 0;
while (*str1 != '\0') {
charIndex[*str1] += 1;
charIndex[*str2] -= 1;
str1++, str2++;
}
while (i < 256) {
if (charIndex[i] != 0) {
return false;
}
i++;
}
return true;
}
what about the next algo
char sum = 0;
for(int i = 0;i < strlen(s1);i++) {
sum += (s1[i] ^ s2[i]);
}
if(sum == 0 )
return true;
else
return false;
So if s1 is just only permutation of s2 characters it meants that each symbol c1 in s1 has equal in s2. So c1 xor c2 == 0.
I'm not sure I understand this solution. Here's a counterexample:
s1 = "ab"
s2 = "ba"
a xor b = 1
b xor a = 1
So the sum = 2 which returns false? Maybe you meant to sort the strings first but that is the largest factor in the runtime.
Oh,thanks. I've found a mistake in my solution.
Actually i meant that
char sum = 0;
for(int i = 0;i< strlen(s1);i++){
sum ^= s1[i];
sum ^= s2[i];
}
So each element of s1 which present in s2 will exclude both from sum by xor. It was my idea.
So a ^ b ^ a ^ b == 0 in your example. It will be always true because the order of xor doesn't matter.
1. Check if the lengths are same. If so, continue to step 2, else return false.
2. Insert each character of first string into std::set<char>. This is C++ specific. Other languages might be having counterpart of set.
3. Iterate through each character of second string and do a std::set::find on the set created in step 2.
4. If std::set::find returns std::set::end at any time during iteration, then result is false and break the loop. Else return true.
Complexity nlogn
#include<iostream>
#include<stdio.h>
#include<string.h>
using namespace std;
int main()
{ int z=0,y=0,c=0,p,i,q;
char s1[]="ram is going to agra";
char s2[]="agra to going is ram";
if((q=strlen(s1))!=strlen(s2))
{printf("false");
}
p=q-1;
for(i=0;i<p;i++)
{
z+=s1[i];
y+=s2[p-i];
printf("%d, %d\n",z,y);
if((s1[i]==' '&&s2[p-i]==' '))
{ c++;
if(z==y)
{ c--;}
z=0;
y=0;
}
}
printf("%d",c);
if(c==0)
{printf("string is panagram");}
else{printf("not");}
system("pause");
}
The most efficient algo is actually going to depend on the size of your character set relative to the size of the string, there is a way to do it in O(n) time with O(1) space(much less than a hash map), you simply use the concept of "buckets", here is the code in Java:
import static org.junit.Assert.*;
import org.junit.Test;
public class AnagramFinder {
public static boolean areAnagrams(String s1, String s2) {
if(s1.length() != s2.length())
return false;
//Assume ascii set
int[] s1Counts=new int[256];
int[] s2Counts=new int[256];
for(int i=0;i<s1.length();i++) {
s1Counts[s1.charAt(i)]++;
s2Counts[s2.charAt(i)]++;
}
for(int i=0;i<s1Counts.length;i++) {
if(s1Counts[i] != s2Counts[i])
return false;
}
return true;
}
@Test
public void testAreAnagrams() {
assertTrue(areAnagrams("that","hatt"));
}
@Test
public void testNotAnagrams() {
assertFalse(areAnagrams("that","hart"));
}
}
This is faster and takes less space than the hashmap(though both are O(n) time and O(1) space, in practice the above is faster), but this tends to fall apart if you have larger char sets.
Another even less practical(though more space efficient) method is to assign each letter a prime value('a' is 2, 'b' is 3, 'c' is 5 etc) then multiply all those together. This just requires 2 BigNumber objects(and a lookaside table to translate the characters into their primes)....theoretically if you had to calculate a lot of anagrams simultaneously with a large ROM but very little RAM you could use that method, but somehow I doubt that's really practical :P
thats nice, but your idea is based on ASCII alphabet. How about if you using UTF16. a lot of languages and characters, So it means that your memory grows O(n), not the O(1).
Did you bother to read the comment before replying? Not only did I mention the character set TWICE, your comment is just plain wrong, memory does not grow O(n) if you use UTF-16, you still do it in constant memory, you just need a lot more of it. It has NOTHING to do with the length of the string.
public boolean anagrams(String a, String b)
- Anshul August 08, 2013{
if(a.length()!=b.length())
{
return false;
}
int [] letter=new int[256]; //assumption that character set was ASCII
char[] tmp=a.toCharArray();
for(char c:temp)
{
letter[c]++;
}
for(int i=0;i<b.length();i++)
{
int c=(int)b.charAt(i);
if(--letter[c]<0)
{
return false;
}
}
return true;
}