## Amazon Interview Question

**Country:**United States

Could u elaborate? I could only think of a O(n^2) solution with suffix tree. But this is easily achievable with DP in O(n^2).

Suffix tree is a good data structure for pattern matching. Look it up for thorough concept but in a nut shell, suffix tree stores all suffixes of a given string. for example : Algorithm has suffix : Algorithm, lgorithm, gorithm,...,hm,m. To build this tree takes O(n) time, n being length of string.

Now for checking if string(m) is substring of string(n), we do a string search on the suffix tree built earlier. this takes O(m) time. total O(n+m)

Suffix tree is an ok'ish (though more memory-consuming than necessary) algorithm in a general substring match, but a horrible algorithm for this problem since wildcards are allowed. How do you handle the wild cards in O(n) time?

Oh wait sorry I thought the problem statement said the string might contains wild cards in the sense that any character can replace it. In that case suffix tree is a good algorithm, though I'd prefer boyer-moore because it takes less space and, on average, is faster.

Couldn't you simply remove not alphabetic characters in a linear pass and then do a KMP substring search? O(N) time with O(P) extra space (Technically O(N+P)where N is the string to be searched and P is the substring with wild cards removed but since P must be strictly less than N it is O(N) overall

static boolean isSubString(String s1,String s2)

{

int start=0;

while(start<s1.length())

{

int cur=start;

int star=-1;

int mark=-1;

for(int i=0;i<s2.length();)

{

if(cur>=s1.length()) break;

if(s1.charAt(cur)==s2.charAt( i )||s2.charAt( i )=='.')

{

if(i==s2.length()-1)//last element

{

return true;

}

else

{

cur++;

i++;

}

}

else if(s2.charAt( i )!='*')

{

if(star!=-1)

{

i=start+1;

cur=mark+1;

mark++;

}

else break;

}

else//s2.charAt( i )=='*'

{

if(i==s2.length()-1)//last element

{

return true;

}

else//keep cur and advance i

{

star=i;

mark=cur;

i++;

}

}

}

start++;

}

return false;

}

Here's a generalized regex substring matcher. Supports things like a+ (1 or more 'a'), ?* and .* (zero or more wildcards), as well as .+ and ?+.

```
bool is_regex_substr(const char* s, const char* p)
{
if (0 == s || 0 == p) return false;
if (0 == *p) return true;
if (0 == *s && 0 != *p && '*' != *(p + 1)) return false;
if ('*' == *(p + 1))
{
if (is_regex_substr(s, p + 2)) return true;
while (*s != 0 && (*s == *p || *p == '.' || *p == '?'))
{
if (is_regex_substr(s + 1, p + 2)) return true;
++s;
}
return false;
}
if ('+' == *(p + 1))
{
while (*s != 0 && (*s == *p || *p == '.' || *p == '?'))
{
if (is_regex_substr(s + 1, p + 2)) return true;
++s;
}
return false;
}
if ('?' == *p || *s == *p) return is_regex_substr(s + 1, p + 1);
return false;
}
// usage
auto irs = is_regex_substr("careercup", "c?re+.*r+?+d*p");
```

Suffix tree

- AlgoAlgae December 02, 2014