Microsoft Interview Question
Software Engineer in TestsHave you thought KMP could help here? We need extra space equal to pattern (str) size and without storing the source string we could tell if str is present in it.
actually now that I think about it I think the interviewer was talking about KMP. (but he was being very vague with his requirements ).
you do still need extra space equal to the search pattern for the jump table. but every character that comes in you don't always need to do naive search on the order of the entire search pattern.
there is a version of KMP that works on streams.
In most of the cases of "stream of text" type of problem, the interviewer is asking if you can create a DFA/NFA for the pattern. All you have to do is create a state machine, whose final state is 'c' and on transition from B (which can be reached using transition from B).
State1 --transition--> State2
[start]--a-->[A]
[A]--b-->[B]
--anything else-->[start]
[B]--c-->[terminal]
[B]--a-->[A]
[B]--c-->[terminal]
--anything else -->[start]
[terminal]--anything-->[terminal]
@bytestorm: thanks for the idea. Can you please provide the code for the same also? As no additional space is needed, how creating a state machine would help in not having extra space?
This is some pseudo c of what it might look like if you were specifically coding for a pattern unlike abcabcb where you should match in abcabcabcb (this would not match because the second c kicks it back to the beginning of the state machine). Should get enough of the idea I suppose, or I'm just plain doing something wrong
char c;
int charCount = 0;
while(c=getchar()) {
switch(c) {
case pattern[charCount]:
charCount++;
break;
case pattern[0] :
charCount = 1;
break;
default:
charCount = 0;
break;
}
if(charCount == patternLength)
return getchar()>0;
}
return false;
I think this will not work perfectly because in your transition to the start state again you are losing some substrings in your stream.
Please try and trace this:
stream = aaab pattern = aab
The stream contains the following substrings
{aaa}, {aab}. The first two characters of substring {aab} are part of the substring {aaa}. The transition to the start state will not take care of that.
I think the main purpose of the question is to search for substring in a stream without storing the whole stream OR using another substring equal to the pattern you are looking for. Since The information you need will be in the pattern itself without using a second storage space.
int searchsub(char *str,char *sub)
{
int i=0,flag=1,index=-1;
int j=0,k=0;
for(i=0;str[i]!='\0';)
{
printf(" %d %c %c\t",i,str[i],sub[j]);
k=i;
if(str[i]!=sub[j])
i++;
else
while(sub[j]!='\0')
{
printf(" In loop %c %c\n",str[i],sub[j]);
if(str[i]!=sub[j])
{
j=0;
flag=2;
break;
}
else
{
i++;
j++;
flag=0;
}
}
if(flag==0)
{
index=k;
break;
}
}
return index;
}
//My code read only one char at a time from the stream and doesnot require any extra
//storage. I only store the pattern to search for.
//The main idea depends on transtion states and taking into account several
//different possible cases
static void Main()
{
Console.Write("Pattern: ");
string pattern = Console.ReadLine();
Console.WriteLine("Input your stream ended with Enter key");
char ch;
int patternIndex = 0;
int currentState = 0;
int stopState;
while ((ch = (char)Console.Read()) != '\r' && patternIndex !=
pattern.Length)
{
if (pattern[patternIndex] == ch)
patternIndex++;
else
{
stopState = patternIndex;
currentState++;
patternIndex = 0;
while (currentState < stopState)
{
int tempState = currentState;
while (pattern[tempState] == pattern[patternIndex] &&
tempState < stopState)
{
tempState++;
patternIndex++;
}
if (tempState == stopState && ch == pattern[patternIndex])
{
patternIndex++;
currentState--;
break;
}
else
{
currentState++;
patternIndex = 0;
}
}
if (currentState == stopState)
{
currentState = 0;
if (ch == pattern[patternIndex])
{
patternIndex++;
currentState--;
}
}
}
}
if (patternIndex == pattern.Length)
Console.WriteLine("Found....!!!!");
else
Console.WriteLine("NOT Found....!!!!");
}
Why not just do some thing like this
int hasContainsString(char *str){
int i = 0, prev=0, len = strlen(str);
char c;
while( c = getchar())
{
if( c == str[i] && prev || c == str[i] && i == 0 )
{
if (i == len)
return 1;
i++; prev=1;
}
else
{
i = 0; prev =0;
}
}
}
<pre lang="" line="1" title="CodeMonkey97107" class="run-this">public static void main(String []args)
{
String Pattern = "aaabbddddbadfbbbassb";
String strMatch = "aaa";
int strLength = strMatch.length();
int final_count=3;
int count=0,j=0,i=0;
while(true)
{
if(strMatch.charAt(j++)==Pattern.charAt(i))
{
count+=1;
if(count==strLength)
{
System.out.println("String found at index " + (i-1));
return;
}
i++;
}
else
{
j=0;
i++;
count=0;
continue;
}
}
}
</pre><pre title="CodeMonkey97107" input="yes">
</pre>
Here disregard int final_count=3; In addition to that instead of considering we have been given String Pattern = "aaabbddddbadfbbbassb";...We can assume that we are getting this string from the console or some file etc. I would like to have your views on this code friends. I am a naive programmer. will appreciate your feedback !! :=)
how about Rabin Karp method ... compute the hash of pattern and keep comparing it with hash of incoming stream ..
- choupsey October 17, 2011