Amazon Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
I wonder if this is correct as it says in the problem that min is three not exactly and only three chars.
If there is a 4 char substring that is repeating, there will always be a 3 char substring within this 4 char substring that repeats.
How much time does Hashmap take to compare the current substring with all other keys present in the map?
It might be faster to use a Trie as suggested by Vincent.
Here is a trie implementation. Please point out the errors and suggest improvements.
This considers "xxx" to be repeated in "xxxx"
#include<iostream>
#include<cstring>
using namespace std;
struct node
{
char val;
bool is_last;
struct node *child[27];
};
bool insert(node **root, char *word);
int main()
{
char *str = "abcxxxxxxabc", word[4];
node *root = NULL;
int i, len;
bool flag;
len = strlen(str);
word[0] = str[0];
word[1] = str[1];
word[3] = '\0';
for(i=2;i<len;++i)
{
word[2] = str[i];
flag = insert(&root, word);
if(!flag)
{
cout<<"First repeated string is "<<word<<'\n';
break;
}
word[0] = word[1];
word[1] = word[2];
}
return 0;
}
bool insert(node **root, char *word)
{
int i,j;
node *temp;
if(!(*root))
{
*root = new node;
for(j=0;j<27;++j)
(*root)->child[j] = NULL;
}
//cout<<"new root\n";
temp = *root;
for(i=0;word[i];++i)
{//cout<<i<<'\n';
//cin>>nn;
if(temp->child[word[i]-'a'])
{
if(i==2)
return false;
temp = temp->child[word[i]-'a'];
}
else
{
temp->child[word[i]-'a'] = new node;
temp = temp->child[word[i]-'a'];
temp->is_last = false;
for(j=0;j<27;++j)
temp->child[j] = NULL;
}
}
temp->is_last = true;
return true;
}
just modified a bit of the code for trie.
So there are 2 more things I'd like to mention:
1. is_last is redundant here
2. insert() function can insert strings of all lengths, but I have put checks according to length 3 only as checking for repeated string of length will suffice.
Better Use a Suffix tree and check if any node at depth >=3 has more than 1 children. I am not sure if in the trie implementation above, people are actually implementing suffix tree or not.
Oh, I get it, you will be only printing size 3 strings. takes O(N). Neat. But yes, if you have to print the max length common substring, suffix tree!
@Hill Billy, in case of longest common substring suffix tree is a better choice, though I would implement it using suffix array as it is easier to code.
Tries should be the way to go about it . HashMap will not solve the problem as question says min 3 chars and not exactly 3 chars
please elaborate how to go with tries, here we are given stream of characters not array of characters.
The question says "Given a String"
Stream is not mentioned in the question.
Even if it is a stream, you can save last two characters and after the current character arrives, just check for the 3 character string in the trie. We just need to save all distinct 3 character strings that have already come through the stream.
As soon as there is a repetition, we can return the answer.
@Kusum, cypher's comment on lovecoding's answer answers the question of minimum 3 and not exactly 3 characters
Algo:
1. You have last 2 characters of the stream saved in a buffer.
2. Append the current char to form string of length 3
3. Insert this into the trie.
If this string already exists in the trie, then return a signal of repeated string.
It does not use any concept of suffixes. It is simply comparing given string with earlier strings but using a tree for it. This is what we do in tries.
But I am taking space of 3 chars to store last 2 chars and current char. I am not using space for just one char.
Maintain a Map<Character, Integer>
loop through the the characters in the string, keep adding them one by one to the map such that the key is the character and the value is it's position within the string.
for all characters beyond the third character in the string, check to see if the character exists in the map.
if so, get it's position and start comparing the characters after that position. In the above example 'a' occurs at 0 and 7. So basically, you have to check to see if the next two characters after that position match or not.
Was this a phone interview? Or on-site?
public class Amazon2 {
public static void main(String []args)
{
String input = "edabcxedrrxedabcrr";
for(int i = 0;i < input.length();i++)
{
if(input.charAt(i) != input.charAt(i+1) && input.charAt(i) != input.charAt(i+2))
{
for(int j = i + 3;j < input.length();j++)
{
if(input.charAt(j) == input.charAt(i) && input.charAt(j + 1) == input.charAt(i + 1) && input.charAt(j + 2) == input.charAt(i + 2) )
{
System.out.println(input.charAt(j) +""+ input.charAt(j+1)+""+ input.charAt(j+2));
System.exit(0);
}
}
}
}
System.out.println("None found");
}
}
public static String findRep(String str)
{
int len=str.length();//computes length of string
for(int j=len/2;j>=3;j--)//start comparing with len/2 character length
{
for( int i=0;(i+2*j)<=len;i++)//check whether substring can be found
{
String source=str.substring(i,i+j);
String target=str.substring(i+j,len);
if(target.contains(source))
return source;//repeated string is found
}
}
return null;
}
public static void First(string str)
{
string Org = str;
char[] cArr = str.ToCharArray();
int count = 0;
for (int i = 0; i < cArr.Length && count < 1; i++)
{
if ((i + 2) < cArr.Length)
{
for (int j = i + 3; j < cArr.Length; j++)
{
if ((j + 2) < cArr.Length)
{
if ((cArr[i].Equals(cArr[j])) && (cArr[i + 1].Equals(cArr[j + 1])) && (cArr[i + 2].Equals(cArr[j + 2])))
{
Console.WriteLine("{0}, {1}, {2}", cArr[i], cArr[i+1], cArr[i+2]);
count++;
break;
}
}
}
}
}
}
public class FindFirstRepeatedString {
public static void main(String args[]) {
String main="abcxrrxcdrrxabcxrr";
String min3String="";
TreeMap<Integer,String> loca = new TreeMap<Integer,String>();
for(int i=0;i<main.length()-2;i++){
min3String=main.substring(i,i+3);
if(main.substring(i+1).contains(min3String))
loca.put(i+1+main.substring(i+1).indexOf(min3String), min3String);
else
min3String="";
}
if(!loca.isEmpty()) {
System.out.println("the first repeated String : "+loca.firstEntry().getValue());
} else {
System.out.println("No String found");
}
}
}
public static void main(String[] args) {
String str = "cccc";
Set<String> set = new HashSet<String>();
for(int i = 0; i <= str.length() - 3; i++){
System.out.println(str.substring(i,i+3));
if(!set.add(str.substring(i,i+3))){
System.out.println("Answer is " + str.substring(i,i+3));
break;
}
}
}
"xababcabcxab" in this case what is the expected output...is it "abc"..since it is repeated first or is it xab since it appears first in the string
firstRepeatedStringWithMinimumNChar(int n) {
String s = "abcxrrxabcxr";
HashMap<String, Object> map = new HashMap<>();
int i = 0, j = 0;
while (i < s.length()) {
j = i + n;
while (j < s.length()) {
String s1 = s.substring(i, j);
if (map.containsKey(s1)) {
System.out.println(s1);
return;
}
map.put(s1, null);
j++;
}
i++;
}
}
The simplest way IMHO is comparing neighbor strings of suffix array using inverted suffix array for ordering search
#include<iostream>
#include<conio.h>
#include<string>
using namespace std;
void find(string str,int p);
int main()
{
int p;
string str,str3;
cin>>str;
p=str.length();
find(str,p);
getch();
return 0;
}
void find(string str,int p)
{
string str1,str2;
int i,j,k,flag,t;
if(p>3)
{
for(i=0; i<p-1; i++)
{
str1=str.substr(i,3);
j=i+1;
for(k=j; k<p-2; k++)
{
str2=str.substr(k,3);
if(str1==str2)
{
cout<<str1;
flag=1;
break;
}
}
if(flag==1) break;
}
if(flag==0) cout<<"no repeatition";
}
else cout<<"string length is too less to find";
}
public class Repeating3DigitCharacter {
/**
* @param args
*/
public static void main(String[] args) {
String str = "abcrxabcxrxabcbcraaaa";
int start = 0;
int repatingCharacteres = 3;
HashMap<String, Integer> hm = new HashMap<String, Integer>();
System.out.println("str==>" + str.length());
while (start + repatingCharacteres <= str.length()) {
// System.out.println(start);
String substr = str.substring(start, start + repatingCharacteres);
if (hm.containsKey(substr)) {
hm.put(substr, hm.get(substr) + 1);
System.out.println(hm.get(substr) + "=" + substr);
} else {
hm.put(substr, 1);
}
start++;
}
}
}
Scala :
def repeatStr(s: String): String = s.toList match {
case Nil => ""
case x :: y :: z :: Nil => ""
case x :: y :: z :: rest if(rest.length) > 3 => if(rest.mkString.indexOf(x.toString + y+z) > 0 ) x.toString + y+z else repeatStr((y :: z :: rest).mkString)
case x :: y :: z :: rest => ""
}
Why use hashes and all that. Straight forward I can doo
#include <iostream>
#include <string>
#include <vector>
using namespace std;
string pattern(string _in)
{
vector <string> pat;
char dummy[] = {'a','b','\0'};
for( int i = 0u; i < _in.length()-1; i++)
{
dummy[0] = _in[i];
dummy[1] = _in[i+1];
pat.push_back(dummy);
}
int location_b;
int location_f;
int t =0;
bool condition = true;
for(int k = 0 ; k <pat.size(); k++)
{
for(int j =t+1; j < pat.size(); j++)
{
if(pat[k] == pat[j])
{location_b = k;location_f = j;condition =false;break;}
}
if(condition == false)
break;
}
condition = true;
int counter = 3;
while(condition)
{
if(_in.substr(location_b,counter) != _in.substr(location_f,counter)) { condition = false ;}
else {counter ++ ;}
}
return _in.substr(location_b,counter-1);
}
I did minimum 2 letters but that can be modified.
No HashMap and no trie :
This one is a simple DP.
Lol that rhymed.
Its elegent, but still not elegant enouth in that i'm having to write one more for loop at the end, only for one stupid condition when the repeated word appears at the far end. If some one can push it inside, I guess it would be perfect.
private static void printFirstRepeatedSubstring(String string, int k) {
int arr[] = new int[string.length()];
for ( int i=0 ; i<string.length() ; i++ ){
arr[i]=0;
for ( int j=i-1 ; j>=0 ; j-- ){
if ( string.charAt(i) == string.charAt(j) ){
arr[j]=1+(j>0?arr[j-1]:0);
}else{
if ( j>0 && arr[j-1]>=3){
System.out.println(string.substring(i-arr[j-1],i));
return;}
arr[j]=0;}
}
}
for (int i=0 ; i<arr.length ; i++ ){
if ( arr[i]>=k){
System.out.println(string.substring(i-arr[i]+1,arr[i]));}
}
}
Two methods:
1. Regard the input string as a very big number, compare them by integer number with groups (XOR can be used for comparation).
This is for the sake of CPU consuming.
2. Pick the 3 chars in sequence, suppose it is "abc", use the remaining chars to match the group "abc", record the current matching index for "abc", if the current char could not match the next char indicated by matching index, then reset the matching index.
void rep_str(char *str)
{
char * temp1,*temp2;
temp1=str;
temp2=temp1+1;
while(*temp1)
{
while(*temp2)
{
if((*temp1)==(*temp2) &&
*(temp1+1)==*(temp2+1) &&
*(temp1+2)==*(temp2+2))
break;
temp2++;
}
if(*temp2)
break;
temp1++;
if(*temp1)
temp2=temp1+1;
}
cout << " Repeated string is " << endl;
while(*temp1==*temp2 && temp2!=NULL)
{
cout << *temp1;
temp1++;
temp2++;
}
}
I am assuming in case of "aaaa" . "aaa" is repeated string
- loveCoding January 11, 2013