Google Interview Question
Software EngineersCountry: United States
Interview Type: Phone Interview
Just detailing what you have done
May be we can validate based on the size of legal string
For example ::
size of given string is n.
Assume Legal string size is n -- only one possible
n-1 -- 2 possible
n-2 -- 3 ..
.
.
if we find the legal string we can stop and say it is the largest because we do not have to look for smaller strings.
Worst case would be when we do not find the string :O(n^2) but average case would be lot faster.
Probably using the built in comparison function for Strings may resolve the issue with "ahello" as our test String. So having the method as:
bool isWordLegal(std::string word)
{
if(word.compare("hello")) return true;
return false;
}
This would fail when you have a string like: "abchealloxyz", although this still contains "hello".
Unless we actually know the legal words and are only given a method to make a boolean check we cant do better than n^2. It seems like something was omitted from this question.
This is the DP solution using memorized Map with recursion. Assuming we given a dictionary of words.
public class SegmentString {
Map<String, String> memorized = new HashMap<String, String>();
public String segmentString(String input, Set<String> dictonary){
if(dictonary.contains(input))return input;
if(memorized.get(input) != null){
return memorized.get(input);
}
int length = input.length();
for(int i = 1; i < length; i++){
String prefix = input.substring(0,i);
if(dictonary.contains(prefix)){
String suffix = input.substring(i, length);
String segmentString = segmentString(suffix, dictonary);
if(segmentString!=null){
memorized.put(input, prefix + " " + segmentString);
return prefix+" "+segmentString;
}
}
}
memorized.put(input, null);
return null;
}
public static void main(String[] args){
SegmentString seg = new SegmentString();
Set<String> dictonary = new HashSet<String>();
dictonary.add("hello");
dictonary.add("ate");
dictonary.add("a");
dictonary.add("an");
dictonary.add("ant");
dictonary.add("i");
System.out.println(seg.segmentString("iateanant", dictonary));
}
}
Without a preprocessing knowledge and limited control over legal words data structures, we can optimize the iterations
1. Short Circuit when first match for specific word length found,
2. Terminate program when first legal word found.
3. Start with longest word to shortest word.
public static string LongestWord(string word){
string result= default(string);
if(string.IsNullOrEmpty(word)) { return word;}
for(int wl = word.Length; wl > 1; wl--){
for(int si = 0; si <= word.Length - wl ; si++){
var eword = word.Substring(si,wl);
if(IsLegalWord(eword)){
result = eword;
break;
}
}
if(result != default(string)) break;
}
return result;
}
I don't think we can do better than O(n^2), but we can do some optimizations as follows:
1: The first loop starts at zero, indicating the start of the string; but we do not need to go all the way to end; once we find a legal string with length K, then we know the start must be at least (K-1) indices before the end of the string
2: For the second loop, indicating the end of the string, start from the end and go backward, two notes here
2.1: once you find a legal word, you can break from the loop
2.2: you don't need to go all the way back to "i",
Here's the code
bool IsLegalWork( const char * s, int start, int end );
void FindLongestLegal( const char * s )
{
int n = strlen(s);
int i,j;
int LegalStart = 0;
int LegalLength = 0;
for ( i = 0 ; i < n - LegalLength ; i++ )
{
for ( j = n-1 ; j > i + LegalLength ; j-- )
{
if ( LegalWord( s, i , j ) )
{
if ( LegalLength < j - i + 1 )
{
LegalStart = i;
LegalLength = j - i + 1;
}
break;
}
}
}
}
I have used regular expression to match the words that are present in the dictionary with the string given. Here is my code and I believe that it takes less than O(n^2) time. Btw regex is very fast i believe.
import re
def getsegstring(S,word):
dictionary = ['hello','ate','a','an','ant','i']
legalwords = {}
finalwords= []
if len(S)>0:
for i in dictionary:
match =re.compile(i)
if len(re.findall(match,S))>0:
legalwords[re.findall(match,S)[0]] = len(re.findall(match,S)[0])
maxi = max(legalwords.values())
for k,v in legalwords.iteritems():
if v == maxi:
finalwords.append(k)
return boolval(finalwords,word)
else:
return False
def boolval(finalwords,word):
if word in finalwords:
return True
else:
return False
print getsegstring("iateanant","ant")
"""
Result : True
"""
Stlll O(n^2). Greedy fashion might make it a bit faster.
- cdhsiung November 07, 2015