Amazon Interview Question
SDE-2sCountry: India
Interview Type: In-Person
If you assume that the sentence is valid (i.e. has all valid words), then the difficulty is that you have to check whether trying a subword breaks the validity of the rest of the sentence. So what you can do is traverse through the string, whenever you find a word, push a pointer to a stack and restart the counter to track the current word. If you end up getting to the end of the "sentence" and your current word is not a word, then you know the subword was wrong, so pop the stack and try a longer word. See this implementation below:
public static String reconstructSentence(String s, HashMap<String, Boolean> dictionary) {
int ptr = 0;
Stack<Integer> spaces = new Stack<Integer>();
spaces.add(ptr);
String currWord = "";
while (!spaces.isEmpty()) {
currWord = s.substring(spaces.peek(), ptr);
System.out.println("Curr Word: " + currWord);
if (dictionary.containsKey(currWord)) {
System.out.println(">>>Found word: " + currWord);
spaces.push(ptr);
}
if (ptr >= s.length()) { //reached the end
if (dictionary.containsKey(currWord)) {
System.out.println("Sentence is done!");
spaces.push(ptr); //Done.
break;
}
System.out.println("Backtracking ...");
if (spaces.isEmpty()) {
System.out.println("Sentence could not be reconstructed");
break;
} else {
ptr = spaces.pop();
}
}
ptr++;
}
//Reconstruct Sentence
int to = s.length();
String sentence = "";
while (!spaces.isEmpty()) {
int from = spaces.pop();
String word = s.substring(from, to);
to = from;
sentence = word + " " + sentence;
}
return sentence;
}
//Try it with this unit test.
public void testValidSentence() {
HashMap<String, Boolean> dictionary = new HashMap<String, Boolean>();
dictionary.put("the", true);
dictionary.put("dog", true);
dictionary.put("went", true);
dictionary.put("to", true);
dictionary.put("park", true);
dictionary.put("in", true);
dictionary.put("sunshine", true);
dictionary.put("happy", true);
dictionary.put("do", true);
dictionary.put("sun", true);
dictionary.put("shine", true);
dictionary.put("sunny", true);
dictionary.put("on", true);
dictionary.put("a", true);
dictionary.put("day", true);
String sentence = "thehappydogwenttotheparkonasunnyday";
System.out.println(sentence + " Reconstructed: " + StringUtil.reconstructSentence(sentence, dictionary));
}
Output: thehappydogwenttotheparkonasunnyday Reconstructed: the happy dog went to the park on a sunny day
Here's a solution with complexity O(2^n)
function spliteSentence(A, start, words) {
var end = A.length;
if(start == end) {
console.log(words);
}
for(var i=start; i<=end;i++) {
var part = A.subString(A, start, i);
if(isWord(part)) {
words.push(words);
spliteSentence(A, i, words);
words.pop();
}
}
}
spliteSentence("iloveyou", 0, []);
To improve, we can introduce A* algorithm. with heuristic condition like:
word frequency.
correct some mistakes
function spliteSentence(A, start, words) {
var end = A.length;
if(start == end) {
console.log(words);
}
for(var i=start; i<=end;i++) {
var part = A.substring(start, i);
if(isWord(part)) {
words.push(part);
spliteSentence(A, i, words);
words.pop();
}
}
}
spliteSentence("iloveyou", 0, []);
private static Map<String, Integer> dictionary;
static {
dictionary = new HashMap<String, Integer>();
dictionary.put("a", 1);
dictionary.put("service", 1);
dictionary.put("system", 1);
dictionary.put("out", 1);
dictionary.put("put", 1);
dictionary.put("output", 1);
dictionary.put("outputs", 1);
dictionary.put("is", 1);
dictionary.put("invalid", 1);
}
/**
* The first parameter is the string to be divided into words in a dictionary.
* The second parameter is the index till which point we have checked if a valid
* word exists.
*
* We initially check if a string is null or empty, return empty string if it is.
* Then we check if a given input string is in the dictionary. If it is we return it
* immediately.
*
* If neither of these two, then we increment the index, substring on it to get the first
* and second part. If the first part is a valid word we return firstPart concatenated
* with the solution from the second part (with a space).
* If not we keep finding the solution.
*
* @param string
* @param i
* @return
*/
public static String getSentence(String string, int i) {
if (string == null || string.isEmpty())
return "";
if ( dictionary.get(string) != null ) {
return string;
}
i++;
String firstPart = string.substring(0, i);
String secondPart = string.substring(i);
if ( dictionary.get(firstPart) != null ) {
return firstPart + " " + getSentence(secondPart, 0);
} else {
return getSentence(string, i);
}
}
This builds in the idea that whenver we find a valid word we recurese from that point to rest of the word while continuing in original string. Any call that reach end with valid end word is printed
{
public class AddSpaces {
/**
* @param args
*/
public static void main(String[] args) {
String str = "outputisinvalid";
addSpaces(str, "", 0);
System.out.println("done");
}
private static void addSpaces(String str, String soFar, int index) {
int i = 0;
if (index == str.length())
return;
for (i = index; i < str.length(); i++) {
try {
String test = str.substring(index, i+1);
if (exists (test)){
if (soFar.length() > 0)
addSpaces(str, soFar + " "+test, i + 1);
else
addSpaces(str, test, i + 1);
}
}catch (Exception e) {
System.out.println("begin: "+index);
System.out.println("end: "+(i+1));
System.out.println("exception: " + e);
System.exit(0);
}
}
if (i == str.length()) {
if (index == str.length()) {
System.out.println(soFar);
}
else if (exists (str.substring(index))) {
System.out.println(soFar + " " + str.substring(index));
}
}
}
public static boolean exists (String str){
return str.equals("I") || str.equals("work") || str.equals("in")
|| str.equals("out") || str.equals("output") || str.equals("put") || str.equals("is") || str.equals("in") || str.equals("invalid") || str.equals("valid");
}
}
}
C++ version. Uses a trie with search() and search_prefix() for matching complete words and prefixes for best match.
#include <iostream>
#include <initializer_list>
#include <iterator>
#include <vector>
class trie {
public:
trie() : root_(new node(0)) { }
trie(const std::initializer_list<std::string>& words) : root_(new node(0)) {
for (const std::string& word : words)
add(word);
}
~trie() {
delete root_;
}
void add(const std::string& word) {
if (word.empty())
return;
node* curr = root_;
for (size_t i = 0; i < word.size(); i++) {
char ch = ::tolower(word[i]);
size_t idx = ch - 'a';
if (curr->children_[idx]) {
curr = curr->children_[idx];
} else {
curr->children_[idx] = new node(ch);
curr = curr->children_[idx];
}
}
curr->is_word_ = true;
}
bool search(const std::string& word) const {
node* curr = search_common(word);
return curr && curr->is_word_;
}
bool search_prefix(const std::string& word) const {
node* curr = search_common(word);
return curr;
}
private:
struct node {
node(char ch) : ch_(ch), is_word_(false), children_{nullptr} { }
~node() {
for (node* n : children_)
delete n;
}
char ch_;
bool is_word_;
node* children_[26];
};
node* search_common(const std::string& word) const {
if (word.empty())
return nullptr;
node* curr = root_;
for (size_t i = 0; i < word.size(); i++) {
char ch = ::tolower(word[i]);
size_t idx = ch - 'a';
curr = curr->children_[idx];
if (!curr)
break;
}
return curr;
}
node* root_;
};
std::vector<std::string> find_words(const trie& tr, const std::string& str)
{
std::vector<std::string> words;
if (str.empty())
return words;
std::string longest_match;
for (size_t beg = 0, end = 1; end <= str.size(); end++) {
std::string curr_match = str.substr(beg, end - beg);
if (tr.search(curr_match)) {
// We are looking for the best (or longest) match
if (curr_match.size() > longest_match.size()) {
longest_match = curr_match;
}
} else {
// If there's no prefix with this match, save the longest so far
if (!tr.search_prefix(curr_match)) {
// Is there a best match?
if (!longest_match.empty()) {
words.push_back(longest_match);
longest_match.clear();
beg = end - 1;
end--;
} else {
// No best match found.
// Increment the begin position for the next match.
beg++;
}
}
}
}
if (!longest_match.empty())
words.push_back(longest_match);
return words;
}
int main() {
trie tr{"i", "love", "beer", "and", "wine", "out", "put", "output"};
std::string str = "ilovebeerandwineoutput";
std::vector<std::string> words = find_words(tr, str);
std::copy(words.begin(), words.end(), std::ostream_iterator<std::string>(std::cout, " "));
std::cout << std::endl;
return 0;
}
Dynamic programming - Word break problem.
"The idea is simple, we consider each prefix and search it in dictionary. If the prefix is present in dictionary, we recur for rest of the string (or suffix). If the recursive call for suffix returns true, we return true, otherwise we try next prefix. If we have tried all prefixes and none of them resulted in a solution, we return false."
Building up on the example that Kallu gave, let's consider the sentence
'output is invalid'.
With spaces removed , it'd appear output is invalid.
Now the dictionary would have all the words {out, put,output, is, in, invalid}
How do you know whether out or output is a word in the given sentence.
Is it not ambiguity problem. Let's say the sentence has a word "output" then dictionary has two words "out" and "put". So, how do we interpret output word.
- Kallu October 23, 2013Let us there is no ambiguity then it is not simple as follows: Start with first char and check whether that is in dictionary or not (say I) . If it is then just add space after that. If not then check with two chars and if not check 3 chars and so on.