Given an infinite stream of ch

Google Interview Question for Software Engineers

5

of 5 votes

27
Answers
Given an infinite stream of characters and a list L of strings, create a function that calls an external API when a word in L is recognized during the processing of the stream.

Example:
L = ["ok","test","one","try","trying"]
stream = a,b,c,o,k,d,e,f,t,r,y,i,n,g.............

the call to external API (let's call it some function callAPI()) would be called when the 'k' is encountered, again when the 'y' is encountered, and again at 'g'.
- ragmo2223 November 05, 2015 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 9 vote

Create trie out of the list and navigate the nodes as you read from stream in linear time.

- ashish.kaila November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 votes

Assume L contains "abxyz" and "xyw"
Also assume your input is "abxyw".
You can't recognize "xyw" in O(n) with trie.
A more complicated cyclic graph is needed.
To recognize the above with trie need to keep set of active paths.
(Sorry for my English)

- boaznahum November 07, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@boaznahum: You don't need to do that as "abxyz" and "xyw" both will be stored in try as

|            
          a   x
        b       y 
        x        w
        y
        z

So let's take your input "abxyzw". We start the search and we find we have all the element till 'x'. Moment you get 'x' you have two options:
1. Check if the current element is equal to the head of trie and If it is equal to head of trie then call recursive search.
2. Continue till the end of current word. In this case for your given input it will return false but for the recursive search we started in point 1, it will return true.

So it is basically your search function that you need to write properly. However you had found a test case which is extremely crucial to implement the search function.

- aka November 08, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Code based on above idea.

#define SIZE 26
struct tri{
    int complete;
    struct tri *child[SIZE];
};

void insert(char *c, struct tri **t)
{
    struct tri *current = *t;
    while(*c != '\0')
    {
        int i;
        int letter = *c - 'a';
        if(current->child[letter] == NULL) {
            current->child[letter] = malloc(sizeof(*current));
            memset(current->child[letter], 0, sizeof(struct tri));
        }
        current = current->child[letter];
        c++;
    }
    current->complete = 1;
}

struct tri *t;
int flag = 0;
int found(char *c, struct tri *tt)
{
    struct tri *current = tt;

    if (current == NULL)
        return 0;
    while(*c != '\0')
    {
        int i;
        int letter = *c - 'a';
        /* if this is the first char then recurse from begining*/
        if (t->child[letter] != NULL)
            flag = found(c+1, t->child[letter]);
        if (flag == 1)
            return 1;
        if(!flag && current->child[letter] == NULL) {
            return 0;
        }
        current = current->child[letter];
        c++;
    }
    return current->complete;
}

int main()
{
    int i;
    t = malloc(sizeof(*t));
    t->complete = 0;
    memset(t, 0, sizeof(struct tri));

    insert("weathez", &t);
    insert("eather", &t);
    insert("weather", &t);
    (1 ==found("weather", t))?printf("found\n"):printf("not found\n");
    return 0;
}

- aka November 09, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 4 votes

@aka - After your fix - Is still linear ?

- boaznahum November 14, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

What if the input is this stream: "xyzdisagreementabc", and the dictionary has the words "agreement", "disagreement", "agree" and "disagree"

The above suggestions won't work here because when you reach 'd', you will start comparing it with your trie and returns the words "disagreement" and "disagree", not knowing that *within* this search, we also need to detect that the 'a' should start the search for "agree" and "agreement"

So my proposal would be to start a new search at every character, and also update every previous search at every character. But this means updating all your previous searches everytime you come across a new character. This will detect all words, but it won't be linear because the number of concurrent searches is now another variable.

- Ahmad November 25, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Time is not linear, because you need to maintain L pointers in your trie.

- Yola January 23, 2016 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

So here is my solution.

1. Create a Trie for the list of Strings given.
2. Store the maximum depth of the Trie or in other words, the length of the longest string in a variable called MaxLen.
3. When ever we receive a new character, we search it and the combination of all its previous MaxLen-1 characters. Because we know that any pattern more than MaxLen characters will not be found in the Trie anyway.
4. Every time we find a word, we print the index of the pattern match, however we do not stop the search. The only criteria to stop the search would be when the length of stream under consideration would reach MaxLen value. Then we will remove the first character from the StringBuilder and add the new character at the end and search for all the combinations for that StringBuilder. Which would be MaxLen combinations.

So for every new character you are doing MaxLen search. Assuming there are N characters in the stream, the complexity of this solution is O(MaxLen*n).

This is not linear time. I don't believe there could be any linear time solution possible for this. Since we at least need to match every single character of the stream with all the characters of the words in the list for the worst case. Which would look like this:

List : {a, aa, aaa, aaaa, aaaaa, aaaaaa};
Stream: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

I will post the code soon.

- Shivam Maharshi January 25, 2016 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

We can actually construct finite automata in memory for each word in the list. If the list total words are n then precomputing complexity is n * max_length_of_word*26(assuming only lower case letters). Now current states for each and every word can be maintained in an array
when a new letter comes update all the states depending on the structure of the automata
if any state reaches a final one then call the api at that moment

- Archit November 06, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Keep L in BST based on first character of each string. Each character of string should be checked in BST and call callAPI whenever element is found.

- Anonymous November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use BST to store L strings based on first character comparison. Match each character from the given string and call callAPI function on each finding.

- Hem November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Keep L in a trie, and iterate searches in a parallel way, as such:
1. For each search position in our search list:
1.1. If the next character is still in the trie - advance the search position to the corresponding position in the trie. If it hit a complete word - call the external API.
1.2. Otherwise, remove the search from the list.
2. Add a new position to the list. That position is the node following from the root, given the current character.

- JonathanBarOr November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Well, we can use tries as mentioned above. But I feel that regular expressions can also be used to solve this problem. If the dictionary L is fixed then I believe that this is useful.
Here is my code

import re

def inputs(stream,dicts):
    outputs = []
    str = ""
    for i in dicts:
        matchs = re.compile(i)
        outputs.extend(re.findall(matchs,stream))
        
    for i in outputs:
        str = str+ "  "+i
    return str


L = ['ok','testing','one','try','trying']
print inputs('oktestingdonehereandtherejustgiveitatryandiamtryingtogetsolution',L)
       
#Result :   ok  testing  one  try  try  trying

- revanthpobala November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I believe the complexity is O(n* time taken to build a DFA i.e regex)

- revanthpobala November 05, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

AC auto machine (trie with fail node)

- zxd November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

My C# implementation, Uses a Tuple to store the string and testing index, also consider the case that a word can be just a single char

// Test code
char[] stream = { 'a', 'b', 'c', 'o', 'k', 'h', 'd', 'e', 'f', 't', 'r', 'y', 'i', 'n', 'g' };
string[] words = { "ok", "test", "one", "try", "trying", "h"};
var list = new List<string>(words);
FindWords(stream, list);

public void FindWords(char[] stream, List<string> list)
{
	// Word to test and index
	List<Tuple<string, int>> words = new List<Tuple<string, int>>();

	foreach (var current in stream)
	{
		for (int i=0; i < words.Count; i++)
		{
			var item = words[i];
			var word = item.Item1;
			int index = item.Item2 + 1;

			if (word[index] == current)
			{
				if (word.Length == (index + 1))
				{
					CallAPI(word);
					words.RemoveAt(i--);
				}
				else
				{
					words[i] = Tuple.Create<string, int>(word, index);
				}
			}
			else
			{
				words.RemoveAt(i--);
			}
		}

		foreach (var word in list)
			if (word[0] == current)
			{
				if (word.Length == 1)
					CallAPI(word);
				else
					words.Add(Tuple.Create<string, int>(word, 0));
			}
	}
}

public void CallAPI(string word)
{
	Console.WriteLine(word);
}

- hnatsu November 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

using System;
using System.Collections.Generic;

delegate void CallAPIH(string w);
class WRec
{
    string word;
    int c = -1;
    public WRec(string w)
    {
        this.word = w;
        c = -1;
    }
    public string getW()
    {
        return word;
    }
    public int check(char ch)
    {
        c++;
        if (word[c] == ch)
            if (c == word.Length - 1)
            {
                c = -1;
                return 2;
            }
            else
                return 1;
        else
        {
            c = -1;
            return 0;
        }
    }
}
class Recognizer
{
    Dictionary<char, List<string>> ind = new Dictionary<char, List<string>>();
    List<WRec> r = new List<WRec>();
    CallAPIH f;
    public Recognizer(string[] words, CallAPIH f)
    {
        this.f = f;
        foreach (string w in words)
        {
            if (w.Length == 0)
                continue;
            List<string> l;
            if (!ind.TryGetValue(w[0], out l))
            {
                l = new List<string>();
                ind.Add(w[0], l);
            }
            l.Add(w);
        }
    }
    public void check(char ch)
    {
        List<string> l;
        if (ind.TryGetValue(ch, out l))
        {
            foreach (string w in l)
            {
                r.Add(new WRec(w));
            }
        }
        for (int i = r.Count - 1; i >= 0; i--)
        {
            switch (r[i].check(ch))
            {
                case 0:
                    r.RemoveAt(i);
                    break;
                case 2:
                    f(r[i].getW());
                    r.RemoveAt(i);
                    break;
            }
        }
    }
}


class Program
{
    static void Main(string[] args)
    {
        string[] w = { "ok", "test", "one", "try", "trying" };
        string stream = "abcokdeftrying";
        Recognizer r = new Recognizer(w, delegate(string s)
            {
                Console.WriteLine("Detected: " + s);
            }
            );
        foreach (char ch in stream)
        {
            Console.WriteLine(ch);
            r.check(ch);
        }
    }
}

- Tewelde November 06, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

c++

//iterate over stream
	for (auto currentChar : charStream) {
		printf("current: %c\n", currentChar);
		//always add root to potentially start a new word
		possibleWords.push(root);
		int size = possibleWords.size();

		//iterate over potential words
		for (int i = 0; i < size; ++i) {
			auto possibleWord = possibleWords.front();
			possibleWords.pop();
			auto next = possibleWord->getNext(currentChar);
			if (next.get() != nullptr) {
				//got a letter that can extend current word
				if (next->isWord) callApi();
				possibleWords.push(next);
			}
		}
	}

- Brian November 06, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1.Maintain a list of all words having same last character(create a list for each alphabet).
2.Construct a string with incoming stream of characters(let its name be superString). So the size of superString increases every time a new character enters the stream.
3.When a new character enters the stream, consider that as a last character of a word and retrieve the list of words(call it as wordList) that has the particular character as the last character.
4. For every word in wordList check whether superString ends with that particular word.
4.1 If the condition is true,Call the API

- Praveen November 07, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If the problem is only focus on checking the dictionaries existing, using indexOf should be quick solution.
public void findDic(String[] dic, String stream) {
for(String item : dic){
if(stream.indexOf(item) != -1) {
System.out.println(item);
}
}
}

- hivehive November 07, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public static void callApiOnStream(char[] stream,String[] words){
        int[] wordCharIndex = new int[words.length];

        for(int i=0;i<stream.length;i++){
            for(int j=0;j<words.length;j++){
                if(words[j].charAt(wordCharIndex[j]) == stream[i]){

                    wordCharIndex[j]++;

                    if(wordCharIndex[j] == words[j].length()){
                        //callApi(words[j]);
                        System.out.println("Calling api for word: "+words[j]);

                        wordCharIndex[j] = 0;
                    }

                }else{
                    wordCharIndex[j] = 0;
                }
            }
        }
    }

- JOPC November 10, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public static void callApiOnStream(char[] stream,String[] words){
        int[] wordCharIndex = new int[words.length];

        for(int i=0;i<stream.length;i++){
            for(int j=0;j<words.length;j++){
                if(words[j].charAt(wordCharIndex[j]) == stream[i]){

                    wordCharIndex[j]++;

                    if(wordCharIndex[j] == words[j].length()){
                        //callApi(words[j]);
                        System.out.println("Calling api for word: "+words[j]);

                        wordCharIndex[j] = 0;
                    }

                }else{
                    wordCharIndex[j] = 0;
                }
            }
        }
    }

- JOPC November 10, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Without the need of a BST we could solve it with an array keeping the index for each char to be compared.

public static void callApiOnStream(char[] stream,String[] words){
        int[] wordCharIndex = new int[words.length];

        for(int i=0;i<stream.length;i++){
            for(int j=0;j<words.length;j++){
                if(words[j].charAt(wordCharIndex[j]) == stream[i]){

                    wordCharIndex[j]++;

                    if(wordCharIndex[j] == words[j].length()){
                        //callApi(words[j]);
                        System.out.println("Calling api for word: "+words[j]);

                        wordCharIndex[j] = 0;
                    }

                }else{
                    wordCharIndex[j] = 0;
                }
            }
        }
    }

Sorry I was trying to post without an account and it posted more than once.

- zombienator November 10, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

How about something like this?

The idea is that with each character we get in we build a character chain and match if that chain exists as a word (or partial word) in our dictionary.
So with each character that goes in, we get all possible words of it, plus valid joined results from previous character chains. If a chain doesn't partially match to a single word anymore, we clean it up.

If I would deploy this kind of code, I would probably rewrite check_dict to create a tree with all words for faster access.

l = ["ok", "test", "one", "try", "trying"]
c = ["a", "b", "c", "o", "k", "d", "e", "f", "t", "r", "y", "i", "n", "g"]


def check_dict(word):
    for w in l:
        if w.startswith(word):
            if word == w:
                return 2
            return 1
    return 0

chains = []
for character in c:
    chains.append([])
    i = 0
    for chain in chains:
        if chain is None:
            continue

        chain.append(character)

        if check_dict(''.join(chain)) == 2:
            print("Found word: %s" % ''.join(chain))
        elif check_dict(''.join(chain)) == 1:
            # partial match
            pass
        else:
            # Resolve unneeded chain
            chains[i] = None

        i += 1

- Dave November 16, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The idea is to use Trie structure to build up the dictionary. Feed the live stream into the dictionary (assuming a new character is coming at a time). Go down deep into the tree (Trie data structure) one more level given the newly coming character. Keep a list of valid tree paths and go through the list when a new character is coming.
- Generate signal to call the API, if a new word is found with the new character.
Keep this tree path because following character(s) can form new words.
- Keep this tree path in the list, if does not find a new word but the tree path is valid.
Because it is one of candidates to form a new word with the following stream.
- Remove this tree path from the list, if the tree path is invalid.
- Search the new character in the root of dictionary
- Call the API and add the tree path into the list, if this single character is a word.
- Add the tree path into the list, if the tree path is valid.

TEST

void TestLiveStream()
{
    const std::vector<std::string> words =
        {"ok", "test", "one", "try", "trying"};
    LiveStreamWordsDetector lsd(words);

    const std::string input("abcokdeftrying");
    std::vector<std::string> output;

    lsd.AppendLiveStream('a', output);
    assert(output.empty());
    lsd.AppendLiveStream('b', output);
    assert(output.empty());
    lsd.AppendLiveStream('c', output);
    assert(output.empty());
    lsd.AppendLiveStream('o', output);
    assert(output.empty());
    lsd.AppendLiveStream('k', output);
    assert(output.size() == 1);
    assert(output[0] == "ok");

    lsd.AppendLiveStream('d', output);
    assert(output.empty());
    lsd.AppendLiveStream('e', output);
    assert(output.empty());
    lsd.AppendLiveStream('f', output);
    assert(output.empty());
    lsd.AppendLiveStream('t', output);
    assert(output.empty());
    lsd.AppendLiveStream('r', output);
    assert(output.empty());
    lsd.AppendLiveStream('y', output);
    assert(output.size() == 1);
    assert(output[0] == "try");
    lsd.AppendLiveStream('i', output);
    assert(output.empty());
    lsd.AppendLiveStream('n', output);
    assert(output.empty());
    lsd.AppendLiveStream('g', output);
    assert(output.size() == 1);
    assert(output[0] == "trying");
}

- peter tang November 16, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

- intainer7 November 08, 2015 | Flag

CareerCup

Google Interview Question for Software Engineers

Books

Videos

Resume Review

Mock Interviews