Google Interview Question for Software Engineers


Country: United States
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
3
of 5 vote

Construct a directed graph that starts with a space (" ") node. As you read through the training data, insert a new node for each new character (or update existing nodes). If you have the word "bat", then the graph would be " " -> "b" -> "a" -> "t" -> " ".

After you're done processing the training data, your graph will be a state machine where given any node/state, you have the probabilities to reach any other reachable states.
Ex. If you are at node corresponding to letter "c", and there are 2 links to "a", 3 links to "e", and 1 link to " ", then you know that the next letter will be P("a") = 2/6, P("e") = 3/6, P(" ") = 1/6. Calculate the one to choose, then move to it.

Note: If you have multiple links between two nodes, you wouldn't actually store the same links copied over and over. Instead store a count for each link.

This mechanism has the benefit of finding the next letter based on the current letter you're on rather than a straight up probability of overall letter occurrences, thereby better approximating the behaviour of the training data.

- JW March 26, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

How about part 2 ? Would you traverse the nodes calling a random() function and selecting the next char based on their probability match ?

- guilhebl April 17, 2015 | Flag
Comment hidden because of low score. Click to expand.
2
of 2 vote

Let's say we are dealing with ASCII characters. Keep matrix 258x258. 256 for each possible character and 2 special ones: beginning of the string and the end of the string.

As you walk each string from beginning to the end, note every adjacent pair of characters, including the imaginary character before the string - beginning - and after - end.
Increment corresponding cell in the matrix [int representation of first char][int representation of second char]

Generating:
Look at the row corresponding to beginning symbol and chose a cell randomly based on its weight, which is a cell value divided by total in the row. Its column is next char.

Repeat until encounter end symbol.

- CT March 25, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

String[] strings = { "Read a set of training strings", "For each character in any of the strings", "calculate the probability of seeing that character" };
int[] counts = new int[27];
int total = 0;
for (String s : strings) {
	total += s.length();
	for (int i=0; i<s.length(); i++) {
		int index = (int)Character.toLowerCase(s.charAt(i))-97;
		index = index < 0 ? 26 : index;
		counts[index]++;
	}
}
for (int i=1; i<counts.length; i++) counts[i] += counts[i-1];
int n = 25;
StringBuilder sb = new StringBuilder();
for (int i=0; i<n; i++) {
	int r = (new Random()).nextInt(total);
	for (int j=0; j<counts.length; j++) {
		if (r > counts[j]) continue;
		int value = j == 26 ? ' ' : j+97;
		sb.append((char)(value));
		break;
	}
}
		System.out.println(sb.toString());

- JP March 25, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

- Keep a sorted list with the chars in frequency order and the number of occurrences
- Generate a random number between 0 and total number of characters
- Go through the list by decrementing the number above with the number of occurrences
- When negative - the current characters is the randomly generated one

C++:

#include <iostream>
#include <cstdlib>

using namespace std;

struct Node {
    Node *next;
    int  data, count;
    Node(int d, int c = 1) : data(d), count(c) {}
};

Node* addData(Node *list, int data) {
    if (!list) {
        return new Node(data);
    }
    
    Node *cur = list, *last = 0, *l=0;
    while (cur) {
        if (cur->data == data) {
            cur->count++;
            if (last && last->count < cur->count) {
                int v = last->data; last->data = cur->data; cur->data = v;
                v = last->count; last->count = cur->count; cur->count = v;
            }
            return list;
        }
        if (!last || last->count != cur->count) {
            last = cur;
        }
        l = cur;
        cur = cur->next;
    }
    // Not found - append
    l->next = new Node(data);
    
    return list;
}

char getRandChar(Node *list, int total) {
    
    int r = rand()%total;
    while (r >= list->count) {
        r -= list->count;
        list = list->next;
    }
    return (char)list->data;
}

int main() {

    int n = 10, count = 0;
    Node *list = 0;
    string strings[] = {"akdjfsfdf", "sdfsdfsdf", "sdfsdfsdfdsf", "aaaasasasaa"};
    
    for (string s : strings) {
        for (char c : s) {
            list = addData(list, c);
            count++;
        }
    }
    
    srand (time(NULL));
    while (n--) {
        cout << getRandChar(list, count);
    }
}

- nikolay.dimitrov March 26, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I used an array of 256 strings to store the frequencies. When character C(cur) is followed by C(next), add C(next) to the string Array[C(cur)].

When creating the output for part 2, just choose a random character in the current string to find the next one.

void main(){
   while (1){
      std::string freqencies[256], input;
      int cnt;
      std::cout << "Enter a string\n";
      std::getline(std::cin, input); 
      input += ' ';
      std::cout << "Enter output count\n";
      std::cin >> cnt; std::cin.ignore();
      
      char cur = ' ';
      for (char c : input){
         freqencies[cur] += tolower(c);
         cur = tolower(c);
      }
      cur = ' ';
      for (int i = 0; i < cnt; i++) {
         cur = freqencies[cur].at(rand() % freqencies[cur].size());
         std::cout << cur;
      }   
      std::cout << '\n';
   }
}

output:

Enter a string
This is Great! Isn't it? I think it is a great solution! That guy tjcbs2 deserves an upvote for his amazing effort.
Enter output count
100
hisort. teamazis fonk in! hit gupvot ang in is sn is erthison't guy t t thazi upvortjcbs foresn

- tjcbs2 March 27, 2015 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

funny username btw.

- tjcbs2 March 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Hi, your solution is very elegant, but not scalable ... The total size of the 256 strings = the size of the input strings. And we may expect a lot of input strings, for the probabilities to be statistically significant. It makes more sense to store only the counters, or the probabilities as proposed by the other solutions.

- ranan.fraer March 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I don't understand your objection. I am basically making a copy of the input string, plus a bit of overhead. Hardly seems unreasonable, or unscalable.

- tjcbs2 March 27, 2015 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

class CharcaterProb {
	static CharcaterProb[] container = new CharcaterProb[26];
	static CharcaterProb getChar(char c) {
		return container[c - 'a'];
	}
	static int count = 0;
	char c;
	int frequency;
	CharcaterProb(char c) {
		this.c = c;
		container[c - 'a'] = this;
		touch();
	}

	static void touch(char c) {
		this.frequency ++;
		count++;
	}
}

void precalc(Stirng[] set) {
	for (String entry: set) {
		for (char c : entry.toCharArray()) {
			if ( CharcaterProb.getCar(c) == null ) {
				new CharacterProb(c);
			} else {
				CharcaterProb.getCar(c).touch();
			}
		}
	}
}

generateRandomString(int n) {
	char[] container = new char[n];
	int l = 0;
	for (char c = 'a'; c <= 'z'; c++)  {
		int freq = CharacterProb.getChar(c);
		while (freq--) {
			container[l++] = c;
		}
	}
	char[] ans = new char[n];

	for ( int i =0; i < n; i++ ) {
		int rnd = rand.nextInt(0, CharcaterProb.count - 1);
		ans[i] = container[rnd];
	}
}

- gime July 06, 2016 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More