Goldman Sachs Interview Question for Senior Software Development Engineers


Country: United States




Comment hidden because of low score. Click to expand.
1
of 3 vote

Have a hashmap for which the key is an array containing the count of all the letters occurring in the word, and as a value have a set in which you put any words which have those letters. For example, if the text contains POT and TOP then for the array (a[p]=1, a[t]=1, a[o]=1) the set will contain (POT, TOP).
When the user enters a query word, simply generate the array which counts all the letters and use it to retrieve all the anagrams.

- Anonymous April 11, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
6
of 6 votes

Hmm.
So your suggestion is to create hash map , that has arrays of size of character set (256 for ascii or 65536 for Unicode) as key which map to list of words
Bit of an overkill in terms of space.


Why not before inserting a word , sort it , and put it in HashMap with key being the sorted representation and values being list of anagrams for the key.

- krbchd April 11, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Sounds like a good improvement.

- Anonymous April 11, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Add run encoding and reduce the key size if words are very long.

- Karthik April 12, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

The problem with using sorting, is that sorting is expensive.
However, it depends on what is provided to you, if a lot of space is given, we can even do this. Have two hashmaps, one with a word and its sorted characters, the other with the sorted characters mapped to all the permutations

- Chintamani May 01, 2013 | Flag
Comment hidden because of low score. Click to expand.
1
of 1 vote

My Idea would be:

why don't we use product of ascii values of each character to calculate their hash key.
For Eg:
A=1, B=2, C=3, D=4, E=5, F=6, G=7, H=8, I=9, J=10, K=11, L=12, M=13,
N=14, O=15, P=16, Q=17, R=18, S=19, T=20, U=21, V=22, W=23, X=24, Y=25, Z=26

Then, Hash key for POT = P*O*T = 16*15*20 = 4800
And Hash key for any of POT's anagram will be same as well, because multiplication is associative property

Hash key for TOP = 20*15*16 = Hash key of OPT = 15*16*20 = Hash key for POT = P*O*T = 16*15*20.

Hence, we can maintain a chain of all anagrams in a hash table. and whenever we search for a word,
we calculate the key and return the entire list.

One more optimization can be done is that: We maintain different hash map for every word length.
That is for word length of 3 we have one hashmap, for word length of 4 we have different.
This will avoid any hash key collision for words of different length

Like: CAT and DO represents same hash key of 60. But, if we have independent hashmap for each and every length of words, both of these words should reside in separate hash maps and should be easy to search for.

- Nikhil June 26, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

can u xplain it in form code ?

- behinddwalls April 11, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

def getwords():
  file = open("sherlockholmes.txt")
  words={}
  for line in file:
    for word in line.split():
      key = ''.join(sorted(word)).lower()
      if words.has_key(key):
        words[key].add(word)
      else:
        words[key]=set([word])
  return words

if __name__ == '__main__':
  words = getwords()
  for word in words:
    if len(words[word])>3 :print words[word]

- Shaan April 11, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

It would be easier to build a Trie-Tree with the data which will map the different paths that can lead to a legitimate word in the dictionary.

Once we have the tree then searching will only require us to start from the top and take certain paths that match the individual letters in the input string. If the word is X letters long, then we are essentially going down to X level depth in each path. The comparison of strings is the dominant action in each decision point, which cannot exceed 26 characters in the english dictionary. So at the most we are talking about 26 * X comparisons in 1 path. This multiplied by X different possible paths will give us X^2 comparisons. I am assuming the input has all possible permutations in the dictionary for worst case scenario.

Now since X is the input string length and not dependent on the size of the dictionary this is a constant computation time. The Big O of building the Trie tree is of course another matter but has a one time cost to it.

- KrisS April 17, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

IN JAVA,

int hascode(){ // it will make sure world with length 3 will goes into same bucket
return world.length()
}

bollean equals(Object o){


if( StringUtils.countMatches( ((String)0),p) == 1 && StringUtils.countMatches( ((String)0),u) == 1 && StringUtils.countMatches( ((String)0),t) == 1){

return true;
else return false;

}

Use Either HasSet of HashMap

- Harish Arora May 02, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

If there is enough memory:
unordered_map<std::string, std::list<std::string> > anagrams; // or a std::map to save some space.
the key is sorted string (word), the list contains real words.

- Anonymous May 15, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

/* Function to print permutations of string
   This function takes three parameters:
   1. String
   2. Starting index of the string
   3. Ending index of the string. */
void permute(char *a, int i, int n) 
{
   int j; 
   if (i == n)
     printf("%s\n", a);
   else
   {
        for (j = i; j <= n; j++)
       {
          swap((a+i), (a+j));
          permute(a, i+1, n);
          swap((a+i), (a+j)); //backtrack
       }
   }
}

- Putta June 07, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We can assume that n(the no of words in the file) is much bigger than m(the average no of letters in each word), hence we can use the following key for a hash map
1.for each word to be inserted, sort the word first
2.now use the following to find the hash key
Eg if TOP, POT, OPT is the word, after sorting they all give POT
now assume A = 0 and Z = 25
therefore TOP, POT, OPT all map to the hash key value P*10^0 +0*10^1 + T*10^2
now this will generate a unique number for each word where atleast one letter is different
Hence all words can be mapped into one hashmap
therefore, insertion will take O(n*m*logm) but all searching deletion and updates will take O(1) time

- Praveen July 22, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class AnagramsInFile {

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		List<String> anagramsInFile = new ArrayList<String>();
		anagramsInFile.add("POT");
		anagramsInFile.add("OTp");
		anagramsInFile.add("ottp");
		anagramsInFile.add("Ppot");
		anagramsInFile.add("pot");
		anagramsInFile.add("top");
		anagramsInFile.add("tpo");
		anagramsInFile.add("pto");
		anagramsInFile.add("opt");

		System.out.println(findAnagrams(anagramsInFile, "pot"));

	}
/**
 * The overall time complexity is :
 * consider there are m elements in array , with average length of r
 * than time complexity will be O(m*r log r) 
 * @param anagramsInFile
 * @param findAnagramOfText
 * @return
 */
	private static Object findAnagrams(List<String> anagramsInFile,
			String findAnagramOfText) {
		// TODO Auto-generated method stub
		Map<Integer, List<String>> mappingAnagram = new HashMap<Integer, List<String>>();
		//O(m)
		for (String anagrString : anagramsInFile) {
			//O(r*log r)
			int key = getKey(anagrString);
			
			if (!mappingAnagram.containsKey(key)) {
				mappingAnagram.put(key, new ArrayList<String>());
			}
			mappingAnagram.get(key).add(anagrString);
		}
		//System.out.println(mappingAnagram);
		return mappingAnagram.get(getKey(findAnagramOfText));
	}

	/**
	 * Here this logic will effect the overall time -complexity of the algorithm.
	 * o(n*log n) - performance overall
	 * @param anagrString
	 * @return
	 */
	private static int getKey(String anagrString) {
		if (anagrString == null || anagrString.trim().length() == 0)
			return -1;
		//o(n)
		char[] sortedStringChar = anagrString.toLowerCase().toCharArray();
		//o(n*logn)
		Arrays.sort(sortedStringChar);
		
		//System.out.println(sortedStringChar);
		//System.out.println(sortedStringChar.toString());
		//o(n)
		return String.valueOf(sortedStringChar).hashCode();
	}

}

by sumit dey

- Anonymous November 14, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

public class Test {


public static void main(String[] args){

String[] arr = {"POT","VDSF","OPT","TOP","ME","LOP"};

String user = "POT";

char[] arr1 = user.toCharArray();

int sum = 1;

for(int i=0;i<arr1.length;i++){

sum *= (int)arr1[i];
}
System.out.println(sum);

for(int i=0;i<arr.length;i++){

char[] arr2 = arr[i].toCharArray();
int sum1 = 1;

for(int j=0;j<arr[i].length();j++){
sum1 *= (int) arr2[j];
}

if (sum1 == sum){
System.out.println(arr[i]);
}
}
}

}

- Ankit Jain June 18, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
-1
of 1 vote

Assumptions
1. If the file is really large
2. The sequence to be searched is very small as compared to the size of the file

We can create a hashMap of strings that has all the permutations of the anagram to be searched. The complexity of this would be O(m!) where m is the size of the anagram to be searched. Once we have such a hashMap, we can read every word from the file and check it in the map. The search complexity would be O(1).

Total complexity = O(n) + O(m!) where n is the size of the file(or the count of words in the file) and m is the size of the anagram to be searched
Space complexity = O(m!) --> size of hash map

This method should work well as long as n >> m

Example: If file has 1000 words and we need to search TOP whose length is 3. Then the number of entries in the hash map should be 6.

- CodeNameEagle April 11, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Dont construct the hashmap of all possible anagrams, instead sort the given word and the word to be matched. O(n)

- Karthik April 12, 2013 | Flag


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More