Given an file which consists t

Amazon Interview Question for Software Engineer / Developers

0

of 0 votes

15
Answers
Given an file which consists thousands of lines. each line consists of a string and several integers.
design an algorithm which take input of several integers and print out the string of the line that have most matches.

input file
----------
aa 3 4 10 2
bb 9 14 15 21 3
cc 12 1024 200 3 9 4
----------

examples:
input: 3 4 10
output: aa
input: 12 3 4
output: cc
input: 3 9
output: bb
input: 3 9
output: cc
input: 3 4 12
output: cc

Thanks!
- Itcecsa April 04, 2012 in United States | Report Duplicate | Flag | PURGE
Amazon Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

Since the file contains several thousand lines/numbers, it would be more efficient to preprocess the input in O(n) time, storing the numbers in Hashset. E.g. 3, 4, 10 will be stored in Hashset.

Now as you read the file sequentially you see if the current number in the current string is contained in the Hashset. If yes increment count for that string.

At end of the line you compare this count with Max Count and store count as Max Count if it is greater and also store the corresponding string.

Thus only one pass through the file will be required. Complexity will be O(m + n).

- Anonymous April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Create a hash
where multiple locations point to a object {string, count}
so hash(3), hash(4), hash(10), hash (2) points to {str -> aa, count -> 0}
so hash(3), hash(9), hash(14), .. points to {str ->bb, count -> 0}

### here 3 is pointing to both aa, and bb

maxCount = 0;
maxStr = '';

for each num in input
foreach obj in map.get(num) ### map.get(3) will return aa, bb, .., etc
obj.count++;
if (obj.count > maxCount)
maxCount = obj.count
maxStr = obj.str
end if
end for
end for

return {maxCount, maxStr}

- Anonymous April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

1) Create a Map with key as number and value as HashSet of string.

2) Given a number, look up the value in map and iterate through its strings creating another map with key as string this time and value as count of each time its seen in a number's set.

3) Once you have repeated step 2 for all input numbers you will have a count of how many times each string was present in the number's string sets. Run through this map and find the string with highest count.

- Anonymous April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

What reply did you give and wht was the interviewer feed back on it

- nr April 04, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

what will be the answer if input is discrete.
i/p : 10 1024
o/p ??

- scofield April 04, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Sort of a brute-force approach, optimized a bit by sorting and leapfrogging. I'm sure someone will do better. I was thinking of using a bitmap but couldn't work out how to make it more efficient.

input = sort(input)
while (get(line)) {
data[line[0]] = sort(line[1-end])
}

winner = list[]
win_matches = 0

foreach tag in data {
comp = data[tag]
matches = 0
index = 0
len = length(data[tag])
foreach elem in input {
while (index < len) {
if (comp[index] > elem) {
break
} else if (comp[index] == elem) {
matches ++
// break /* uncomment to count duplicate number matches */
}
index++
}
}
if (matches > win_matches) {
winner = list(tag)
win_matches = matches
} else if (matches == win_matches) {
winner = append(winner, tag)
}
}

print (rand(winner))

- refactor April 04, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1. For each line in input file create a hashset (of integers) i.e a vector of hashsets.
2. For each integer in input string:
     i) check each of the hash_set. If all integer found in a single hash_set , print it. Otherwise keep a count of max_count and hash_set index. Print at the end.
m->no of line in input file
n->max no of integer in a single line
p->no of integer in input
Time complexity: O(m*n + p*m) {m*n=for creating hash_set,p*m for checking}
Space complexity: O(m*n)

- dumbhead April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use a trie. The entire file can be preprocessed in O(n) time where n is the number of integers in the file. Once this is done, you can find the line matching the most in O(m) where m is the number of integers in the input.
To be independent of the ordering of input integers, when creating the trie, have it created for all the permutations of integers in the line
For example for the first line 3,4,10 - create entries in trie for
3,4,10
4,3,10
10,4,3
10,3,4
4,10,3
3,10,4

- Anonymous April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

so we need factorial size of memory??? just consider 10!

- jay April 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Ofcourse, the algorithm won't work well with 10!. But if it is less 5, this approach definitely has an upper hand. Also, may be the trie could be optimized by converting it to a graph except the root node

- Renjith April 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

- Renjith April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Thought of exact same solution as posted by anonymous :)

- mandeep April 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The above problem becomes similar for finding the longest common subsequence between two array of integers...
The extension would be like longest common subsequence between a constant array (input) and a list of array integers (from file). The array from the list of array integers having the longest subsequence is our answer.
This approach will require negligible memory and is almost an inplace search algorithm. The complexity of the algorithm would be O(mn1+mn2+mn3...) where m is the length of the input string and n(i) is the length of the array of ith line integers.
This approach is much more easier to implement as compared to hasmap solution and has almost O(1) space complexity as compared to hashmap solution whose worst case space complexity when all the numbers in the list are distinct is O(n1+n2+n3...)

- sunny April 10, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Does anyone have a better solution that O(n), as I can see it can be done easily in this time

Here is the algo
1) Read line by line
2) Extract the number from the line and match with the numbers given,
Take a max which contains the string and the number of matches

Everytime you compares for a match and if matches are greater than max then make max = current string

- DashDash May 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public class Program2 {
	public static String currentPoint;
	static Map<String, Set<String>> dic = new HashMap<String, Set<String>>();
	public static void main(String[] args) throws IOException {
		String fileName = "input.txt";
		BufferedReader reader = new BufferedReader(new FileReader(fileName));
		for (String line; (line = reader.readLine()) != null;)
		{
			TreeSet set = new TreeSet();
			Collections.addAll(set, line.split(" "));
			dic.put((String)set.pollLast(), set);
		}
		System.out.println(getKey(Arrays.asList(new String[]{"3","4","10"}) ));
		System.out.println(getKey(Arrays.asList(new String[]{"12", "3", "4"})));
		System.out.println(getKey(Arrays.asList(new String[]{"3", "9"})));
		System.out.println(getKey(Arrays.asList(new String[]{"3", "9"})));
		System.out.println(getKey(Arrays.asList(new String[]{"3", "4", "12"})));
	}
	public static String getKey(Collection<String> nums){
		for (Entry<String, Set<String>> entry : dic.entrySet()) {
			if(entry.getValue().containsAll(nums)) { return entry.getKey();}
		}
		return "Nil";
	}
}

- Anonymous May 11, 2012 | Flag Reply

CareerCup

Amazon Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews