zyfo2
BAN USER- 0of 0 votes
Answersc = ‘a’
- zyfo2 in United States
w = “apple”
c covers w, iff w contains c.
c_set = {‘a’, ‘b’, ‘c’, ‘g’}
w_set = {‘apple’, ‘ibm’, ‘cisco’, ‘google’}
c_set covers w_set, iff every w in w_set is covered by some c in c_set.
Given c_set and w_set, Whether w_set is covered by c_set?
Follow up: if w_set is fixed say a book, how to determine whether a c_set covers this w_set?| Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Data Structures
Thanks for your suggestions. Arrays are better in memory, but String also gets better equality comparison. I think your code already handles cases like "hello","*" if you just put the i<=s1.length. Besides you can also deal with it in the for loop of "*". I've modified my code to make it better.
- zyfo2 December 29, 2012Just do it recursively to check for all substrings.
boolean check(String s1, String s2){
if(s2.equals("*")) return true;
if(s1.equals(s2)) return true;
if(s2.charAt(0)=='?'){
return check(s1.substring(1),s2.substring(1))||check(s1,s2.substring(1));}
else if(s2.charAt(0)=='*'){
for(int i=0;i<=s1.length();i++){
if(check(s1.substring(i),s2.substring(1)) retun true;}
return false; }
else if(s1.charAt(0)!=s2.charAt(0)) return false;
return check(s1.substring(1),s2.substring(1));
}
Mapreduce maybe one option. Just set key as query and frequency as value. Since Mapreduce is for distributed data and it can sort, it will be efficient to get the result. But if you try to get the result repeatedly, hashtable + heap is better since you can make use of previous results.
- zyfo2 December 28, 2012Yes, that's more efficient than a hashmap. If the w_set is an array of words, the position is easy to get. The only problem is still we need to store a lot of duplicate mappings, like for apple in w_set {‘apple’, ‘ibm’, ‘cisco’, ‘google’} you'll need to store a->0, p->0, l->0, e->0 at the same time. Not sure whether it's the best way. But it does work.
- zyfo2 December 27, 2012A word bitmap? Do you mean a hashtable? That should work but seems the time complexity is worse than just traversing the w_set. Although for a fixed w_set, it can be treated as preprocessing. And it will also take a lot of space. Maybe there's better solution since the interviewer asked me to improve the time complexity.
- zyfo2 December 27, 2012Sorry for the confusion. Actually I mean, given a fixed w_set, and check for any given c_set whether it covers w_set. Just a reminder that there can be multiple distinct minimum c_set for a fixed w_set so just generating a minimum c_set is not the solution. Maybe you can generate all possible minimum c_set but I don't even know how to generate the minimum c_set.
- zyfo2 December 27, 2012For the first problem I tried to use a bitmap for these characters instead of a hashtable and check iteratively for every word.
For the second problem, I only came up with a solution that get a minimum c_set for the w_set first and also store all the words in a hashtable with the the first character as the key so that we can check the not covered words iteratively. But seems the interviewer's not satisfied with my answer.
My solution:
Mapper:
output key: mgr_id
value:id,salary //first output
output key: id
value: id,salary //second output
Reducer:
Output key:id,salary
The key is to get the manager's salary. So in the reducer's iterating loop, set the manager's salary when key's id== value's id and output values only higher than this salary after getting manager's salary. And don't forget to store all the values before getting the manager's salary which can be dealt with outside the loop.
just one small bug for second solution, for sorted array -1 3 4 5,
- zyfo2 January 04, 2013the missing min positive int is 1. so you have to add an if before your code: if(array[i]>0 && array[i-1]<=0 && array[i]!=1) return 1;