CareerCup

Microsoft Interview Question for Software Engineer / Developers

1

of 1 vote

18
Answers
Given a file, return the list of all the words which occurs exactly n times in the file
- topcoder99 January 14, 2012 in United States for Bing | Report Duplicate | Flag | PURGE
Microsoft Software Engineer / Developer Algorithm

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Team: Bing
Country: United States
Interview Type: Phone Interview

More Questions from This Interview

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

7

of 7 vote

public static Set<String> occurs(Scanner scanner, int n) {
    Map<String, Integer> map = new HashMap<String, Integer>();
    while(scanner.hasNext()) {
        String word = scanner.next();
        if(!map.containsKey(word)) map.put(word, 0);
        map.put(word, map.get(word) + 1);
    }
    Set<String> words = new HashSet<String>();
    for(String word : map.keySet()) {
        if(map.get(word) == n) words.add(word);
    }
    return words;
}

- vijay January 14, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 2 vote

step:1 fetch one word at a time and store it in a Trie with a variable for that word which would keep track of the no of occurances of that word.Each time that word is inserted in the Trie increment that value by one, so at the end it would store the no of occurrence of that word in the whole text.

step:2 Iterate over the trie and return the words which have 'n' no of occurences.

Total complexity: O(sizeof text)

- raja roy January 14, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

@raja
You are right but using a trie requires a huge space complexity.

Is there any solution without using trie?

- topcoder99 January 14, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

You can use a hashtable instead of a trie. This can save a considerable amount of space. The key would be the word and the value would be the frequency

- Anonymous January 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Tries don't have to have a huge space complexity.

- eugene.yarovoi January 15, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

hashmap will have a lot more space then trie, hashmap does not use the advantage of same prefixes between two words

- Anonymous March 13, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

start parse input data file and store words in STL map<word,frequency>; as soon as frequency incremented to n+1 delete that map entry...continue this to the end of file; In last delete all map entries having frequency <n. Thus, finally map is containing all words in file with exaclty n occurences.

- pravin.jnu08 January 16, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

wrong .. all the words with frequence k*n (k>1) will also be prinited !

- anonymous March 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

wrong .. all the words with frequence k*n (k>1) will also be prinited !

- anonymous March 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

wrong .. all the words with frequency k*n (k>1) will also be prinited !

- anonymous March 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

let F be the original file
read F and create a new file, say N, such that each word in F will make only exactly one line in N - takes O(n)
sort N using external merge sort - takes O(nlogn)
read N and list the words whose duplicate counts match n - takes O(n)

So the running time complexity is O(n) + O(nlogn) + O(n) = O(n)

- Anonymous January 16, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

Sorry for the typo, running time complexity should have read O(nlogn).

- A small correction. January 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

One way that i think is much faster is to

1: copy the content of the file in memory (or another file). (We will call this 'data')

2: Get the first word and search it in the data.

3: Count how many times the word occurred and rewrite the memory where each occurrence occurred with white spaces.

4: Do same for the next word.

Rewriting the momory will ensure that we will not scan the same word twice. Also that if we user Boyer-Moore for string search then these overwritten whitespace chunks will be skipped in long jumps( so you will get a nice performance boost after first few words)

- Aseem Vyas January 14, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

hashtable is far better than this approach.

- Abhishek January 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Agreed :(

- Aseem Vyas January 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

Create a linked list, for each node you need to have a word and a frequency.
For each new word you read from the file, do the following:

Compare the word with stored words in the list.
If the word is already there, increase the frequency.
If the word is not there, add the new element to the list with the new word and frequency = 1.

- sergey.a.kabanov January 16, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

Everytime searching one word in linked list will cost more. So hash table is good in this case.

- Sidhu January 18, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Hi

in this way the cost is more in two phases

Phase 1: while updating the list, need to check the each node untill finding the matched one and increase the frequency, else add the new node.

Phase 2: in finding the 'n' number of occurances , need to traverse the list again.

- vijay February 01, 2012 | Flag

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.