You have a String which has th

Interview Question

1

of 1 vote

8
Answers
You have a String which has the value of the text of a magazin or a newspaper.

Find the 10 most used words.
- maryanna November 07, 2013 in United States | Report Duplicate | Flag | PURGE
Android

Email me when people comment.

An error occurred in subscribing you.

Country: United States

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

If the file's size is enough to fit in memory than the solution will be straight forward ...
Use one Minheap<Integer> of size 10 and a HashMap <String, count>

heap = new minheap();
for each item
    // if you don't already have k items on the heap, add this one
    if (heap.count < k)
        heap.Add(item)
    else if (item.frequency > heap.Peek().frequency)
    {
        // The new item's frequency is greater than the lowest frequency
        // already on the heap. Remove the item from the heap
        // and add the new item.
        heap.RemoveRoot();
        heap.Add(item);
    }

But if it a big file, a book etc. Than it is tricky to find top 10 words. I will use SPACE SAVING algorithm . 

SPACESAVING(k)
n ← 0;
T ← ∅;
foreach i do
	n ← n + 1;
	if i ∈ T then //If word alreadyb exists increase its count
		ci ← ci + 1;
	else if |T| < k then //if T is less than F add new word
		T ← T ∪ {i};
		ci ← 1; //New words count will be 1
	else
		j ← arg minj∈T cj; //If T is full remove least frequent one, and add new one
		ci ← cj + 1;  // and add last removed words count in new word
		T ← T ∪ {i}\{j};

Here is link of original research paper:
dimacs.rutgers.edu/~graham/pubs/papers/freqcacm.pdf

- Ajeet November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If the file's size is enough to fit in memory than the solution will be straight forward ... Use one Minheap<Integer> of size 10 and a HashMap <String, count> {{{ heap = new minheap(); for each item // if you don't already have k items on the heap, add this one if (heap.count < k) heap.Add(item) else if (item.frequency > heap.Peek().frequency) { // The new item's frequency is greater than the lowest frequency // already on the heap. Remove the item from the heap // and add the new item. heap.RemoveRoot(); heap.Add(item); } }} But if it a big file, a book etc. Than it is tricky to find top 10 words. I will use SPACE SAVING algorithm . We have k = 10; {{{ SPACESAVING(k) n ← 0; T ← ∅; foreach i do n ← n + 1; if i ∈ T then //If word alreadyb exists increase its count ci ← ci + 1; else if |T| < k then //if T is less than F add new word T ← T ∪ {i}; ci ← 1; //New words count will be 1 else j ← arg minj∈T cj; //If T is full remove least frequent one, and add new one ci ← cj + 1; // and add last removed words count in new word T ← T ∪ {i}\{j}; }}} Here is link of original research paper: dimacs.rutgers.edu/~graham/pubs/papers/freqcacm.pdf - Ajeet November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1) Create a Trie of all the strings in the magazine.
2) While creating the trie, when a word ends at a node, maintain a count on that node determining how many words ended at that node.
3) After creating the trie, Do a BFS or DFS for each node. When you see a node with count insert into min heap.
4) After 10 insertions, when you insert next node, perform remove min.
5) The remove min will remove based on the count in the node.

- Anonymous November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Hey even I was thinking of the same solution...but stopped by thinking of the Time complexity for searching the most repeated words node. Every node in the Trie needs to be visited.

- Stupid Developer November 07, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Assuming usual things.

Count based hash. Hash strings defined as delimitted by ur preferred delimitters.
Arrayify table inside hash with fields: string, count

Median of medians (or ur favourite "find top ten" alg) to find top ten using count field.

- urik on bb November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Because 10 is small and fix we can easy go through magazi
10 times to get the top ten.
-Guest DS

- algos November 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public static void mostUsedWords(String text) {
		String words[] = text.toLowerCase().split(" ");
		Map<String, Integer> wordsMap = new HashMap<String, Integer>();
		CustomComp bvc = new CustomComp(wordsMap);
		Map<String, Integer> wordsSortedmap = new TreeMap<String, Integer>(bvc);
		for (int i = 0; i < words.length; i++) {
			if (wordsMap.containsKey(words[i])) {
				int value = wordsMap.get(words[i]);
				wordsMap.put(words[i], value + 1);
			} else {
				wordsMap.put(words[i], 1);
			}
		}
		wordsSortedmap.putAll(wordsMap);
	    int length = wordsSortedmap.size();
		if(length>10)
	    {
	      	length = 10;
	    }
		for (Map.Entry entry : wordsSortedmap.entrySet()) 
		{ 
			if(length==0)break;
			else
			System.out.println("Word====>" + entry.getKey() + "\t\t Times===>" + entry.getValue());
			length--;
		}
	}
}

class CustomComp implements Comparator<String> {

	Map<String, Integer> base;

	public CustomComp(Map<String, Integer> base) {
		this.base = base;
	}

	public int compare(String a, String b) {
		if (base.get(a) >= base.get(b)) {
			return -1;
		} else {
			return 1;
		}
	}
}

- Anonymous November 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

public static void mostUsedWords(String text) {
		String words[] = text.toLowerCase().split(" ");
		Map<String, Integer> wordsMap = new HashMap<String, Integer>();
		CustomComp bvc = new CustomComp(wordsMap);
		Map<String, Integer> wordsSortedmap = new TreeMap<String, Integer>(bvc);
		for (int i = 0; i < words.length; i++) {
			if (wordsMap.containsKey(words[i])) {
				int value = wordsMap.get(words[i]);
				wordsMap.put(words[i], value + 1);
			} else {
				wordsMap.put(words[i], 1);
			}
		}
		wordsSortedmap.putAll(wordsMap);
	    int length = wordsSortedmap.size();
		if(length>10)
	    {
	      	length = 10;
	    }
		for (Map.Entry entry : wordsSortedmap.entrySet()) 
		{ 
			if(length==0)break;
			else
			System.out.println("Word====>" + entry.getKey() + "\t\t Times===>" + entry.getValue());
			length--;
		}
	}
}

class CustomComp implements Comparator<String> {

	Map<String, Integer> base;

	public CustomComp(Map<String, Integer> base) {
		this.base = base;
	}

	public int compare(String a, String b) {
		if (base.get(a) >= base.get(b)) {
			return -1;
		} else {
			return 1;
		}
	}
}

- Anonymous November 19, 2013 | Flag Reply

CareerCup

Interview Question

Books

Videos

Resume Review

Mock Interviews