We have a file or say book as

Intuit Interview Question for Software Engineer / Developers

0

of 0 votes

9
Answers
We have a file or say book as input. Write a java code which can find the occurrence of every word on page.

We don’t want the number of occurrence; we want the number of pages on what word is coming. Take care of each and every word.
- JavaJavas March 13, 2011 | Report Duplicate | Flag | PURGE
Intuit Software Engineer / Developer Java

Email me when people comment.

An error occurred in subscribing you.

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

this is a classic mapreduce problem. But on a single machine, this can be solved using hashmap. For each word encountered, maintain a HashMap<Word, Set<PageNumber>>
if we have already encountered the word, add the page number to the value set. Else add the word and page number as a new entry.

- Anonymous August 02, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Its a wordcount problem in mapreduce.
One can refer this : wiki.apache.org/hadoop/WordCount

- Anuj Kulkarni October 08, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Read file. For each word in file call addPosition():

Map<String, List<Integer>> positionsMap = ...

void addWord(String word, int fileOffset){
List<Integer> positions = positionsMap.get(word);
if( positions == null ){
positions = new ArrayList<Integer>();
}

positions.add( Integer.valueOf(fileOffset) );
}

- m@}{ March 14, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

But how do we know that word is occurring on what page, I mean number of the pages on which word is appearing.

- Anonymous March 14, 2011 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

sorry, call addWord(String, int)

- m@}{ March 14, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This could be done much more quickly by using multithreading.
Say a book has n pages - divide it in 10 parts of size n/10 pages (say) and run threads on each part.
As you scan each page, divide the page in 5 sections and start threads on each section for searching that word...if any of these threads finds the word, record the page number and kill all threads in pool

- Roxanne March 15, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Using a second collection to maintain whether this word has occurred in this page before, would improve the performance.

- lwpro2 March 24, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I didn't get the Qs. how we read the book, as text files(means: each text file as one page.). Or any possibility.

- Nagesh January 03, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Assume input comes like 1 file for each page containing words.
Mapper code:
Each Mapper reads one file. Use a separator lets say ($)
output: word, fileName$1

Reducer receives the input as:
word, list of values
values contain fileName$1

for example:
file1$1, file1$1, file2$1, file3$1..

Now we can just process this list and make a structure Map<String, Map<String, Integer>>
which we can write to text file in a format:
word -> file -> count

example:

abc -> file1 -> 2
abc -> file2 -> 1
abc -> file3 -> 1 ...

- dhirendra.sinha January 05, 2016 | Flag Reply

CareerCup

Intuit Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews