Yahoo Interview Question
- 0of 0 votes
Write a MapReduce job that takes in two text files, and output the probability that those two files are identical (with 0% -> completely different, 100% -> completely different).- tazo June 06, 2013 in United States
Clarification: Matching should not be a per-line diff, but it's about the content. One article could be 80 characters per line in one version, but could be 100 characters per line in another version, for the same content. In that case, it should be 100% match even though, if you are comparing line by line, they are totally different.
| Report Duplicate | Flag | PURGE
Interview Type: Phone Interview
Open Chat in New Window