Facebook Interview Question for Software Engineers


Country: United States




Comment hidden because of low score. Click to expand.
3
of 3 vote

I think y'all are overthinking this; it says the similarity function is a given. This question is about how you manage building the similarity dictionary (whoops I gave it away).

- Honcho March 29, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

//Levenshtein Algo
	public static void main(String [] args) {
	    String a [] = {"tasneem@gmail.com", "tas@hotmail.com", "abc@gmail.com", "abc@hotmail.com", "apc@yahoo.com"};
	    HashMap<Integer, ArrayList<String>> similarElementsMap = new HashMap<>();
	    for (int i = 0 ; i < a.length - 1; i ++) {
	      int d = distance(a[i], a[i+1]);
	      if (similarElementsMap.containsKey(d)) {
	    	  ArrayList<String> list = similarElementsMap.get(d);
	    	  list.add(a[i]);
	    	  list.add(a[i+1]);
	      } else {
	    	  ArrayList<String> list = new ArrayList<>();
	    	  list.add(a[i]);
	    	  list.add(a[i+1]);
	    	  similarElementsMap.put(d, list);
	      }
	    }
	    
	    for (Entry<Integer, ArrayList<String>> map : similarElementsMap.entrySet()) {
			System.out.println("For distance : "+map.getKey());
			for (int i = 0 ; i < map.getValue().size(); i ++) {
				System.out.println(map.getValue().get(i));
			}
		}
	    
	  }
	  
	  
	  private static int distance(String a, String b) {
	   
	    a = a.substring(0, a.indexOf('@')).toLowerCase();
	    b = b.substring(0, b.indexOf('@')).toLowerCase();
	    // base case 
	    if (a.length() == 0) return b.length();
	    if (b.length() == 0)return a.length();
	    
	    int cost[] = new int[b.length() + 1];
	    // Initialise cost with each indexes initialy 
	    for (int i = 0 ; i < b.length(); i ++) {
	      cost[i] = i;
	    }
	    
	    // Iterate through a and b string and update cost array with each a iteration
	    for (int i = 1; i <= a.length(); i ++) {
	      cost[0] = i;
	      int k = i - 1;
	      for (int j =1; j <= b.length(); j ++) {
	        int c = Math.min(1 + Math.min(cost[j], cost[j-1]), a.charAt(i-1) == b.charAt(j-1) ? k : k + 1);
	        // Swap the min value with cost and k
	        k = cost[j];
	        cost[j] = c;
	      }
	    }
	    return cost[b.length()];

}

- Tasneem March 30, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

Python simple solution:

def groupEmails(emails):
	buckets = []
	for email in emails:
		for bucket in buckets:
			if isSimilar(email, bucket[0]):
				bucket.append(email)
				break
		else:
			buckets.append([email])
	return buckets

- Galileo April 22, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Weighted Quick Union + Path Compression
Complexity: N + M lg* N. M unions (or similarities) on N Emails.

- naren March 28, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Even if we assume transitivity of the similarity function, how do you reduce the number of unions to be performed (M in this case) to below O(N^2)?

- axg April 13, 2017 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

We can use the Levenshtein distance or edit distance algorithm. More the levenshtein distance, more dissimilar the email IDs are. We have to make sure, we do not compare the TLDs i.e. ignore @gmail.com and @hotmail.com

For instance, "hello" and "jello" have LD as 1. So, quite similar. However, "good" and "goodbye" will have LD as 3, so slight dissimilar.

- puneet March 28, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

We can use Levenshtein distance algorithm. More the LD distance, more dissimilar the words.

- puneet March 28, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

//Levenshtein Algo
	public static void main(String [] args) {
	    String a [] = {"tasneem@gmail.com", "tas@hotmail.com", "abc@gmail.com", "abc@hotmail.com", "apc@yahoo.com"};
	    HashMap<Integer, ArrayList<String>> similarElementsMap = new HashMap<>();
	    for (int i = 0 ; i < a.length - 1; i ++) {
	      int d = distance(a[i], a[i+1]);
	      if (similarElementsMap.containsKey(d)) {
	    	  ArrayList<String> list = similarElementsMap.get(d);
	    	  list.add(a[i]);
	    	  list.add(a[i+1]);
	      } else {
	    	  ArrayList<String> list = new ArrayList<>();
	    	  list.add(a[i]);
	    	  list.add(a[i+1]);
	    	  similarElementsMap.put(d, list);
	      }
	    }
	    
	    for (Entry<Integer, ArrayList<String>> map : similarElementsMap.entrySet()) {
			System.out.println("For distance : "+map.getKey());
			for (int i = 0 ; i < map.getValue().size(); i ++) {
				System.out.println(map.getValue().get(i));
			}
		}
	    
	  }
	  
	  
	  private static int distance(String a, String b) {
	   
	    a = a.substring(0, a.indexOf('@')).toLowerCase();
	    b = b.substring(0, b.indexOf('@')).toLowerCase();
	    // base case 
	    if (a.length() == 0) return b.length();
	    if (b.length() == 0)return a.length();
	    
	    int cost[] = new int[b.length() + 1];
	    // Initialise cost with each indexes initialy 
	    for (int i = 0 ; i < b.length(); i ++) {
	      cost[i] = i;
	    }
	    
	    // Iterate through a and b string and update cost array with each a iteration
	    for (int i = 1; i <= a.length(); i ++) {
	      cost[0] = i;
	      int k = i - 1;
	      for (int j =1; j <= b.length(); j ++) {
	        int c = Math.min(1 + Math.min(cost[j], cost[j-1]), a.charAt(i-1) == b.charAt(j-1) ? k : k + 1);
	        // Swap the min value with cost and k
	        k = cost[j];
	        cost[j] = c;
	      }
	    }
	    return cost[b.length()];

}

- tasneem March 30, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Given people already are massively overthought this - and they did not think what should be thought about : en.wikipedia.org/wiki/Similarity_measure.
Take a pause and think this.
Is it transitive? That is :

A similar_to B  and B similar_to C ==>(implies) A similar_to C

does it hold true? 99.999% of the real practical cases, it won't be.
Take name for example.
ram : ok, male
rama : ok, male again perhaps
ramaa : ambiguous - perhaps it is a female name ? Now thinking backward,
ramaa --> rama :: hmm. See the problem there?
what about roma? ramaa and roma are close but is ram and roma close? Well.

That... pose a problem.
How did we get it? By working in system - where close name matching may point to same individual - facebook is another classic example. Same problem persists in the Indian addresses.

Thus, we need to ask, is the similarity function transitive? That would have clear impact on any algorithm we try to apply.

- NoOne March 31, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I assumed that traistive rule doesnt work in this case (As NoOne mentioned it earlier). So that means I have to compare each emailaddress for similarity. so the brute force takes the complexity of O(n^2)

Here is a sample code, Is there any better solution

List<HashSet<string>> allSets = new List<HashSet<string>>();
            for(int i = 0; i < emailIds.Length; ++i)
            {
                HashSet<string> currentSet = new HashSet<string>();
                currentSet.Add(emailIds[i]);
                for (int j = 0; j < emailIds.Length; ++j)
                {
                    if(i == j)
                    {
                        //same check
                        continue;
                    }
                    if(true == IsSimilar(emailIds[i], emailIds[j]))
                    {
                        currentSet.Add(emailIds[j]);
                    }
                }

                if(currentSet.Count == 1)
                {
                    // there is no similar email id to this, so let just remove it
                    currentSet.Remove(emailIds[i]);
                }
                allSets.Add(currentSet);
            }

- The-Learner March 31, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Let's suppose the interviewer says it /is/ transitive, i.e. A=B, B=C --> A=C. And let's say they only want sets with multiple emails, i.e. no singleton sets need be kept.

LinkedList<String> input = ...;
List<HashSet<String>> output = new LinkedList<>();
while (!input.isEmpty()) {
	// pop one item
	String current = input.remove(0);// (if array list, take last instead)
	HashSet<String> equal = new HashSet<>();// (or ArrayList)
	// transfer any similar emails
	Iterator<String> it = input.iterator();
	while (it.hasNext()) {
		String s = it.next();
		if (isSimilar(current,s)) {
			equal.add(s);
			it.remove();// fast for LinkedList (otherwise careful)
		}
	}
	// only output non-singleton sets
	if (!equal.isEmpty()) {
		equal.add(current);//sic: add only if not empty
		output.add(equal);
	}
}

- gforman44 April 04, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Let's suppose the interviewer says it /is/ transitive, i.e. A=B, B=C --> A=C. And let's say they only want sets with multiple emails, i.e. no singleton sets need be kept.

LinkedList<String> input = ...;
List<HashSet<String>> output = new LinkedList<>();
while (!input.isEmpty()) {
	// pop one item
	String current = input.remove(0);// (if array list, take last instead)
    HashSet<String> equal = new HashSet<>();// (or ArrayList)
	// transfer any similar emails
	Iterator<String> it = input.iterator();
	while (it.hasNext()) {
		String s = it.next();
		if (isSimilar(current,s)) {
			equal.add(s);
			it.remove();// fast for LinkedList
		}
	}
	// only output non-singleton sets
	if (!equal.isEmpty()) {
		equal.add(current);//sic: add only if not empty
		output.add(equal);
	}
}

- gforman44 April 04, 2017 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

object Solution {
  
  import scala.collection._

  type Email = String
  
  def readInput(): List[Email] = {
    val sc = new java.util.Scanner (System.in)
    var n = sc.nextInt()
    var emails = new Array[String](n)
    for(i <- 0 until n) {
      emails(i) = sc.next()
    }

    emails.toList
  }

  def isSimilar(email1: Email, email2: Email): Boolean = {
    val maxDistance = 2
    
    def countify(str: String): Map[Char, Int] = {
      str.groupBy(identity).mapValues(_.length)
    }

    def distance(str1: String, str2: String): Int = {
      val (c1, c2) = (countify(str1), countify(str2))
      val (s1, s2) = (c1.keySet, c2.keySet)
      val all = (s1 | s2).toList

      all.map(x => math.abs(c1.getOrElse(x, 0) - c2.getOrElse(x, 0))).sum / 2
    }
    
    distance(email1, email2) <= maxDistance
  }

  def groupBySimilarity(emails: List[Email], similarFunc: (Email, Email) => Boolean): List[Set[Email]] = {
    
    val res = emails.foldLeft(Map.empty[Email, mutable.Set[Email]])( (map, email) => {
      val key = map.keys.filter(k => similarFunc(k, email)).headOption
      if (key.isDefined) {
        map(key.get) += email
        map
      } else {
        val set = new mutable.HashSet[Email]()
        set += email
        map + (email -> set)
      }
    })

    res.mapValues(_.toSet).values.toList
  }
  
  def main(args: Array[String]) {
    val emails = readInput()
    val grouped = groupBySimilarity(emails, isSimilar)
    grouped.foreach(g => println(g.mkString(" ")))
  }

}

- Aleks.Sidorenko April 19, 2017 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More