Facebook Interview Question
SDE1sCountry: United States
@HarryT:
suppose n = 3, m = 30
the following log would be correct according to the problem statement
time | bot id |
30 | 1 |
30 | 2 |
31 | 1 |
32 | 1 |
50 | 2 |
60 | 2 |
but if you only go back 10 seconds, you miss bot 1, since the question didn't say anything about the distribution. I assumed it's a more or less uniform distribution
so:
time | bot id |
30 | 1 |
30 | 2 |
40 | 1 |
40 | 2 |
50 | 1 |
50 | 2 |
I would look at 50=2 and 50=1 and 40=2 and 40=1 and 30=2 and stop here, because 50-30 > 10.
but as soon you have a little jitter due to scheduling or what ever it might look like:
time | bot id |
29 | 1 |
32 | 2 |
42 | 1 |
43 | 2 |
49 | 1 |
49 | 2 |
61 | 2 |
so, you look at 61 = 2 and 49 = 2 and stop because 61-49 > 10 ... ooops
so a safety factor would extend the time frame you look at so you catch reasonable outliers on the "arrival time" ...
the question said nothing about the distribution within this n seconds. Assuming a perfect uniform distribution, knowing the context, is a bit naive, I thought ...
My code assumes that the id is a character,instead of a string. This can be extended to strings by using another map.
import java.io.*;
import java.util.*;
public class Log
{
public char id;
public int time;
public Log(char a, int b)
{
this.id = a;
this.time = b;
}
public static List<Character> findutil(Log[] logs,int m, int n)
{
int[] diff=new int[10];
int i;
int[] count=new int[26];
for(i=0;i<10;i++)
{
count[i]=1;
diff[i]=0;
}
for(int j=i;j<26;j++)
count[j]=1;
HashMap<Character,Log> hm= new HashMap<Character,Log>();
for(i=0;i<logs.length;i++)
{
if(hm.containsKey(logs[i].id))
{
Log temp=hm.get(logs[i].id);
diff[i]+=logs[i].time- temp.time;
if(diff[i]<=n)
count[logs[i].id-'a']++;
hm.put(logs[i].id,logs[i]);
}
else
{
hm.put(logs[i].id,logs[i]);
}
}
List<Character> x= new ArrayList<Character>();
for(i=0;i<10;i++)
if(count[i]==m)
{ //System.out.println("count[i]");
System.out.println(count[i]);
x.add(logs[i].id);
}
return x;
}
public static void main(String[] args)
{
int m=3,n=20;
Log[] x=new Log[10];
Scanner s=new Scanner(System.in);
x[0]=new Log('a',10);
x[1]=new Log('b',12);
x[2]=new Log('a',13);
x[3]=new Log('c',16);
x[4]=new Log('a',19);
x[5]=new Log('b',26);
x[6]=new Log('d',37);
x[7]=new Log('e',46);
x[8]=new Log('f',47);
x[9]=new Log('g',54);
List<Character> p = findutil(x,m,n);
System.out.println(p);
}
}
public class MainTest {
public static void main(String[] args) {
int m=3,n=20;
Log[] x=new Log[10];
Scanner s=new Scanner(System.in);
x[0]=new Log("a",10);
x[1]=new Log("b",12);
x[2]=new Log("a",13);
x[3]=new Log("c",16);
x[4]=new Log("a",19);
x[5]=new Log("b",26);
x[6]=new Log("d",37);
x[7]=new Log("e",46);
x[8]=new Log("f",47);
x[9]=new Log("g",54);
Set<String> p = getBots(x,m,n);
System.out.println(p);
}
static class Log {
String id;
int time;
public Log(String id, int time) {
this.id = id;
this.time = time;
}
}
public static HashSet<String> getBots(Log[] logs, int m, int n) {
Queue<Log> logQueue = new LinkedList<Log>();
HashSet<String> violatingIds = new HashSet<>();
int lastTimeOnQueue = logs[0].time;
Map<String, Integer> frequency = new HashedMap<>();
for (Log log : logs) {
if (log.time - lastTimeOnQueue > n) {
for (String id : frequency.keySet()) {
if (frequency.get(id) >= m) {
violatingIds.add(id);
}
}
}
while (log.time - lastTimeOnQueue > n) {
Log lastLog = logQueue.poll();
lastTimeOnQueue = logQueue.peek().time;
frequency.put(lastLog.id, frequency.get(lastLog.id) - 1);
}
logQueue.add(log);
if (frequency.containsKey(log.id)) {
frequency.put(log.id, frequency.get(log.id) + 1);
} else {
frequency.put(log.id, 1);
}
}
return violatingIds;
}
}
scan the logs to figure out how many robots there are. I assume
the question is how to narrow down the time frame of logs scanned
(e.g. if they are on disc).
If the m/n frequency is without jitter, scan seiling(m/n) records is
sufficient. that's unrealistic due to scheduling, network latency etc.
I assume a safety factor of two would account for various
sources of errors and would scan too many entries by a factor
of two.
An other approach could be to scan the logfile until each robot-id
has been seen at least twice. Again this will fail eventually in extreme
cases with only two robots.
It's basically a probability experiment to maximize probability every
bot wrote a log in the time frame we are looking at (possion process)
something like this:
- Chris May 31, 2017