## Facebook Interview Question for Software Engineer / Developers

Country: United States
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
4
of 4 vote

Considering an unsigned 32 bit integer
There are 2^32 possible values ( about 4 Billion )
Allocate a single bit for each value ( This would required 512 MB of memory )
Read the file, for each integer I, set ith bit to 1
find the first bit which is not 1
print the number corresponding to this bit.

Comment hidden because of low score. Click to expand.
0

that should work no regardless the size of the integers. 1 million number = 2^30 numbers, each number representing to 32 = 2^5 numbers, it needs at most 2^30/2^5 = 2^25 integers, in 64-bit system is 2^25* 8 byte = 256M. Or I am talking about a different method?

Comment hidden because of low score. Click to expand.
0

@nanny: I doubt that you and DarkKnight are talking about the same thing. Do elaborate on your idea though.

Comment hidden because of low score. Click to expand.
0

Use bit map, each 32-integer could represent the existences of 32 numbers, we need 2^25 integers to present from 1 to 1 billion , which may need up to 256M memory. In 64 bit, each integer cost more space but could represent more numbers (64) so it doesn't matter the system is using.

Comment hidden because of low score. Click to expand.
0

@nanny: I see what you're saying. You're saying that by pigeonhole principle, we will conclude that if there's a billion numbers, at least one number between 1 and 1 billion + 1, inclusive, is unused. We will then make a bitmap with 1 billion + 1 bits and mark off occurrences of only numbers from 1 to 1 billion + 1. We will be able to do this with 128 MB. If the inputs were 64-bit integers, this would take no more memory, because we still only care about integers in the 1 to 1 billion + 1 range.

That's a decent solution. It's a different solution from what DarkKnight was referring to, though.

Comment hidden because of low score. Click to expand.
0

@eugene.yarovoi
The question is asking for any integer not in file. We can set the range to 1 to 1 billion. Any number out of the range, we just ignore it.
For example, if we are given 64 numbers, and we want to find an integer not there, we only need to check the existences of 1 to 64. We only need 2 integers to check the existences of 1 to 64: num_1 store the existence of 1 to 32 and num_2 stores the existences of 33 to 64. (One integer is actually works as 32 bool values).
That's my only understanding of DarkKnight's algorithm. Maybe you have a different interpreting?

Comment hidden because of low score. Click to expand.
0

@eugene.yarovoi
And that's independent to the 32-bit and 64-bits since if the bits in one integer doubled, it takes double memory but can present double amount of numbers as well. So doesn't matter which system is using

Comment hidden because of low score. Click to expand.
0

@Nanny: As you were writing your response, I realized what you were talking about and edited my response above before I saw your reply. See my changes.

Your algo makes sense. DarkKnight was almost certainly proposing a different solution, as evidenced by:

"There are 2^32 possible values ( about 4 Billion )
Allocate a single bit for each value ( This would required 512 MB of memory )"

He is proposing allocating a bit for every possible value of a number and doesn't incorporate the pigeonhole principle.

Comment hidden because of low score. Click to expand.
0

@eugene.yarovoi
Yes, I just read your response. I don't quite understand "he is proposing allocating a bit for every possible value of a number". He need some way to keep tack of the orders of numbers so no sorting is need. Bit map is the only way I can think of. Could you please explain more on this, take the example: given 64 numbers and find one not existing.

Comment hidden because of low score. Click to expand.
0

If the numbers are 32-bit numbers, since there are 2^32 possible 32-bit numbers, he would allocate 2^32 bits, regardless of the input size.

Comment hidden because of low score. Click to expand.
0

@eugene.yarovoi
How did you use 1-bit to store the existence of 2^32 instead of using a bit map?
Could you show me an example? I think DarkKnight and I was talking about the same method, "set the ith bit to be true". You have to use some way to keep the orders as I said, or maybe you have some other fancy way?

Comment hidden because of low score. Click to expand.
0

No, no, it's a bitmap. It's just that you propose creating a bitmap only for the first 1 billion values. It looks like DarkKnight was proposing creating a bitmap for all possible values (2^32 for 32-bit ints). Then his technique would not be extensible to 64-bit ints.

Comment hidden because of low score. Click to expand.
3
of 3 vote

External sort then binary search on the file

Comment hidden because of low score. Click to expand.
0

As far as I've read, binary search and quick sort both don't exploit the locality of reference hence if used on file sizes which can't fit in main memory, they cause excessive paging. Please correct me if I'm wrong.

Comment hidden because of low score. Click to expand.
0

@Algos: No one said anything about quicksort, first of all. External sort is usually done via a mergesort phase followed by an N-way merge algorithm (which is either done by more mergesorts or by something that looks like a modified heapsort).

Second of all, there's no reason why quicksort couldn't exploit locality of reference, as it does operate on largely contiguous data. It arranges the array from the front and the back at the same time, but that's only 2 locations and writes are contiguous in both those places, so with a proper buffering strategy, things would be fine.

Third, you're right that binary search is not always a great algorithm to use on files, but we only need to use it once, so the cost of it relative to the cost of sorting is minimal.

Comment hidden because of low score. Click to expand.
0
of 2 vote

Read each number in the file. Keep track of the largest number you encounter. When you are at the end. Add 1 to it.

Comment hidden because of low score. Click to expand.
0

If you find the MAX_INT value as the largest number, adding 1 to it will not work

Comment hidden because of low score. Click to expand.
0

Since the original question didn't mention bounds, there's nothing wrong with this solution as long as you preface it by saying "If the integers have no upper or lower limit, ...". But you'd probably be expected to come up with a decent solution for bounded integers too. :-)

Comment hidden because of low score. Click to expand.
0
of 2 vote

Pick a random 64 bit integer.

Comment hidden because of low score. Click to expand.
0

That would probably be my first thought too. To give some analysis: the person who posted this answer probably (and apologies if I'm mistaken) meant that we should choose an integer uniformly at random from the space of all possible integers we're considering (let's say this question is about 32-bit ints). Then, because there's > 4*10^9 possibilities (in the case of 32-bit ints), if we scan our array of size 10^9 for the number we just chose, the chances we'll find it is < 1/4. If we don't find it, the algo terminates and returns the number. If we find it, we just run the whole algorithm again. So, we have that E[running time] = O(n) + P(we need to run again)*E[running time]. This implies E[running time] = O(n) / (1-P). Since we know P < 1/4, we know (1-P) > 3/4. This means E[running time] < 4/3 O(n), which means E[running time] is still O(n).

Comment hidden because of low score. Click to expand.
0

I think starting at MIN_INT and methodically working up would work better than a randomly chosen number. Since the list of numbers is random, whatever number you choose has the exact same odds of being in the list as a number chosen randomly and also doesn't add the risk of randomly choosing the same target number again. It still seems like too much of a brute force method though.

The only thing my tired brain can come up with right now is doing a quicksort so that finding it becomes trivial. But again, it feels like there's a simpler solution I haven't put my finger on yet.

Comment hidden because of low score. Click to expand.
0

@RS: Why do you think the list of numbers is random? This is not stated in the problem. What if they're adversarially chosen?

Comment hidden because of low score. Click to expand.
0
of 0 vote

It is not clear from the question, If the integers are between 1 to 1B only. We can solve it by divide and conquer. We can divide all the integers based on their 1st bit i.e. make two buckets, 1st bucket with all integers whose first bit is 0 and second bucket with integers whose first bit is 1. then count the integers in both buckets, The bucket which has less integers pick that and again divide it into 2 buckets with 2nd bit and so on, do this until you end with a bucket no element and keep track of the bit for the each chosen bucket. You can form the missing integer.

Comment hidden because of low score. Click to expand.
0

It's like the radix sort, and it will work, but there is a memory issue, to store a billion 4B ints will take about 4GB. If memory isn't an issue better solution would be counting sort(O(n)). To solve this in O(n) without additional memory some sort of dynamic search pattern should be used.

Comment hidden because of low score. Click to expand.
0
of 2 vote

I think using quicksort approach would definitely hit it. Suppose we are given a billion 64-bit integers we try to partition them around 2^63 and then we choose the part that has less (or equal) numbers. Suppose that left part has less numbers, then by comparing it to 2^62 we partition it (in place) again. until we reach a point that after partitioning there's one part that doesn't contain any number in it. Then you could choose any number in that range for an answer. That works for 32-bit integers, too.

Comment hidden because of low score. Click to expand.
0

Yup that works dude!

Comment hidden because of low score. Click to expand.
0

1,000,000,000 integers need 1,000,000,000 * 4 bytes = 4 GB.
How on hell are you going to quick sort this... external sort seems to be the only way if you want to sort and search...

Comment hidden because of low score. Click to expand.
0
of 0 vote

Scan the file first and store the values in a BitMap. Using 1 bit per integer, we'd need 125MB.

Next scan the bitmap for a 0 bit.

Comment hidden because of low score. Click to expand.
0

This is a good solution, however it only works for small integers (i.e. 32 bits). You would need a 512MB bitmap.

For larger integers (e.g. 64 bits), you would need a 2 Exabyte (2^61 bytes) bitmap which makes the solution not viable.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Two we need to know to answer this question:
1. Are the numbers in the file unique?
2. What is the range of the numbers in the file?

If the numbers in the file are unique then we are dealing with a billion numbers in the range [n to n+1billion+1]. The plus one is for the missing number. If the numbers are not unique then the numbers will range from [n to n+c+1] where c is unknown, and there will be repetitions in the file (unless the file is 1billion instances of the same number).

If assume we are dealing with a file containing 1billion unique numbers in the range [0 to 1billion+1] then an way to determine the missing number would be to:
1. calculate the sum of [0 to 1billion+1] (sum of arithmetic series = n(a1+an)/2)
2. then subtract each number contained in the file.

The sum is big (approx. 1x10^18) This could be represented in 60bits, so a long will be fine.

``````final long billion = 1000000000;
long sum = billion * (billion+1)/2;

while (file.hasNext()) {
sum -= file.next();
}

return sum;``````

Comment hidden because of low score. Click to expand.
0

Ah ok. So the numbers can be any integer, can repeat, and there is a whole set of integers that satisfy this problem.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Here's my Java solution for limited memory. Very minimal error checking.

``````int getIntNotInFile(String filename, int memoryInBits) {

if (memoryInBits<=0)
throw new IllegalArgumentException("We need some memory");

int low = Integer.MIN_VALUE, high = Integer.MIN_VALUE + memoryInBits;

while (low<=Integer.MAX_VALUE) {

BitSet bs = new BitSet(memoryInBits);

Scanner sc = new Scanner(new File(filename));
while (sc.hasNextInt()) {
int num = sc.nextInt();
bs.set(num - low);
}

int clearBit = bs.nextClearBit(0);
if (clearBit<memoryInBits)
return low + clearBit;

low = high + 1;
high = low + memoryInBits;
}

// We've reached the end of the file and we've seen every integer
throw new IllegalArgumentException("File does not have a missing integer");

}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

``````public class Solution{
public int findFirstNotIn(int[] arr){
byte[] ret = new byte[(1<<30)+1];
for(int a in arr){
if(a>0)
ret[a>>2] = 1;
}
for(int i = 0 ; i < ret.length ; i++){
if(ret[i] == 0) return i;
}
return 0;
}
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

Use a bloom filter - en.wikipedia.org/wiki/Bloom_filter
A Bloom filter with 1% error and an optimal value of k, requires only about 9.6 bits per element. So the algorithm would be as follows:
1. Add all the numbers to the Bloom filter
2. Look up numbers from 1 to 1B in the Bloom filter. Return the first one that is not found.

Since bloom filter addition and look up are constant time, this is an O(n) solution

Comment hidden because of low score. Click to expand.
0

We don't win anything with the bloom filter. We only want to store "presence"; whether a number was present in those billion numbers or not. We can do this by using a BitSet, for instance, which requires 1 bit per element. Again, we don't win anything with a bloom filter if it requires ~10 bits per element. Not to mention the error rate that comes with it.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Perhaps xor for all of them gives the result.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

``````int input[] = { 0, 1, 2, 3, 4, 5, 6, 7, 12, 22, 23, 32, 54, 66, 88, 102, 12312, 1000000,  1012739, Integer.MAX_VALUE };
long sum = 0;
int max = 0;
int res = 0;
int i;

// Sum all the ints into a long variable, in this example ints come from int input[]
for (i = 0; i < input.length; i++)
{
sum += input[i];
if (max < input[i])
{
max = input[i];
}
}

// simple check
if (max != Integer.MAX_VALUE)
{
res = max + 1;
}
else
{
// if i cannot be subtracted from the sum it is the result, otherwise subtract
for (i = Integer.MAX_VALUE; i >= 1 ; i--)
{
if (sum >= i)
{
while (sum >= i)
{
sum -= i;
}
}
else
{
res = i;
break;
}
}
}

System.out.println("Here is your new int: " + String.valueOf(res));``````

This works only for positive ints, to handle negative, another sum variable and for circle must be added.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Cantor's diagnoalization

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Find the max number and max+1 is the number you are looking for.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Unbelievable the amount of bullshit we can read here. The best solution is, as someone mentioned, cantor's diagonalisation. You read the nth digit of the nth word, you add 1 to it modulo 10 (or even modulo 2 actually) (if the number doesn't have a nth digit, you just use 0+1 = 1) and this makes the nth digit of your new number. This new number is different from all other numbers from at least one digit (by construction). The complexity of this algorithm is O(number of lines). No other algorithm I have seen on this page runs as fast as this one.

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.