Google Interview Question
Software Engineer / DevelopersCan be done in 0(n), where n is the number of integers. In this case its 10K.
Its 0(n), because we know the for each integer we take we can determine the number of bits set to 1 in constant time ( using '&' bit operation. int k; while (k!=0){count++; k= k & (k-1)}) as they are 16 bits in length.
Counting the number of bits set in a number can be done in O(1)... magic... :)
unsigned int v; // count bits set in this (32-bit value)
unsigned int c; // store the total here
static const int S[] = {1, 2, 4, 8, 16}; // Magic Binary Numbers
static const int B[] = {0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF, 0x0000FFFF};
c = v - ((v >> 1) & B[0]);
c = ((c >> S[1]) & B[1]) + (c & B[1]);
c = ((c >> S[2]) + c) & B[2];
c = ((c >> S[3]) + c) & B[3];
c = ((c >> S[4]) + c) & B[4];
The number of bits in an int is constant, so certainly it can be done in O(1), even with a dumb algorithm that just loops through the bits. The "magic" of the code you pasted is not that it's O(1), it's that it is very fast for something that does not rely on pre-computation. That said, I'm willing to bet that a look-up table will still be much faster. Just try counting the number of arithmetic operations in that snippet of code.
to create a lookup table of 16bits (~32k elements) is slower than going through 10K numbers. I think a lookup table of 8bit numbers is a good optimization.
Some thoughts:
1)Since the question mentions about unlimited memory .
We can read each input . Check if it is present in the lookup table . If yes , then directly reading count of set bits from there . Else , calculate the number of bits set in that input , add it to the result and also populate the lookup table for future reference .
2) If the numbers are totally random ( 0 - 2^16 ) . We can make use of the fact that if all numbers from 0 - (2^n - 1) are present then we can calculate the total number of bits set directly .
It follows the following recurrence
f(n) = 2*f(n-1) + 2^(n-1) // with f(1) = 1;
this gives ,
f(2)= 2*2 + 2 = 4 ( i.e if all numbers from 0 - 3 are present , then total number of bits set will be 4 )
f(3) = 2*4 + 4 = 12 ( i.e if all numbers from 0 - 3 are present , then total number of bits set will be 12 )
we can continue this way and find f(16) .
then we can initialize an array of size 2^16 , read the inputs and take count of the number of times a number appeared.
then , for each number missing subtract it from f(16) , and each number present more than one , add it to f(16 ) .
We get the result .
Correction above:
f(3) = 2*4 + 4 = 12 ( i.e if all numbers from 0 - 7 are present , then total number of bits set will be 12 )
1) Lookup Table of 65536 numbers is the best approach..since memory is not a constraint
2) Lookup Table of 256 numbers is the second best...the number of set bits of a 16-bit number would then be table[x>>8]+table[x&ff]
3)Solutions of the type posted by "S on July 28, 2010" should also work. I haven't tested that code though.
disagree. 65536 >> 10K. just going through each element will be way faster than creating the lookup table of 65k element
256 element table trumps in any case.
Regarding above, i'm thinking (2) is best solution. Problem with (1) is that the total operations is 65536 + 10000 (creating HT + counting) whereas for (2) it is 256 + 10000.
If the number of integers was significantly greater than 10K, say 1Million, etc then having a larger hash table makes sense.
(3) does not really look very efficient for 10K numbers. A simple lookup (base address + index*size) is all that's needed to get a value compared with several memory read, shifts and AND operations per number. itis the solution to a different problem...
Actually we can populate a lot of entries in the lookup table every time.
For example, for number 1235, when we counting the number of 1's in a traditional way, we can get
1235 = 617 * 2 + 1 (6)
617 = 308 * 2 + 1 (5)
308 = 154 * 2 + 0 (4)
154 = 77 * 2 + 0 (4)
77 = 38 * 2 + 1 (4)
38 = 19 * 2 + 0 (3)
19 = 9 * 2 + 1 (3)
9 = 4 * 2 + 1 (2)
4 = 2 * 2 + 0 (1)
2 = 1 * 2 + 0 (1)
1 = 1 (1)
so, we can populate 10 entries in the lookup table. Since there are 10K numbers, 16bit is 65536, there is a great chance that the extra numbers we populated will be hit.
When I saw "unlimited memory", I too was tempted to use a 16-bit LUT. But I worry about the nature of this unlimited memory. Is it just main memory or is the fast cache memory also included? If cache is still limited, I would be very careful about putting out such a large LUT. I think that splitting up the 16-bit integer into two 8-bit LUT lookups might be a sane compromise.
Use lookup table, for example, for each 16bit integer, or more.
- fiddler.g July 27, 2010