Google Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
Wow this is great!
Just to help somebody count the probability that for example the root node will be selected = 1* 1/2 (it was not replaced by second) * 2/3 (not replaced by third) * 3/4 (not replaced by 4th) = 1/4.
What I suggest is this: Let define two variables lcount and rcount whith each node. lcount is the number of nodes on left subtree and rcount is the number of nodes in right sub tree. Now we do an preorder traversal. At each node we select that node with probability 1/(1+lc+rc). We go to the left sub tree with probability lc/(1+lc+rc) and go to the right sub tree with probability rc(1+rc+lcc). If we reached a leaf select that node. You can easily check that the probability of selecting each node is 1/n (n number of nodes).
Node* preOrder(Node * node)
{
int lc= node->lc, rc=node->rc;
if (node->left ==NULL && node->right==NULL)
return node;
int r= rnd()*(lc+rc+1)+1;
if (r==1)
return node;
if (r>1 && r<=lc+1)
return preOrder( node->left)
if (r>lc+1)
return preOrder( node->right);
}
Why bother?
if you have a lcount and rcount.
Select a random number (say R) in 1,2,..., lcount+rcount+1.
Now traverse the tree picking the element whose inorder position is R.
@jobseeker - can you explain what this statement does?
int r= rnd()*(lc+rc+1)+1;
Also i cant understand "At each node we select that node with probability 1/(1+lc+rc). We go to the left sub tree with probability lc/(1+lc+rc) and go to the right sub tree with probability rc(1+rc+lcc)."
Can you please elaborate?
Also i thought of the same approach as Anonymous has mentioned. But i guess your algo is O(Log n)
Pre Order for what? Did you mean by taking a counter and do a preorder and reduce the counter. And return the node when counter is zero?
This question can be solved with 2 traversals of the tree. Count the number of nodes n in the first traversal. Generate a random number between 1 and n, and return the nth node in the 2nd traversal. Now the question becomes how to solve it with just one traversal.
This question is similar to the one "randomly return a number from a incoming stream of infinitely many numbers". We keep a variable x that stores the value we want to return and a count n of total numbers we have seen so far. We set x to the first number we see. When the 2nd number comes in, we keep x the same with probability 1/2 and set x to the 2nd number with probability 1/2. When the third number comes, we set x to the new number with probability 1/3, and keep x unchanged with probability 2/3. When we see nth number, we set x to this number with probability 1/n.
We apply the same algorithm to a binary tree with unknown nodes. Here is my code, tested.
struct Node
{
int d;
Node * left;
Node * right;
};
const Node * getRandomNodeHelper(const Node * root, const Node * result, int & count)
{
if (!root)
{
count--;
return result;
}
if (0 == rand() % count)
result = root;
const Node * lresult = getRandomNodeHelper(root->left, result, ++count);
return getRandomNodeHelper(root->right, lresult, ++count);
}
const Node * getRandomNode(const Node * root)
{
const Node * r;
int count = 1;
return getRandomNodeHelper(root, r, count);
}
The nodes should keep the number of children. This way random access can be done in log(n).
Let's say that the tree has n nodes, each node is labeled with the number of children + 1 and you want to find k-th node of the inorder traversal.
You start from root which is labeled 'n'. Look at left child's label: if the label is less than k then go left looking for k-th node. Otherwise go right and look for (n-k)th element. Either way you go down one level hence the number of such steps is at most the height of the tree, log(n).
Here's some pseudocode:
c - current node
k - the k-th node to find (k-th of inorder traversal), from 1 to n
repeat steps 1..7:
1. if c.left.label = k then return c.left //found it
2. if c.left.label >k then c= c.left //go left
3. else //c.left.label < k
{
4. k = k-c.left.label // we just eliminated c.left.label nodes
5. if k=1 then return c // found it, (the current node is on c.left.label + 1 position)
6. k--;
7. c=c.right //go right
}
struct treeNode
{
int data;
int count ; // this node childrens count
struct node *left, *right;
};
step1: asume , tree size is n.
spep2: use 0 to n-1 random genrator. random genrator everytime returns one of the value in between 0 to n-1 .
step3: now we know the node index to return, we can return this node in O(logn ) time by using tnode->count field.
e.g
int main()
{
index = NrandomGen() ;
randomNodeGenerator(root, index);
}
struct treeNode * randomNodeGenerator( struct treeNode *root, int nodeIndex)
{
if( (root) || (root->left == NULL && root->right == NULL))
{
return root;
}
else if ( root->left == NULL )
{
if(nodeIndex == 0)
{
return root;
}
else
{
nodeIndex = nodeIndex -1;
return randomNodeGenerator(root->right, nodeIndex );
}
}
else
{
if(nodeIndex == 0)
{
return root;
}
else if( root->left->count + 1 == index )
{
return root;
}
else if ( (root->left->count + 1 ) < index )
{
nodeIndex = nodeIndex - root->left->count -2 ;
return randomNodeGenerator(root->right, nodeIndex );
}
else
{
return randomNodeGenerator(root->left, nodeIndex );
}
}
}
Time complexity O(logn );
Not good. You can not use node count as a substitute for a proper node counter - there will be duplicates and gaps. I ran through examples with this logic that did not generate a result.
how about generate a number k between 1 to n, and do a BFS until totally k element has been traversed?
time O(logn), space O(1)
use numc in Node struct to denote how many children the node has:
struct Node
{
int data;
Node* left;
Node* right;
int numc;
};
Pseudo code:
Node* func(Node* curr)
{
return curr if its numc is 0
otherwise:
1/(numc+1) possibility return the curr;
//implement this by: int r=random(1, numc+1);
//if(r==numc+1) return curr;
numc/(numc+1) possibility: //else
{
return func(curr->left) if curr->right is NULL;
return func(curr->right) if curr->left is NULL;
otherwise:
left/(left+right) possibility return func(curr->left);
right/(left+right) possibility return func(curr->right);
// left=left subtree size = curr->left->numc+1; right is similar as left
}
}
//implement left/(left+right) possibility by: int r=random(1,left+right);
if(r>=1&&r<=left) //means left/(left+right) possibility.
{ //do something}
a very simple solution:
store the BIN TREE in an array with non existing childs marked as null.
Generate the random number and return the value corresponding to that
node.
This is applying reservoir sampling during tree traversal
void randomBTreeNode (Node * current, Node * & retVal, int & numNodesSoFar)
{
if (0 == numNodesSoFar && NULL == current)
{
retVal = NULL;
return;
}
srand (time(NULL));
double r = static_cast<double> (rand ()) / static_cast<double> (RAND_MAX);
++ numNodesSoFar;
if (r < 1.0 / numNodesSoFar)
{
retVal = current;
}
if (NULL != current -> m_left)
{
randomBTreeNode (current -> m_left, retVal, numNodesSoFar);
}
if (NULL != current -> m_right)
{
randomBTreeNode (current -> m_right, retVal, numNodesSoFar);
}
}
When testing the above code replace
srand (time(NULL));
double r = static_cast<double> (rand ()) / static_cast<double> (RAND_MAX);
with
/* in main */
int randomNums [100];
srand (time(NULL));
for (int i = 0; i < 100; ++i)
{
randomNums [i] = rand ();
}
/* in randomBTreeNode */
double r = static_cast<double> (randomNums [numNodesSoFar]) / static_cast<double> (RAND_MAX);
to get different random numbers for each step ... otherwise due to the fast cpus these days all random numbers generated will be the same for all steps; alternatively sleep for 5 seconds before generating each random number.
This question can be solved by using just one traverse of the binary tree.
The idea is, do a whatever order of travel, if it is the kth node we are traveling, then replace the previously selected node with probability 1/k.
This uses the idea of selecting a random node from a linkedlist where you don't know how long it is.
- Nova2358 July 12, 2012