Oracle Interview Question
Software Engineer / DevelopersSimply search for the root node of the smaller tree in the larger one. Once that is found, do a simultaneous traversals as the follows :
eq(t1, t2) = t1.data=t2.data && eq(t1.left, t2.left) && eq(t1.right, t2.right)
By doing this you can break once equality fails.
Complexity in the worst case is still O(m+n). However, the average case is much better.
Are you guys sure the complexity is the worse case is still O (m +n ) Think about the case almost every node in T1 match T2.root, so you have to look for T2 in T1 many many times. Imagine that It only differ on one Node ( on the T1 subtree )
then you have to look the whole T1 ( O (n ) ) times T2 ( O (m - x ) where x is the number of nodes that did not match. then It ends up like O (n*(m-x)) ~ O(n*m) which is much more thatn O (n +m )
What do you think? If you don't agree elaborate more no the complexity please.
Complexity O (m*n) worse case scenario
public static boolean isSubTree(TNode t1, TNode t2)
{
if( t1 == null && t2 == null) return true;
if( t1 == null) return false;
if(t2 == null || (t1.value == t2.value && containsTree(t1,t2))){
return true;
}else{
if(t1.value > t2.value){
return isSubTree(t1.left,t2);
}else{
return isSubTree(t1.right,t2);
}
}
}
public static boolean containsTree(TNode t1, TNode t2)
{
if( t2 == null ) return true;
if( t1 == null ) return false; // t2 not found
if( t1.value == t2.value ){
return containsTree(t1.right,t2.right) && containsTree(t1.left,t2.left);
}else{
return false;
}
}
First we find t2 in t1 , it costs O(n), then we compare the two trees costs O(m), the overall should be O(n+m), correct me if I am wrong
You are doing recursively. So for millions of node very soon you will get stack overflow error.
Here is an optimization. Since T1 is too big, we don't want to do it recursively right? So what about a BFS like this:
public static boolean isSubtree2(TNode t1, TNode t2){
Queue<TNode> queue = new java.util.LinkedList<TNode>();
queue.add(t2);
while(!queue.isEmpty())
{
TNode current = queue.poll();
if(equalTree(current, t2)){
return true;
}else{
if(t2.value < t1.value){
queue.add(t1.left);
}else{
queue.add(t1.right);
}
}
}
return false;
}
equalTree could or not be recursivly since we are taking about hundred of nodes, which is not that big. But we can for sure make this none recursive and make it better.
public static boolean equalTree(TNode t1, TNode t2){
if(t2== null) return true;
if(t1== null) return false;
if(t1.value == t2.value){
return equalTree(t1.left, t2.left) && equalTree(t1.right,t2.right);
}else{
return false;
}
}
If it was not a Binary Search Tree then we can just change the folowing lines
if(t1.value > t2.value){
return isSubTree(t1.left,t2);
}else{
return isSubTree(t1.right,t2);
}
TO:
return isSubTree(t1.left,t2) || isSubTree(t1.right,t2);
I assume it's not a binary search tree. BSTs usually don't permit multiple nodes with the same value, so the algorithm would be trivial for a BST: search for the root of the small tree in the large tree, call the node found X, and check whether the subtree at X is identical to the small tree.
This is too much of a question:
1.With two millions of nodes coming into memory can I have another two million nodes memory available.
2. The lesser the memory the more the complicated the code is going to get.
3. Say there is not enough memory to bring 1/4 of any tree, the solution would be different then
4. If the entire tree cannot come into memory, is the tree in a Database, a file, ...In such a case it is an art to parse the tree itself [one needs to use memory mapped files]
5. If my Memory is small and hard disc is small and/or the Tree data is itself distributed in some 100 computers, then the solution lies in a distributed architecture.
6. If we go on like this, one can see that the actual solution to compare trees looks abysmal.
7. I wonder why they ask such difficult question.
8. In the Mathematics world this question would be rejected for not being precise.
/* A utility function to check whether trees with roots as root1 and
root2 are identical or not */
bool areIdentical(struct node * root1, struct node *root2)
{
/* base cases */
if(root1 == NULL && root2 == NULL)
return true;
if(root1 == NULL || root2 == NULL)
return false;
/* Check if the data of both roots is same and data of left and right
subtrees are also same */
return (root1->data == root2->data &&
areIdentical(root1->left, root2->left) &&
areIdentical(root1->right, root2->right) );
}
/* This function returns true if S is a subtree of T, otherwise false */
bool isSubtree(struct node *T, struct node *S)
{
/* base cases */
if (S == NULL)
return true;
if (T == NULL)
return false;
/* Check the tree with root as current node */
if (areIdentical(T, S))
return true;
/* If the tree with root as current node doesn't match then
try left and right subtrees one by one */
return isSubtree(T->left, S) ||
isSubtree(T->right, S);
}
You could always come up with a sort of hash function that was dependent on the current node, and the hash values of the left/right children.
In O(n) time you are able to calculate the hash value of the root node of the smaller tree.
And then in O(m) time you are able to calculate all node hash values in the larger tree. As you're cycling through the nodes of the larger tree, you're comparing the hash keys.
This solution would be O(m+n) time. With the only complications coming from possible key collisions.
Here is solution with O(m) complexity. m - Number of nodes in the larger tree.
public static boolean isSubTree(TreeNode p, TreeNode s) {
if (p == s & p == null) {
return true;
} else if (p == null || s == null) {
return false;
}
if (p.getData() == s.getData()) {
return isSubTree(p.getLeft(), s.getLeft())
&& isSubTree(p.getRight(), s.getRight());
} else {
return isSubTree(p.getLeft(), s) || isSubTree(p.getRight(), s);
}
}
This is something that came to my mind. Correct me if I am wrong.
- Coder January 24, 2014You can do the preorder traversal of the first tree. Store the result in a string. Do preorder traversal of second tree. Store result in second string. Find if second string is substring of first.
Next. Do the same for inorder traversal. If both the result come out to be true then T2 is subtree of T1.
Complexity will be : O(m+n).