1000 elements in one bag and 1

Google Interview Question for Software Engineer / Developers

2

of 2 votes

21
Answers
1000 elements in one bag and 1 million elements in another. how do you find common elements among them. Also give the complexity of your solution.
- sivaji8 September 27, 2013 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 17 vote

First put all the 1000 elements in an hash table. Then take each of the one million elements and look for them in the hash table. If match is found report them as a solution.

Also works if you don't have enough memory to bring all the 1mil elements. Then just drag a chunk to disk, check and then drag the next chunk.

Space Complexity : O(m)
Time Complexity : O(n)

m size of smaller bag n: size of larger bag

- Hingle McCringleBerry September 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

that's an ideal answer to come up with but I would like to know if Google was happy with this answer assuming this approach was presented.

- siva September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

-5

of 5 votes

Since # of elements are less in bags (< 4 billon elements), we can use bit array instead of hashmap. 
Time will be will still O(n) and space will be will far less than the hashmap.

- Anonymous September 29, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 4 votes

Would not the time complexity actually be O(m+n) since it takes time O(1) to put each of the first bag's m=1000 elements into a hash table?

- Tom Moertel October 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

It shouldn't be O(m + n) because u can always assume that n > m.

- houseUrMusic October 18, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

There are two different situations in unspecified condition:
Whether we should consider duplicate one or more times.

- If the expected result without duplicates:
1. HashSet, not HashTable. Its similar but not identical structures.
2. We should remove elements from HashSet after they was founded - otherwise we can find them a few times.
- If the expected result with counted-duplicates
1. HashTable should save as values duplicates-count

- Darida October 21, 2013 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 votes

I dont think this is best solution. Hash table is applicable only for some specified range. We dont know what are the 1000 nos they could be anything.

- unknown October 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

Doesn't it depend on what kind of a data structure is used to store the million elements, so accordingly the search could be carried out?

- AVK September 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 5 votes

Sedgewick says:

Bags. A bag is a collection where removing items is not supported—its purpose is to
provide clients with the ability to collect items and then to iterate through the collected
items (the client can also test if a bag is empty and find its number of items). The order
of iteration is unspecified and should be immaterial to the client.

- S O U N D W A V E September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

So in this case, we can use the function retainAll(java.util.Collection c) which removes all elements which is not in collection c and retains only the elements that are present in the first bag. So, now iterating through the second bag gives the common elements. Please go let me know if this approach would be appropriate...

- AVK September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I think could be
1） mlogm+nlogm
2) nlogn+mlogn
3) nlogn+mlogm+m+n

- Fmars November 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Bloom filters?! (False positive matches are be possible, but there is no false negative);

- sathishp123 November 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1. We could use bloom filters for the first bag of elements (1000 elements). Since the number of elements is small it is easy to use bloom filters in this case. For all the elements in Bag2 pass these elements through bloom filter, if they are already set then print the solution that the element is identicaly. In bloom filter if a number is not present => the index of that element will never be set. So bloom filters could be used here.

2, If the range of numbers(in bag1) is known and is very small, we could as well use a BitSet and set the corresponding bit in bitset for all the 1000 elements. In second pass we could use see for each element if a bit is already set in the bitset, if Yes then we print that the element is identical. advantage of bitset is it is much more memory efficient.

- Boyer Moor February 25, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

If hashing is not universal (i.e. collisions are possible), then sort m (=1000) and sort n(=1M)
and do compare merge sort way.
Complexity O(m log m + n log n + m +n ) = O(m log m +n log n)

- mithya September 28, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 6 votes

If m < n, you can simplify your comp. to O(n*lgn )

But note, question just said elements in a bag. We can't assume the elements are (total order) comparable.

If we can, here is another way via sorting:
1) Sort the smaller bag O(m*lgm)
2) Binary search each element of larger bag in result of 1) above:
O(n*lgm)

Total comp: O( mlgm + nlgm ) = O(n*lgm)

- S O U N D W A V E September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

-5

of 5 vote

Construct a bloom filter with the large array (Use a bit vector for storing the large array). Then search the smaller array in the bloom filter by hashing its value. The problem with this approach though is that you can get false positives with bloom filter unless you choose multiple hash functions and larger input array.

- Vs September 28, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

genius == you
my friend

- rohit's mom September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

who is this new "friend"?

- rohit's supposed dad. September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

here genius will hit null reference ..

- Jack September 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews