Google Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
Andrei, if you use an array, the insertion are not always O(1). I suppose that you are using a dynamical array (because of the push_back). Every time you run out of space, you have to increase its size and copy every element to a new array with a bigger size.
Since I read this problem, I had an idea very similar to yours on how to implement this DS. But instead of using a Hash Table + Array, used 2 Hash Tables. I have posted the implementation below. Check it out.
@arturo You have a good point +1. I think Andrei is using a STL vector, because he mentioned push_back(). Insertion or removal of elements of std::vector are linear in distance to the end of the vector, which is O(n), if you're not inserting/removing from/to the end of the vector, which is what you're doing in your remove function (see en.cppreference.com/w/cpp/container/vector for complexity of operations).
I think a better choice would be a simple array with dynamic resizing and shrinking. You can use a load factor (3/4) to determine when to grow or resize the array and the complexity will be _amortized_ O(1).
@oOZz Even if you use an array with dynamic resizing and shrinking, and control the load factor, the total work of managing the array is O(n), (unless you know in advance the maximum number of elements you're going to have) because every time you have to resize the array (in some insertions or deletions), you'll have an operation of O(n) complexity of moving elements from one array to another.
I have posted below a solution with expected constant complexity O(1) in every of the mentioned 4 operations, using 2 Hash Tables. Check it out and let me know what you think about it.
@arturo, @Andrei's solution is fine, it just needs a little tweak on the delete operation. The inserts are to the end of the vector, which is O(1). However for deletes, instead of deleting from the ith position (which is O(n)), he needs to copy the last element over to the ith position and delete the last element, because deleting from the end from a vector is O(1).
Please note, that these operations are "amortized O(1)", which means when you consider entire sequence of operations of the program, in the worst case growing and shrinking the array can cannot occur again and again for a long time, thus "amortizing" its cost. (see en.wikipedia.org/wiki/Amortized_analysis)
@oOZz You're right. I didn't consider the Amortized analysis. In that case, the best solution, as you said, is using a Hash Table and a dynamic resizing array. Thanks for the information.
@arturo, @oOZz isn't deleting the node will require the node to looked up in array which is unsorted hence O(n) so delete isn't O(1) ??
This is an implementation using two Hash Tables that has an approximate complexity of O(1) for every operation (Add, Remove, Search and Random) if the Hash Tables are well implemented (good hash function and load factor).
The Hash Tables are:
map1: Key(string)->(Value(int), Index of map2 (int))
map2: Key(int)-> Value(string)
The Key of map2 represents an index (from 0 to size of data-1) and its value maps with Key of Map1.
Below you can see the C++ implementation (I used STL <unordered_map> for Hash Table), and assumed the Key is a "string" and the value is an "int".
#include <unordered_map>
#include <string>
#include <array>
using namespace std;
class MyStruct
{
public:
MyStruct() {size_of_data_=0;}
virtual ~MyStruct(){}
void Add(string str, int v){
array<int,2> arr;
arr[0] = v;
arr[1]= size_of_data_;
map1_[str] = arr;
map2_[size_of_data_++]=str;
}
int Search(string str)
{
unordered_map<string,array<int,2> >::iterator it =
map1_.find(str);
if(map1_.find(str)!=map1_.end()) //Element found
return it->second[0];
else //Element not found
return -1;
}
void Remove(string str)
{
unordered_map<string,array<int,2> >::iterator it =
map1_.find(str);
//If element to remove doesn't exist in our Data
if(it==map1_.end())
return;
//Get the index element in the map2
int map2_idx = it->second[1];
//Get key (string) of last element in the map2
string last_str = map2_[size_of_data_-1];
//Replace with element to remove with the last element
map2_[map2_idx] = last_str;
//Update the map2_idx of last key of map2
map1_[last_str][1] = map2_idx;
//Erase element from map1
map1_.erase(str);
//Erase last element from map2
map2_.erase(size_of_data_-1);
//Decrement size_of_data
size_of_data_--;
}
int Random()
{
//Generate big rand number - use your favorite randgen
unsigned int rand_number = GetRandNumber();
//If you want to return the value of the random element
return map1_[map2_[rand_number%size_of_data_]][0];
//If you want to return the key of the random element
//uncomment next line
//return map2_[rand_number%size_of_data];
}
private:
unordered_map<string, array<int,2> > map1_;
unordered_map<int,string> map2_;
int size_of_data_;
};
A hash table + an array that contains all the keys. The order of the keys stored in the array is not important
- Andrei June 15, 2013add
- we insert the key of the element at the end of the array (push_back).
- the element is then added in the hash table.
- we also keep in the hash table the position where the key was stored in the array
remove
- from the hash table we determined the position where the key of the element is stored in the array. We need then to removed it (this will create a hole). We move then the last key from the array to the position of the removed key. Of course, we need to update the position which was stored in the hash table (for this last key).
- then we do the "normal" removal from the hash;
getrandom
- since all keys are stored in the array, it's really easy to determine a random key ( array[rand() % array.size()]. we then return the random element from the hash table