Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
create an array pointers for frequecy max; lets say 10 million
create a trie for all the substring. (or a suffix tree)
The leaf of a trie will contain two variables and two pointers.
one variable contains the time stamp
the other variable contains the frequency.
one pointer contain the address of frequency pointer for k.
the other contain the pointer to the next leaf of the same frequency. (you can even have double linked list).
when a substring is found again, you increase the frequency and change the next leaf pointer
to the leaf containing at the next frequency array pointer.
when you want the N most common substrings for a time frame T.
go through the linked list pointed by the max frequency array pointer till time frame T.
repeat the procedure to the next max frequency.
update substring= O(1)
adding new substring = O(1)
N most freq = O(N)
decrease substring = O(1) (dont update the time stamp while decreaseing freq).
ofcourse you have the time taken for searching string in the trie or suffix tree
which would cost O(P +Sigma) P = length of pattern + max P. which can be obtained
by having weighted balanced search tree at every node in the trie.
if frequency exceeds the max, then create a max frequency array pointer. :)
or if you are worried about the size of frequency array pointer, then create
a frequency bst/heap/wbst etc.
create an array pointers for frequecy max; lets say 10 million
- Anonymous November 04, 2012create a trie for all the substring. (or a suffix tree)
The leaf of a trie will contain two variables and two pointers.
one variable contains the time stamp
the other variable contains the frequency.
one pointer contain the address of frequency pointer for k.
the other contain the pointer to the next leaf of the same frequency. (you can even have double linked list).
when a substring is found again, you increase the frequency and change the next leaf pointer
to the leaf containing at the next frequency array pointer.
when you want the N most common substrings for a time frame T.
go through the linked list pointed by the max frequency array pointer till time frame T.
repeat the procedure to the next max frequency.
update substring= O(1)
adding new substring = O(1)
N most freq = O(N)
decrease substring = O(1) (dont update the time stamp while decreaseing freq).
ofcourse you have the time taken for searching string in the trie or suffix tree
which would cost O(P +Sigma) P = length of pattern + max P. which can be obtained
by having weighted balanced search tree at every node in the trie.
if frequency exceeds the max, then create a max frequency array pointer. :)
or if you are worried about the size of frequency array pointer, then create
a frequency bst/heap/wbst etc.