Google Interview Question
Applications DevelopersCountry: India
Interview Type: In-Person
you put a lot of effort into writing that up in plain text ... i wrote a similar solution shorter in code below : )
@emalaj23: because he is here to learn and not brag about it that he solved the problem.Please explain your algorithm rather than just duming some lines of code.
didn't mean to brag - just suggesting that sometimes its easier to try to code it rather than explaining it in text
my solution is very similar to his in that we keep a map where the key and value represent the range of consecutive numbers, except that i combine the ranges of consecutives before and after, so when inserting 4 with existing entries (2,3) (3,2) (5,7) (7,5), all of that is simplified to (2,7) (7,2)
It is very hard to implement the iteration over hash map. It is simply not good to modify the hash map while erasing existing entries and inserting new entries. Of course I don't mean the solution is incorrect, I am just expressing the practicality of writing a source code.
Sorting seems too obvious. If this is a legit Google interview question, I'd suspect there must be a better solution than O(n log n).
I can think of a linear solution, but it has a lot of assumptions; if the numbers are small positive numbers, let's say [0,31), then you can use a 32 bit integer and set the bits with the corresponding number in the int array. e.g., S = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3} becomes binary digit:
00000000 00010000 00001111 00111110
Then you can check the longest contiguous set bit sequence. This is linear time, but the worst case is the length of the bit vector, not the array length. Similarly, you can use 64 bit number for [0,64), for bigger numbers, you'll have to use a bit vector or a bit set.
Note that this solution will only make sense, if the numbers are positive, small, and/or there are a lot of repetitions.
On similar lines -- instead of trying to store them in bit array (or a binary number), we can simply iterate over [min, max] range and compute the max range that has all numbers present in given array.
This has fewer constraints. Only constraint :- [min, max] range should be reasonable. Numbers can be negative/big/repeated etc.
@oOZz: what if there are duplicates? I am not sure how to interpret the word "set of numbers" here - does it mean they are unique? But if not then for instance [1,1,1,4,5] your answer would return [4,5]. Basically what you did is countring sort and we can extend it a bit for negative/bigger/not unique numbers using a different data structure.
//Time Complexity: O(n)
public static int[] longestConsecutiveSequence(int[] data){
int first = Integer.MAX_VALUE; // the first number of the maximum consecutive sequence
int length = 0; // the length of the maximum consecutive sequence
Map<Integer, Integer> table = new Hashtable<Integer, Integer>();
for(int value: data) {
if(!table.containsKey(value)) {
int start = value;
int end = value;
if(table.containsKey(value + 1) && table.get(value + 1) >= value + 1) {
end = table.get(value + 1);
table.remove(value + 1);
table.remove(end);
}
if(table.containsKey(value - 1) && table.get(value - 1) <= value - 1) {
start = table.get(value - 1);
table.remove(value - 1);
table.remove(start);
}
table.put(start, end);
table.put(end, start);
if(end - start + 1 > length) {
first = start;
length = end - start + 1;
}
}
}
int[] seq = new int[length];
for(int i = 0; i < length; i++){
seq[i] = first + i;
}
return seq;
}
What is the expected time complexity? Sorting can get the job done easily in O(nlgn). Also, are there any pecularities about data that they all are +ve numbers and that they fall within a certain range, duplicates allowed etc etc?
sorting is definitely one solution with O(nlogn). Can we do it better?
there is no data pattern and numbers are not unique....
@anonymous
I think we can try do better. The notion of 'consecutive numbers' divides the set into equivalence classes. We can construct an undirected graph with numbers as node values and edges between two nodes if they differ in value by 1. Run DFS to find connected components(which are equivalent classes) finding their size along. The connected component with maximum size is the maximum contiguous subset. Complexity would be O(|V| + |E|). In this case, since we have |E|<|V|, we can say that we have got a O(|V|) solution.
The disjoint-set data structure also enables to paritition the set into equivalent classes. I suppose we can use that as well to find a solution to this problem.
@Dumbo you can get the connected components in liner time. However, how will you create a graph with edges that differ in value by 1? Doesn't that require you to search the graph every time you try to insert a node/vertex and add the edge? That'll be a quadratic operation.
Build the adjacency list using a hash table.Keys would be the numbers and values would be (number, groupid) pairs.
Traverse the array. Add number k as a key to the hash table if not already present. If numbers k+1 and k-1 are present in the hash table, add k to their adjacency lists and vice-versa. (To add more to the implementation details, we need to add k to it's own adjacency list as well). We will then have constant time per insertion.
Initially the groupid is set to null for all of the elements.Iterate over the elements of the hash table and run DFS marking the groupid of the connected components if not already set. Alongside, maintain the size of the largest connected component and its groupid. In the end print all elements of the groupid with the largest size
@Dumbo Is groupid just a bit set/not-set? If the elements come in the order 1,4,2,3....3 will go both in 2 and 4, how do we print that?
@oOZz
Thought of giving a shape to my ideas using connected components of a graph. Finally, a flawless solution with O(n) time complexity that works putting an end to speculations.
Provide input as:
19 16 5 1 9 3 18 17 8 20 4 10 2 11 3 6 13 15 12 14
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.regex.Pattern;
public class MaxConsecSubset {
int ccSize = 0, ccMin = 0, ccMax = 0;
int tmpSize = 0, tmpMin = 0, tmpMax = 0;
static HashMap<Integer, HashMap<Integer, Integer>> graph = new HashMap<Integer, HashMap<Integer, Integer>>();
public void DFS(int nodeId, int groupId) {
if (graph.get(nodeId).get(nodeId) == null) {
graph.get(nodeId).put(nodeId, groupId);
tmpSize++;
if (nodeId > tmpMax) {
tmpMax = nodeId;
}
if (nodeId < tmpMin) {
tmpMin = nodeId;
}
for (Map.Entry<Integer, Integer> entry: graph.get(nodeId).entrySet()) {
if (entry.getKey() != nodeId) {
DFS(entry.getKey(), groupId);
}
}
}
}
public void findEquivalenceClasses() {
for (Map.Entry<Integer, HashMap<Integer, Integer>> entry: graph.entrySet()) {
if (entry.getValue().get(entry.getKey()) == null) {
tmpSize = 0; tmpMax = -999999; tmpMin = 999999;
DFS(entry.getKey(), entry.getKey());
if (tmpSize > ccSize) {
ccSize = tmpSize;
ccMin = tmpMin;
ccMax = tmpMax;
}
}
}
}
public void readInput() {
Scanner scanner = new Scanner(System.in);
scanner.useDelimiter(System.getProperty("line.separator"));
Pattern delimiters = Pattern.compile(System.getProperty("line.separator")+"|\\s");
scanner.useDelimiter(delimiters);
int i;
while(scanner.hasNextInt()) {
i = scanner.nextInt();
if (!graph.containsKey(i)) {
HashMap<Integer, Integer> adjElements = new HashMap<Integer, Integer>();
adjElements.put(i, null);
graph.put(i, adjElements);
}
if (graph.containsKey(i + 1)) {
graph.get(i).put(i+1, null);
graph.get(i+1).put(i, null);
}
if (graph.containsKey(i - 1)) {
graph.get(i).put(i-1, null);
graph.get(i-1).put(i, null);
}
}
}
public static void main(String[] args) {
MaxConsecSubset maxCS = new MaxConsecSubset();
maxCS.readInput();
maxCS.findEquivalenceClasses();
System.out.println("The maximum consecutive subset is:");
for (int j = maxCS.ccMin; j <= maxCS.ccMax; j++) {
System.out.print(j + " ");
}
System.out.println();
}
}
@Dumbo..
I see that you call 2 functions - readInput() which is O(n) and then findEquivalenceClasses().
In second function, you call DFS for each node. DFS is O(n) and so findEquivalenceClasses() becomes O(n^2) and thus overall complexity as O(n^2).
Not sure, I am missing out something in ur algo...
@Anonymous:
Regardless of the number of times DFS is called, the algorithm to find connected components runs in O(|V| + |E|). The reason for this is upon the first visit, DFS() marks a node as visited and explores its adjacency list. Once a node is marked as visited, its adjacency list is never explored again. Thus the adjacency list of each node in a graph is explored only once, hence the complexity O(|V| + |E|).
In this specific case, |E| < |V|, so the complexity is O(|V|) = O(n).
@Dumbo: I agree with you that since we mark nodes as visited so essentially we visit each node only once. However, how are marking nodes as visited in algo..
or shouldn't hashmap entries be removed to prevent them from being visited again.
@anonymous
Here, I use the notion of a groupid, which is the id of a connected component. Initially the groupid is null. When a node is visited, groupid is set to a non-null value. This serves as a 'visited' flag in my algorithm. The groupid though is not much significance and is exclusively serving the purpose of a 'visited' flag.
int max_seq = 0;
int start = 0;
int end = 0;
for(int i = 0; i < l -1 ; i++) {
int current = a[i];
int next = a[i + 1];
// first sequence match
if (current + 1 == next) {
int tmp = max_seq;
max_seq = 2;
// check for further sequence
int k;
for(k = i + 1; i < l -1 ; i ++) {
current = a[k];
next = a[k + 1];
if (current + 1 != next) {
// sequence break
break;
} else {
max_seq++;
}
// maximum info poulation
if (tmp > max_seq) {
max_seq = tmp;
} else {
start = i;
end = k;
}
}
}
}
May be some variables are unused and could be optimized.
Variable name(s) are self explanatory
I think solution is so simple, does Google really look for it?
Guys if you have any better solution ....
I made a mistake..... So in that case we need to sort it first.
This may require typical nlog(n) time then some more operations.
So alternate solution is to maintain a HashTable (provided if numbers may not be unique)
NB: If the numbers are not unique then above code fragment should be changed ( a[k] == a[k + 1] + 1 || a[k] == a[k + 1]
Set<Integer> max_set = ...
HashSet<Integer> numbers = ...
numbers.add(number);
Iterator<Integer> iterator = numbers.iterator();
// iterates numbers
while(iterator.hasNext()) {
int number = iterator.next();
// checking for this number sequence
int c = 0;
int s = numbers.size();
Set<Integer> tmp_set = ...
while(c < s - 1) {
int nxt = number + ++c;
if(!numbers.contains(nxt)) {
break;
} else {
tmp_set.add(nxt);
}
}
// max sequence determination
if(tmp_set.size() > max_set.size()) {
max_set = tmp_set;
}
}
To find something hashing is better
Can we use a hash table? we'll take the smallest element from the array and add 1 to it, and will check whether hash table has new element or not. if it has we'll again add one to the new element and check. (here we got a set of consecutive numbers).
When new element is not present, we'll take the next smallest element in the hash table and go on with the same process.
Declare a struct
struct consecseq{
int first;
int last;
}
Declare an array of this structure where elements will be stored in sorted order of first/last.
For every element in array;
1) Do binary search to find the position where to place the value
2) if currval == (structarray[i].start - 1), add this element and change the first. Check if this can be merged with structarray[i-1] if exists
3) if currval == (structarray[j].last + 1), add this element and change the last. Check if this can be merged with structarray[i+1] if exists
4) If none of above, add a new structarray element in the current position and move the later elements.
5) At the same maximum and index of maximum length should be maintained.
O(n) solution using hash-table
Need to traverse the array twice
1. First, fill the array elements in the hash-table Now, start with the first element (say k),
2. Check if k is present in the hash-table. If not, go to the next element.
3. If yes, find the max n such that elements between k & k+n are all present in hash-table, and then remove them from the hash-table.
4. Similarly, find the max m such that elements between k-m and k are present in the hash-table and remove them from the hash-table.
5. Now we know a sequence which is present in the array.
6. Save k-m and the count (m-n) (this is the sequence found so far). Update the min_element and the count if m-n is larger than so far count.
7. Remove k from the hash-table.
8. Move to the next element in the array and repeat step 2
9. The largest sequence is : min_element, min_element+1, ..., min_element+count
After sorting the whole list
We need two pointers start and end
->n1, n2, n3 .................................. N<-
For each start pointer move pointer from end once end number == start number + (start index + difference between and start and end index) terminate program
Here if the probability is too high of getting a maximum sequence this provides best solution.
If there is too low probability then it does not provide then HashTable provides best solution
O(n) solution traversing the list only *once*
public static Set<Integer> compute(Set<Integer> ints){
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
// produce map of ranges of consecutive numbers
for(Integer key : ints){
if(map.containsKey(key)) continue;
map.put(key, key);
if(map.containsKey(key+1)){
map.put(map.get(key), map.get(key+1));
map.put(map.get(key+1), map.get(key));
if(map.get(key) != key+1) map.remove(key+1);
}
if(map.containsKey(key-1)){
map.put(map.get(key), map.get(key-1));
map.put(map.get(key-1), map.get(key));
if(map.get(key) != key-1) map.remove(key-1);
}
}
// find longest range
int longest = Integer.MIN_VALUE, start = 0;
for(Entry<Integer, Integer> entry : map.entrySet()){
if(Math.abs(entry.getKey() - entry.getValue()) > longest)
start = Math.min(entry.getKey(), entry.getValue());
}
// create the list to return
Set<Integer> answer = new HashSet<Integer>();
for(int i = start; i <= start+longest; i++)
answer.add(i);
return answer;
}
The map STL itself is implemented as a BST. Even if you use it to store (key,value) pairs, it'll use logn operations every time you check whether a key exists or not. I don't really see this as an efficient solution.
I have implemented an algorithm with complexity O(n) using a Hash Table (I used STL "unordered_map") that only has to iterate over the array once, and has onl a max of 6 accesses to the Hash Table per iteration.
I have implemented other working versions of this functions with complexity O(n) (one of them using a bitset), but this is the one I like the most. Here is the C++ implementation:
vector<int> findLongestSubset(int* arr, int size)
{
unordered_map<int,int> mp;
unordered_map<int,int>::iterator it;
mp[arr[0]] = 1;
int max_seq = 1;
int value_of_sequence = 0;
int max_seq_first_elem = 0;
int first_elem = 0;
for(int i=1;i<size;i++)
{
first_elem = arr[i];
//Check if element is already in the map
it = mp.find(arr[i]);
if(it!=mp.end())
continue;
value_of_sequence = 0;
//Check previous element in order by value if exists in map
it = mp.find(arr[i]-1);
if(it!=mp.end())
{
value_of_sequence = it->second;
first_elem = arr[i]-value_of_sequence;
//Update beginning of sequence
mp[first_elem] = value_of_sequence+1;
}
value_of_sequence++;
mp[arr[i]]=value_of_sequence;
//Check next element in order by value if exists in map
it = mp.find(arr[i]+1);
if(it!=mp.end())
{
value_of_sequence = value_of_sequence+it->second;
mp[first_elem] = value_of_sequence;
mp[arr[i]+it->second] = value_of_sequence;
}
//Update max sequence and max sequence's first element
if(value_of_sequence > max_seq)
{
max_seq = value_of_sequence;
max_seq_first_elem = first_elem;
}
}
vector<int> ret(max_seq);
for(int i=0; i<max_seq;i++)
ret[i] = max_seq_first_elem+i;
return ret;
}
I thought of this solution:
1. Create all the subsets and store the integer of each line in a set (disregarding order of integer values). Store each set to a list. Lets say call the list as
List<Set<Integer>> numberSubsetList = new ArrayList<Set<Integer>>();
2. set int longestSetIdx = -1, currSetLen = -1, longestSet = -1
2. For each set in the list
a. set currSetLen = # of item in current set
a. get max and min while computing sum of integers (call the sum as totalVal)
b. based on max and min compute sum of consecutive numbers between min and max
consecSum ((#of items in set)/2)*(max+min)
c. if totalVal = consecSum and longestSetLen < currSetLen then
longestSetIdx = index of set
longestSetLen = currSetLen
3. Return numberSubsetList.get(longestSetIdx)
Technically, this is O(N) disregarding of course the cost of populating numberSubsetList.
Everyone does really complicated stuff....
Here is an order n solution that is very easy to read (in Python).
def longest_common_subset( input ):
hash_map = {}
# O(n)
for elem in input:
hash_map[elem] = True
lcs = []
for elem in input:
#Try higher first
higher_lcs = []
counter = elem
while counter in hash_map:
higher_lcs.append(counter)
counter = counter + 1
#Try lower now
lower_lcs = []
counter = elem
while counter in hash_map:
lower_lcs.append(counter)
counter = counter - 1
longest_option_lcs = higher_lcs if len(higher_lcs) > len(lower_lcs) else lower_lcs
lcs = longest_option_lcs if len(longest_option_lcs) > len(lcs) else lcs
return lcs
@Farid:
1. i think longest_option_lcs should be
longest_option_lcs = lower_lcs + higher_lcs
2. It will blow up to O(n^2) when whole set is answer. O(n) for traversing the set and then for each element, you go through all elements again to construct longest set
the worst time complexity for this algorithm will be O(n^2), but i think this can easily be fixed. After using element to construct lcs, you can set hashmap[ele] = false. in the future traversal, if hashmap[ele] == false, then visit the following node. In this way, we make sure that each node is only visited once.
1. find the sequences in a hashmap with H[x] = x-1 if x, x-1 both are in the list, otherwise H[X] = 0, through one iteration of the set. O(N)
for example: { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3}
maps to : { 1->-1, 2->1, 3->2, 4->3, 5->4, 8->-1, 9->8, 10->9, 11->10, 20->-1}
2. reverse the map and also get the headers of the sequences, through one iteration of the original list.
new hashmap: { 1->2, 2->3, 3->4,4->5, 8->9, 9->10, 10->11 }, with headers {1, 8, 20 }
(3) from each header find the sequences and return the longest one, iterate throght all numbers once. O(N)
vector<int> findLongestSubset(int* arr, int size)
{
unordered_map<int,int> mp;
// iteration 1 to get connected components
for ( int i = 0; i < size; ++i )
{
int n = arr[i];
if ( mp.find(n-1) != mp.end() )
mp[n] = n-1;
else
mp[n] = -1;
if ( mp.find(n+1) != mp.end() )
mp[n+1] = n;
}
// iteration 2 to reverse the sequences
unordered_map<int,int> mp2;
vector<int> headers;
for ( int i = 0; i < size; ++i )
{
int curr = arr[i];
int next = mp[curr];
if ( next == -1 )
headers.push_back(curr);
else
mp2[next] = curr;
}
// iteration 3 to comparing the sequences
// we use vector here to store all numbers in a sequences, but start and end number are actually sufficient
vector<int> maxSequences;
for (int i = 0; i < headers.size(); ++i )
{
int curr = headers[i];
vector<int> sequence;
sequence.push_back(curr);
while ( mp2.find(curr ) != mp2.end() )
{
curr = mp2[curr];
sequence.push_back(curr);
}
if ( sequence.size() > maxSequence.size() )
maxSequence = sequence;
}
return maxSequences;
}
O(n) solution in Python.
def longest_sequential_subset(S):
A = {}
longest = set()
for x in S:
low = x - 1
high = x + 1
if low in A and high in A and A[low] is not A[high]:
A[low].update(A[high])
A[high] = A[low]
s = A[low] if low in A else A[high] if high in A else set()
s.add(x)
if len(s) > len(longest):
longest = s
A[x] = s
return longest
I also prefer using the hash table to solve the problem, and the expected time complexity is O(n), and here is my code in C++:
#include <iostream>
#include <vector>
#include <list>
struct node
{
int key;
int beg, end;
node( int key ) : key(key), beg(key), end(key){}
int size() const{ return end + 1 - beg; }
};
struct hash_table
{
const int divisor;
std::vector< std::list<node> > buckets;
hash_table( int divisor ) : divisor(divisor), buckets( divisor, std::list<node>() ) {}
typedef std::list<node>::iterator nd_iter;
std::pair< bool, nd_iter > insert( const node& nd )
{
int i = nd.key % divisor;
for( nd_iter it = buckets[i].begin(); it != buckets[i].end(); ++it )
if( it->key == nd.key ) return std::make_pair( false, it );
buckets[i].push_back(nd);
return std::make_pair( true, std::prev( buckets[i].end() ) );
}
std::pair< bool, nd_iter > search( int key )
{
int i = key % divisor;
for( nd_iter it = buckets[i].begin(); it != buckets[i].end(); ++it )
if( it->key == key ) return std::make_pair( true, it );
return std::make_pair( false, buckets[i].end() );
}
void find_longest()
{
int max_size = 0;
std::list<node>::iterator kt;
for( std::vector< std::list<node> >::iterator it = buckets.begin(); it != buckets.end(); ++it )
{
for( std::list<node>::iterator jt = it->begin(); jt != it->end(); ++jt )
{
if( jt->size() > max_size )
{
max_size = jt->size();
kt = jt;
}
}
}
for( int i = kt->beg; i <= kt->end; ++i ) std::cout << i << " ";
std::cout << std::endl;
}
};
void longest_increasing_sequence( int a[], int n )
{
hash_table ht(11);
std::pair< bool, hash_table::nd_iter > ait, lait, rait;
for( int i = 0; i < n; ++i )
{
ait = ht.insert( node(a[i]) );
if( ait.first )
{
lait = ht.search(a[i]-1);
rait = ht.search(a[i]+1);
if( lait.first ) ait.second->beg = lait.second->beg;
if( rait.first )
{
ait.second->end = rait.second->end;
rait.second->beg = ait.second->beg;
}
if( lait.first ) lait.second->end = ait.second->end;
}
}
ht.find_longest();
}
template< std::size_t n >
std::size_t size( int (&a)[n] )
{
return n;
}
int main( int argc, char* argv[] )
{
int a[] = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3 };
longest_increasing_sequence( a, size(a) );
return 0;
}
I first figured out the O(nlogn) solution (sort -> go through it once to find the longest sequence), and then immediately realized there must be an O(n) solution. Others have found it, but just to throw my C++, O(n) solution up here (I think others have come up with similar, but most seem to be more complex than they need to be):
1 #include <stdio.h>
2 #include <iostream>
3 #include <vector>
4 #include <unordered_set>
5
6 using namespace std;
7
8 int getCount(unordered_set<int> *nums, int ndx, bool goingUp) {
9 if(nums->find(ndx) != nums->end()) {
10 nums->erase(ndx);
11 int next = goingUp ? ndx+1 : ndx-1;
12 return 1 + getCount(nums, next, goingUp);
13 }
14
15 return 0;
16 }
17
18 void getLongestSequence(vector<int> nums) {
19 unordered_set<int> allNums;
20
21 for(int i = 0; i < nums.size(); i++) { // O(n)
22 allNums.insert(nums[i]);
23 }
24
25 int largest = 0;
26 int start = 0;
27
28 unordered_set<int>::iterator itr = allNums.begin();
29
30 // O(n) because only touching each node 1x (either here, or in getCount)
31 while(itr != allNums.end()) {
32 int total = 1;
33 int lowest = getCount(&allNums, *itr-1, false);
34 int highest = getCount(&allNums, *itr+1, true);
35
36 total += lowest + highest;
37
38 if(total > largest) {
39 largest = total;
40 start = *itr - lowest;
41 }
42
43 itr = allNums.erase(itr);
44 }
45
46 printf("Largest sequence is: { ");
47 for(int i = start; i < start+largest; i++) {
48 printf("%d ", i);
49 }
50 printf("}\n");
51
52 }
53
54 int main(int argc, char *argv[]) {
55 vector<int> nums;
56 int val;
57 while(cin >> val)
58 nums.push_back(val);
59
60
61 printf("Numbers are: { ");
62 for(int i = 0; i < nums.size(); i++)
63 printf("%d ", nums[i]);
64
65 printf("}\n");
66 getLongestSequence(nums);
67
68 return 0;
69 }
A solution in Python:
def f(A):
hash = {}
# Make a 'hash-table' with each number
for a in A:
hash[a] = a
ret = []
while hash:
# Get first item in the hash-list
cont = []
a = hash.pop(hash.items()[0][0])
# Find all < numbers
c = a - 1
while hash.has_key(c):
cont.append(c)
del hash[c]
c -= 1
cont += [a]
# Find all > numbers
c = a + 1
while hash.has_key(c):
cont.append(c)
del hash[c]
c += 1
if len(cont) > len(ret): ret = cont
return ret
A = [5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3]
print f(A)
A solution in Python:
def f(A):
hash = {}
# Make a 'hash-table' with each number
for a in A:
hash[a] = a
ret = []
while hash:
# Get first item in the hash-list
cont = []
a = hash.pop(hash.items()[0][0])
# Find all < numbers
c = a - 1
while hash.has_key(c):
cont.append(c)
del hash[c]
c -= 1
cont += [a]
# Find all > numbers
c = a + 1
while hash.has_key(c):
cont.append(c)
del hash[c]
c += 1
if len(cont) > len(ret): ret = cont
return ret
A = [5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3]
print f(A)
Amortized O(N) solution using HashSet and HashMap
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
public class MaxConseqSubseequence {
public static int[] getLongestConseqSeq(int [] arr){
HashSet<Integer> seen = new HashSet<>();
HashMap<Integer, int[] > numToRange = new HashMap<>();
for( int cur : arr){
if(!seen.add(cur)) continue;
int[] left = numToRange.get(cur-1);
int[] right = numToRange.get(cur+1);
if(left != null && right !=null){
if(left[0] != left[1]) numToRange.remove(left[1]);
numToRange.remove(right[0]);
left[1] = right[1];
numToRange.put(right[1], left);
} else {
if(left != null) {
if(left[0] != left[1]) numToRange.remove(left[1]);
left[1] = cur;
numToRange.put(cur, left);
} else if(right != null){
if(right[0] != right[1]) numToRange.remove(right[0]);
right[0] = cur;
numToRange.put(cur, right);
} else { // both null
int [] newRange = { cur, cur };
numToRange.put(cur, newRange);
}
}
}
int [] maxRange = null;
int maxRangeSize = -1;
for( int [] range: numToRange.values() ){
int rSize = range[1] - range[0];
if(rSize > maxRangeSize){
maxRangeSize = rSize;
maxRange = range;
}
}
return maxRange;
}
public static void main(String[] args) {
int [] arr = { 1,3, 2, 5, 1, 9, 3, 8, 6, 4, 7, 2, 11, 3} ;
System.out.println("The longest sequence in " + Arrays.toString(arr)+ " is "+ Arrays.toString(getLongestConseqSeq(arr)));
}
}
int main()
{
int S[] = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11, 3};
int N = sizeof(S)/sizeof(int);
unordered_map<int,bool> map;
int longest = 0, start;
for (int i = 0; i < N; i++)
map.insert(make_pair<int,bool>(S[i],true));
for (int i = 0; i < N; i++)
if (map[S[i]])
{
map[S[i]] = false;
int p = S[i];
while (map.find(p-1) != map.end())
map[--p] = false;
int q = S[i];
while (map.find(q+1) != map.end())
map[++q] = false;
if (q-p+1 > longest)
{
longest = q-p+1;
start = p;
}
}
for (int i = 0; i < longest; i++)
cout << start+i << " ";
return 0;
}
here is my answer to this question
- anonymous June 14, 20131. Create a new hashmap. Its key will contain starting index of contiguous set and value be ending index of contiguous set.
2. Iterator through array.
3. Let each value be x. look for x-1 and x+1 in hashmap.
case1. there exists an entry for x-1, so update its value to x if x is greater than existing value
case2. there exists an entry for x+1, get its value, say vx and insert an entry with key as x and value as vx.
case 3. there exists an entry for x; simply ignore
case 4: insert into hashmap, key x and value x
Ex: S = { 5, 1, 9, 3, 8, 20, 4, 10, 2, 11}
5 => { (5,5) }
1 => { (5,5), (1,1) }
9 => { (5,5), (1,1), (9,9) }
3 => { (5,5), (1,1), (9,9), (3,3) }
8 => { (5,5), (1,1), (8,9), (3,3) }
20 => { (5,5), (1,1), (8,9), (3,3), (20,20) }
4 => { (5,5), (1,1), (8,9), (3,4), (20,20) }
10 => { (5,5), (1,1), (8,9), (3,4), (20,20), (10,10) }
2 => { (5,5), (1,2), (8,9), (3,4), (20,20), (10,10) }
11 => { (5,5), (1,1), (8,9), (2,4), (20,20), (10,11) }
3 => { (5,5), (1,2), (8,9), (3,4), (20,20), (10,11) } -> (ignored)
after array iteration is done, repeat step 3 with minor modifications by iterating over map again and longest set will be with max diff b/w key and value.
Let key be k and value be x
case1. there exists an entry for x-1, so update its value to x if x is greater than existing value. delete entry with key = k
case2. there exists an entry for x+1, get its value, say vx and update k's value to vx.
delete entry for with key = x+1. repeat this step with x = vx
Now iterate over { (5,5), (1,2), (8,9), (3,4), (20,20), (10,11) }
5 => no action,
1 => update 1's value to 4 and delete (2,4) resulting in
{ (5,5), (1,4), (8,9), (20,20), (10,11) }
repeat with x = 4
update 1's key to 5 and delete (4,5) resulting in
{ (1,5), (8,9), (20,20), (10,11) }
8 -> update 8's value to 11 and delete (10, 11) resulting in
{ (1,5), (8,11), (20,20) }
repeat with x = 11 -> no action
20 => no action
{ (1,5), (8,11), (20,20) }
Haven't mentioned above, at each step we will maintain key-value pair with max diff. So at the end of map iteration, we have result.
Complexit = O(n)...though I'm finding this complex..Is there a simplified way of doing this.