Interview Question
Country: United States
Like @Loler said: record the frequency of each number present in those lists in a hash table. then iterate ( in order ) over the possible keys and populate a sorted list with number whose keys are present in the hashtable. here is my implementation:
public ArrayList<Integer> sort(List<ArrayList<Integer>> list, int n)
{
if(( list == null) || ( n < 1)){
return null;
}
/* hash table to count the frequency of each number that occurs in those lists */
HashMap<Integer, Integer> table = new HashMap<Integer,Integer>();
for(ArrayList<Integer> a_list : list ){
/* for each list */
for(Integer m : a_list ){
/* count the frequency of that number and update the hash table accordingly */
if( table.containsKey(m)){
Integer val = table.remove(m);
table.put(m, val + 1);
}
else{
table.put(m, 1);
}
}
}
/* single past, populate a sorted list based on
the occurrence of each number from 1 - n */
ArrayList<Integer> sorted_list = new ArrayList<Integer>();
for( int i = 1; i <= n ; i++ ){
if(table.containsKey(i)){
int val = table.remove(i);
while ( val-- > 0 ){
sorted_list.add(i);
}
}
}
return sorted_list;
}
@kx. You don't really need a hashtable, an array will do, but yes, something like that. +1 to you comment.
actually, but I think the interviewer expected to sort all the lists separately, not merging them into one sorted list
hence the answer is that add a prefix to every number for example for first list add prefix 1 to every number in first list in second list add prefix 2 to every number of that list ,finally take all these numbers and use radix sort
@rdx: Eugene has convinced me that your interpretation is not as artificial as it seemed to me! Good solution, seems like it will work. +1 to your comment.
@Anonymous: What I meant to say was you don't really need L to be exactly n. It can be n+100, or 1000n. Counting sort requires bounded integers, why must that bound be same as the number of such integers? You can sort a million bits, for instance. We don't restrict to just two :-)
@Loler: regarding your comment to rdx, I actually don't think it's artificial at all to ask to sort the lists separately. That's how I interpreted the question too, and it's a little more difficult that way. The question is how you will ensure strictly O(n) complexity even when the n elements are distributed across r different lists (let's assume r <= n). A direct attempt to use counting sort directly on each list will use up O(n) time *per list*, resulting in a total complexity of O(r*n), which may not be O(n). You will need a modified and trickier form of counting sort to get the O(n + r) time bound, which can be taken to be O(n) under the r<= n assumption.
@Eugene: Ah! Good point. Thanks.
I will edit my answer to mention the interpretation it follows.
As to the other interpretation, I suppose rdx's solution works.
Treat each number as a 2-digit number in base n and use radix sort, one of the digits telling you which list the number belongs to, and the other the actual number. We can use a stable version of counting sort for the single digit sorting (which is what I suppose you were referring to?).
@eugene.yarovoi I have edited the answer to include a stable version of count sort, which does solve both interpretations. I am curious to know your solution. Care to elaborate? Thanks.
@Loler: I was just going to have a list of occurrence locations for each of the n slots instead of just having an occurrence count. When input list number x encounters a value y, just do arr[y].add(x). All the lists can be processed together in this way, and then the values can be restored in sorted order.
@eugene: Yeah, I realized that you were probably thinking of something like that, after posting this. Of course, the one I posted trades time for space (space is O(n) if all we need is a stable sort), but makes an extra pass. If we store the list instead of a count, that is O(L+n) space.
@Anonymous: The question does not say the lists are already sorted.
Even if they were sorted, I am curious: what O(n) time algo do you have which can merge r lists (where r can be as large as n itself) which does not take into account the 1 to n bounds?
Sorry, have to give this answer a -1.
@Anonymous: I wish I could give your comment a -1, but that changes the order, so I will leave it at that.
A stable version of counting sort should cater to the two different interpretations of the problem we have seen so far:
1) Sort all into one big list
2) Sort each independently.
We solve 1), but include enough information in the output to solve 2)!
Maintain an array counts of size n for the counts. Increment counts[k] when you encounter k and then generate the sorted list at the end, by walking the input again(in order we got it), so that we can tag the output number with the list it came from. This way, the numbers are sorted, but we also know which list we got them from, which can be used to solve 2).
If the sum of the lengths of the lists is L, then this algorithm is O(L+n) time and O(L+n) space.
We don't exactly need n = L.
Sample python code. count_sort takes a list of lists, and returns a sorted list of tuples (number, list_number), sorting by number. sort_all just extracts the number and creates a big list. sort_single create a list of lists, and puts in number in the appropriate list using list_number.
Output
--------------------------------------------------------------
- Loler March 06, 2013