Bloomberg LP Interview Question
Financial Software DevelopersCountry: United States
Interview Type: In-Person
First, to illustrate the worst case is O(N) for any algorithm regardless the array be sorted.
Without formal proof, one can see the string as an function F from index i to v[i] - 'a', for 0 <= i < N (The identity function is obtained from array {'a', 'b', 'c', ..., 'z'} that contains only unique characters.)
Imagine F draws the points in the (character x time) plane in a given velocity (unit character per index). This way both repetition and skipping characters, e.g. {'a', 'a', 'c',...} 'a' being repeated and 'b' skipped, will alter its curve diverging from the identity line.
I bring this geometric analogy to purposely abstract us from enumerating (i.e.: looping over characters of the string) since in the plane they are lines with infinite points, resting us work with a discrete subset.
That said, and without proof, I propose finding the longest repeated char is equivalent to partition F in three segments w.r.t to velocity v = 1, v = 0, v > 1, and obtain the longest segment [i, j], j > i, s.t v = 0.
The solution steps are:
1- Partition y = F(i) curve for 0 <= i < N, resulting in K segments,.
We denote the complexity for this step as p(F(i)) = O(g_F(N))
I.e.: for a given F there is a function g s.t. partitioning has complexity O(g(N)).
2- Classify each segment S to one of the three partitions classes v = 0, v = 1, or v > 1, using any arbitrary function w, we denote its complexity by O((w(S)).
Combining both steps, we have S = K and the resulting complexity is O(g(N)) + O((w(S)). It is important to note steps 1 and 2 may be coupled logically. As result, w may be functionally dependent of g.
Nevertheless, for worst case analysis, if F is the identity (i.e.: no repetitive elements), S = N, since v = 0 \in [i, i+1) for 0 <= i < N. So, even if partitioning complexity g is sublinear, we still need to traverse the v=0 bucket to label them as 1 (if you may argue there is no need to label since all elements belong to one bucket. however one can reconstruct a case the where v = 0 \ [i, i+c) for 0 <= i < N-c given 0 < 1 < N/2 and v[j] = j * c for 0 < j <= N/c, making the labeling required to identify the repetition length c). On the other hand, if labeling is sublinear, partitioning is likely lifting the O(N) weigh.
That said, at this point I am convinced that O(N) is the worst case complexity. Then, I spend a good amount of time on reducing the average case.
I had several ideas using divide and conquer, algebraic manipulations, iterative solutions (reducing error not over N), etc.
One idea's that worth to mention is to use pivot to partition repetitive chars of size as follow
At iteration i-th, select a pivot randomly in [0,N) range and move to v[p] (initialize p=0).
Traverse over p < j < N and bubbling up all elements with exact i repetitions before pivot p.
The loop invariant is at end of iteration i-th, elements to the left of v[p] have repetitions equal or less than i, and greater to the right.
However, j < N is too loose since repetitions must have length i+1 at least. Therefore, if j < N-1, there must be a remainder at least i+1 elements in the array. Furthermore, because any subsequent repetition is of length i+1, a remainder portion (N - p) \in (i+1, 2*(i+1)-1] is the one maximum repetition. This was an important observation because allow us to ignore a fraction of the array, reducing complexity.
To remember the original goal, I wanted to design an algorithm that was required to be linear worst case and sublinear in average. Divide and conquer algorithms, will be O(nlogn) worst case (see Skiena section 4.10.3) unless the the division multiplier is 1, i.e.: T(n) = T(n/2) + f(n), this implies that if f linear then average is O(logn).
So, to combine this requirement with the previous idea above, we should elaborate a recursion that discards one of the halves and any recursive call f(n) has complexity O(n).
The previous observation was important because my base idea for a recursive algorithm is to assume the longest repetition has size k, then verify if this hypothesis holds true, if not recurse into a smaller hypothesis k/2.
0- Start from K=N, which is trivial to verify and has complexity f(k=N) = 1, i.e.: v[0] == v[K-1]?
1- k = k/2 also partition the current array by 2.
Here is the trick part:
Because we assume the maximum repetition has length k, IF we have the array ordered by repetition, it would be a matter of testing v[N-1] == v[N-k] && v[N-k] != v[N-k-1] (in this case f(k) = 2).
Because we can't afford the initialization costs for this sorting, we have to use another solution where f(k) is still linear.
The solution is then to check all sub-segments (at i-th recursion, they are 2^i).
The complexity for this check is then #segments, linear.
As result, the average complexity is O(log n)
// ZoomBA
s = 'abbbbcdddefffg'
seed = { 'cur' : '', 'count' : 0, 'max' : '', 'max_count' : 0 }
seed = lfold( s.value, seed ) -> {
if ( $.o != $.p.cur ){
if ( $.p.count > $.p.max_count ){
$.p.max = $.p.cur
$.p.max_count = $.p.count
}
$.p.cur = $.o
$.p.count = 1
}else{
$.p.count +=1
}
$.p
}
printf('%s -> %d\n', seed.max , seed.max_count )
Written in C# and in ConsoleAplication
string kelime = Console.ReadLine();
char enCokTekrarEden = kelime[0];
int tekrarSayisi = 1;
string[,] dizi = new string[1, 2];
dizi[0, 0] = kelime[0].ToString();
dizi[0, 1] = "1";
for (int i = 1; i < kelime.Length; i++)
{
if (kelime[i]!=enCokTekrarEden)
{
tekrarSayisi = 1;
enCokTekrarEden = kelime[i];
}
else
{
tekrarSayisi++;
enCokTekrarEden = kelime[i];
}
if (tekrarSayisi>Convert.ToInt32(dizi[0,1]))
{
dizi[0, 0] = kelime[i].ToString();
dizi[0, 1] = tekrarSayisi.ToString();
}
}
Console.WriteLine(string.Format("char {0} repeated {1} times", dizi[0, 0], dizi[0, 1]));
Console.ReadLine();
Improved solution: performing less number of comparisons per iteration, also checks if the current Max Count can be covered:
(sLen - i > maxCount) will avoid additional comparisons.
Solution in Java:
public static Character findLongestRepeatingChar(String s) {
if (s == null || s.length() == 0 || s.equals("")) {
System.out.println("Error, invalid input String!");
return null;
}
char[] chars = s.toCharArray();
char maxChar = chars[0];
char curChar = chars[0];
int count = 1;
int maxCount = 0;
int sLen = s.length();
for (int i = 1; i < sLen && (sLen - i > maxCount); i++) {
if (chars[i] == curChar) {
count++;
} else {
if (count > maxCount) {
maxCount = count;
maxChar = curChar;
}
count = 1;
curChar = chars[i];
}
}
return new Character(maxChar);
}
public static void main(String[] args) {
String test = "aabbbcccdddeeefffgggghhhjjjiiklllmmnnopqrstuvxz";
System.out.println(findLongestRepeatingChar(test));
}
In this implementation, you only have to check whether the current count is bigger than the max when a character boundary is crossed (and once at the end).
static void RepeatingChar(string s)
{
if (s.Length == 0)
return;
if (s.Length == 1)
{
Console.WriteLine("{0} occurs 1 time", s[0]);
return;
}
char maxChar = '\0', currentChar = s[0];
int maxCount = 0, currentCount = 1;
for(int i = 1; i < s.Length; ++i)
{
if (s[i] != s[i - 1])
{
if (currentCount >= maxCount)
{
maxCount = currentCount;
maxChar = currentChar;
}
currentCount = 1;
currentChar = s[i];
}
else
currentCount++;
}
if(currentCount >= maxCount)
{
maxCount = currentCount;
maxChar = currentChar;
}
Console.WriteLine("{0} occurred {1} times", maxChar, maxCount);
}
public static string GetMaximumRepeatedCharacters(string text)
{
if (String.IsNullOrWhiteSpace(text))
return null;
int max = 1;
char c = text[0];
char maxChar = c;
int repeats = 1;
int startingIndex = 0;
for (int i = 1; i < text.Length; i++)
{
if (c == text[i])
{
repeats++;
}
else
{
repeats = 1;
c = text[i];
}
if (repeats > max)
{
max = repeats;
maxChar = c;
startingIndex = i - (max - 1);
}
}
return text.Substring(startingIndex, max);
}
public static string GetMaximumRepeatedCharacters(string text)
{
if (String.IsNullOrWhiteSpace(text))
return null;
int max = 1;
char c = text[0];
char maxChar = c;
int repeats = 1;
int startingIndex = 0;
for (int i = 1; i < text.Length; i++)
{
if (c == text[i])
{
repeats++;
}
else
{
repeats = 1;
c = text[i];
}
if (repeats > max)
{
max = repeats;
maxChar = c;
startingIndex = i - (max - 1);
}
}
return text.Substring(startingIndex, max);
}
This Java version should work.
public static char find(String s) {
if (s == null || s.length() <= 0) {
return (Character) null;
}
int maxLen = 1;
int res = 0;
for (int i = 0; i < s.length(); i++) {
int start = i;
while (i < s.length() - 1 && s.charAt(i) == s.charAt(i + 1)) {
i++;
}
if (i - start + 1 > maxLen) {
maxLen = i - start + 1;
res = start;
}
}
return s.charAt(res);
}
public class longestrepeatingstring {
public static void main(String args[]){
String a = args[0];
int max = 0;
char longest = ' ';
int count = 1;
if(a.length() >0){
for(int i = 1; i < a.length() - 1;i++){
if(a.charAt(i) == a.charAt(i+1)){
count++;
}
else {
if(count+1 > max){
max = count+1;
longest = a.charAt(i);
}
count = 0;
}
}
if(count+1 > max){
max = count+1;
longest = a.charAt(a.length()-1);
}
if(max == 0){
max = count+1;
longest = a.charAt(0);
}
}
System.out.println(max);
System.out.println(longest);
}
}
#include <iostream>
#include <string>
int main() {
const std::string mstr = "aabbbcccdddeeefffgggghhhjjjiiklllmmnnopqrstuvxz";
char c = mstr[0];
int mmax = -1, cmax = 1;
for (int i = 1; i < mstr.size(); ++i) {
if (mstr[i] == mstr[i - 1]) {
cmax++;
}
else {
cmax = 1;
}
if (cmax > mmax) {
mmax = cmax;
c = mstr[i];
}
}
std::cout << mmax << ", " << c << std::endl;
std::cout << "Press enter to continue ..." << std::endl;
std::cin.get();
return 0;
}
#include <iostream>
#include <map>
#include <string>
using namespace std;
int main()
{
// find the longest repeating character in a sorted string
string myStr = "aabbbcccdddeeefffgggghhhjjjiiklllmmnnopqrstuvxz";
map<string,int> myMap;
map<string,int>::iterator it;
int count=0;
for(int i=0;i<myStr.length();i++){
if(myMap.find(myStr.substr(i,1))==myMap.end()){
myMap[myStr.substr(i,1)] = 1;
}
else{
myMap[myStr.substr(i,1)]++;
}
}
multimap<int,string,greater<int>> myMultiMap;
for(it=myMap.begin();it!=myMap.end();it++){
myMultiMap.insert(make_pair(it->second,it->first));
}
multimap<int,string,greater<int>>::iterator mit;
mit=myMultiMap.begin();
cout<<"The longest repeating character is: "<<mit->second<<endl;
cout<<"The size is: "<<mit->first<<endl;
}
public static char findLongestRepeatingChar(string inputString)
{
char[] charArray = inputString.ToCharArray();
char _currentChar = charArray[0];
char _maxChar = charArray[0];
int _strLength = inputString.Length;
int _count = 1;
int _maxcount = 0;
for (int i = 1; i < _strLength && (_strLength - i +_count> _maxcount); i++)
{
if (charArray[i] == _currentChar)
{
_count++;
}
else
{
if (_count >= _maxcount)
{
_maxcount = _count;
_maxChar = _currentChar;
}
_count = 1;
_currentChar = charArray[i];
}
}
if (_count >= _maxcount)
{
_maxcount = _count;
_maxChar = _currentChar;
}
return _maxChar;
}
public static void main(String[] args){
Scanner in= new Scanner(System.in);
String s= in.next();
ArrayList<Character> currentLogest=new ArrayList<>();
ArrayList<Character> current=new ArrayList<>();
char prev = s.toCharArray()[0];
for(char c : s.toCharArray()){
if(prev==c)
{
current.add(c);
prev=c;
if(current.size()>currentLogest.size())
{
currentLogest=new ArrayList<>();
currentLogest=current;
}
}
else
{
current=new ArrayList<>();
current.add(c);
prev=c;
System.out.println(current);
if(current.size()>currentLogest.size()){
currentLogest=new ArrayList<>();
currentLogest=current;
}
}
}
System.out.println("Size: " + currentLogest.size() + " longest run: " + currentLogest );
}
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(findLongest("acddsssefffffale"));
}
public static char findLongest(String s)
{
int max = 0, len = s.length(),counter=1;
char longest = s.charAt(0),current = s.charAt(0);
for(int i = 1; i<len;i++)
{
char ch = s.charAt(i);
if(ch==current)
counter++;
else
{
if(counter>max)
{
max = counter;
longest = current;
}
counter = 0;
current = ch;
}
if(i==len-1 && counter>max)
{
max = counter;
longest = ch;
}
}
return longest;
}
}
public static void maxstr(String str){
char[] c=str.toCharArray();
char maxchar=c[0];
int startindex,endindex;
startindex=endindex=0;
int maxsize=endindex-startindex+1;
while(endindex<c.length){
if(c[endindex]!=c[startindex]){
int size=endindex-startindex;
if(size>maxsize){
maxsize=size;
maxchar=c[startindex];
}
startindex=endindex;
}
endindex++;
}
for(int i=0;i<maxsize;i++){
System.out.print(maxchar);
}
}
static void Main(string[] args)
{
// This program is in C#
Program p = new Program();
string str = "abbcccddddeeeeezzz";
p.Solve(str);
Console.ReadLine();
}
public char Solve(string str)
{
char ret = char.MinValue;
char[] arr = str.ToCharArray();
int count = 0;
for (int i = 0; i < arr.Length; ++i)
{
char c1 = arr[i];
int subcount = 0;
for (int j = 0; j < arr.Length; ++j)
{
char c2 = arr[j];
if (c1 == c2)
++subcount;
if (c1 < c2)
break;
} // end of for(j)
if (subcount > count)
{
count = subcount;
ret = c1;
}
} // end of for(i)
return ret;
}
The string is already sorted, so we just have to compare each element to its previous ones.
O(N) time O(1) space solution
- mombasa February 16, 2014