Fernando
BAN USER
- 0of 0 votes
AnswersGiven a array of integers there is one that is repeated several time. How would you compute the length of the sequence of repeated elements.
- Fernando
Assuming the initial array is sorted can you do better than O(n) both for spatial and temporal cost?| Report Duplicate | Flag | PURGE
unknown Software Engineer Coding
@ChrisK you are right you can't perform that search in less than O(n). You can perform the search in a binary fashion but you still would have to go trough all the elements in the array in order to find the repeated sequence in the worst case. To be honest the answer to the question comes before that part so I did not think that very thoroughly. The real answer is that you can't solve the problem better than O(n). The source code I gave is the proof. You are solving a relaxed version of the problem and it still takes O(n). 2log(n) is still linear as the base of the logarithm is also 2. You can get a formal demonstration using the master theorem. The recursion is T(n) = 2T(n/2) + O(1). The thing is that on the interview I got assured by the interviewer that you can do better. What a nice feeling when I finally got the proof that you can't. I think I am satisfied with that answer so I will stop looking for a way to find an improvement over O(n).
- Fernando May 30, 2017If you know that the array it is sorted you can reduce the memory cost to O(1). Here is an implementation in python:
def length_of_repeated_elements(s):
l = 0
for x in xrange(len(s) - 1):
y = x
while s[y] == s[y + 1]:
l += 1
y += 1
if x != y:
break
return l
To see if we can do better than linear cost let's assume that we know the repeated value and a random position in the array that it occurs. In that situation we can perform a binary search towards both sides to get the lower and upper limits of the repeated sequence.
For example the array [1,2,3,4,4,4,7], the repeated value 4, and the position 5
We could perform from both sides a binary search to reach the limits. In this case 3 and 5.
Here is a python code that solves this relaxed problem.
def length_of_repeated_elements(s, value, position):
if s[0] == s[-1]:
return len(s)
high = position
low = 0
while (low < high):
middle = (low + high) / 2
print "Low", low, high, middle
if s[middle] == value:
if s[middle-1] != value:
break
else:
high = middle
else:
if low == middle:
middle = high
break
else:
low = middle
first_element = middle
high = len(s)
low = position
while (low < high):
middle = (low + high) / 2
print "High", low, high, middle
if s[middle] == value:
if middle == len(s) - 1 or s[middle+1] != value:
break
else:
low = middle
else:
high = middle
last_element = middle
return first_element, last_element
The cost is 2 log(n).
Now to answer the question about how to find the repeated value and the position the answer is similar we can perform the binary search until we hit a random position with element as we have shown using the previous code it doesn't matter which position we find.
Here is a solution in python for the version of O(n)
from collections import Counter
def length_of_repeated_elements(s):
c = Counter(s)
max_length = v = -1
for value, length in c.iteritems():
if length > max_length:
max_length = length
v = value
return v, max_length
Python version using a stack keeping track of the last visited position to avoid infinite paths (some cases would require a set of visited positions or modify the matrix).
def find_maze_exit(maze, start, end):
frontier = [ (start, start) ]
while len(frontier) > 0:
current, last = frontier.pop()
if current == end:
return True
for x, y in [(0, 1), (1, 0), (0, -1), (-1, 0)]:
n_p = (current[0] + x, current[1] + y)
if n_p[0] < 0 or n_p[0] >= len(maze)\
or n_p[1] < 0 or n_p[1] >= len(maze[x])\
or n_p == last or maze[n_p[0]][n_p[1]] == 'X':
continue
frontier.append((n_p, current))
return False
Humm in wich way would a Trie would help to solve the problem?? As the writer mentions in the question what happens with the substrings starting at the middle of the string?? The only answer I can think of is to build a trie for each letter of the string but in that case it would be better to build all the substrings. Also what kind of information can you exploit using a prefix??. Using the string "cara" c and ca and cara might not be on the dictionary but car might be. You could try to do something smart like trying to get common prefixes on the words on the dictionary to try to remove unsuccessful tries as fast as possible but that doesn't change the complexity
Python code:
def find_all_valid_substrings(s, d):
solutions = set()
for x in xrange(len(s)):
for y in xrange(x+1, len(s)):
if s[x:y] in d:
solutions.append(s[x:y])
return solutions
You can use a stack to solve the problem with O(n) for memory and time if you are only tracking one type of parenthesis you can promote the stack into a counter and improve the memory cost to O(1) as King@Work mentioned.
def is_balanced(a):
parens = []
for x in a:
if x == '(':
parens.append('(')
if x == ')':
if len(parens) == 0 or parens.pop() != '(':
return False
return len(parens) == 0
It can be solved with time O(n) and memory O(n)
def find_unique_char(a):
for key, value in Counter(a).iteritems():
if value == 1:
return key
If it is a repeating string we can reduce the memory cost to O(1). If we use the first letter of the cycle and iterate from the end of the array the first non repeating letter must be the one that precedes the first letter of the cycle. This is not clear from the question also it is not clear if the cycle starts from the first letter or you have to find it.
def find_unique_char(a):
for x in xrange(len(a)-1, 1, -1):
if a[x] == a[0]:
return a[x-1]
return a[0]
Solution in python
def compute_interesting_times(s, t):
def convert_number(x):
pad = ''
if x < 10:
pad = '0'
return pad + str(x)
# Check times
if s > t:
raise ValueError("The init time is greater than the finishing time")
# Parse times
h_c, m_c, s_c = map(lambda x: int(x), s.split(':'))
h_f, m_f, s_f = map(lambda x: int(x), t.split(':'))
total = 0
while not (h_c == h_c and m_c == m_f and s_c == s_f):
# Transform the numbers into a proper string and
# check if we have to increase the total
a = '{}{}{}'.format(convert_number(h_c),
convert_number(m_c),
convert_number(s_c))
if len(set(a)) <= 2:
total += 1
# Update the time counters
s_c += 1
if s_c == 60:
s_c = 0
m_c += 1
if m_c == 60:
m_c = 0
h_c += 1
# Check if the finishing time is an interesting time
a = '{}{}{}'.format(convert_number(h_c),
convert_number(m_c),
convert_number(s_c))
if len(set(a)) <= 2:
total += 1
return total
@sri The complexities are O(n^2). Temporally this can be explained as we can solve the problem by traversing the matrix, even if you can do it by traversing just half of it. Spatially is easy to see using the worst case in which there is no preexisting connection between the cities. In that case you would need to generate n-1 + n-2 ..... + 1 connections which has a quadratic cost. Basically the question is asking you to generate a clique, you can research the topic to get a deeper understanding.
Solution in Python:
def RoadBuilder(nCities, builtRoads):
solutions = set()
m = [ [0] * nCities for _ in xrange(nCities) ]
for x,y in builtRoads:
m[x][y] = 1
m[y][x] = 1
for x in xrange(nCities):
for y in xrange(x+1, nCities):
if m[x][y] == 0:
solutions.add((x, y))
return solutions
The question is a little bit ambiguous.... I thought the question was asking that given an array of leafs you had to construct a correct tree. Here is a simple python code that does that
def print_tree(leafs):
current_level = leafs
print current_level
while len(current_level) > 1:
next_level = map(lambda x: int(x[0] and x[1]),
[current_level[x:x+2] for x in xrange(0, len(current_level), 2)])
print next_level
current_level = next_level
@Yevgen I don't know why I can't reply below the source code.... Anyway the j index is not correctly updated. What happens when you are at N - 1?? j has to be 0 not N otherwise the code will rise an exception. I haven't checked the rest of code there might be more errors.
- Fernando May 23, 2017As you can only go from one station to the next you can't perform any kind of searching. The algorithm just needs to check that starting from one station you can always reach the next one with the accumulated gas.
def compute_gas(gas, cost):
solutions = []
for x in xrange(len(gas)):
current_gas = gas[x]
y = (x + 1) % len(gas)
while x != y:
if current_gas < cost[y - 1]:
break
current_gas = current_gas - cost[y - 1] + gas[y]
y = (y+1) % len(gas)
if x == y and current_gas > cost[y - 1]:
solutions.append(x)
return solutions
This question is really a challenge. Are all the questions at this level so vague??
For all of those that didn't understandd the question, it is asking to find all the sub-sequences of numbers of the array that can be mapped into letters.
For example the array {1, 1, 1} has three sub-sequences {1,1,1} {1, {11}} {{11}, 1}. That is taking all the three elements as different letters, grouping the last two elements of the array as one letter or grouping the two first elements of the array as a letter. One important thing to note is that you can't take arbitrary elements of the array like for example the first element and the third one. We can see this on the next example as some possibilities are discarded because the values are not adjacent. Another important thing to note is that you have to discard all sub-sequences that lead to invalid mappings that is all mappings that contains numbers not between 1 and 26. I hope this explanation helped. Here is the solving code in python.
def compute_number(n):
res = 0
for exp, value in enumerate(reversed(n)):
res += value * 10 ** exp
return res
def compute_combinations(n):
if len(n) == 1: return (n,)
solutions = []
for x in range(1, len(n)):
c = compute_combinations(n[x:])
solutions.extend(map(lambda r: n[:x] + r, c))
number = compute_number(n[:x])
if number > 0 and number <= 26:
solutions.extend(map(lambda r: (number,) + r, c))
number = compute_number(n)
if number > 0 and number <= 26:
solutions.append((number,))
return set(filter(lambda x: 0 not in x, solutions))
Here is another way to obtain completely randomly a maximal element from an array of n elements using constant space
1) Find the maximal element of the array
2) Choose randomly a number x between 1 and n
3) Iterate the array spacing it x until you find a maximal element.
3.1) If you iterate over n increase x by one and start iterating from the beginning of the array.
3.2) If x gets over n choose randomly another spacing value.
3.3) For subsequent calls start iterating from the last found element
Python one liner for the n spatial solution
from random import choice
def get_max(s):
return choice(filter(lambda x: x[1] == max(s), enumerate(s)))
Python version for the constant spatial solution
from random import randint
def get_max(s):
maximum_value = max(s)
spacing = randint(1, len(s) -1)
next = spacing
while True:
if s[next] == maximum_value:
yield next, maximum_value
next = (next + spacing)
if next >= len(s):
next %= len(s)
spacing += 1
if spacing >= len(s):
spacing = randint(1, len(s) -1)
To compute all possible pairs instead of using brute force which would mean all the possible pairs I have used as a lower bound 10 ** (digits of the sum - 1) - 10 ** (digits of the sum - 2) and sum - (10 ** (digits of the sum - 2) - 10 ** (digits of the sum - 3)). I am still sure there are better approximations
def get_num_digits(num):
digits = []
while num >= 10:
digits.append(num % 10)
num /= 10
digits.append(num)
return digits
def filter_pairs(x):
n1 = get_num_digits(x[0])
n2 = get_num_digits(x[1])
for x in n2:
if x not in n1:
return False
n1.remove(x)
return True
def getNumbers(s):
num_len = len(get_num_digits(s))
pairs = []
min_num = 10 ** (num_len - 1) - 10 ** (num_len - 2)
max_num = s - (10 ** (num_len - 2) - 10 ** (num_len - 3))
print min_num, max_num
for x in xrange(min_num, max_num + 1):
pairs.append((x, s - x))
return filter(filter_pairs, pairs)
Yet another solution in python!!
from itertools import chain
from __future__ import print_function
def print_diamond(n):
number_of_stars = list(chain(xrange(1, n+1, 2),
xrange(n-2, 0, -2)))
stars = []
for e in number_of_stars:
half = (n - e) / 2
stars.append(' ' * half + '*' * e + ' ' * half)
map(lambda x: print(x), stars)
Assuming a multi-core environment in which n is the number of cores/threads
using several threads wouldn't improve the big notation as on a perfect set up it would only provide a speed up of n which is a fixed number. Also, this problem doesn't offer a perfect setup as when k < n you can't use all the available threads/cores. Eventually, k will be less than n as you are merging arrays.
In practice, in a typical architecture, you can achieve important improvements, may be of orders of magnitude, as you can usually manage more efficiently the access to the main memory using several threads/cores.
Edited: The previous solution assumes a real scenario in which the number of threads/cores is less than the number of arrays. The maximum theoretical speed up you can achieve sorting in parallel with n cores/threads is O(n log n) / n => O(log n). As previously stated even in this ideal scenario you can't do that with merge sort as each time you halve in two the number of effective threads. For merge sort in the ideal case yes you can obtain a linear cost O(n log n) / log n = O(n).
@Chris.k It depends on the architecture. But lets assume that reading from memory takes x cycles this number has to be much higher than the cost of performing a comparison. Let's assume that cost is y. You can reserve one core to read from memory and to read each time (x/y) * (n - 1) blocks, in this way for the rest of the (n - 1) cores the access to memory is basically free as when they finish the y comparisons they have a new chuck to compare. Perhaps I am missing something here. What do you think?
In general for this kind of problems I prefer to use the tabling method as it solves you all the pain of unrolling the recursion which is specially useful on an interview. You have to check that DP is going to be effective so you have to check that there are multiple repeated calls and the solution is monotonous.
Here is a possible implementation in python
# cost(n, color) is supposed to be the function with the
# cost for a given row and colors
from itertools import permutations
def compute_houses_cost(n, colors):
tabling = dict()
# Initialize the first column.
# This saves the base case on the
# recursive function
for p in permutations(colors):
tabling[(0, p)] = cost(0, p)
def _aux(n, colors):
if (n, colors) in tabling:
return tabling[(n, colors)]
res = []
for p in permutations(colors):
# Don't recurse using the same colors
if p == colors:
continue
res.append(_aux(n-1, p))
tabling[(n, colors)] = min(res) + cost(n, colors)
return tabling[(n, colors)]
return _aux(n, colors)
Classical variation of the binary search as other solutions have already specified.
Solution in python
def get_index(words):
start = 0
end = len(words)-1
middle = (end - start) / 2
while words[middle - 1] < words[middle]:
if words[middle] < words[end]:
end = middle
elif words[middle] > words[start]:
start = middle
middle = ((end - start) / 2) + start
return middle
This one was kept me thinking about why the condition would work for every case. The idea is to keep a window of at least k elements and two counters in one you always add elements to the right side of the window and in the other you always add elements from the left side of the window. If at some point the counter that add elements from the left side of the window becomes negative you can be sure that those elements don't contribute to compute the maximum sum and you can remove them from the window. As you don't know if by removing you are computing the maximum sum as this can happen several times you need to keep the track of that too.
Here a working version in python
def max_subarray_with_k(v, k):
running = 0
current = sum(v[:k])
max_sum = current
for x in xrange(k, len(v)):
current += v[x]
running += v[x-k]
if running < 0:
current -= running
running = 0
if current > max_sum:
max_sum = current
return max_sum
This is a solution for python with the set suggestion to remove duplicates
def anagrams(word):
if len(word) == 1: return word
solutions = []
for x in xrange(len(word)):
solutions.extend(map(lambda y: word[x] + y,
anagrams(word[:x] + word[x+1:])))
return set(solutions)
The corner cases can be improved a little bit but this is a working version in python
def permutations(s):
if len(s) == 1: return [s.lower(), s.upper()]
solutions = []
buf = s[0]
x = 1
while s[x].isdigit():
buf += s[x]
x += 1
solutions.extend(map(lambda y: buf.lower() + y,
permutations(s[x:])))
solutions.extend(map(lambda y: buf.upper() + y,
permutations(s[x:])))
return solutions
The question states that the number at the odd index must be greater than the number at the even index but it doesn't specify which index that means that the number at the first odd position must be greater than any number at any even position. To do this we have to compute the median of the vector and assign the values accordingly. There are several ways to compute the median in linear time, you can use the Median of medians algorithm or use a heap. Here is the solving code in python
import heapq
def rearrange(input_vector):
# Compute the median using a heap. Total cost 2n + n/2
heap = input_vector[:] # This has cost n
heapq.heapify(heap) # This has cost n
for x in xrange(len(input_vector) / 2): # This has cost n / 2
median = heap.pop()
# The rest of the code has cost n, so the total cost of the function
# is 3n + n/2 which is O(n)
solution = [ 0 ] * len(input_vector)
even = 1
odd = 0
for v in input_vector:
if v < median:
solution[odd] = v
odd += 2
else:
solution[even] = v
even += 2
return solution
A possible optimization is to check if the resolution matrix already has computed a solution and add that value instead computing all the possible paths when adding a new element to the goals list
def visiting_matrix(matrix):
def compute_successors(x, y):
res = []
for x1, y1 in [ (0, 1), (1, 0) ]:
x2 = x + x1
y2 = y + y1
if x2 >= len(matrix): x2 -= len(matrix)
if y2 >= len(matrix): y2 -= len(matrix)
res.append((x2, y2))
return res
r = [ [ 0 ] * len(matrix) for _ in xrange(len(matrix)) ]
positions = [ (x, y) for x in xrange(len(matrix))
for y in xrange(len(matrix)) ]
for (x, y) in positions:
total = 0
goals = [ (x, y) ]
for g in goals:
total += 1
successors = compute_successors(g[0], g[1])
for s in successors:
if matrix[g[0]][g[1]] > matrix[s[0]][s[1]]:
goals.append(s)
r[x][y] = total
return r
Here is a python answer for the question, no backtracking though
def invalid(s):
for x in xrange(len(s)-1):
if s[x] != s[x+1]: return False
return True
def convert(a, b):
if (a == 'A' and b == 'B') or (a == 'B' and b == 'A'):
return 'C'
elif (a == 'A' and b == 'C') or (a == 'C' and b == 'A'):
return 'B'
else:
return 'A'
def pathlen(initial_string):
frontier = [(initial_string, 0)]
for (s, l) in frontier:
if len(s) == 1: return l
if len(s) == 0 or invalid(s): continue
for x in xrange(len(s)-1):
a = s[x]; b = s[x+1]
new_string = s[:x] + convert(a, b) + s[x+2:]
frontier.append((new_string, l+1))
return -1
@ChrisK you are right with that
- Fernando May 30, 2017