Programming Interview Questions

laperla

Comments (11)

Reputation 305

Page:

Comment hidden because of low score. Click to expand.

of 0 vote

- can you follow the thought: every URL that has less than T/S hits on
all servers cannot be among the top ten anymore? The rationale being
that S*T/S equals T and if they have a bit less than T/S, then all
of them together have less than T.

- if so: how about a URL that has less than T/S hits on a single
server? Sure it can be among the top ten iff the same URL has more
than T/S hits on a different server. Right?

- if so: we will get a hold of such a URL on the other server that has
more than T/S hits there. Either we have already got it in the first
pass (1) or in the second pass (4)

- the exactness of the number of hits is guaranteed in steps (2) and
(5), after we have determined candidate URLs. In the first pass just
a minimal seed of URLs, in the second pass all URLs that might
qualify

- laperla August 16, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

That's a regexp task:

perl -le '$_="aaabbccc"; s/((.)\2*)/length($1).$2/ge ; print'                             
3a2b3c

Enjoy,

- laperla October 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

It is not "in place"

- laperla October 17, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Counter example: { 3,3,2,2,2 }

With your algorithm you miss the optimal solution {3,3} and {2,2,2} because you put the two elements of value 3 in two different teams.

If I understood you correctly

- laperla October 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

What a wicked answer, Erasmus, thank you very much! I stand corrected,
plead guilty as charged and offer a reimbursement. You're right that
splice is not O(1), neither in perl nor in python. To preserve the
contiguous memory locations, I offer to run an equivalent chain of swaps
whenever the iterator i hits a number that must be moved to the left.
Here is the improved rearrange function:

def rearrange (a):
  good_i = -1
  i=0
  while (i<len(a)):
    v = a[i]
    if ( is_odd(good_i) and v<0 or is_even(good_i) and v>=0 ):
      if (i != good_i+1):
        for j in range(i,good_i+1,-1):
          a[j]=a[j-1]
        a[good_i+1] = v
      i = good_i = good_i+1
    i+=1
  return a

- laperla October 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

The trick is to _move_ the next element to its place in the array, not to swap. Such an operation is also known as splice. I wrote it in python. I'd argue, it is O(N) even though the iterator i sometimes decreases a bit. The good news is that every splice we do fixes two elements simultaneously.

def is_odd (num):
  return num % 2 > 0

def is_even (num):
  return not is_odd(num)

def rearrange (a):
  good_i = -1
  i=0
  while (i<len(a)):
    v = a[i]
    if ( is_odd(good_i) and v<0 or is_even(good_i) and v>=0 ):
      if (i != good_i+1):
        a[i:i+1] = []
        a[good_i+1:good_i+1] = [v]
      i = good_i = good_i+1
    i+=1
  return a

print rearrange([1,2,3])
print rearrange([-1,-2,-3])
print rearrange([1,2,3,-1,-2,-3])
print rearrange([1,2,-1,-2,-3,3])
print rearrange([-1,-2,-3,1,2,3])
print rearrange([-1,1,-2,2,-3,3])
print rearrange([1,-1,2,-2,3,-3])
print rearrange([-1,1,-2,2,-3,3,4,5,6])
print rearrange([-1,1,-2,2,-3,3,-4,-5,-6])

Enjoy

- laperla October 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Just to nitpick a bit: ASCII does not have 256 characters but only 128 and UTF-8 has one to four byte sequences, so three byte sequences are part of the game.

Pity that URLs are not allowed on this site, I would point you directly to wikipedia.

- laperla October 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Hi Izaaz, in your example the first pass finds the critical threshold T to be 2. The second pass would then calculate T=2 devided by S=3 and ask all servers for all URLs that have a score >= 2/3. In other words it would merge the complete set of URLs and thus get the URL G and the sum of the accesses to it in the second pass.

- laperla October 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

Thanks, vik, for your thoughts on scalability. I think this shows how open ended the question actually is. Without more knowledge about the topology of the machines, datacenters, loadbalancers, etc involved it is not possible to proof the quality and scalability of the algorithm in real life. A few things I would suggest to discuss about this:

- is it a typical weblog file? Typical weblogs have a steep declining start and a long flat tail, so the number of accesses on the tenth rank are usually high: if it is such a distribution, this algorithm is not bad.

- how biased are the loadbalancers? If the loadbalancers have a high degree of randomness built in, then the differences between the machines are statistically already similar enough that the first pass is quite good and the second pass is not much work.

- can clusters of machines be pre-merged? If racks or lines of racks can have a logserver that collects the full logs for the rack or line, then the number of servers to contact can be drastically reduced.

- how often are these calculations repeated? Which similar questions need to be answered? How often? How exact must the numbers be? How up-to-date? Each answer would influence the program and tuning we want to set up to produce fast and good numbers.

- laperla October 02, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

As I understand the task, the result must be one or more of the given points on the grid and not any extra point somewhere in between. In my opinion, this is backed by the hint that one should imagine a map of streets.

Given this interpretation is correct, the task is just to sum up all distances that would have to be gone if all people would come together at any of the given points. The distance between two points on a grid is trivial to calculate, just the distance on the x asis plus the distance on the y axis, no graph theory required. The sum must be computed for every point, so the runtime complexity is O(N^2).

Here is my solution written in go. It contains 6 examples that can be verified with a bit of common sense by drawing the points on a grid.

package main

import "fmt"

type Point struct {
	X, Y int
}

type Solution struct {
	Indexes []int
	Min_sum uint
	Points  []Point
}

func abs(a int) uint {
	if a < 0 {
		a = -a
	}
	return uint(a)
}

func grid_distance(p1, p2 Point) uint {
	distance := uint(0)
	distance += abs(p1.X - p2.X)
	distance += abs(p1.Y - p2.Y)
	return distance
}

func median_of_grid(points ...Point) {
	sol := new(Solution)
	sol.Min_sum = ^uint(0)
	for i, ivalue := range points {
		sum := uint(0)
		for _, jvalue := range points {
			sum += grid_distance(ivalue, jvalue)
		}
		if sum < sol.Min_sum {
			sol.Min_sum = sum
			sol.Indexes = []int{i}
		} else if sum == sol.Min_sum {
			sol.Indexes = append(sol.Indexes, i)
		}
	}
	for _, value := range sol.Indexes {
		sol.Points = append(sol.Points, points[value])
	}
	fmt.Println(sol)
}

func main() {
	// one solution:
	median_of_grid(Point{0, 1})
	median_of_grid(Point{0, 0}, Point{0, 1}, Point{1, 0})
	median_of_grid(Point{0, 0}, Point{3, 4}, Point{4, 3}, Point{1, 1})
	// two solutions
	median_of_grid(Point{0, 1}, Point{1, 0})
	median_of_grid(Point{0, 0}, Point{3, 4}, Point{4, 3})
	// four solutions
	median_of_grid(Point{0, 1}, Point{1, 0}, Point{2, 1}, Point{1, 2})
}

Enjoy,

- laperla September 28, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 32 vote

Presuming a protocol exists that can ask three questions to each server:

* the score of a single url
* the top 10
* the top n that satisfy score >= N

We program a two pass solution like so:

We denote the number of servers as S.

[First pass]
(1) Ask every server for its own top ten

(2) merge the results. For all URLs in the merged set calculate correct values by asking
all servers for their scores for each URL. Calculate a set of top ten from our sample.

(3) pick score of the now tenth URL as the threshold that we try to beat
in the second round. We denote the threshold as T.

[Second pass]
(4) Ask every server for all its top N that satisfy score >= T/S

(5) Merge these bigger samples again as in step (2)

(6) We now have the correct top ten with correct scores.

- laperla September 27, 2013 | Flag Reply

Page:

Add Question

CareerCup is the world's biggest and best source for software engineering interview preparation. See all our resources.

How Can You Help CareerCup?

Find Out When Gayle / CareerCup / Cracking the Coding Interview is in Your City

Report a Bug or Issue

CareerCup

laperla

Books

Videos

Resume Review

Mock Interviews