Programming Interview Questions

eugene.yarovoi

Comments (50)

Follows (3)

Followers (16)

Reputation 5415

Page:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Comment hidden because of low score. Click to expand.

of 0 vote

@gaurav101: How would you update the min / max efficiently with your proposal? The trouble is that we need to keep the min / max updated as elements drop off the left end of the sliding window.

- eugene.yarovoi May 05, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@kannan: That part of the expression tells you the minimum delta that occurs at any point in the combined chunk. This is the smaller of the minimum delta in the first chunk, and the overall delta caused by the first chunk (r1.braceDelta, not r1.minBraceDelta) plus the minimum delta anywhere in the second chunk (r2.minBraceDelta). So the expression should be Math.min(r1.minBraceDelta, r1.braceDelta + r2.minBraceDelta), as posted. Let me know if further clarification of this point is needed.

- eugene.yarovoi April 29, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Yes, it's always copies. Keep in mind, however, that nothing is stopping you from using some pointer type for A, in which case you'd be making a copy of the pointers and not the objects themselves. Having some type like vector<int*> is perfectly valid.

- eugene.yarovoi April 25, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Or a superset of the array_to_find. The problem asks you to find the sets whose union contains all the elements in array_to_find, such that the number of sets is minimized.

- eugene.yarovoi April 19, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Top-down dp can be a constant factor worse. You probably mean it's no worse asymptotically, though.

- eugene.yarovoi April 19, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Not just better space, but better time too, because we don't necessarily have to calculate the answer for every value in [0, N], but only answers to subproblems that actually occur.

- eugene.yarovoi April 18, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Since any argument to the recursive call will have the form N / (2^i * 3^j), the number of possible distinct arguments is less than log_2(N) * log_3(N). For N <= 1 000 000 000, this upper bound is about 600 only. Therefore using the top-down memorization strategy is a lot better in this case.

- eugene.yarovoi April 18, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

No, it's the set cover problem alright. It's just here we have a small variation: we should not consider elements that are not in the "array to find". In other words, if a set contains elements not in the "array to find", ignore them, since those elements have no effect on the answer.

Typical set cover is trivially poly-time reducible to this variation (pick an "array to find" that contains the whole universe), and this variation is poly-time reducible to set cover just by deleting the irrelevant elements.

- eugene.yarovoi April 18, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Since the difference is allowed to be <= 2, you could at any given time have values that are -2, -1, 0, 1, and 2 from the middle element. That's all I meant.

- eugene.yarovoi April 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 7 vote

The problem in the PDF you linked to is different from the problem you describe. You ask to print the ranges; the PDF asks only to count the number of such ranges. The PDF also asks you to make the difference <= K and not just <= 2.

The problem as you posed it requires O(n + r) time, where r is the size of the output. Since the output is up to O(n^2) in size, no algorithm is possible that will always be O(n) -- the best we can hope for is an output-sensitive algorithm that solves the problem in O(n) if the output size is O(n) or smaller. Consider that an array of all 0s will need to print O(n^2) pairs of numbers.

The problem posed in the PDF only asks you to count the number of subarrays, not enumerate them. This is, in fact, doable in O(n). The basic idea is that you can start with arr[0] and then see how many more elements you can include before you violate the max -min <= K constraint. When you reach some index j you can no longer include, you know the maximum subarray starting at index 0 is arr[0...j-1], and you can count j different subarrays there. You then already know arr[1...j-1] is valid, and you try to advance the right boundary again (try arr[1...j], then arr[1...j+1]) and so on until again you reach a point past which you can't advance. Then you'll try subarrays starting with index 2, and so on. This is what they mean when they talk about "crawling" over the array.

The key issue here is how you will check whether a particular subarray is valid efficiently. One possibility would be to use a data structure that allows you to do minimum / maximum range queries (see the Maximum Range Query problem). However, in this case, you can reach the solution more simply by recognizing that the values over which you want to find maxima / minima can be modeled by a queue. When you advance the right boundary, you append an element to the queue; when you advance the left boundary, you dequeue an element. Now all that remains is to show how to implement a queue that allows O(1) amortized append, dequeue, and max / min.

You may have heard of the classic interview question to implement a stack with O(1) push, pop, and max / min. If you know how to solve that and also know how to make a queue from 2 stacks with amortized O(1) stack operations per queue operation, you have a queue that supports O(1) amortized append, dequeue, and max / min (each of the 2 stacks will have a max / min, and the overall max / min is just the larger / smaller of the two). I find that to be the most straightforward solution.

The PDF goes with a different approach to the same problem of implementing a queue with O(1) min / max. They have 1 queue each for min and max (the min is treated analogous to the max case). Whenever a new value comes in the max queue, they remove all values less than the value, since those can never again be the max. For that reason the queue's values are in ascending order (you can prove that if you start with an ascending sequence, remove everything less than a given number, and add the given number to the front, you still have an ascending sequence). Since the queue is in ascending order, the max at any given time is the last element, and since the elements are always inserted in the front, the oldest elements are in the back, so you always dequeue from the back.

- eugene.yarovoi April 14, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Good approach. Small nitpick: technically you could have 5 distinct values in the tree.

If you know you'll only have 5 values you might even use an array and not bother with a tree.

The PDF document the OP linked to asks for maximum and minimum in the slice <= K (rather than <= 2). Then you can no longer assume the tree will be small. Can you solve this problem in O(n) regardless?

- eugene.yarovoi April 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

If you glued the lists, you'd have to do some kind of mergesort that detects existing runs and exploits them (e.g. Timsort). I assume the number of elements E is >> N and that therefore there is some difference between O(E log N) and O(E log E), the complexity of naive mergesort.

- eugene.yarovoi April 07, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

Neither the original code nor the proposed edit will work. The issue is that your object is of type A, and is not of type B. You therefore cannot assign it to a variable that holds an object of type B.

This would work in the other direction: A a = new B() works just fine and doesn't even need any casting. That's because, since B is a subclass of A, any object of type B is also of type A. Since the new instance of B is also of type A, we can assign it to a variable that holds an object of type A.

You might have an easier time understanding this if you replace A and B with Car and Chevy. If you make a new Car, it might not be a Chevy, so putting it in a place reserved for only Chevys is not always correct and won't be allowed. But if you make a new Chevy, it's always a Car, so putting it into a place for Cars is perfectly fine.

At no point has casting entered this discussion yet, so you might be wondering what that's used for. Well, it can't be used to transform any Car into a Chevy. You can't just do, as you did with A and B above, Chevy c = (Chevy) new Car();. Java isn't some kind of magical wizard that can somehow transform a Car into a Chevy if it isn't one already. A Chevy might have additional data associated with it -- where would that data come from when making the conversion? In this case, the operation would result in a ClassCastException, letting you know that the conversion failed because the object isn't of the type you're trying to cast it to.

A cast would only work in a situation where you know a priori that something is of a certain type, but yet the object is currently being stored in a variable whose type is some supertype. For example, let's say we wrote Car c = new Chevy(). We know c is a Chevy, even though we're holding it in a variable for Cars. We could now do Chevy ch = (Chevy) c. When we do this, Java will verify that the conversion we're attempting is valid (that c really is a Chevy and not any other kind of Car) before making the assignment, and a ClassCastException will be thrown if c is not a Chevy as expected.

- eugene.yarovoi April 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You can merge the lists in pairs. First merge lists 1 and 2, then 3 and 4, and so on. After you're done, you have half as many lists. Make another pass to merge those lists in pairs again and you'll have a quarter of the number of lists you started with. Do this again and again until you only have one list.

Over the course of the entire algorithm, the number of comparisons is O(log N * totalInputElements), since it's one comparison per element per pass. One pass halves the number of lists and therefore there will be O(log N) passes. This number of comparisons is the asymptotic minimum for merging N sorted lists. Only O(1) space is used (additional space would be used if you were doing this with arrays, but it's not needed for linked lists).

- eugene.yarovoi April 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@Bhargav: It doesn't ripple back to 0 every time. It uses cached results obtained previously to avoid that. That's what this snippet does:

else if(b[a[i]]!=-1)
          b[i]=b[a[i]]+1;

- eugene.yarovoi April 05, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 3 vote

Pick an index at random and check if the element there is non-null. Keep re-picking indexes at random until you get a non-null element. This approach makes no assumptions about the distribution of the data and has an expected running time of O(N/K) if there are K non-null elements in the array, which is the best you can hope for if you have no information about how elements are distributed.

The algorithm as stated above is unbounded in terms of worst-case (as opposed to expected) time, but that can be remedied by adding a condition that if we fail to find a non-null element after n steps, we will do a linear scan of the array to find all the non-null values, and pick randomly from that list. This change caps the algorithm at O(n) time. Since the linear-time scan is only done when O(n) time has already been used anyway, the bounded algorithm is slower than the unbounded one by at most a constant factor for any input. It is therefore O(N/K) expected time and O(N) worst-case time.

- eugene.yarovoi March 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You can't just run the algorithm on the undirected graph, at least not in the typical sense of finding an Euler path on an undirected graph (each edge must be traveled exactly once, regardless of direction). There is no constraint in this problem that the undirected graph must admit an Euler path.

However, since the question asks that we should travel each edge once *in each direction*, we know we can always make a tour because the in-degree equals the out-degree for every vertex in the equivalent directed graph.

It looks like I didn't see the requirement to find all paths when reading the problem, so I only found one.

Now that I think about it, there was no need to deal with Euler tours at all here. We could have simply maintained a linked list representing the path discovered so far, and for every edge (v1, v2) not yet included in the path such that v1 appears in the path already, make a new path v1 -> v2 -> v1 and merge it with the existing path at v1.

- eugene.yarovoi March 07, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

There is exactly 1 way to climb 0 stairs. If anything is confusing here, it might be our informal notion, when using human language, of what constitutes a "way" to do something. We can make this notion precise: the number of ways to climb N steps with step sizes of 1, 2, and 3 is the number of distinct sequences consisting of 1, 2, and 3 whose elements sum up to exactly N. There is precisely 1 such sequence when N=0: the empty sequence.

That said, this recursive implementation has exponential complexity because it does not memorize answers to previously-solved subproblems. It can be augmented with dynamic programming to give a linear-time algorithm (though the simplest solution is still to solve it iteratively).

- eugene.yarovoi March 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

I would add to that dependent resources. For example, maybe the application is supposed to connect to some web service, URL, or database that's not available today for some reason, and the application does not correctly handle the exception that results.

Maybe the user deleted some data or configuration file accidentally.

- eugene.yarovoi February 27, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@Park: walls would be taken into account when building the graph. Normally vertices that represent adjacent areas will have an edge between them, but not if there's a wall in between them.

- eugene.yarovoi February 10, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 5 vote

@GZ: Let's review what it means to run out of memory. It means that we need additional space to store some data we need to store, but we just don't have that extra memory. We're out of physical resources.

Your question is like asking "I have a warehouse and I need to put another box in it, but the warehouse is completely full already. How can I handle this case?" You're physically out of space. Your options when you run out of RAM are analogous to the options you have here:

1. Get a bigger warehouse (buy new RAM) or annex a new section to your warehouse (add a stick of RAM). Then you can store your box (data) and possibly many more boxes.

2. Compress the boxes (data) already in the warehouse (memory), if they are compressible. Just like physical goods, not all data is very compressible. Doing this means you might spend extra time compressing and decompressing boxes (data) when putting them in and when retrieving them. Depending on how you compress, it might also make your warehouse less organized, making it harder and/or slower to locate items.

3. Discard some of the old items. This only works if you have something that can be thrown away. If there's a requirement to keep all the old stuff around, there's nothing you can do here.

4. Put whatever you can't fit into your warehouse (RAM) into secondary storage (hard disk). The secondary location may be far away, so placing goods there might take much longer than it would to place them in your warehouse. Retrieving items from secondary storage might also take longer for the same reason. Disk is very slow compared to RAM.

- eugene.yarovoi February 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@riddimon: Heapsort would be an example of an in-place sort, but it's not very appropriate for files. The trick with manipulating files is that you want to read and write to them in a contiguous way because disk seeks are slow. Heapsort accesses data in a highly non-contiguous way as you move up and down the heap, so it's not a great choice.

Quicksort might be a good fit in this case. The main challenge of quicksort is the in-place partitioning of an array with respect to a pivot, which is something that can be done without much random access of the file. You write elements starting from the front and elements starting from the back in a contiguous way (buffer up values in memory until you have enough of them to write). Quicksort isn't really 100% in-place, requiring something like O(log N) memory, but that's a very small amount even for large N.

Quicksort does have shortcomings like degenerate cases that potentially take O(N^2) time. If you don't want to use random pivots to avoid degenerate cases, it's possible to choose the median as the pivot. There are linear-time median-selection algorithms like "median of medians", which can also be done with only logarithmic extra memory.

- eugene.yarovoi February 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

A reasonable approach would be to model the workplace as a graph, using the walls and the geography of the office to determine which cubicles, spaces, areas, etc. are adjacent to each other and creating edges between them in the graph. Once we've built the graph, we can do either breadth-first search or Dijkstra's algorithm, depending on whether it's reasonable to treat all distances between adjacent areas as equal (say in the case that we have everything on a uniform grid) or not. We'd want to start the search simultaneously from all coffee rooms; that way, as we reach each cubicle during the search, we will have found the shortest distance to *any one of* the coffee rooms. Once we've calculated this distance for all cubicles and excluded any already-occupied cubicles from consideration, we can place the free cubicles into a min-heap ordered by coffee room distance. We can now remove-min from the heap when assigning a new cubicle.

If a new empty cubicle is added to the office, hopefully it's placed in some previously existing space whose distance to coffee we calculated during the preprocessing, and we can just add the cubicle to the heap. Similarly, when an engineer leaves the company and their cubicle becomes available, we can also add it to the heap.

If we don't care about supporting the scenario of adding cubicles, we can use a sorted array instead of a min-heap. We can make a stack of the cubicles, with the ones closest to coffee on top. Getting the next cubicle is just an O(1) pop now.

- eugene.yarovoi February 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Implicit in your question is the premise that a hash table somehow handles running out of memory. A trie handles this the same way a hash table does -- it doesn't.

- eugene.yarovoi February 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Who said anything about a dynamically updated hash map? You'd insert tuples of the form <value, number of array it comes from> into the heap, which is sorted only by value. Then when you retrieve the tuple with the smallest value, you'd know which array it came from.

The worst-case complexity for heap insertion is O(log K) where K is the size of the heap. It is not O(1). I sure hope I can't turn to the wiki for more details, because that would mean I need to go edit it for correctness!

EDIT: I suppose that there are types of heaps other than binary heaps, and that those may have O(1) insert. But my comment was on binary heaps specifically, so I assume we're talking about binary heaps.

- eugene.yarovoi February 06, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

This problem is called 3-SUM and there is no currently known way to solve it faster than O(n^2). This is actually an important problem because a number of problems that are not known to admit sub-quadratic time solutions can be reduced to this one in sub-quadratic time, which means we'd have asymptotically faster algorithms for all of them if we could solve this one in less than O(n^2). You would likely win some kind of prize if you managed to give such a solution.

FFT can solve the problem in O(N + R log R) where R is the range of the numbers and N is the number of elements, but since R can be very large in comparison to N, this solution is not general.

- eugene.yarovoi February 05, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Go through the linked list and build a hash table mapping each node to the node that follows it. Then go through the array and for each position, look at the next position and see whether the node there matches what the hash table says the next node should be. Increment a counter each time there's not a match (counter starts with 1 since we always say there's at least 1 cluster).

- eugene.yarovoi February 04, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This doesn't work because we don't need to modify every edge that initially has no path to node 1. It's possible to imagine a situation, for instance, where all nodes except node 1 form one giant cycle. If you now modify just any one node in the cycle to point to node 1, you will have a path for every node. The answer in that case would be 1 rather than N - 1.

- eugene.yarovoi January 31, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Consider the case where the graph looks like one big tree rooted at node 1, with an edge from each child to its parent. The answer is 0 because node 1 can be reached from all other nodes and there is therefore no need to change any of the edges. However, the entire tree is clearly not just one strongly connected component. You can see from this example that detecting SCCs doesn't give you the answer.

- eugene.yarovoi January 31, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I assume it's not a binary search tree. BSTs usually don't permit multiple nodes with the same value, so the algorithm would be trivial for a BST: search for the root of the small tree in the large tree, call the node found X, and check whether the subtree at X is identical to the small tree.

- eugene.yarovoi January 24, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A path that visits every edge in a graph exactly once is called an Euler tour (see Wikipedia).

You might say your graph is undirected, but since you want each edge to be traversed once *in each direction*, direction becomes relevant and you can model the graph as a directed graph. Convert your undirected graph to a directed graph by replacing each undirected edge with two directed ones, one in each direction. We will now ask: can we produce a path that traverses each edge of this directed graph exactly once?

For there to exist a path that visits every edge, the original undirected graph must be a connected graph (it is clearly impossible to move between components that are not connected). In addition, the directed graph has an important property: every vertex in the directed graph has exactly as many inbound edges as outbound edges because the directed edges were placed in pairs (that is, A -> B always accompanied by B -> A).

Given these properties, here is how you can find a path. By this algorithm, you can also see that a path always exists:

1. Pick an arbitrary node in the graph.

2. From that node, take any previously untraveled outbound edge. You will now be at a new vertex.

3. Repeat step 2 until you return to the vertex you started with in step 1, and record the path you travel along the way. You are guaranteed to eventually reach the vertex you started with -- you will never run out of untraveled edges somewhere in the middle and get stuck -- because every intermediate vertex has an outbound edge for every inbound edge, so if you can get to that vertex, you can leave.
Now, consider the possibilities.
a. You've visited every vertex, and visited every edge. In that case, you're done.
b. You've visited every vertex, but not every edge. In that case, there are vertices in your path that have untraveled outbound edges.
c. You haven't visited every vertex. In that case, there are vertices in your path that have untraveled outbound edges. Otherwise, your graph wouldn't be connected.
So, either you're done, or there are vertices in your path that have untraveled outbound edges.

4. While there are untraveled outbound edges, pick an arbitrary vertex in your path that has such an edge. Call that vertex v0.
If v1 is your initial node picked in step 1, your current path looks like v1 -> ... -> v0 -> ... -> v1.
Repeat steps 2-3 starting from v0. The path generated by this operation will look like v0 -> ... -> v0. Insert this path into the previous path of v1 -> ... -> v0 -> ... -> v1, at the position of v0. Now your path looks like v1 -> ... -> v0 -> ... -> v0 ... -> v1.
Keep doing this and adding to the path until there are no untraveled outbound edges anywhere in your path.

The final resulting path visits every edge exactly once.

A fairly sloppy implementation of this algorithm might take O(n^2), but if implemented carefully, it is possible to achieve O(n).

- eugene.yarovoi January 24, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Really all you're asking for is to find an index k such that arr[0] + ... + arr[k] = 1/2 * sum of entire array. You could sum up the list, set your target value as 1/2 of the sum, and then start summing elements from the beginning until the total reaches the target value.

using System;
using System.Linq;

public static int GetBreakingPoint (int[] arr)
{
    int totalSum = arr.Sum(x => x);
    //no breaking point if numbers can't be divided by 2
    if (totalSum % 2 == 1)
    {
        return -1;
     }	
    int totalSoFar = 0
    for (int i=0; i<arr.Length; i++)
    {
         totalSoFar += arr[i];
         if (totalSoFar == totalSum/2)
         {
             return i;
         }
     }
     return -1;
}

You can see this working at ideone dot com / Y9S9vO

- eugene.yarovoi January 24, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

We can do an in-order traversal while maintaining the current depth so that we know to which list to add the current node. If we've never seen the current depth before, add a new list.

private void getNodesByDepthHelper (TreeNode root, int currentDepth, ArrayList<LinkedList<TreeNode>> depthLists)
{
    if (root == null) { return; }

    if (depthLists.size() == currentDepth)
    {
        depthLists.add (new LinkedList<TreeNode> ());
    }
    getNodesByDepthHelper (root.left, currentDepth+1, depthLists);
    depthBuckets.get(currentDepth).add(root);
    getNodesByDepthHelper (root.right, currentDepth+1, depthLists);
}

public ArrayList<LinkedList<TreeNode>> getNodesByDepth(TreeNode root)
{
    ArrayList<LinkedList<TreeNode>> depthLists = new ArrayList<LinkedList<TreeNode>> ();
    getNodesByDepthHelper (root, 0, depthLists);
    return depthLists;
}

- eugene.yarovoi January 24, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think the OP meant the sentence as (If counter goes negative) or (counter is non zero at the end of the file), braces are not balanced

and not

If ((counter goes negative) or (counter is non zero)) at the end of the file, braces are not balanced

Since the language is ambiguous, it probably makes sense to give the benefit of the doubt and assume the interpretation that would make the solution correct.

- eugene.yarovoi January 24, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 6 vote

The OP's approach for the initial problem was correct. Namely, something like

int counter = 0;
for (int i=0; i<string.Length; i++)
{
    char c = string[i];
    if (c == '{')
    {
        counter++;
    }
    else if (c == '}')
    {
        counter--;
        if (counter < 0)
        {
            return false;
        }
    }
}
return (counter == 0);

This is correct because we are counting how many unclosed { we have at any moment. When we see a new { we add 1, and when we see a }, we immediately pair it with the last unclosed {, therefore using up one of them and lowering the count by 1. If there are no unclosed { when we see a }, the } does not have a matching { and the counter becomes negative, at which point we return false. At the end, if there are any unclosed { remaining, those { do not have a matching } and we must return false. If the unclosed { count is 0, on the other hand, we return true.
--------------------------

The first instinct to solving the parallel problem might be to split the work up into equal-sized chunks for each processor. If there are N characters in the file and K processors, give each processor a chunk of N/K characters. When processing a chunk, we can no longer simply return false when the count drops below 0, because there may be more opening braces in an earlier chunk processed by a different processor. We also can't return false if the counter is not 0 at the end, because it may be that there are more closing braces in later chunks. What we should calculate for each chunk, instead, is:

1. If we were doing a global brace counter as in the single-threaded approach, by how much would this chunk change the counter? This can be a positive or negative number. Call this the "brace delta". This number is equal to the number of opening braces minus the number of closing braces in the chunk.

2. While we are calculating the "brace delta", what is the lowest point to which the brace delta drops anytime during the processing of the chunk? Call that the "brace delta minimum". This number is important because this is the point in the chunk where the global counter would be the smallest. We can check whether the global counter would have ever been negative by seeing whether (global counter before chunk) + (brace delta minimum in chunk) is < 0 for any chunk.

These two things would be calculated by:

ChunkResult BraceDeltaAlgorithm (int chunkStartPos, int chunkEndPos, String input)
{
    int braceDelta = 0;
    int minBraceDelta = braceDelta;
    for (int i=chunkStartPos; i<chunkEndPos; i++)
    {
        char c = input[i];
        if (c == '{') 
        {
            braceDelta++;
        }
        else if (c == '}')  
        {
            braceDelta--;
            if (braceDelta < minBraceDelta)
            {
                minBraceDelta = braceDelta;
            }
        }
    }
    return new ChunkResult (braceDelta, minBraceDelta);
}

The complexity of the above code is O(N/K) for each of K processors. However, we can consider it O(N/K) time because all the processors can work in parallel.

After we process each chunk in parallel, we combine the results in a single-threaded way. We use the brace deltas to calculate the value of the global counter at the beginning of every chunk, and we use the brace delta minima of the chunks to verify that the global counter never drops below 0. The global counter value should be 0 at the end of all the chunks as before.

int globalCounter = 0;
for (int i=0; i<K; i++)
{
    ChunkResult result = childThreads[i].getResult();
    if ( globalCounter + result.minBraceDelta < 0)
    {
        return false;
    }
    globalCounter += result.braceDelta;
}
return (globalCounter == 0);

The non-parallel part takes O(K) time, making the overall time for the solution O(N/K + K).
---------

The above approach is a very good practical solution, because in most cases the number of processors K will be fairly small, and the "+ K" portion of the time complexity would be negligible. Let's assume however that we have unlimited processors now.

We can see that K = order of sqrt(n) is the best choice for K if we want to minimize the
value of N/K + K. So even with unlimited processors, the time taken by the above approach is O(n / sqrt(n) + sqrt(n)) = O(sqrt(n)).

We can do better by giving a divide and conquer approach:

1. The current processor splits the input into 2 chunks, and hands each of 2 processors one chunk each. Pointers would be passed instead of copying data, so this is O(1).
2. Each of these 2 processors recursively solve this problem, again delegating to yet other processors. If the problem size has been reduced to a small constant, the processor uses the BraceDeltaAlgorithm from before as a base case.
3. The solutions obtained by the 2 processors in step 2 are combined into a single ChunkResult, and that result is returned.

ChunkResult solve (int start, int end, String input)
{
    if (end - start <= SOME_SMALL_CONSTANT)
    {
        return BraceDeltaAlgorithm (start, end, input);
    }
    int mid = start + (end-start)/2
    start new thread t1 to run "solve (start, mid, input)"
    start new thread t2 to run "solve (mid, end, input)"
    ChunkResult r1 = WaitForThreadToFinishAndGetResult(t1); 
    ChunkResult r2 = WaitForThreadToFinishAndGetResult(t2); 
    return new ChunkResult (r1.braceDelta + r2.braceDelta, Math.min(r1.minBraceDelta, r1.braceDelta + r2.minBraceDelta))
}

public bool hasCorrectBraces (String input)
{
    ChunkResult result = solve (0, input.Length, input);
    return (result.braceDelta == 0 && result.minBraceDelta >= 0);
}

Everything here is O(1) except the time spent waiting for the recursive solve() invocations. Since both calls to solve are processed in parallel, we only bill for one of them. So our recurrence relation is T(n) = T(n/2) + O(1) => T(n) = O(log n). This improves over the previous bound of O(sqrt(n)).

- eugene.yarovoi January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The problem asks for an Eulerian path, not a Hamiltonian path (each edge needs to be used once, not each vertex). The problem of finding an Eulerian path is in fact solvable in linear time.

- eugene.yarovoi January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The OP's method is correct -- see my comment under gns's answer for an explanation. The OP's method would be able to detect the first string is invalid because, to quote the OP, "If counter goes negative or counter is non zero at the end of the file, braces are not balanced". The counter goes negative in your first example. It first goes up to 1, then down to 0, then to -1. At that point it is negative and the algorithm would return false.

The main challenge of this problem lies in figuring out a way to parallelize this efficiently.

- eugene.yarovoi January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Simply counting braces as the OP described does work for the non-parallel case. Each } closes off the most recent {, so all you need to know is whether you have any { to close off whenever you hit a }, and whether all { have been closed when the string ends. Think about it this way -- if you had a stack, every element of the stack would always be {, simply because as soon as you get } you immediately pair it with one of the { on the stack, so no } or any other symbols ever go on the stack. Since your stack will never contain anything other than {, you can just maintain a count of how many of them you have rather than maintaining the stack in explicit form. From these realizations we can prove the OP's approach correct.

- eugene.yarovoi January 23, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Wouldn't the tree, as you've described it, be exponential in size with the size of the input? I might misunderstand your approach, but isn't this tree approach doing the same thing as regular backtracking, with the exception that you first enumerate all the possibilities, store them, and then evaluate them (whereas backtracking evaluates possibilities as they are enumerated)?

- eugene.yarovoi January 16, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

In some other languages, you can modify strings, so the problem statement is a little more restrictive than that. Think of it as having an array that contains all the characters of the string. You need to perform the specified operation on the array without taking any extra arrays or other extra storage beyond a constant amount (for example a few primitive variables).

- eugene.yarovoi December 11, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I used (2*26 + 10)^10 = ~10^18.

- eugene.yarovoi December 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

The answer is correct. s1 = s2 would make the pointer s1 point to the same thing s2 points to. This is different from copying the object (invoking the copy constructor), which is achieved by *s1 = *s2. Often people are confused about this topic if they don't understand the difference between objects themselves and pointers to objects -- there is a very big difference, though! I recommend reading a tutorial on pointers if you don't understand this distinction.

- eugene.yarovoi December 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You'd need a lot of space to eliminate hash table collisions entirely. To be completely guaranteed to not have any, you'd need a bucket for every possible value an element can take on, which is potentially a huge number. For example, you'd need 2^32 buckets for 32-bit integers, 2^64 buckets for 64-bit integers, and ~10^18 buckets for alphanumeric strings of length 10.

Even if your keys are hashed fairly randomly by the hash function and you're willing to settle for only a high probability of no collisions, by the Birthday Paradox, you need at least k*n^2 buckets, for some constant k depending on the desired probability, where n is the number of keys.

- eugene.yarovoi December 05, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I'm guessing that by -(107) <= x <= (107), you meant -(10^7) <= x <= (10^7), because otherwise the examples don't make sense. One very easy way to solve would be to just sort the array, and then look only at pairs of adjacent numbers. This has O(n log n) time complexity if implemented using an O(n log n) sort.

- eugene.yarovoi December 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

LinkedHashSet does not support querying for the n-th element. It only supports lookup by value and traversal in the order of insertion.

- eugene.yarovoi November 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Linked lists do not support random access, so the answer is no. You could have some data structure augmenting the linked list, but then it wouldn't be just a linked list anymore.

- eugene.yarovoi November 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I'm fairly confident it's not really him, but rather someone impersonating him. I've spoken to Varun before and I doubt he'd make such comments.

- eugene.yarovoi November 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You can write the code yourself if you know the approach. Developing the approach is the more interesting and more difficult aspect. Keep in mind also that my solution was the first posted on this page to reach O(n) time -- at the time I posted, every other solution involved a sort, and I came up with my solution by thinking about how I could avoid sorting. Turns out there's an easier way to avoid sorting as seen in Jason's answer.

- eugene.yarovoi November 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 4 vote

You can solve this in O(n) time by finding the median of the array in O(n) using quickselect or the median-of-medians algorithm, and then placing numbers less than the median (in any order) in even positions (for a 0-indexed array), and numbers greater than the median (in any order) in odd positions. For elements equal to the median, distribute them between both even and odd slots that remain after placing the other elements. The correctness of this approach follows from the observations that every constraint on the final output is of the form a[some_odd_index] >= a[some_even_index], and that this construction ensures all odd indexes are >= the median and all even indexes are <= the median, therefore guaranteeing that all odd indexes are >= all even indexes.

- eugene.yarovoi November 17, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

They probably meant it's log(number), which is true. It's logarithmic in the value of the number, which is the same as saying it's linear in the number of digits of the number.

- eugene.yarovoi November 08, 2013 | Flag Reply

Page:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

CareerCup

eugene.yarovoi

Books

Videos

Resume Review

Mock Interviews