Amazon Interview Question for Developer Program Engineers


Country: India
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
1
of 1 vote

1) Create HashMap <String, Index> which maps a word to its latest position in the text.
2) Create priority Queue which contains indexes of the words which have been already found.
(Priority queue stores sorted elements so first element in the queue is the beginning of the segment.
3) Every time a matching word is found on the new position update queue:
- remove previous position and add new one
4) If all words have been found check if segment is smaller then the minimum.
Java:

public static void findSegment(String[] text, String[] dict) {
        Map<String, Integer> wordToIndexMap = new HashMap<String, Integer>();
        Queue<Integer> positions = new PriorityQueue<Integer>();
        int missingWords = dict.length;
        int minDist = Integer.MAX_VALUE;
        int startIndex = -1;
        int endIndex = -1;
        for (int i = 0; i < dict.length; i++) {
            wordToIndexMap.put(dict[i], null);
        }
        for (int i = 0; i < text.length; i++) {
            if (!wordToIndexMap.containsKey(text[i])) {
                continue;
            }
            if (missingWords > 0 && wordToIndexMap.get(text[i]) == null) {
                missingWords--;
            }
            if (wordToIndexMap.get(text[i]) != null) {
                positions.remove(wordToIndexMap.get(text[i]));
            }
            wordToIndexMap.put(text[i], i);
            positions.add(i);
            if (missingWords == 0 && (i - positions.peek() + 1 < minDist)) {
                minDist = i - positions.peek() + 1;
                startIndex = positions.peek();
                endIndex = i;
            }
        }
        if (missingWords == 0) {
            System.out.println("Segment starts: " + startIndex + ", ends: " + endIndex);
        } else {
            System.out.println("Segment not found");
        }
    }

- hulkthedestroyer April 02, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

"First" and "Shortest" are two different things.

Let N = length of paragraph and W = length of word list
The following is an O(N) time and O(W) space solution

The idea is that you keep two pointers to the beginning and end of the window. Also, keep a list of the number of occurrences of each of the given words in the window.

A) Increase the end-pointer and each time, if the newly seen word belongs to our list, add its counter
B) If you have seen all the required pairs, report the window (add it to "pairs")
C) Increase the begin pointer under either of the following conditions:
C-1) The word begin-pointer refers to does not belong to the list
C-2) The word belongs to the list but it has occurred more than once in the window (makes the window shortest possible)

Continue until you reach last.

Complexity analysis: O(W) space to remember the count and O(N) time for the while loop.
To see the O(N) time note that each word in paragraph is visited "at most" twice, once by end-pointer and once by begin-pointer.

words = ['list', 'of', 'words'];
string = 'this part of text that includes words and lists but not list of words and if you need a list which contains many more words you should try a longer list of terms or list of words'
para = string.split(' ');

count_words = dict()
for w in words:
    count_words[w] = 0

pos_begin = 0
pos_end = 0
not_seen = len(words)

pairs = []
while pos_end < len(para):
    if para[pos_end] in words:
        if count_words[para[pos_end]] == 0:
            not_seen -= 1
        count_words[para[pos_end]] += 1
    
    
    while pos_begin < pos_end:
        if para[pos_begin] in words:
            if count_words[para[pos_begin]] == 1:
                break
            else:
                count_words[para[pos_begin]] -= 1
        pos_begin += 1
    
    # We might have found a shorted window
    if not_seen == 0:
        pairs.append([pos_begin, pos_end])
    
    pos_end += 1

print pairs

- Ehsan April 01, 2014 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

this seems it will return first always i need smallest subpara in the whole. hope i am able to clarify.

- sameer.careercup April 01, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

It returns all pairs. Add this to get the min:

min_val = len(para)
 	window = [0, min_val - 1];
	for k in range(0, len(pairs)):
        	if pairs[k][1] - pairs[k][0] < min_val:
			min_val = pairs[k][1] - pairs[k][0]
			window = pairs[k]

- Ehsan April 01, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

How is this solution O(W)? when you split the string, you will get an array of all the words in the paragraph. This is (n- (spaces)). which is O(N). right?

- Anonymous April 01, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

I should say O(W) extra memory. The paragraph is the input anyway. I mean, I assumed it is the input to the function.
In case it is being streamed in, we need a buffer and O(N) space.

- Ehsan April 02, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Question about this statement you are making for a condition to advance the begin pointer :

"The word belongs to the list but it has occurred more than once in the window (makes the window shortest possible) ".

That logic seems flawed in that you might inadvertently skip over another "found" word in the list. So in a sentence like

"Is this the first time I have seen the word first or is it the second"

where the target words are "first | seen | second", I think your logic would improperly advance the begin pointer when you run into the second occurrence of the word "first"... skipping right past "seen" and returning an false shortest window.

- johny418 April 02, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Thanks for the comment.
You only skip over a word if
A) it does not belong to the list
B) it belongs to the list but it has been recorded more than once.

In your example if "first" has been recorded multiple times, then we skip over it. But when we reach "seen", we will check if another word "seen" has been spotted before (by the end-pointer at some previous iteration). If so, the count has to be > 1 and we can skip over it. Otherwise, the loop breaks and "begin-pointer" will not increase.
Note that I only count a word when the end-pointer reaches one.

- Ehsan April 02, 2014 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

leetcode.com/2010/11/finding-minimum-window-in-s-which.html...modify the solution for words instead of chars

- Nitin April 02, 2014 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More