CareerCup

Linkedin Interview Question for Software Engineer / Developers

0

of 0 votes

24
Answers
Given a large document and a short pattern consisting of a few words (eg. W1 W2 W3), find the shortest string that has all the words in any order (for eg. W2 foo bar dog W1 cat W3 -- is a valid pattern)
- shekhar2010us December 18, 2011 in United States for Data Scientist | Report Duplicate | Flag | PURGE
Linkedin Software Engineer / Developer Algorithm

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Team: Data Scientist
Country: United States
Interview Type: Phone Interview

Email me when people comment.

Email me when people comment.

Loading...

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

5

of 7 vote

This can be done by modifying the maximum sum subsequence.
Loop through once keeping track of
Most recent position of w1 w2 and w3, minlengthsofar and min.

- kill January 08, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

I think this solution is correct, pseudo code here:

min_len = infinity;
most_recent_pos[n] = -1;
foreach word in document 
   int sequenceStart = min (the most recent position of all words in pattern except the current word)
int curr_len = pos of curr word - sequenceStart
if curr_len < min_len min_len = curr_len
most_recent_pos[current word] = position of current word

- Sandra Dai January 19, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Could you please explain this code in detail??

- Anonymous April 20, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

Explained in details here:
stackoverflow.com/a/16369063

- PT November 11, 2013 | Flag

Comment hidden because of low score. Click to expand.

1

of 1 vote

string GetMinimumString ( string original, string word1, string word2, string word3 )
{
    const char SpaceChar = ' ';
    
    if(original == null)
        return null;
    
    List<string> words = original.Split(SpaceChar);
    
    Hashtable<string, int> wordList = new Hashtable<string, int>();
    wordList.Add(word1, -1);
    wordList.Add(word2, -1);
    wordList.Add(word3, -1);
    
    int minDistance = Integer.MaxValue;
    int minIndex = 0, maxIndex = 0;
    
    for(int i=0; i< words.Length; i++)
    {
        string w = words[i];
        if(w == word1 || w == word2 || w ==word3)
        {
            wordList[w] = i;
            if ( (w == word1 & (wordList[word2] == -1 || wordList[word3] == -1))
            ||   (w == word2 & (wordList[word1] == -1 || wordList[word3] == -1))
            ||   (w == word3 & (wordList[word1] == -1 || wordList[word2] == -1)) )
                continue;
            
            int distance = GetMax(wordList) - GetMin(wordList);
            if ( distance < minDistance )
            {
                minDistance = distance;
                minIndex = GetMin(wordList);
                maxIndex = i;
            }
            
            if(minDistance == 2)
                break;
        }
    }
    if(maxIndex == 0)
        return null;
    
    string s = "";
    
    for(int i = minIndex; i<= maxIndex; i++)
        s += words[i] + SpaceChar;
    
    return s;
}
    
    int GetMax(Hashtable<string, int> tab)
    {
        int max = 0;
        foreach(Key k in tab.Keys())
            if( tab[k] > max )
                max = tab[k];
        
        return max;
    }
    
    int GetMin(Hashtable<string, int> tab)
    {
        int min = 0;
        foreach(Key k in tab.Keys())
            if( tab[k] < min )
                min = tab[k];
        
        return min;
    }

- PrateekS. July 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

2

of 2 vote

We need a Trie and a HashTable, and do one time scan using sliding window (head + tail pointers) technique

1) Advance tail pointer thru the document and match any word with the help of Trie. Add a count of 1 to HashTable keyed by word.
2) When we matched all n words (HashTable.Size == n), we know document from head to tail contains all words, but not necessarily the shortest.
3) Now advance head pointer and decrease the value in HashTable when we encounter a word. Stop when the value in HashTable is 0. Now we have a shortest match between head and tail at current (tail) location.
4) Advance tail pointer again and stop when any word is matched. Try to advance tail pointer the same way as far as we can. Update the global tracker if new match is shorter.
5) Repeat until tail hit the end of document.

There're optimizations we can do when we try to advance the head (because tail already visited these words).

- Jerry October 24, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

Nice and neat.
Time complexity is O(n^2)

- goalchaser November 02, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 2 vote

We can solve this problem in only one iteration, O(n) Complexity

let there be a variable int result= -1

1) Create a hash table with keys (W1,W2...Wm) and initialize their value by -1
2) Start checking the document word by word if the word exist in hash map then check its value, if its value is -1 then put the index value , else check the hash value with index if index is greater than hash value then put index value in hash. then check if all the hash box have +ve values(means all words are present) then look for max and min of the hasp values if its diff is less than result or result == -1 then store it in result.

- Wayne December 19, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

looking for max and min in the hash values will add overhead O(n * pattern.length)
i'm thinking how to keep track of the two index

- Danielle February 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

looking for max and min in the hash values will add overhead O(n * pattern.length)
i'm thinking how to keep track of the two index

- Danielle February 27, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 votes

yes this will work... we can use TreeMap.. and get last and first entry for min and max

- coder February 06, 2014 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

I think that the hash-table based solutions (suppose that they are not very trick) could work only if W1, W2, .. Wn are all different. If they are not something else should be considered or the hash items should contain a list of positions and should keep as many of these positions as many time Wk is repeated.

- Selmeczy, Péter December 19, 2011 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

I made this attempt to solving this problem using regex, and testing on big.txt (from here norvig.com big.txt).

This runs just under 3sec for 1mil words. My main concern: Is this fast enough?

run with:

java test.ShortestString big.txt

package test;


import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileNotFoundException;
import java.lang.NullPointerException;
import java.lang.Exception;

import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.Scanner;

import java.util.List;
import java.util.ArrayList;



class ShortestString {
  
  public static void main (String args[]) throws FileNotFoundException {
    
    // You can also find out about how to make
    String w1 = "find";
    String w2 = "how";
    String w3 = "make";

    String minM = null; // min matched string 
    int minL = Integer.MAX_VALUE; // min matched string length

    ArrayList<String> patterns = new ArrayList<String>();
    patterns.add("(" + w1 + ".*?\\s.*?" + w2 + ".*?\\s.*?" + w3 + ")");
    patterns.add("(" + w1 + ".*?\\s.*?" + w3 + ".*?\\s.*?" + w2 + ")");
    patterns.add("(" + w2 + ".*?\\s.*?" + w1 + ".*?\\s.*?" + w3 + ")");
    patterns.add("(" + w2 + ".*?\\s.*?" + w3 + ".*?\\s.*?" + w1 + ")");
    patterns.add("(" + w3 + ".*?\\s.*?" + w1 + ".*?\\s.*?" + w2 + ")");
    patterns.add("(" + w3 + ".*?\\s.*?" + w2 + ".*?\\s.*?" + w1 + ")");
  
    // compile all patterns
    ArrayList<Pattern> patternsC = new ArrayList<Pattern>();
    for (String pattern : patterns) {
      patternsC.add(Pattern.compile(pattern));
    }
    
    BufferedReader r = new BufferedReader(new FileReader(args[0]));
    
    try {
      String line = null;
      while ((line = r.readLine()) != null) {
        for (Pattern pattern : patternsC) {
          Matcher m = pattern.matcher(line);
          while(m.find()) {
            String matched = m.group();
            int l = matched.length();
            if (matched != null) {
              if (minL > l) {
                minL = l;
                minM = matched;
              }
//              System.out.println(l+ "\n" + matched); 
            }
          }
//          System.out.println("Pattern done: " + pattern);
//          System.out.println("---------------------------------------------------");
        }
      }
    } catch(NullPointerException e) {
      e.printStackTrace();
    } catch(Exception e) {
      e.printStackTrace();
    }
    
    System.out.println("Min string: " + minM);
    System.out.println("Min len: " + minL);
  }
}

- redronin February 18, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

Please read the question properly. The question says few words, which means it could be a string array not fixed 3 words. Your above solution assumes this (which beats the original question).

- Udayakumar July 07, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

Can we do this?

minLength = infinity;
minLine = ""
For each line L_i:
{
indx1=substring(L_i,w1)
indx2=substring(L_i,w2)
indx3=substring(L_i,w3)

// indx1,indx2 and indx3 are the indexes where w1, w2 and w3 appear in the line L.

Length = max(indx1,indx2,indx3) - min(indx1,indx2,indx3)
if ( Length < minLength)
minLength = Length;
minLine = i;
}

- dee February 19, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Create a standard trie from the given text. Since we have well defined words we can get away with a trie instead of suffix tree. Find the words if they occur in the trie. If yes then we just have to sort the indexes to get our answer. For eg, W1 occurs at 1, 10 ,30. And W2 occurs at 5, 15, W3 at 35 40. Sort the indexes and pick 1, 5, 35.

- xml May 20, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 votes

except the answer would be 30, 15, 35....it wants the shortest string that has all 3 words.

- Anonymous May 23, 2012 | Flag

Comment hidden because of low score. Click to expand.

0

of 0 vote

got this question asked.

- me too October 06, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

O (n* n)

define pattern = "w1, w2, w3";
string smallestStr = null;

for word in Document
{
   if word is w1
    {
          string str = FindPatternString(Document, currentPosiiton);
          if(smallestStr is null or str.Length < smallestStr.Length)
          {
                smallestStr  = str;
          }
    }
}


string FindPatternString(doc, pos)
{
               string left = FindString(doc, 0, pos);
               string right = FindString(doc. pos, doc.length -1);

               if(left == null && right == null)
                        return null;
                else if (left == null)
                         return right;
                else if (right == null)
                          return left;
                else
                        return   left.length > right.length ? right : left;
}

string FindString(doc, from, to)
{
            HashtableFromPattern<Word, Count> table =   {<w1, 1>,  <w2, 1>, <w3, 1>};

            bool isFromSmaller = (to > from);

             while(from != to) {
                     if(table.Count ==0){break;}

                     if(table.contains(doc[from])
                     {
                               table[from]--;

                               if(table[from] ==0) 
                                      table.remove(doc[from]);
                     }

                     if(isFromSmaller )
                              from++;
                      else
                              from--;
             } 

             if(table.count ==0)
                     return doc.substring(from, to);
             else
                     return null;
}

- shrimpy September 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

public void printShortestDis(String w1, String w2, String w3, String[] doc) {
    
    int pos1 = -1;
    int pos2 = -1;
    int pos3 = -1;
    
    int minLen = Integer.MAX_VALUE;
    int minStart = -1;
    int minEnd = -1;
    
    for (int i = 0; i < doc.length; i++) {
        if (doc[i].equals(w1)) {
            pos1 = i;
            if (pos2 != -1 && pos3 != -1) {
                int tmpLen = pos1 - Math.min(pos2, pos3);
                if (tmpLen < minLen) {
                    minLen = tmpLen;
                    minStart = Math.min(pos2, pos3);
                    minEnd = pos1;
                }
            }
        }
        else if (doc[i].equals(w2)) {
            pos2 = i;
            if (pos1 != -1 && pos3 != -1) {
                int tmpLen = pos2 - Math.min(pos1, pos3);
                if (tmpLen < minLen) {
                    minLen = tmpLen;
                    minStart = Math.min(pos1, pos3);
                    minEnd = pos2;
                }
            }
        }
        else if (doc[i].equals(w3)) {
            pos3 = i;
            if (pos1 != -1 && pos2 != -1) {
                int tmpLen = pos3 - Math.min(pos1, pos2);
                if (tmpLen < minLen) {
                    minLen = tmpLen;
                    minStart = Math.min(pos1, pos2);
                    minEnd = pos2;
                }
            }
        }
    }
    
    if (minStart != -1 && minEnd != -1)
        for (int i = minStart; i <= minEnd; i++)
            System.out.print(doc[i] + " ");
}

- Anonymous October 03, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

Worst case is O(n^2).

Basic idea:

When we encounter a word in the sequence. We search to find others until we have a valid string. We repeat and keep track of the minimum string as we find more substrings containing the words.

public class ShortestSubStringExpression {

    Hashtable<String, Boolean> words = new Hashtable<String, Boolean>();


    public static void main(String [] args) {
        ShortestSubStringExpression shortestSubStringExpression = new ShortestSubStringExpression();
        shortestSubStringExpression.solution();
    }

    public void solution() {
        words.put("Hello", true);
        words.put("World", true);
        words.put("Ciao", true);

        ArrayList<String> document = new ArrayList<String>();
        document.add("Hello");
        document.add("was");
        document.add("Hello");
        document.add("maybe");
        document.add("World");
        document.add("Ciao");
        document.add("this");
        document.add("was");
        document.add("Hello");

        printShortestStringWithPattern(document);
    }


    public void printShortestStringWithPattern(ArrayList<String> document) {

       int minStringLength = -1 ; // Use -1 as "infinite"
       String minString = null ;

       for(int pos = 0 ; pos < document.size() ; pos++) {
           String currWord = document.get(pos);

           if(words.get(currWord) == null) {
              continue ;
           } else {
              minString = getMinimumStringSoFar(document, pos, minString);
              if(minString != null)
                 minStringLength = minString.length();
           }
       }

       System.out.println(String.format("Min String: %s. Length: %d" , minString, minStringLength));
    }

    private String getMinimumStringSoFar(ArrayList<String> document, int pos, String currentMinString) {

        int wordsEncountered = 0 ;

        StringBuffer newBuffer = new StringBuffer(2048);
        HashMap<String,Boolean> foundSoFar = new HashMap<String, Boolean>();

        for(int i = pos ; i < document.size() ; i++) {

            newBuffer.append(document.get(i));

            String currWord = document.get(i);

            if(words.containsKey(currWord) && !foundSoFar.containsKey(currWord)) {
                wordsEncountered ++;
                foundSoFar.put(currWord, true);                      }

            if(wordsEncountered == words.size()) {
                break ;
            }
        }

        if(wordsEncountered == words.size()){

            String newString = newBuffer.toString();

            if(currentMinString ==  null || currentMinString.length() > newString.length()) return newString;

            return currentMinString;

        } else {
            return currentMinString;
        }
    }

- Anonymous October 18, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

We have dynamic programming based solution on following link
stackoverflow.com/a/16369063

- PvT November 11, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

import sys

def getMinString( str, search_L ):
    min_dis = sys.maxint # assign max int
    L = [] # used for result output
    w_pos = {}

    for i in range(len(str)):
        if str[i] in search_L:
            w_pos[str[i]] = i

        if len(w_pos)==3:
            v_list = w_pos.values()

            if v_list[2]-v_list[0] < min_dis:
                min_dis = v_list[2]-v_list[0]
                L = v_list
                w_pos = {}

    return "".join( [str[i] for i in range(L[0],L[2]+1)] )

print getMinString( 'Hellowrold', ['l','r','d'] )

- Akkking October 28, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

It can be done in O(N)

Keep indices track int last_W1, last_W2, last_W3 - where as we go through the doc from left to right, the last seen index of W1 is set to last_W1, last seen index of W2 is set to last_W2 so on.

Anytime W3 is seen, update last_W3 (similarly for last_W2, last_W1 cases). Then check
last_W3 minus min(last_W2,last_W1) and if it is lesser than global_min, update global_min and save the three indices last_W3, last_W2, last_W1.

Once you reach the end of the doc, the last_W1, last_W2 and last_W3 should give the needed substring from the doc. We can go to those locations and print the substring.

- venkat325 November 23, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

0

of 0 vote

{{
static int ShortestStringWithWordSequences(string[] text, string[] wordSequences)
{
if (text == null || text.Count() == 0 || wordSequences == null || wordSequences.Count() == 0)
return 0;

var minLen = int.MaxValue;
var dict = new Dictionary<string, int>();
var index = 0;

for(int i=0; i<text.Length; i++)
{
if (wordSequences.Contains(text[i]))
{
if (!dict.ContainsKey(text[i]))
dict.Add(text[i], index);
else
dict[text[i]] = index;

if (wordSequences.Count() == dict.Count())
{
var startIndex = dict.Values.Min();
var len = index + text[i].Length - startIndex + 1;
if(minLen > len)
{
Console.WriteLine(len);
}

minLen = minLen > len ? len : minLen;
}
}

index += text[i].Length;
}

return minLen == int.MaxValue ? 0 : minLen;
}
}}

- SK August 24, 2016 | Flag Reply

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.