Amazon Interview Question
Software Engineer / Developersclass Node{
int clientId;
String url;
}
HashMap<Node> h = new HashMap<Node>();
FileInputStream fstream = new FileInputStream("textfile.txt");
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
//Parse the content and store the clientId and url in the Node
// If the clientid is the same o the url is not in the HashMap (depending on what the interviewer want), add it to the hashmap. If present, get the value and increment it.
}
We can also implement the same by making bucket of set of records, say bucket/node of size 1kb(assuming 1 kb block size).
class bucket
{long client_id[]=new long[x]; // can be calculated assuming id of fix size
long offset[]=new long[]; // offsets of records in original file
}
now we can use extedible hashing/external hashing to add ids to same buckets by using hash fx:key mod(2 power g),where g is global depth of hash table/array ,which is incremented at each bucket split.
now we can write these buckets to an external file/hash file as we know each bucket size is 1 kb and number of buckets too by using following formulas
seek(pos) where pos takes values like 0,1024,2048 etc...
at retrieval time , we can hash the client id, using same hash function,and seek to hash file,from that we can reach the actual record in input file , which we can print.Also we can increment the count in input file so next time we get correct informaton.
Initially my response was to use a grep command in unix and filter the stuff and also make a count. but it turns out to be not perfect. Then i responded with another idea. Using a HashMap which has the key as a node(clientid,url) and the value as the count it is visited. It was right and she just asked me for a pseudocode.
- GBC February 24, 2011