shanthikp
BAN USERdata stream (probabilistic) algorithm like Misra-Gries' Frequent algorithm
- hash table with only 10 items or less at all times
- if elem in hash table increment count by 1
- if elem not in hash table
-- if num elem in hash table < 10, insert elem with count 1
--- else decrement count of every elem in hash table by 1
------- remove elem from hash table if count = 0 or outside time window t
so you need 3 hash tables (for 1 hr, 1 day and 1 month)
why does this work probabilistically ? decrementing the count by 1 for every elem in stream will knock out the less frequently items while the most frequent items will survive..
for vm in vmList:
takeSnapshotAndSchedule(vm)
def takeSnapshotAndSchedulet(vm):
takeSnapshot(vm.vmId)
next_run = datetime.now() + vm.freq_in_min
heappush(job_heap, (next_run, vm))
ever so often check if a vm snapshot is due
next_to_run, x = job_heap[0]
if (datetime.now() > next_to_run ) {
next, vm = heappop(job_heap)
takeSnapshotAndSchedule(vm)
}
O(n^2) using single hash table
- shanthikp September 15, 2018#1: insert all elems into hash table
#2: for each elem in hash table
---- fix elem i as one of the triplets
---- for all other elements j != i :
----- look for sum - (i + j) in hash table
---- if found, output triplet