jtogari223
BAN USERPart 1 - Overall design
The basic algorithm is that track session start and end times. You also have an 'interval generation class' which generates pairs of start and end times. Each pair represents a day. For example {2017-05-012T00:00:00, 2017-05-13T00:00:00} represents the day May 12 2017. Have a 'worker' class which can compute averages for a given collection of pairs/intervals. This 'worker' class would implement Java Callable interface and return the averages for the days it is assigned. Also have a 'controller' class which uses the 'interval generation' class to retrieve all intervals, creates a 'worker', assigns a batch of intervals to each worker then uses Java Executors and Futures to submit the 'worker' to the Java Executor. Once all 'worker's finish their assigned tasks, the 'controller' would have the result in the Futures returned by the workers.
Part 2- Calculation of skype call duration for a single day
This is the part done by the 'worker'. The work is assigned a batch of days.
For each day, the worker collects all calls which started or ended that day. For each such call, it adds the portion of time spent for that call on that day into a time counter. Then it increments the number of calls (call counter). Once done, the average for that day is the time counter divided by the call counter. It collects this average into a Map with the day interval as key and average as value. This is the Map which is returned as a Java Future
Your assumption that average call duration is per user makes sense. If the total number of users is not too much, a single threaded / single process design is fine. Calculating the average once per day at 11:55PM is also makes sense, provided you are only asking for the average the next day and not during the same day. If the audience is a user who is looking at his own average call duration, the above assumptions make sense.
- jtogari223 February 05, 2018If the audience is an administrator who is looking into average call duration frequently and tracking averages for per user basis or per day basis you would need something more accurate (i.e., calculate frequently) and fast. Also, as the number of users increases (maybe into millions), we would need a multi-threaded design so that averages are calculated faster for a single day. To calculate today's averages, using the daily 'cron job' or trigger, it would have to make use of worker threads for a set of users. So each 'worker' instance I mentioned in my earlier post would be for a single day for a set of users. In order to make the averages available throughout the day and with somewhat more accuracy, the cron job would have to be done more frequently, say every three hours.
Boundary scenarios: I agree, the system could monitor on a per call basis and keep recording duration - I just assumed other parts of the system would do this or something similar (like tracking session to check for timeout) and then record the end time even if user failed to hit the end button. You could also skip calls whose end time was not recording so that inaccurate information is not used. Calls made from the same user account - I assumed other parts of the system would take care of this and create a unique session for each device.