In Learning Science and Educational Data Mining, we often look for student time-on-task (ToT) to better understand the content usage patterns, to make better assessment and predictions about student performance, for reporting, and for other uses. In the domain of online educational software, ToT is an elusive metric. Many things may be happening between the moment when the software assigned an activity to a student, and the moment when it registered that the student finished the activity: students may be working in class or at home, they may be leasurely practicing or being under strict time constraints. Students may take breaks or get distracted. Other factors may come into play like imperfect network connection or different technology used by different students. The list continues. Very often a learning scientist has to work with incomplete and imperfect ToT data.
Nonetheless, we often need estimates of ToT. And that is where your challenge begins. You received a data file with begin and end times for different students doing the same problem, and how they were scored. Using this dataset, develop an algorithm for estimating student ToT for similar problems and use cases. Explain your methodology, include supporting evidence and arguments, and apply your algorithm to estimate ToT for the problem in this dataset.
Attached file: tot_data.tgz
Good luck!