pascalkieslich / mousetrap Goto Github PK

View Code? Open in Web Editor NEW

43.0 43.0 17.0 15.73 MB

Process and Analyze Mouse-Tracking Data

Home Page: http://pascalkieslich.github.io/mousetrap

License: GNU General Public License v3.0

R 84.46% C++ 15.54%

analysis clustering mouse-tracking r visualization

mousetrap's People

Contributors

Stargazers

Watchers

Forkers

sbrockhaus nickduran dwulff thomasstoltmann joolsa huzhangron imarcello aminechaigneau cogyamaguchi aung2phyowai mrb0y anamartinovici louisfred zabarhoumi mousetrack likao seapsy

mousetrap's Issues

compare measures for reset_timestamps = FALSE with and without extra timestamp 0

This is just a small consistency check I did to compare the computed mouse-measures when the timestamps are not reset to zero. In the first case I just left the data as it is with the first timestamp != 0. In the second case, I added the data for timestamp 0 by hand by repeating the first observation. As you can see in the example code almost all measures are equal. The only measures with discrepancies are AD and vel_min_time.
The "time at which minimum velocity occurred first" is always the first timestamp. For the AD, the differences are quite small.

library(mousetrap)
library(abind) ## function abind()

### read in data, set reset_timestamps = FALSE
mt_data <- mt_import_mousetrap(mt_example_raw, reset_timestamps = FALSE)
mt_data$trajectories[1:2, 1, 1:5]

mt_data <- mt_derivatives(mt_data)
mt_data <- mt_measures(mt_data)


## now add the timepoint 0 and set it to the first observation
time0 <- mt_data$trajectories[ , , 1, drop = FALSE]
time0[ , "timestamps", ] <- 0 

mt_data0 <- mt_data
mt_data0$trajectories <- abind(time0, mt_data0$trajectories)
mt_data0$trajectories[1:2, 1, 1:6]

mt_data0 <- mt_derivatives(mt_data0)
mt_data0 <- mt_measures(mt_data0)


###### compare the computed mouse-measures
all(mt_data$measures == mt_data0$measures)
which( colMeans(mt_data$measures == mt_data0$measures) != 1 )

## the measures AD and vel_min_time are different!

## AD: Average deviation from direct path
## only slightly different 
mt_data$measures$AD
mt_data0$measures$AD

## Time at which minimum velocity occurred first
## is always the first timestamp
mt_data$measures$vel_min_time
mt_data0$measures$vel_min_time
all(mt_data$trajectories[, 1, 1] == mt_data$measures$vel_min_time) 


###### compare the computed derivatives
mt_data$trajectories[1, , 1:6]
mt_data0$trajectories[1, , 1:6]

Sample Entropy R Calculation

I have been playing around with the mousetrap package to validate my mouse feature calculations in Python and I noticed that my sample_entropy function produces different results for calculating the sample entropy of x-position difference values for 6 trials than the mt_sample_entropy function. This seemed weird to me because I actually translated your sample_entropy(x, m, r) function into Python and was able to produce equal results for tests with different example vectors as inputs.

Looking more closely at the mt_sample_entropy function, I found that the difference between our approaches is that you calculate r using the standard deviation of the x-position difference data across all trials r <- 0.2 * stats::sd(diff(t(trajectories[, , dimension])), na.rm = TRUE) and I calculated separate r-values for each individual trial using the x-position difference data of the target trial only (which I find more intuitive). What is the rationale behind your approach (or is it a potential bug) and do you see an obvious shortcoming of my approach? I also read through the references you mentioned, but didn't find them very explicit in their explanation about the sample entropy calculations. Hehman et al. (2015) write in their supplementary files: Recommended tolerance r is the standard deviation of x-shifts (Δx) across conditions Was this the reasoning for your approach? What would you recommend if each trial represents a "different" condition or is independent of each other?

idle time for mouse-trajectories with sampling rate depending on movement

For technical reasons it is possible that the mouse-coordinates are recorded whenever the mouse changes the position (in contrast to data that is logged at a constant rate). To get an idea of such data, I include a minimal example that closely mimics my data at hand.
I think for the computation of idle time (and hovers, but I would stick to idle time for the moment), it is important how such data is treated.

library(mousetrap)

## type in timestamps and coordinates, similar to really observed data  
timestamps <- c(30, 50, 341, 349, 357, 365, 373, 381, 389, 397) - 30 # in ms (get rid of initiation time)
xpos <- c(80, 80, 75, 65, 53, 43, 32, 24, 17, 11)
ypos <- c(46, 45, 44, 41, 38, 35, 32, 30, 29, 27)

## it seems that the maximal rate of recording mouse-movements is every 8ms 
diff(timestamps)

## create array of mouse-trajectories 
temp <- matrix(c(timestamps, xpos, ypos), ncol=3, nrow=10)
colnames(temp) <- c("timestamps", "xpos", "ypos")
temp1 <- temp 
tr <- array(NA, dim = c(2, 10, 3), 
            dimnames = list(1:2, 1:10, c("timestamps", "xpos", "ypos")))
tr[1,,] <- temp
tr[2,,] <- temp1
tr

## create mousetrap object
mt_data <- list(data = data.frame(x1 = 1:2, x2 = 4:5), 
                trajectories = tr)
class(mt_data) <- "mousetrap"

Focusing on the timestamps, we see that when the mouse is moving, the logging rate is every 8 ms:

> diff(timestamps)
[1]  20 291   8   8   8   8   8   8   8

My intuition would be that the idle time is 20 + 291 = 311 ms for this trajectory (whether a lower threshold > 8 ms should be defined for times to count as idle time is a different question that should be discussed - when not making this threshold explicit it is determined by the sampling rate).

When using the mousetrap-package to compute the idle time on this data, we get per definition 0 idle time, as the coordinates are only logged in case of movement.

## compute measures 
mt_data <- mt_measures(mt_data, flip_threshold = 0, use = "trajectories", 
                       save_as = "measures", hover_threshold = 2000 )
## idle time is zero
mt_data$measures$idle_time

When doing a resampling of the observations to a regular grid using linear interpolation, again the idle time is 0 (or only non-zero in case of mistakes with the machine epsilon).

#### compute equidistant versions of the trajectories with timestamps every 10 milliseconds
mt_data <- mt_resample(mt_data, exact_last_timestamp = FALSE, 
                       save_as = "rs_10_trajectories",
                       step_size = 10L)  

## compute measures 
mt_data <- mt_measures(mt_data, flip_threshold = 0, use = "rs_10_trajectories", 
                       save_as = "measures_10", hover_threshold = 2000 )
## idle time is zero
mt_data$measures_10$idle_time

We already discussed that a possible solution would be to do linear interpolation (see #7). However, we agreed that liner interpolation is better than constant interpolation whenever the mouse is moving.
Maybe it would be a solution to make a mix of linear and constant interpolation, with constant interpolation for time-differences that are above a certain threshold (this threshold should be 0 per default for backwards-compatibility and in my current example it would be set to 8).
What do you think?

Otherwise the idle time could be computed as

difft <- diff(timestamps)
sum(difft[difft != min(difft)])

But in my opinion this second solution fits very poorly into the current package design.

Weird "_ideal" values calculated by mt_deviations()

Hi Pascal,

I noticed that there can be trajectories for which mt_deviations() comes up with values for both xpos_ideal and ypos_ideal that seem off. Please consider the 10-point trajectory in the example below. For example, the values for ypos_ideal start at 0 and decrease for a number of steps before increasing towards 1. They should be monotonically rising, though.

I hope this isn't just based on a fault of mine.

Best,
Mathias

library(mousetrap)

xpos <- c(-0.3, 0.121649484536082, 0.342201834862385, 0.446153846153846, 
0.466942148760331, 0.414876033057851, 0.284615384615385, 0.0532110091743122, 
-0.332989690721649, -1)

ypos <- c(0, 0.241237113402062, 0.396330275229358, 0.507692307692308, 
0.59504132231405, 0.669421487603306, 0.738461538461539, 0.809174311926605, 
0.890721649484536, 1)

mt_x <- mt_add_trajectory(mt_example, xpos = xpos, ypos = ypos, id = "deviations")
mt_x <- mt_deviations(mt_x)
mt_x$trajectories["deviations",1:10,2:5]

mousetrap-measures computed on irregular and on equidistant versions of the trajectories

As a sensitivity analysis I computed the mouse-measures once on the original irregular data and once on an equidistant version of the trajectories.
For some measures, there is some difference, e.g. for idle_time, hovers;
for the velocity and acceleration measures it makes quite a big difference.

Consider the following MWE:

library(mousetrap)

mt_data <- mt_example

## make the timestamps irregular 
mt_data$trajectories[  , , "timestamps"] <- mt_data$trajectories[  , , "timestamps"] + runif(38*465, 0, 9)

#### compute equidistant versions of the trajectories with timestamps every 50 milliseconds
mt_data <- mt_resample(mt_data, exact_last_timestamp = FALSE, 
                       save_as = "rs_trajectories",
                       step_size = 50L) ## method = "constant" 

## compare regular and irregular version of trajectories
plot(mt_data$trajectories[1, , "timestamps"], mt_data$trajectories[1, , "ypos"])
points(mt_data$rs_trajectories[1, , "timestamps"], mt_data$rs_trajectories[1, , "ypos"], col=2, type="b")


## compute mousetrap measures using the equidistant version of the trajectories
mt_data <- mt_derivatives(mt_data, use = "rs_trajectories") 
mt_data <- mt_measures(mt_data, flip_threshold = 0, use = "rs_trajectories", 
                       save_as = "measures", hover_threshold = 200)


## compute the mouse-measures on irregular observations 
mt_data <- mt_derivatives(mt_data, use = "trajectories") 
mt_data <- mt_measures(mt_data, flip_threshold = 0, use = "trajectories", 
                       save_as = "irregular_measures", hover_threshold = 200)

## plot the measures computed on regular and irregular observations 
for(i in 2:ncol(mt_data$irregular_measures)){
  plot(mt_data$measures[,i], mt_data$irregular_measures[,i], 
       main = colnames(mt_data$measures)[i], xlab = "computed on equidistant trajectories", 
       ylab = "computed on irregular trajectories")
}

all(colnames(mt_data$measures) == colnames(mt_data$irregular_measures))

Do you have a recommendation whether irregular or regular time-stamps should by used?

documentation

Just a small note on the documentation:

Line

mousetrap/R/mousetrap.R

Line 64 in 01e170b

#' \link{mt_sample_entropy} calculates the initial movement angle.

is
#' \link{mt_sample_entropy} calculates the initial movement angle.
but I think it should be
#' \link{mt_movement_angle} calculates the initial movement angle.

mt_import_long imports the wrong trajectories when mt_id_label contains mixed case data

Description

If the function mt_import_long is called with a mt_id_label value that references columns that contain mixed case, then the data is imported in the wrong order, i.e. the wrong mousetracks are assigned to the incorrect mt_ids.

Reproducible example

dt <- expand.grid(sbj=c(1,2,3),condition=c(3,2,1),timestamp=seq(1:10)) %>%
    mutate(condition=c("a","b","C")[condition]) %>%
    mutate(xpos=c(a=10,b=20,C=30)[condition]*timestamp) %>%
    mutate(ypos=c(a=1,b=2,C=3)[condition]+timestamp)

 mt_import_long(dt,mt_id_label = c("sbj","condition"),timestamps_label = "timestamp")

Expected:

, , xpos

    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
1_C   30   60   90  120  150  180  210  240  270   300
2_C   30   60   90  120  150  180  210  240  270   300
3_C   30   60   90  120  150  180  210  240  270   300
1_b   20   40   60   80  100  120  140  160  180   200
2_b   20   40   60   80  100  120  140  160  180   200
3_b   20   40   60   80  100  120  140  160  180   200
1_a   10   20   30   40   50   60   70   80   90   100
2_a   10   20   30   40   50   60   70   80   90   100
3_a   10   20   30   40   50   60   70   80   90   100

Actual:

, , xpos

    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
1_C   10   20   30   40   50   60   70   80   90   100
2_C   10   20   30   40   50   60   70   80   90   100
3_C   10   20   30   40   50   60   70   80   90   100
1_b   30   60   90  120  150  180  210  240  270   300
2_b   30   60   90  120  150  180  210  240  270   300
3_b   30   60   90  120  150  180  210  240  270   300
1_a   20   40   60   80  100  120  140  160  180   200
2_a   20   40   60   80  100  120  140  160  180   200
3_a   20   40   60   80  100  120  140  160  180   200

I.e. the condition A is imported in the position of condition C.

Analysis:

This is caused by a difference how dplyr::count (here) and order (here) sort mixed case data.

Example:

dt <- dt %>% mutate(mt_id = paste0(sbj,"_",condition))

dt[order(dt$mt_id),]    # Produces order a,b,C
dplyr::count(dt,mt_id) # Produces order C,a,b

mt_exclude_initiation not working

Hi,
Thanks for the awesome package! I have been trying to use the mt_exclude_initiation function, but it does not seem to work. That is, the returned trajectory array seems unchanged. As far as I can see, this applies also to the example in the documentation (with mt_example). I have also tried to apply it to the KH2017 data set. Same result - nothing seems to happen. The resulting trajectory data are identical to the non-modified trajectories. Am I doing something wrong?
Best,
Pierpaolo

vectorize the function point_to_line() for time speed up

To speed up the computation of the idealized line by the function points_on_ideal(), I propose to vectorize the function point_to_line() such that the argument P0 expects a matrix of points and not just a single point.

A small benchmark on computation times shows that the time is reduced drastically. On my machine the mean computation time for an example with 1000 observations can be reduced from about 1400 to 100 microseconds.

As the functions points_on_ideal() and point_to_line() are both internal, it should be easy to replace point_to_line() by the vectorized version and keep everything else as is.

Attached you find such a vectorized version of point_to_line(), with some code for benchmarking (R-Code as .txt as .R files cannot be uploaded here).

Best,
Sarah

vectorize_point_to_line.txt

even though derivatives() is applied before measures() - the distance, velocity and acceleration are not calculated

make it available in python as well

mt_diffmap error "Determine joint bounds Error in x[, , dimensions[1]] : incorrect number of dimensions" and "Error in x[[use]][!condition, , ]: (subscript) logical subscript too long"

Hi,

I used your toy example in this message

#12

and added a grouping variable

grp = c('G1','G1','G1','G2','G2','G2')
dummy = data.frame(grp, x, y, unix_time, event_time, sequence_number)

When I run the following to create a difference heat map:

mt_diffmap(test, condition = "grp", filename = NULL)

it throws the error:

Determine joint bounds Error in x[, , dimensions[1]] : incorrect number of dimensions

because x is a 2D matrix

The complete line of code is

debug: range_x1 = range(x[, , dimensions[1]], na.rm = TRUE)

By the way, if I call mt_import_long without specifying timestamps_label (because my real data is on msec resolution and doing an as.numeric on my POSIXct timestamps results in sampling points with the same timestamp so I thought I'd let mt_import_long create the timestamps) results in this error when I call mt_diffmap:

Error in x[[use]][!condition, , ] : (subscript) logical subscript too long

I'd appreciate any help -- thanks!
Gina

mt_align_start_end() introduces NaNs and -Infs

Hey Pascal,

mt_align_start_end() has been giving me some headaches similar to the ones in this forum thread:
https://forum.cogsci.nl/discussion/4415/question-about-mt-time-normalize-function-in-mousetrap-r-package?

My first intuition was to blame exceedingly short trajectories, too, and excluding them worked until it didn't. Following your second suggestion, I took care of the 'no-variation' trajectories, and that also seemed to work for a while/in some cases.

I believe neither the first nor the second of the conditions you identified as problematic holds true for this small set of values:

mt_data$trajectories[,,]
timestamps xpos ypos
[1,] 0 -20 30
[2,] 10 30 40
[3,] 20 -10 50
[4,] 30 40 60
[5,] 40 -20 80

However, calling mt_align_start_end() with argument defaults on these values results in the following:

mtdata$se_trajectories[,,]
timestamps xpos ypos
[1,] 0 NaN 0.0
[2,] 10 -Inf 0.2
[3,] 20 -Inf 0.4
[4,] 30 -Inf 0.6
[5,] 40 NaN 1.0

Sorry if this is not the conventional way of filing an "issue", this is my first time doing anything like this on GitHub or anywhere.

Angle calculation for only one row of data

I tried to use the mt_derivates() and mt_angles() function on an example dataset that only contained one trial (trajectories matrix shape was 1x100x3). The mt_derivates function worked as intended, but the mt_angles function failed with the error message:

Error in getAnglesP(trajectories[, , dimensions[1]], trajectories[, , (MouseTrap_Feature_Validation.R#42): Not a matrix.

When adding more trials to the dataset (e.g. trajectory matrix shape 4x100x3), the mt_angles() function worked as intended. I tried to fix the bug by myself and create a pull request, but unfortunately im an R beginner.

reshape data in long format to data in mousetrap format as array

This is a general question on whether the mousetrap package already conains the following functionality. I convert a mousetrap-object into long format, add one or several new variables to the long data and then I would like to transform the data back to mousetrap format. Consider the following example code:

library(mousetrap)
## example from package 
mt_example <- mt_import_mousetrap(mt_example_raw)

## reshape data into long format 
tr <- mt_reshape(data = mt_example, use = "trajectories", use2 = "data", 
                 use2_variables = c("Condition"), 
                 aggregate = FALSE) 

## add a new fancy variable 
tr$new_var <- rnorm(nrow(tr))

## now reshape back to mousetrap object like mt_example?
?

Is there a possibility to reshape tr back to a mousetrap object like mt_example, where new_var is part of trajectories like "timestamps" "xpos" "ypos"?

Error in dimnames(trajectories)[[3]] : subscript out of bounds

Each function called on the resulting mousetrap object from the long import is giving this same error.