Comments (5)
this has been very very helpful! As you might understand, I am not very experienced in the field. I've followed some tutorials and checked the book by Simon Wood, but clearly, I misinterpreted too many things. (especially how ti(timestamp,diff_type, bs = c("fs", "fs"), k = c(30, 30)) works)
I'm very grateful for your clarifications, it makes so much sense now. Thanks!
from gratia.
I don't understand your modelling choices here.
At the moment your model is
diff_values ~ group + diff_type + group:diff_type +
s(timestamp, k = 150, bs = "ad") +
s(timestamp, by=group, bs = "ad", k = 100) +
s(timestamp, by=diff_type, bs = "ad", k = 100) +
s(subject, bs = "re", k = 80) +
ti(timestamp, group, bs = c("fs", "fs"), k = c(30, 30)) +
ti(timestamp, diff_type, group, bs = c("cr", "fs", "fs"), k = c(30, 30, 30)) +
ti(timestamp,diff_type, bs = c("fs", "fs"), k = c(30, 30)) +
ti(timestamp, diff_type, by = group, bs = c("cr", "fs"), k = c(30, 30))
and there are a number of weird things here:
- You already have first order smooth interactions between
timestamp
and both ofgroup
anddiff_type
through the factor by smooths. So why do you re-add these terms as ´ti()` smooths 1 and 3? - You shouldn't use the
"fs"
smooth as a marginal like this. In the definitionti(x, g, bs = c("fs", "fs"))
what are you hoping to achieve? Normally, a"fs"
smooth is of two covariates, a factor and a continuous variable. Here though, you are defining a"fs"
smooth for each margin, where the margins are of single terms. I suspect you'll be gettingti(x, g, bs = c("cr", "re"))
I think. This would be very similar to as(x, by = g)
except thelevels(g)
smooths would be sharing a single smoothing parameter in theti()
version.
To set what I think is the desired model (as I don't understand what you are trying with most of those ti()
terms), I would do this:
diff_values ~
s(timestamp) + # 1
s(timestamp, group, bs = "sz") + # 2
s(timestamp, diff_type, bs = "sz") + # 3
s(timestamp, group, diff_type, bs = "sz") + # 4
s(subject, bs = "re") # 5
where the numbered smooths are
- The average (main) effect of
timestamp
- The smooth difference between 1. and the effects of
timestamp
withingroup
, plus thegroup
means - The smooth difference between 1. and the effects of
timestamp
within diff_type, plus thediff_type
means - The smooth differences of the effects of
timestamp
for combinations ofgroup
anddiff_type
, a three-way interaction between a smooth and two factors, but one whose effects are orthogonal to terms 1, 2, and 3, and subject
-specific means via a random intercept.
I am not surprised that difference_smooths()´ fails for your model; recently, I fixed a bug in gratia that affected
difference_smooths()handling of decomposed HGAM forms (like you are using, with
ti() terms), which might make your model work now - the error was the same as you are seeing from ´combn()
and was because I was picking up the ti(x, z)
term as well as the ´ti(x, z, by = g)` terms. Now I would only select the latter and the former is not a factor-by smooth.
I think the "sz"
basis version of your model is what you want, but unfortunately it doesn't work with difference_smooths()
which is only for factor-by models currently. This is because the difference between the levels of groups defined by group:diff_type
are actually spread over 4 smooth terms.
This doesn't mean it's not possible to do it, I just need to write a function to handle these special bases, so that we select all the required smooths. This however would be a different thing to what difference_Smooths()
was intended as it would have to include the group means and difference_smooths()
doesn't do that by default (although group_means = TRUE
now allows this).
Perhaps try the GitHub version of gratia (to see if your problem is fixed with the changes I did to fix a related issue) and see if that works for you, but perhaps you could clarify your model a bit as that might help as well. And if I have misunderstood your model, let me know and I can take another look. A reproducible example would be very helpful.
from gratia.
Just an FYI; I don't think that you can get adaptive smooths as a marginal in a tensor product so I doubt the "sz"
basis model I showed will get you to exactly where you might want as this basis is implemented internally as a tensor product.
from gratia.
Dear Gavin,
Thank you very much for your quick response and corrections!
It was not clear to me that s(x, by=y) and ti(x,y) serve the same purpose.
Since s( x,y) terms are supposed to be isotropic, I decided to add tensor interactions, and after checking several tutorials, I wanted to include reference and difference terms to capture differences arising from by=facs . (eg: [https://jacolienvanrij.com/Tutorials/GAMM.html#summary] )
the initial model was intended to model pupil size data from 2 groups where each group has 3 condition with ~45 subjects.
data is collected from -750 ms to 4000 and stimulus is introduced at timepoint 0. My fundamental goal is to show in which time intervals, differences between conditions differ between 2 groups.
thus, I used same model for differences, noted as diff_type and diff_value.
the model should be precise enough to be very flat at the beginning and change suddenly after timepoint 0. that's why I choose "ad" and "cr". I tried to achieve this by increasing k with your suggested model, and it improved.
Overall, the model you suggested works perfectly as intended, except before timepoint 0 (even after increasing k). for marking significant timeframes, I used plot_diff {itsadug}, maybe the way that function calculates the CIs is the main problem, not the model.
do you have any suggestions for finding time intervals where smooth differences are significant?
This is for my master's thesis and I definitely will acknowledge your help!
Best,
Ahsen
from gratia.
It was not clear to me that s(x, by=y) and ti(x,y) serve the same purpose.
They don't, and that's not what I said.
s(x, by = f)
produces a smooth ofx
for each level of factorf
.s(x, z)
is an isotropic thin plate spline (or a generalisation of that, Duchon spline)s(x, f, bs = "fs")
is a "random" smooth of x per level off
. This is similar tos(x, by = f)
in that you get a smooth per level off
with both, but in the factorby
version each of those smooths has it's own smoothing parameter so the different smooths can have different wigglinesses, while thebs = "fs"
version has a single smoothing parameter for all the smooth so they share the same wiggliness. The"fs"
version also differs in that it is fully penalized; the constant terms and the linear terms in the basis of each smooth are penalized. This is why this smooth includes the group means forf
while theby
version doesn't.ti(x, z)
is for a pure interaction tensor product smooth. This is a smooth where the main effects have been parameterised out of the basis. Note that for this smooth you have to define the marginal smooths and then mgcv sorts out the tensor product of those smooths.
So, when you write this in your model ti(timestamp,diff_type, bs = c("fs", "fs"), k = c(30, 30))
it doesn't mean what you think it means. Here you have two marginal smooths:
s(timestamp, bs = "fs", k = 30)
, ands(diff_type, bs = "fs", k = 30)
The first one doesn't have the factor part that is required so that will just be a univariate smooth s(timestamp)
and the second one only contains a factor, so that will result in a univariate random effect IIRC, s(diff_type, bs = "re")
. It's unfortunate that mgcv does this silently when you try to build "fs"
smooths that don't contain the required elements.
So your term ti(timestamp,diff_type, bs = c("fs", "fs"), k = c(30, 30))
is actually going to produce this ti(timestamp, diff_type, bs = c("cr", "re"), k = c(30, NA))
(where the NA
means auto-initialise so you get one basis function per level of the factor- if you only have 3 levels for diff_type
then setting this to 30
is meaningless and mgcv will set it back to the maximum, 3).
Now this term that you're actually creating ti(timestamp, diff_type, bs = c("cr", "re"), k = c(30, NA))
, as described above, is going to yield a random smooth of timestamp for the levels of diff_type
. The marginal basis for this will have had the main effect of timestamp
parameterised out of it as this is a pure interaction term. But this is only modelling the effects of timestamp
per diff_type
, which you already included in the model with this term s(timestamp, by=diff_type, bs = "ad", k = 100)
. So you are modelling the group-specific terms twice!
The same goes for this term ti(timestamp, group, bs = c("fs", "fs"), k = c(30, 30))
, which is duplicating this one s(timestamp, by=group, bs = "ad", k = 100)
, and is wrong from the same reason I mentioned above re the marginal smooths not being what you specified them to be; they're converted to "cr"
and "re"
.
The only different between the two sets of terms is that the s()
version is going to be able to estimate smooths that have different wigglinesses while the ti()
versions will share smoothing parameters.
Basically, you're including terms for the same features in your model multiple times and it is only because mgcv tries to stop you shooting yourself in the foot that your model is fitting; mgcv will be removing collinear terms from the model matrix, as much as it can.
You should decide if you want to allow for individual smoothing parameters (by
version) or shared smoothing parameter ("fs"
version). You shouldn't include both.
So this:
diff_values ~ group + diff_type + group:diff_type +
s(timestamp, k = 150, bs = "ad") +
s(timestamp, by = interaction(group, diff_type, drop = TRUE), bs = "ad", k = 100) +
s(subject, bs = "re")
or this
diff_values ~ s(timestamp, k = 150, bs = "ad") +
s(timestamp, interaction(group, diff_type, drop = TRUE), bs = "fs", k = 100,
xt = list(bs = "ad")) + # might not work, might need `list(bs = "cr")`
s(subject, bs = "re")
Or, if you really want different smoothing parameters smooths of diff_type
within group
and vice versa, then you could do
diff_values ~ s(timestamp, k = 150, bs = "ad") +
s(timestamp, group, by = diff_type, bs = "fs", k = 100,
xt = list(bs = "ad")) + # might not work, might need `list(bs = "cr")`
s(timestamp, diff_type, by = group, bs = "fs", k = 100,
xt = list(bs = "ad")) + # might not work, might need `list(bs = "cr")`
s(timestamp, interaction(group, diff_type, drop = TRUE), bs = "fs", k = 100,
xt = list(bs = "ad")) + # might not work, might need `list(bs = "cr")`
s(subject, bs = "re")
or you can use the sz
version I mentioned.
Since s( x,y) terms are supposed to be isotropic
They are isotropic if x
and y
are continuous and you leave the bs
argument at it's default "tp"
or use "ts"
or "ds"
. If you use s(x, f, bs = "fs")
that smooth is not isotropic because it is univariate.
... (eg: [https://jacolienvanrij.com/Tutorials/GAMM.html#summary] )
You should note that they used:
ti(Time, Condition)
, andti(Time, Condition, by=OFGroup)
, and smooths likes(Time, Subject, bs='fs', m=1)
but none of those terms matches what you were doing because Time
and Condition
are both continuous, whereas you only have a single continuous variable timestamp
.
do you have any suggestions for finding time intervals where smooth differences are significant?
You will need to define a correct model first. Without that, all else is pointless.
from gratia.
Related Issues (20)
- Support for models with `gfam()` family
- `data_sim()`
- gfam support in `appraise()`
- `draw()` works with gfam models HOT 1
- gfam support in `fitted_samples()`
- gfam support in `posterior_samples()`
- gfam support in `smooth_samples()`
- gfam support in `simulate.gam()`
- `compute_partial_residuals` is wrong if deviance residuals are not equivalent to pearson residuals? HOT 2
- Ensure continuity of gam from start of series to end of series HOT 1
- Tests fail with current version of `Matrix` (1.6-5): `function 'chm_factor_ldetL2' not provided by package 'Matrix'` HOT 3
- Improve plotting code for SOS (spline-on-the-sphere) smooths
- Work towards bayesplot and loo integration?
- Select mvn_method when running `fitted_samples()` HOT 1
- shift=TRUE not working for confint.gam() HOT 5
- Possible PR: making posterior draws compatible with MCMC diagnostic packages
- `draw()` fails with `parametric = TRUE` if there aren't any parametric effects to actually plot
- `draw` forgets order of ordered factor HOT 2
- Encountering errors with difference_smooths() - potential bug? HOT 1
- `appraise()` and `qq_plot()` need a `seed` argument and to run the RNG-using bits with that seed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gratia.