ax3man / phylopath Goto Github PK
View Code? Open in Web Editor NEWPerform phylogenetic path analysis in R.
Home Page: https://ax3man.github.io/phylopath/
Perform phylogenetic path analysis in R.
Home Page: https://ax3man.github.io/phylopath/
This package imports and reexports guide_train.edge_colourbar
from ggraph. As guide_train is now a proper exported S3 generic in ggplot2 the specific methods are no longer exported and can thus not be imported.
Can I get you to fix this and submit to CRAN?
Line 265 in b260035
Hi, I am running the example describe here https://cran.r-project.org/web/packages/phylopath/vignettes/intro_to_phylopath.html and when I run results <- phylo_path I get this error message
Error in s[nrow(s), ncol(s)] : incorrect number of dimensions
Any thoughts?
(I was trying first with my data and had the same problem so I decided to try the example). I restarted Rstudio and the problem persists)
A future enhancement to Phylopath - if it was technically possible - would be the integration of the functions in the sensiPhy package to provide information on the sensitivity of path models to sample, data, and phylogenetic uncertainty. Like Phylopath, sensiPhy uses the phylolm package, which might make integration easier. A challenge would be providing summary information on sensitivity for a whole path model, though even generating a table that presents the sensitivity information for each individual regression in the model in one place would be helpful.
Hi,
I'm using the phylo_path function to compare models (using the logistic_IG10 method). I'm getting errors saying that it can't fit a model with two particular variables in it, but I've checked and checked again and there are no models in the set that contain both these variables together, even via indirect links. I think I must have misunderstood what phylo_path is doing - I thought it was running each model independently, but is each model also using the other models in the set?
Many thanks
Hello again,
I've been trying to automate the model generation procedure so that all possible causal models, except for those that are either cyclic or fully connected, can be generated for some given number of variables without having to enter each model manually. Using the rhino data set to illustrate the point (and with three variables only), the main code looks like this:
library(phylopath)
library(gtools)
Nvar <- 3 #number of variables
#detach("package:ggm") # Disable ggm package because it masks DAG() function in 'phylopath::DAG'
dagobj <- DAG(BM ~ LS, LS ~ DD) # generate a DAG with the three variables of interest
dagobj[1:Nvar,1:Nvar] <- matrix(0,Nvar,Nvar) # preset DAG matrix entries to 0
# --- Generate all possible binary strings, in order to allocate to DAG matrices
zerosones <- permutations(2,Nvar^2,v=c(0,1),repeats.allowed=TRUE)
# --- Remove models with reflexive edges (these are not DAGs)
cycles <- matrix(0,dim(zerosones)[1],1)
for (i in 1:Nvar){ cycles[which(zerosones[,Nvar*(i-1)+i] == 1)] <- 1 }
zerosones_2 <- zerosones[-which(cycles==1),]
# --- Allocate binary strings to copies of the DAG object
models_1 <- rep(list(dagobj),dim(zerosones_2)[1])
for (i in 1:dim(zerosones_2)[1]){ models_1[[i]][,] <- zerosones_2[i,] }
# --- Further remove models, this time ones that have bidirectional edges
cycles_2 <- matrix(0,length(models_1),1)
for (k in 1:length(models_1)){ cycles_2[k] <- length(intersect( which(models_1[[k]] == t(models_1[[k]])), which(models_1[[k]]==1))) >0}
models_2 <- models_1[which(cycles_2==0)]
# --- then do an additional check for cyclic models using ggm::isAcyclic()
#library(ggm)
cycles_3 <- matrix(0,length(models_2),1)
for (i in 1:length(models_2)){ cycles_3[i] <- ggm::isAcyclic(models_2[[i]])==0 }
models_3 <- models_2[which(cycles_3==0)]
# --- remove models for which d-separation cannot be evaluated due to full connectedness, using ggm::basiSet()
cycles_basisnull <- matrix(0,length(models_3),1)
for (i in 1:length(models_3)){ cycles_basisnull[i] <- is.null(ggm::basiSet(models_3[[i]])) } # ggm::basiSet
models_auto <- models_3[which(cycles_basisnull==0)]
After running the above code, the list models_auto
should contain 19 models. And it should be basically identical to the following manually constructed list:
models_manual <- list(
m1=DAG(BM~BM,LS~LS,DD~DD), m2=DAG(BM~LS,DD~DD), m3=DAG(BM~DD,LS~LS), m4=DAG(BM~DD,BM~LS), m5=DAG(LS~BM,DD~DD), m6=DAG(LS~BM,BM~DD),
m7=DAG(LS~DD,BM~BM), m8=DAG(LS~DD,BM~LS), m9=DAG(LS~DD,BM~DD), m10=DAG(LS~DD,LS~BM), m11=DAG(DD~BM,LS~LS), m12=DAG(DD~BM,BM~LS),
m13=DAG(DD~BM,LS~BM), m14=DAG(DD~BM,LS~DD), m15=DAG(DD~LS,BM~BM), m16=DAG(DD~LS,BM~LS), m17=DAG(DD~LS,BM~DD), m18=DAG(DD~LS,LS~BM), m19=DAG(DD~LS,DD~BM)
)
However, whereas phylo_path()
works for models_manual
, it doesn't work for models_auto
. Specifically, phylo_path(models_auto, rhino, rhino_tree)
gives the following error:
Error in mutate_impl(.data, dots) : Evaluation error: comparison (6) is possible only for atomic and list types.
Do you have any idea why phylo_path()
works for one but not the other? Is the apparent identity between models_manual
and models_auto
actually false? I need to automate model generation in this way because I'm going to scale up from 3 to 6 or 7 variables (an huge increase in the number of possible models), and also want to be able to approach the analysis in a theory-free manner.
Any advice would be appreciated.
Also see #16.
I should probably just coerce all to data.frame
Hello Wouter,
I have no problem running the example analysis that uses the 'rhino' data set and rhino_tree
. But when I try to run the function phylo_path()
using instead a tree downloaded from 10kTrees, I get errors such as:
Error in mutate_impl(.data, dots) : Evaluation error: comparison (6) is possible only for atomic and list types.
and
Error in cor_fun(par, .x): object "phy" is not of class "phylo"
10kTrees is (as you might know) a resource used by many researchers, and the downloaded tree should be a properly formatted phylo
object. Even when I use the rhino
data set and the models as specified in your tutorial but together with a 10kTrees primate tree (after replacing the rhino
species names with a subset of the primate ones), I still get the former error listed above, so I assume the problem is the tree itself, although I'm not completely sure. I do have the most recent versions of the dependent packages.
I have no idea what I should be trying out, so any suggestions would be appreciated! If there is any other information or code that I could provide in order to clarify, please let me know.
Thanks,
Ryu
My reverse dependency check shows some problems between phylopath and the next ggraph version:
Missing link or links in documentation object 'plot.DAG.Rd':
‘[ggraph:create_layout.igraph]{ggraph::create_layout.igraph()}’
Missing link or links in documentation object 'plot.fitted_DAG.Rd':
‘[ggraph:create_layout.igraph]{ggraph::create_layout.igraph()}’
See section 'Cross-references' in the 'Writing R Extensions' manual.
This is basically because the internal network representation has changed. I suggest you link directly to create_layout
instead of to one of the methods. I plan on submitting ggraph to CRAN on September 1st and hope you'll be able to have a fix on CRAN by then.
best
Thomas
Hi Wouter,
Thanks for making this very useful package!
Is there a way to alter the value for the 'btol' argument when using the phylo_path function? Normally, I am able to make this adjustment when using phyloglm but I can't seem to make this adjustment when using phylopath.
Do you have any suggestions?
Thanks,
Louis
@Ax3man I need to update phylosem
and phylopath
is a dependency. However, the code below fails with a warning about package graph
not being available. graph
appears to be a dependency of ggm
. Any ideas on what's happening here?
models_pp <- phylopath::define_model_set(
one = c(RS ~ DD),
two = c(DD ~ NL, RS ~ LS + DD),
three = c(RS ~ NL),
four = c(RS ~ BM + NL),
five = c(RS ~ BM + NL + DD),
six = c(NL ~ RS, RS ~ BM),
seven = c(NL ~ RS, RS ~ LS + BM),
eight = c(NL ~ RS),
nine = c(NL ~ RS, RS ~ LS),
.common = c(LS ~ BM, NL ~ BM, DD ~ NL)
)
ggraph
has changed the curvature
argument to strength
.
Hi @Ax3man!
Thanks for a really useful package! I was wondering if you are planning to provide support for the inclusion of non-binary categorical variables.
Thank you!
Joan
Hi Wouter,
I am getting an error when running even the example of phylo_path()
code:
candidates <- list(A = DAG(LS ~ BM, NL ~ BM, DD ~ NL),
B = DAG(LS ~ BM, NL ~ LS, DD ~ NL))
p <- phylo_path(candidates, rhino, rhino_tree)
error:
Error: Fitting the following model:
DD ~ NL + BM
produced this error:
Error in nlme::gls(..., correlation = cor_fun(par, .x)):
model must be a formula of the form "resp ~ pred"
I am running R 3.4, phylopath 0.3.0 and nlme 3.1-131 on MAC OSX
Any ideas why that might be?
Thank you for your help and the great package!
Cheers,
Christoph
Hi there,
great package, thanks for sharing and maintaining it!
I was able to define boot values in an older version of phylo_path(). When I try and define boot values in the latest version (v.1.1.3 from CRAN) of phylopath I get the following error: "formal argument "boot" matched by multiple actual arguments." R and all my other packages are up to date.
Is there a change in how the boot argument is being passed on to phylolm in the updated version of phylopath? or am I just doing something silly here? The outcome variable is continuous if that helps and a sample of code is below. Happy to pass on any additional information if that would be useful.
Much appreciated,
Joseph
best_result <- phylo_path(best_models_DAG,
data = data_use,
tree = pruned_tree,
model = 'lambda',
method = "logistic_MPLE",
boot = 100)
Pruned tree to drop species not included in dat.
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
Error: Fitting the following model:
Unaffiliated ~ Education_Index + Life_expectancy_Index
produced this error:
formal argument "boot" matched by multiple actual arguments
d <- DAG(LS ~ BM, NL ~ BM, DD ~ NL + LS)
d_fitted <- est_DAG(d, rhino, ape::corBrownian, rhino_tree)
plot(d_fitted)
plot(d_fitted, labels = c(LS = 'ls', BM = 'bm', NL = 'nl', DD = 'dd'))
Should be the same ordering.
Hi Ax3man,
I would like to be able to convert the odds ratios obtained from phylopath models to probabilities to make results easier to interpret. For this I would need the baseline coefficients, which aren't provided. I presume it doesn't make sense to have a baseline coefficient when each phylopath model is comprised of many regression models, but thought I would ask just in case!
Many thanks
However, whenever I start a cluster e.g.
result <- phylo_path(all_models,
data = data_use,
tree = tree,
model = 'lambda',
method = "logistic_MPLE"
, parallel = “FORK"
)
… the analysis fails — with "Error: $ operator is invalid for atomic vectors”
It’s easy to replicate, just set the parallel flag to either ‘FORK’ or ‘SOCK’ and it happens.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.