hmorlon / panda Goto Github PK

Phylogenetic ANalyses of DiversificAtion

R 98.15% C 1.85%

panda's Introduction

RPANDA

R: Phylogenetic ANalyses of DiversificAtion

Implements fits of diversification and phenotypic evolution models to phylogenetic data. See Morlon et al. PLoSB (2010), Morlon et al. PNAS (2011), Condamine et al. Eco Lett (2013), Morlon et al. Eco Lett (2014), Manceau et al. Eco Lett (2015), Lewitus & Morlon Syst Biol (2016), Drury et al. Syst Biol (2016), Manceau et al. Syst Biol (2016), Clavel & Morlon PNAS (2017), Drury et al. Syst Biol (2017), Lewitus & Morlon Syst Biol (2017), Drury et al. PLoSB (2018), Clavel et al. Syst Biol (2019), Maliet et al. Nature Ecol Evol (2019), Condamine et al. Eco Lett (2019), Aristide & Morlon Eco Lett (2019), Billaud et al. Syst Biol (2019), Lewitus & Morlon Syst Biol (2019), Maliet et al. Eco Lett (2020), Drury et al. PloS B (2021), Perez-Lamarque & Morlon Mol Ecol (2022), Perez-Lamarque et al. Peer Comm J (2022), Mazet et al. Methods Ecol Evol (2023), Drury et al. Current Bio (2024).

More information on the RPANDA package and worked examples can be found in Morlon et al. (2016)

The current stable version of the RPANDA package (2.2) is available on the CRAN repository. https://cran.r-project.org/package=RPANDA

Package Installation

From gitHub You can install RPANDA directly from gitHub with devtools:

library(devtools)

install_github("hmorlon/PANDA", dependencies = TRUE)

From the binaries

You can download the pre-released binaries for Windows and Mac OS X from the release page

From the source

Otherwise, you can install it directly from the source. Download the RPANDA folder and then, from the terminal console (linux, windows or mac):

R CMD build RPANDA

This will produce the RPANDA tarball.

Then, for compiling the binary:

R CMD INSTALL --build RPANDA_2.X.tar.gz

Report an issue

Any bugs encountered when using the package can be reported here

panda's People

Contributors

Stargazers

Watchers

Forkers

mmanceau florianhartig simongreenhill elewitus theopannetier ccbio cnyuanh jluiselli vsenderov alexsla phylobiology hherhold rnaimehaom hide-fun hansonmenghan

panda's Issues

update RPANDA---Error : object 'GIC' is not exported by 'namespace:mvMORPH'

installing the source package ‘RPANDA’

trying URL 'https://cran.rstudio.com/src/contrib/RPANDA_1.4.tar.gz'
Content type 'application/x-gzip' length 299199 bytes (292 KB)
downloaded 292 KB

installing source package 'RPANDA' ...
** package 'RPANDA' successfully unpacked and MD5 sums checked
** R
** data
** inst
** byte-compile and prepare package for lazy loading
Error : object 'GIC' is not exported by 'namespace:mvMORPH'
ERROR: lazy loading failed for package 'RPANDA'
removing 'C:/Users/xxx/R/win-library/3.5/RPANDA'
In R CMD INSTALL
Warning in install.packages :
installation of package ‘RPANDA’ had non-zero exit status

shift.estimates()

Hello - I am trying to use shift.estimates() with the dolphin data and my own data. I am running it on my own Mac book - but it seems the analysis never finishes. How long does it usually take?

[WARNING] daspk-- warning.. At T(=R1) and stepsize H (=R2) the nonlinear solver failed to converge repeatedly of with abs (H) = HMIN &g, 0

Hi,
I'm running

library(Rcpp)
library(ape)
library(terra)
library(raster)
library(RPANDA)

phy <- read.tree('.../treeDim.tre')

# fit ClaDS with a proportion of 153/350 as sampling fraction (153 spp. in the tree vs ~350 total species)
setwd('...')
sample_fraction <- 153/350

### Option 1: run for the full 300k iterations at once
sampler <- fit_ClaDS(phy,sample_fraction,iterations=300000,thin=250,file_name='sampler300k',model_id="ClaDS2",nCPU = 3)

and getting this warning at two points in the process

Done. in progress 1000 of 1000 please wait!
Done. in progress 2000 of 2000 please wait!
Done. in progress 2000 of 2000 please wait!
Done. in progress 3000 of 3000 please wait!
Done. in progress 3000 of 3000 please wait!
Done. in progress 4000 of 4000 please wait!
Done. in progress 4000 of 4000 please wait!
daspk--  warning.. At T(=R1) and stepsize H (=R2) the
      nonlinear solver failed to converge
      repeatedly of with abs (H) = HMIN &g, 0
Done. in progress 5000 of 5000 please wait!
daspk--  warning.. At T(=R1) and stepsize H (=R2) the
      nonlinear solver failed to converge
      repeatedly of with abs (H) = HMIN &g, 0
Done. in progress 5000 of 5000 please wait!
Done. in progress 6000 of 6000 please wait!
Done. in progress 6000 of 6000 please wait!
Done. in progress 7000 of 7000 please wait!
Done. in progress 7000 of 7000 please wait!
Done. in progress 8000 of 8000 please wait!
Done. in progress 8000 of 8000 please wait!
Done. in progress 9000 of 9000 please wait!
Done. in progress 9000 of 9000 please wait!
Done. in progress 10000 of 10000 please wait!
Done. in progress 10000 of 10000 please wait!
Done. in progress 11000 of 11000 please wait!
Done. in progress 11000 of 11000 please wait!
Done. in progress 12000 of 12000 please wait!
Done. in progress 12000 of 12000 please wait!
Done. in progress 13000 of 13000 please wait!
Done. in progress 13000 of 13000 please wait!
Done. in progress 14000 of 14000 please wait!
Done. in progress 14000 of 14000 please wait!
 MCMC in progress 14500 of 15000 please wait!



 MCMC in progress 18250 of 19000 please wait!





Done. in progress 25000 of 25000 please wait!
Done. in progress 26000 of 26000 please wait!
Done. in progress 27000 of 27000 please wait!
Done. in progress 28000 of 28000 please wait!
Done. in progress 29000 of 29000 please wait!
Done. in progress 30000 of 30000 please wait!
Done. in progress 31000 of 31000 please wait!
Done. in progress 32000 of 32000 please wait!
Done.

Is this warning something I should worry about?
Thank you,
-madza

fit_t_standard vs. createModel + fitTipData

Hi RPANDAs team,

First off, I just want to say how amazing this package and all the work you do is. But I'm having an issue understanding some of the parameters and differences in the functions.

Specifically, what is the difference between the fit_t_standard function vs. the combination of createModel and fitTipData functions?

Using the same phylogeny, trait data, and error vector I get different inferred parameters with these two functions.
When I use the fitTipData function with a BM model, the inferred parameters are m0, v0, d,sigma . When I use the fit_t_standard function the inferred parameters are sigma2 and nuisance and there is an additional theta parameter. It seems as though sigma = sigma2, v0 = nuisance, and m0 = theta, but what is d?

Similarly, for an OU model, fitTipData gives m0, v0, psi, theta, sigma while fit_t_standard gives sigma2, alpha, nuisance, theta. It seems like now theta = theta, but now I don't know what m0 or psi mean and where is alpha in the first function? None of the values in the first function seem to be anywhere near the estimated alpha from the second function.

Am I missing something? Is there a document that explains these parameters in more detail?

Thank you,

Rhett

Fitting environmental model

Dear all,
I am trying to fit an environmental-dependent model using InfTemp data of RPANDA package. My tree goes back to 42.08 Myr at the basal node. And I understand that the InfTemp data goes to 67 Myr. I am sorry about this question (I am a R-user, I do not know anything about programming): Does the fit_env function take this differnce into account or should I indicate something inside the function? I was expecting some relationship between diversification and this variable but I got a result that says that the best model is the one with speciation and extinction as constants (I tried linear and exponential models). So I want to know if I am doing the things right.
Thank you in advance,
Alicia

JSDtree fails on some other trees

Thanks for fixing #23 and #25! Here is another one (with the latest version of master, of course):

When I run JSDtree, it usually does exactly what it should: produce a matrix with zeroes on the diagonal and Jensen-Shannon distances in the other cells. Well done 👍

However, for some trees, the function fails:

tree_1 <- ape::read.tree(text = "(((t3:0.1411719966,(t5:0.03526457828,t4:0.03526457828):0.1059074183):0.06732406607,t6:0.2084960626):1.756416178,((t7:0.2838467566,t1:0.2838467566):0.4396700074,t2:0.723516764):1.241395476);")
tree_2 <- ape::read.tree(text = "((t1:0.01716155337,t4:0.01716155337):0.003048532642,(t3:0.01927721092,t2:0.01927721092):0.0009328750893);")
trees <- c(tree_1, tree_2)

# Should pass
testthat::expect_silent(
  RPANDA::JSDtree(trees)
)

Running this, gives the following error:

Error in density.default(m[[i]]) : 
  need at least 2 points to select a bandwidth automatically

Just for convenience, these are the trees:

I would expect either a clear error message or a (valid) result. I hope you will add this reprex to your test suite and fix it.

Just for completness, I add the complete brute-force script I used to find these trees:

library(RPANDA)

n <- 100
distances <- rep(NA, n)

for (i in seq_len(n)) {
  set.seed(i)
  trees <- c(
    ape::rcoal(n = sample(x = seq(3, 10), size = 1)),
    ape::rcoal(n = sample(x = seq(3, 10), size = 1))
  )
  distances[i] <- RPANDA::JSDtree(phylo = trees)[1, 2]
}

Good luck fixing and keep up the good work 👍

"relToAbs" not available for .Call() for package "RPANDA"

Running the example code for CLADS analysis fails:

set.seed(1)

obj= sim_ClaDS( lambda_0=0.1,    
                mu_0=0.5,      
                sigma_lamb=0.7,         
                alpha_lamb=0.90,     
                condition="taxa",    
                taxa_stop = 20,    
                prune_extinct = TRUE)  

tree = obj$tree
speciation_rates = obj$lamb[obj$rates]
extinction_rates = obj$mu[obj$rates]

plot_ClaDS_phylo(tree,speciation_rates)

sampler = fit_ClaDS0(tree=tree,        
              name="ClaDS0_example.Rdata",      
              nCPU=1,               
              pamhLocalName = "local",
              iteration=500000,
              thin=2000,
              update=1000, adaptation=5)

Generates this error:

Error in .Call("relToAbs", lambda = as.numeric(lambda2), parents = as.integer(parents),  : 
  "relToAbs" not available for .Call() for package "RPANDA"

Error using spectR function

Hi,

I'm trying to run RPANDA using my own tree as well as the data in the package, however, when I run:

res<-spectR(Phyllostomidae)

I receive the following error message:

Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

I'm running a Mac OSX 10.15.7 (Catalina) with RStudio 1.3.1093 and R 4.0.3

I hope you can help out with this!

Cheers
Pepijn

issue using fit_ClaDS function

Hello, I am trying to use fit_ClaDS() to get branch-specific speciation rates for a bird phylogeny, but I am getting warnings and not sure how to troubleshoot the issue.

The phylogeny, in .tre format, is in a zipped folder here:
mytree.zip

The R code I am using is:

tree <- read.tree('mytree.tre')
clads <- fit_ClaDS(tree = tree, sample.fraction = 1, 
                   iterations = 10000)
#  Here I get a warning: 1: In likelihood(param2, sample_fraction, former) :   restarting interrupted promise evaluation

plot_ClaDS_chains(clads)
#this generates the following error:  Error in plot.new() : figure margins too large

Session Info:

Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RPANDA_2.2      picante_1.8.2   nlme_3.1-162    vegan_2.6-4    
[5] lattice_0.20-45 permute_0.9-7   ape_5.7-1 

loaded via a namespace (and not attached):
  [1] colorspace_2.1-0        class_7.3-21            modeltools_0.2-23      
  [4] glassoFast_1.0          mclust_6.0.0            corpcor_1.6.10         
  [7] rmutil_1.1.10           clue_0.3-64             statip_0.2.3           
 [10] ggrepel_0.9.3           flexmix_2.3-19          optimParallel_1.0-2    
 [13] fansi_1.0.4             mvtnorm_1.2-2           codetools_0.2-19       
 [16] splines_4.2.3           R.methodsS3_1.8.2       mnormt_2.1.1           
 [19] doParallel_1.0.17       robustbase_0.95-1       spam_2.9-1             
 [22] cluster_2.1.4           Rmpfr_0.9-2             kernlab_0.9-32         
 [25] R.oo_1.25.0             stabledist_0.7-1        compiler_4.2.3         
 [28] Matrix_1.5-3            cli_3.6.1               tools_4.2.3            
 [31] gmp_0.7-1               modeest_2.4.0           igraph_1.4.3           
 [34] dotCall64_1.0-2         coda_0.19-4             gtable_0.3.3           
 [37] glue_1.6.2              clusterGeneration_1.3.7 maps_3.4.1             
 [40] fastmatch_1.1-3         Rcpp_1.0.10             statnet.common_4.9.0   
 [43] raster_3.6-20           vctrs_0.6.2             iterators_1.0.14       
 [46] fpc_2.2-10              timeDate_4022.108       network_1.18.1         
 [49] spatial_7.3-16          lifecycle_1.0.3         phangorn_2.11.1        
 [52] GUniFrac_1.7            statmod_1.5.0           pvclust_2.2-0          
 [55] terra_1.7-29            DEoptimR_1.0-14         TESS_2.1.2             
 [58] MASS_7.3-58.2           scales_1.2.1            timeSeries_4030.106    
 [61] subplex_1.8             parallel_4.2.3          expm_0.999-7           
 [64] fields_14.1             gridExtra_2.3           ggplot2_3.4.2          
 [67] fBasics_4022.94         rpart_4.1.19            bipartite_2.18         
 [70] foreach_1.5.2           plotrix_3.8-2           phytools_1.5-1         
 [73] stable_1.1.6            rlang_1.1.1             pkgconfig_2.0.3        
 [76] prabclus_2.3-2          matrixStats_1.0.0       mvMORPH_1.1.7          
 [79] pracma_2.4.2            deSolve_1.35            magrittr_2.0.3         
 [82] R6_2.5.1                generics_0.1.3          combinat_0.0-8         
 [85] sna_2.7-1               pillar_1.9.0            mgcv_1.8-42            
 [88] scatterplot3d_0.3-44    sp_1.6-1                nnet_7.3-18            
 [91] tibble_3.2.1            pspline_1.0-19          utf8_1.2.3             
 [94] viridis_0.6.3           grid_4.2.3              digest_0.6.31          
 [97] diptest_0.76-0          pbmcapply_1.5.1         numDeriv_2016.8-1.1    
[100] R.utils_2.12.2          stats4_4.2.3            munsell_0.5.0          
[103] viridisLite_0.4.2       quadprog_1.5-8

If you have any suggestions for what I might try to either fix, or troubleshoot, the issue, I would very much appreciate it. Many thanks for your consideration!

error when running fit_t_comp on seemingly random trees

I've been attempting to fit MC models on a set of simulated phylogenies, all simulated using the same parameters. On seemingly random trees, the function fails and brings up the following error:

Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent

By following the traceback, it seems the error occurs somewhere during the optimisation process when running fitTipData internally, but I can't figure out what is causing the problem, and why other trees simulated under the same parameters don't break the function.

I'm attaching a zip file with two RDS files: the simulated tree (tree.RDS) and the named vector of trait values (traits.RDS) I used to try to run the function.

data.zip

Thank you,
Alex

error when loading library

Hi,

I just installed RPANDA on our linux computer, but when I load in the library I receive the following error:

Error: package or namespace load failed for 'RPANDA':
objects 'neworder_phylo', 'neworder_pruningwise' are not exported by 'namespace:ape'

Then when I try to run plot(my_tree) I receive the following error:

Error in reorder.phylo(x, order = "postorder") : 
object 'neworder_phylo' not found

I made sure I installed all the different packages necessary, but still get this error. Do you think this might have to do with the version of 'ape'?

Cheers
Pepijn

Error in env_tabulated[index] : only 0's may be mixed with negative subscripts

Hello,

I am trying to fit an environmental bd model, but am getting the error "error in env_tabulated[index] : only 0's may be mixed with negative subscripts". The error seems to be related to my environmental dataset, because I fit time-dependent models on my tree with no issue. I have had this error message pop up with several different environmental datasets, but I am not sure of the cause. My env data file is attached here (I was unable to attach my tree file). Is there anything wrong with it that you can see? Thank you!

sediment_initiation.txt

Error in optim(init, optimLH, method = meth) : function cannot be evaluated at initial parameters

Hi,

I have the code from the RPANDA example, when I ran "fit_env" function with example intial values, it failed with error mentioned above. What's the resolution for this issue?

I also see sombody asked here : https://www.phylobabble.org/t/troubleshooting-rpanda-code/595

I paste my code and data here: https://github.com/Cactusolo/Rpanda_fail_case.
Thanks!

Error in while (sum(abs(OV)) > epsnormv & end < 1000) { : missing value where TRUE/FALSE needed

Hi,
I'm running

library(Rcpp)
library(ape)
library(terra)
library(raster)
library(RPANDA)

phy <- read.tree('.../treeDim.tre')

# fit ClaDS with a proportion of 153/350 as sampling fraction (153 spp. in the tree vs ~350 total species)
setwd('...')
sample_fraction <- 153/350

# Option 2: run by 25k iterations each time
# first run
sampler <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler25k',model_id="ClaDS2",nCPU = 3)

# start second run using the result from first run
sampler2 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler50k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler25k)

# start third run
sampler3 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler75k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler50k)

# next run
sampler4 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler100k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler75k)

# next run
sampler5 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler125k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler100k)

# next run
sampler6 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler150k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler125k)

# next run
sampler7 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler175k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler150k)

# next run
sampler8 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler200k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler175k)

# next run
sampler9 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler225k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler200k)

# next run
sampler10 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler250k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler225k)

# next run
sampler11 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler275k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler250k)

# next run
sampler12 <- fit_ClaDS(phy,sample_fraction,iterations=25000,thin=250,file_name='sampler300k',model_id="ClaDS2",nCPU = 3,mcmcSampler = sampler275k)

And got a series of there errors and warnings along the way:

Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In daspk(y, times, func, parms, ...) :
  repeated convergence test failures on a step - inaccurate Jacobian or preconditioner?
2: In daspk(y, times, func, parms, ...) :
  Returning early. Results are accurate, as far as they go
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed
Error in while (sum(abs(OV)) > epsnormv & end < 1000) { :
  missing value where TRUE/FALSE needed

Is this warning something I should worry about?
Thank you,
-madza

time * environmental interaction model?

Hello, this is not an issue but a question of fit_env application. I'm wondering if it is possible to include a model with lambda and/or mu varying in time in addition to an environmental effect using fit_env. That would be a case where lambda depends on both time and environment. I don't think this has been done, or at least couldn't find any example in the literature or manual (but maybe I'm missing it?).
I got this model and just want to check if RPANDA is doing what "I want" or if I'm missing/misunderstanding something:

f.lamb <-function(t,x,y){
y[1] * exp(y[2] * t * x) ### exponential example of time * environmental interaction?
}
f.mu<-function(t,x,y){
y[1] + y[2] * t * x ### linear example of time environmental interaction?
}
lamb_par<- c(0.1, 0.01)
mu_par <- c(0.1, 0.01)
result <- fit_env(phylo = tree, env_data = env_data, tot_time = tot_time,
f.lamb=f.lamb, f.mu = f.mu, lamb_par = lamb_par, mu_par = mu_par,
expo.lamb = T,
dt=1e-3)

Thanks a lot in advance!!

InfTemp data Age duplicates?

Hello PANDA dev team,

First of all, thank you for continuing to develop this great package!

I am currently fitting multiple fit_env models with different paleoclimate (temperature) reconstructions. The reconstructed climate data sets I'm working with only have 271 rows (i.e. 271 time steps) compared to the much higher resolution data(InfTemp) available as part of the RPANDA package (which has 17632 rows).

Upon closer inspection, I notice that the some of the InfTemp$Age are duplicates (same time) but the temperature values are different.

> head(InfTemp)
    Age Temperature
1 0.000    3.902176
2 0.000    2.900296
3 0.002    4.309984
4 0.002    5.172534
5 0.004    3.733446
6 0.004    4.309984

To make sure this is not a matter of rounding, I verified this with:

> InfTemp$Age[1] == InfTemp$Age[2]
[1] TRUE
> InfTemp$Age[3] == InfTemp$Age[4]
[1] TRUE

plot(InfTemp$Age[1:4], InfTemp$Temperature[1:4])

The only reference I saw about this temperature estimation is Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85 but I wasn't able to figure out why the Age column has duplicates. Could you explain why there are two Temperature values assigned to duplicated Age values?

Many thanks,
Putter

master broken, nobody can install RPANDA from GitHub

Due to #23 the code on the master branch has been changed.

Sadly, this broke master:

It means that nobody can install RPANDA:

> devtools::install_github("hmorlon/PANDA")
Downloading GitHub repo hmorlon/PANDA@master
✓  checking for file ‘/tmp/RtmpFEd2N0/remotes254a6b7dd5d3/hmorlon-PANDA-fc535ef/DESCRIPTION’ (492ms)
─  preparing ‘RPANDA’:
✓  checking DESCRIPTION meta-information ...
─  cleaning src
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  looking to see if a ‘data/datalist’ file should be added
─  building ‘RPANDA_1.7.tar.gz’ (7s)
   
Installing package into ‘/home/richel/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
* installing *source* package ‘RPANDA’ ...
** using staged installation
** libs
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG     -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-k1TtL4/r-base-3.6.1=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c RPANDA_init.c -o RPANDA_init.o
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG     -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-k1TtL4/r-base-3.6.1=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c diversif_lognormal.c -o diversif_lognormal.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o RPANDA.so RPANDA_init.o diversif_lognormal.o -L/usr/lib/R/lib -lR
installing to /home/richel/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-RPANDA/00new/RPANDA/libs
** R
** data
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘RPANDA’ in namespaceExport(ns, exports):
 undefined exports: JSDtree
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/home/richel/R/x86_64-pc-linux-gnu-library/3.6/RPANDA’
* restoring previous ‘/home/richel/R/x86_64-pc-linux-gnu-library/3.6/RPANDA’
Error: Failed to install 'RPANDA' from GitHub:
  (converted from warning) installation of package ‘/tmp/RtmpFEd2N0/file254a4e18e85b/RPANDA_1.7.tar.gz’ had non-zero exit status
>

Good luck fixing this 👍

Problematic lambda values "RPANDA"

Dear all,

I am trying to fit an environmental-dependent model using paleoelevation data of the tropical Andes. I am also testing different sampling strategies since we know that there are some cryptic species within my group.
The clade is composed of 90 species, so I am trying with f=90/90 (100%), 81/90 (90%), and 72/90 (80%).
I run 10 models per sampling strategy (total=30 analyses) in RPANDA in which rates were modified by both an exponential and linear function of paleo-environment: (i) speciation rate varies with the environmental variable and no extinction; (ii) speciation rate varies with the environmental variable and extinction rate is constant; (iii) speciation rate is constant and extinction rate varies with the environmental variable; (iv) speciation and extinction rates both vary with the environmental variable. These models were compared to simpler diversification models, (v) namely a pure speciation model (Yule) and a (vi) constant rate birth-death models.

Everything was ok until I saw the results for the linear function of the model where speciation and extinction rates both vary with the environmental variable (which is also the best model according to delta AIC), specifically for the f=81/90 sampling strategy (90%). All the lambda values in the other 29 runs vary between 0.14343217 and 1.432733, but for some reason that I can't understand, the "problematic" model is giving me a lambda value of 10.830502.
I would like to know what am I doing wrong.
Thank you in advance,
Majo

######linLambdadeponaltandlinmudeponalt######
f.lamb<-function(t,x,y){y[1] + (y[2] * x)}
f.mu<-function(t,x,y){y[1] + (y[2] * x)}
lamb_par<-c(0.05,0.01)
mu_par<-c(0.005, 0.009)
tot_time<-max(node.age(prist)$ages)
result_exp20<- fit_env(prist, elev2, tot_time, f.lamb, f.mu, lamb_par, mu_par, df= dof, f = 81/90,
meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE,
expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE,
dt=1e-3, cond = "crown")

I also have tried with 79/90, 80/90, 82/90, and 83/90. These are all of the outputs that I got:

####72/90 model (80%): "environmental birth death"
LH: -228.9846
aicc: 466.4397
lamb_par: 1.432733 -0.321531
mu_par: 0.6416556 -0.1663267

####79/90 model : "environmental birth death"
LH: -223.9706
aicc: 456.4119
lamb_par: 12.837443 -3.275568
mu_par: 10.750152 -2.734816

#####80/90 model: "environmental birth death"
LH: -230.036
aicc: 468.5427
lamb_par: 1.1869399 -0.2533093
mu_par: 2.5183912 -0.6622045

####81/90 (90%) model: "environmental birth death"
LH: -222.3847
aicc: 453.24
lamb_par: 10.830502 -2.768822
mu_par: -8.996084 2.307149

####82/90 model: "environmental birth death"
LH: -223.7253
aicc: 455.9212
lamb_par: 4.0083097 -0.9913247
mu_par: -4.765867 1.224885

####83/90 model: "environmental birth death"
LH: -232.0119
aicc: 472.4944
lamb_par: 0.42181768 -0.05442131
mu_par: -1.1430388 0.3022878

####90/90 model (100%): "environmental birth death"
LH: -228.9846
aicc: 466.4397
lamb_par: 1.432733 -0.321531
mu_par: 0.6416556 -0.1663267

Error in svd(m) : infinite or missing values in ‘x’

Error in svd(m) : infinite or missing values in ‘x’

I keep getting this error when trying to fitTipData. It mostly happens after I simulate data under my empirically-fitted models, but it also seems to happen sometimes with my empirical data. Has anyone ever experienced this before?

Documentation suggestion

So just a suggestion, but given that the docs linked from cran are mostly just a list of functions (albeit with good descriptions of each function), and thus not super helpful as far as figuring out where to get started (i.e. which functions to run), it might be nice to link this paper from the github readme (and maybe also from the cran docs), since it seems to be a more useful starting place for noobs like me. If there's a better starting place than that paper that I haven't found, that would be great, as well.

For context, I'm coming from this paper, and am super excited to try out the MGL stuff, thanks for writing it!

how to interpret negative mu values from RPANDA model

Hi Helene,

I run a few models with RPANDA, and I got a bunch negative values for mu (extinction rate). would you please shed some lights on how should I interpret negative mu values? is negative value correct?

here are the basic code I used:
lambda.cst <- function(t,y){y[1]} #t=time, y=lambda (λ) mu.l <- function(t,y){y[1] + y[2]*t} par <- c(0.05, 0.005, 0.001) bcst.dvar.l <- fit_bd(tree, tot_time, f.lamb=lambda.cst, f.mu=mu.l, lamb_par=par[1], mu_par=par[c(2,3)], cst.lamb=TRUE, cond="crown", f=fraction, dt=1e-3)
and got results like this:
$bcst.dvar.l`
model :
[1] "birth death"

LH :
[1] -58309.65

aicc :
[1] 116625.3

lamb_par :
[1] 1.406383

mu_par :
[1] -1.3424997816 -0.0003351743
`

I have read your RPANDA tutorial on Morlon et al. (2014) in MEE journal, but the mu_par[1] and mu[2] are not explained, except res$lamb_par[1] and res$lamb_par[2]. So would you please explain what does it mean if I got a negative value for mu_par[1]? for mu_par[2], does it mean increased the speciation rate?

Thanks!

Miao

Issues with installation

Hello:

I am trying to use Clads(), but I am having some difficulties installing RPANDA (any version) on a HPCC and my own computer.

The issue appears to be with "Rmpfr".
Warning message in install.packages("Rmpfr"):"installation of package 'Rmpfr' had non-zero exit status"
configure: error: Header file mpfr.h not found; maybe use --with-mpfr-include=INCLUDE_PATH

Does anyone know a way around this?

Many thanks!
-Dylan

Comparison with likelihood values and Aicc in Table S1 from PNAS 2011

Although in principle the Cetacean phylogeny should have been fitted with cond="crown" (stem age is not known), the likelihood values in the PNAS paper correspond to the condition "stem". In addition, Aicc values were computed using different values for the number of observations (all inter-node branches in PNAS 2011, branches from 0 to nodes in PANDA).

JSDtree fails on some trees

When I run JSDtree, it usually does exactly what it should: produce a matrix with zeroes on the diagonal and Jensen-Shannon distances in the other cells. Well done 👍

However, for some trees, the function fails:

library(RPANDA)

tree_1 <- ape::read.tree(text = "(t4:0.2969950169,((t1:0.06457926296,t2:0.06457926296):0.0824593208,(t5:0.1346457881,t3:0.1346457881):0.01239279566):0.1499564331);")
tree_2 <- ape::read.tree(text = "((t2:0.07486817339,t5:0.07486817339):0.8541436693,(t1:0.08605484639,(t4:0.06379297152,t3:0.06379297152):0.02226187487):0.8429569963);")
trees <- c(tree_1, tree_2)
jsd <- RPANDA::JSDtree(phylo = trees)[1, 2]

# Should give a number
testthat::expect_false(is.nan(jsd))

Running this, gives a warning:

Warning messages:
1: In log(x/y) : NaNs produced
2: In log(x/y) : NaNs produced

Just for convenience, these are the trees:

I would expect either a clear error message or a (valid) result. I hope you will add this reprex to your test suite and fix it.

Just for completness, I add the complete brute-force script I used to find these trees:

library(RPANDA)

# reprex
tree_1 <- ape::read.tree(text = "(t4:0.2969950169,((t1:0.06457926296,t2:0.06457926296):0.0824593208,(t5:0.1346457881,t3:0.1346457881):0.01239279566):0.1499564331);")
tree_2 <- ape::read.tree(text = "((t2:0.07486817339,t5:0.07486817339):0.8541436693,(t1:0.08605484639,(t4:0.06379297152,t3:0.06379297152):0.02226187487):0.8429569963);")
trees <- c(tree_1, tree_2)
jsd <- RPANDA::JSDtree(phylo = trees)[1, 2]
testthat::expect_false(is.nan(jsd))

print(ape::write.tree(trees[1]))
print(ape::write.tree(trees[2]))

# Plot
png("~/plot.png")
par(mfrow = c(1,2))
ape::plot.phylo(trees[[1]])
ape::add.scale.bar()
ape::plot.phylo(trees[[2]])
ape::add.scale.bar()
dev.off()

# Brute force search
for (i in seq(1, 100000)) {
  set.seed(i)
  trees <- c(
    ape::rcoal(n = 5),
    ape::rcoal(n = 5)
  )
  jsd <- RPANDA::JSDtree(phylo = trees)[1, 2]
  if (is.nan(jsd)) {
    print(i)
    print(ape::write.tree(trees[1]))
    print(ape::write.tree(trees[2]))
    stop("ERROR")
  }
}

Good luck fixing and keep up the good work 👍

plot_fit_env(res,InfTemp,tot_time)

Error in match.fun(FUN) : argument "FUN" is missing, with no default

BICompare giving variable results

While rerunning BICompare a couple of times, I noticed I was receiving different results. In order to be able to get the best result based on the BSS/TSS ratio, with help of @thijsjanzen, I created a little script that runs BICompare x times, writes the results of each run in a list, searches the best BSS/TSS ratio among the results, and writes the best result in a separate output.

best_result <- BICompare(phylo,t)
num_repl <- 1000
total_result <- vector("list",length = num_repl)
for(r in 1:num_repl) {
  new_result <- BICompare(phylo,t)
  total_result[[r]] <- new_result 
  if (new_result$`BSS/TSS` > best_result$`BSS/TSS`) {
    best_result <- new_result
  }
  if (r %% 10 == 0) {
    cat(r, best_result$`BSS/TSS`, "\n")
  }
}

Hope this is useful for others as well!

Function fit_t_pl: same codes, very different results on 32 or 64-bit R

Hi, thank you vey much for this great package.

I have an issue regarding the function fit_t_pl which appears to give very different results if used on a 32-bit or 64-bit R platform, given the same codes and the same data.
I will give you an example from my data (I am using 3D landmark data):

on 32-bit:
###read tree
tree <- read.nexus("tree.nex")
###read data
xx<-as.matrix(read.csv("dataset.csv",row.names=1))

xxBM <-fit_t_pl(xx, tree, model="BM", method="RidgeAlt", SE=TRUE)

xxBM

-- Summary results for the BM model --

Penalization: RidgeAlt

LOOCV (negative):        -14929.86 

Model parameter: 
______________________ 
0 

Regularization parameter (gamma): 
______________________ 
6.271081e-37 

Evolutionary Covariance of size: 30 by 30 
for 40 species 
______________________

on 64-bit:

###read tree
tree <- read.nexus("tree.nex")

###read data
xx<-as.matrix(read.csv("dataset.csv",row.names=1))

xxBM <-fit_t_pl(xx, tree, model="BM", method="RidgeAlt", SE=TRUE)

xxBM

-- Summary results for the BM model --

Penalization: RidgeAlt

LOOCV (negative):        -13593.94 

Model parameter: 
______________________ 
0 

Regularization parameter (gamma): 
______________________ 
1.087406e-33 

Evolutionary Covariance of size: 30 by 30 
for 40 species 
______________________

As you can see, the parameters of the BM model are all quite different. Also, when comparing the GIC of different models with the package mvMorph, I obtain this:

on the 32-bit

GIC(xxBM); GIC(xxEB); GIC(xxOU);

-- Generalized Information Criterion --

GIC: -30971.89 | Log-likelihood 15428.84 


-- Generalized Information Criterion --

GIC: -Inf | Log-likelihood Inf 


-- Generalized Information Criterion --

GIC: -28133.11 | Log-likelihood 14172.98

on the 64-bit

GIC(xxBM); GIC(xxEB); GIC(xxOU);

-- Generalized Information Criterion --

GIC: -27390.41 | Log-likelihood 13783.3 


-- Generalized Information Criterion --

GIC: -28605.16 | Log-likelihood 14399.63 


-- Generalized Information Criterion --

GIC: -31424.25 | Log-likelihood 15676.97

The results are completely different. I have tested also with other datasets and the GIC are always different and sometimes, as in the case above, this affects also the choosing of the best fitting model.
I would really appreciate your help on this issue. Do you have any idea of which one I should consider as the correct one?

My R version is the 3.6.2 on Windows 10, while the Rpanda version is the 1.6.
Thank you very much for any help!

Regards,
Silvia

Can you do fit_env models not spanning the entire timespan of the tree?

My environmental data goes to only ~120Mya, but my tree is older than this. Can I fit these models to a younger age, or do I need to select younger clades?

issue with running sim_MCBD

Hi,

There appears to be an issue with running sim_MCBD where simulations do not finish because of a 'subscript out of bounds' error. The error occurs on lines 67-70 of the function, where there is a test to examine whether the incipient lineage's sister species has a current trait value of NA. As far as I can tell, the error occurs when the focal incipient lineage's sister is either an internal branch or an extinct lineage, both cases in which it's not an active lineage and therefore the trait vector is too short, which causes the 'subscript out of bounds' error.

I haven't been able to figure out exactly why this occurs, but as far as I can tell it's something to do with re-assigning sister lineages during an extinction event.

Cheers,
Alex

How to run environmental-dependent models with a sample of trees

Hello,

I want to fit the environmental-dependent models in RPANDA, but with a sample of trees, but I don't know-how. All the examples I found are with a single tree. There is some script to do this?. A lot of thanks!

Cheers,

Kateryn

HOME: Error in modelTest

Dear PANDA team,

Thank you very much for providing such amazing package.

I came across an issue in testing the HOME module incorporated in the package.
When I ran Simulation and Empirical application using the same codes as introduced in HOME Tutorial and showed in Examples of this function, the function sim_microbiota ran normally, but the function HOME_model always ended with an Error:

Error in modelTest(sequences_model, model = c("K80", "F81", "HKY"), G = F, :
Labels in tree and data differ!

I checked the lables in tree file and alignments, they are the same. How could I to address this issue ?

Regards,
Biao