mwpennell / geiger-v2 Goto Github PK

View Code? Open in Web Editor NEW

22.0 22.0 17.0 1.63 MB

A suite of methods and models for studying evolutionary radiations

Makefile 0.05% R 84.57% C 6.87% C++ 8.51%

geiger-v2's People

Contributors

Stargazers

Watchers

Forkers

bioinformaticsarchive maurotcs bomeara trgao10 ianmadlenya dwbapst hpepper21 lukejharmon lunasare gejielin yva-eline enatasha clinte14 jtvanleuven whitneythiel

geiger-v2's Issues

problem with analysis output with aov.phylo (geiger)

I'm using the aov.phylo function (R package geiger) to do phylogenetic manova. I used the same phylogeny to run the manova using two different data sets. I'm wondering about why Pr(phy) is the same in both analyses... here is the output of these two analyses

Thanks!

Marina

Structure 1

Multivariate Analysis of Variance Table

Response: dat
Df Wilks approx-F num-Df den-Df Pr(>F) Pr(phy)
group 1 0.29354 13.237 2 11 0.0011807 0.01961 *

Residuals 12

Signif. codes: 0 ‘**’ 0.001 ‘__’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Structure 2

Multivariate Analysis of Variance Table

Response: dat
Df Wilks approx-F num-Df den-Df Pr(>F) Pr(phy)
group 1 0.27139 8.949 3 10 0.0035049 0.01961 *

Residuals 12

Signif. codes: 0 ‘**’ 0.001 ‘__’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

fitDiscrete optimization code is confusing

when it spits out all that stuff people feel worried and sad.

MEDUSA results over a distribution of trees?

Hello!

I've just run medusa on 100 trees, but I can't seem to find anything akin to the options for plotting results over a distribution of trees à la the TurboMedusa or Medusa packages (which I can't install as they're not available for my version (3.6.0) of R. I'd like something like the example output here: https://github.com/josephwb/turboMEDUSA. Am I missing something really obvious?

My temporary workaround is summarising the frequency of a shift happening at a given node using simple bar charts, but I liked the elegance of the old plotting functionalities of the turboMedusa/Medusa packages!

Thanks and sorry for the potential derpiness of this question...
Laura

Error in .check.tree(tree,...) for fitDiscrete

I'm trying to use geiger to carry out an analysis of phylogenetic signal for discrete traits. My input files are here: data and tree).

When I run the below code

library(readr)

l2_geog <- read_tsv('l2_subclade_country', col_names = FALSE)
l2_geog_fac <- as.factor(l2_geog$X2)
names(l2_geog_fac)<- l2_geog$X1

l2_tree <- read.tree('2018.10.23.all_l2_in_asia.fasta.sub_clade.tree')

fitDiscrete(l2_tree, l2_geog_fac)

I get the below error

Error in .check.tree(tree, ...) : 
  'tree must be bifurcating (no polytomies or unbranched nodes)'
In addition: Warning messages:
1: In treedata(phy, dat) :
  The following tips were not found in 'data' and were dropped from 'phy':
...

I've tried to get to the bottom of it. Since the example from the docs was working fine e.g.

library(geiger)
data(geospiza)
attach(geospiza)

## defaults
gb <- geospiza.data[,1]>4.2
#gb['fusca'] <- 'THIRD'
gb<-as.factor(gb)
names(gb)<-rownames(geospiza.data)
fitDiscrete(geospiza.tree, gb)

I was confused when

( bifurcating && (!is.binary.tree(geospiza.tree) || any(tabulate(geospiza.tree$edge[, 1]) == 1)) )

and

( bifurcating && (!is.binary.tree(l2_tree) || any(tabulate(l2_tree$edge[, 1]) == 1)) )

Both returned FALSE, as it is this statement which triggers the error message I received.

Any help would be greatly appreciated.

I'm running R v3.5.2, session info below.

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] readr_1.3.1    geiger_2.0.6.1 ape_5.3       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       subplex_1.5-4    lattice_0.20-38 
 [4] mvtnorm_1.0-8    crayon_1.3.4     digest_0.6.18   
 [7] MASS_7.3-51.1    grid_3.5.2       R6_2.4.0        
[10] nlme_3.1-137     coda_0.19-2      pillar_1.3.1    
[13] rlang_0.3.1      rstudioapi_0.9.0 deSolve_1.21    
[16] tools_3.5.2      hms_0.4.2        yaml_2.2.0      
[19] parallel_3.5.2   compiler_3.5.2   pkgconfig_2.0.2 
[22] tibble_2.0.1  ```

MEDUSA parameter estimation (?) error

I'm attempting to run a MEDUSA analysis in geiger - the analysis initially runs along fine, and estimates the number of rate shifts. But then when it comes to the end (following no more significant increases in the aicc score), it produces this error:

Error in uniroot(thresholdR, lower = low.bound, upper = par1) :
f.lower = f(lower) is NA

After looking at the package code, it seems that there's something strange going on with arriving at a confidence interval for parameter "r", but I have no idea what. Also, the issue goes away when I reduce the species number in the richness file, although there are still several NA's in the parameter estimation output. Any idea what might be causing this issue?

TCS

Hi,
Can you please install me TCS (Transitive Consistency Score for MSA evaluation) on Jekyl?
here is the link to the installation:
http://www.tcoffee.org/Projects/tcs/index.html

Thanks,
Nomi

output of aov.phylo is confusing to users

The current output of aov.phylo returns two P-values, one for the "standard" analysis and one for the "phylogenetic" analysis. The column labels for these are Pr(>F) and Pr(phy). I agree with the user who emailed me that this is confusing - both are Pr(>F), after all, but just with different assumptions, and Pr(phy) reads like Probability of the tree, which is weird. I suggest Pr(Standard) and Pr(Phylogenetic) or something.

geiger considered orphaned by CRAN

Hi,
It looks like CRAN is currently considering geiger as an orphaned package,

"X-CRAN-Comment Orphaned and corrected on 2019-01-16 as no response to
request for corrections"

I was trying to submit a package to CRAN and received a Warning because of that. I hope it is not true.

Best,

fitDiscrete fails when traits have character state 0

I appreciate that the error message here is now sensible, but this is really not ok I don't think. Coding two-state characters as 0,1 goes back to the beginning of comparative methods, and I don't think it's ok that we can't handle data coded like that.

fitContinuous - input argument names and documentation names are very confusing

When people discuss evolutionary trends in the literature, they usually mean changes in the mean trait value through time. But in fitContinuous this is called "drift" - which conjures genetic drift and, probably, is confusing to everyone. Then what we call the trend model is a trend in the rate, and is very similar to EB - but there is really no way to know that from the documentation.

This is a bit of a sad state. I suggest changing this but in a way that maintains backwards compatibility with people's code - that is, add two new arguments for these two models that are more intuitive, like meanTrend and rateTrend, and (silently) accept the old arguments as well.

install.packages("geiger") error

make: *** [/usr/lib64/R/etc/Makeconf:172: foreign-graphml.o] Error 1
ERROR: compilation failed for package ‘igraph’

removing ‘/usr/lib64/R/library/igraph’
ERROR: dependency ‘igraph’ is not available for package ‘phangorn’
removing ‘/usr/lib64/R/library/phangorn’
ERROR: dependency ‘phangorn’ is not available for package ‘phytools’
removing ‘/usr/lib64/R/library/phytools’
ERROR: dependency ‘phytools’ is not available for package ‘geiger’
removing ‘/usr/lib64/R/library/geiger’

The downloaded source packages are in
‘/tmp/RtmpqU6Eiy/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Warning messages:
1: In install.packages("geiger") :
installation of package ‘igraph’ had non-zero exit status
2: In install.packages("geiger") :
installation of package ‘phangorn’ had non-zero exit status
3: In install.packages("geiger") :
installation of package ‘phytools’ had non-zero exit status
4: In install.packages("geiger") :
installation of package ‘geiger’ had non-zero exit status

How can I solve this problem ？
Thanks

Build is broken

The build has been broken since 79e79d9:
https://travis-ci.org/mwpennell/geiger-v2

Looks like the examples have fallen behind the code. Y'all should be getting emails when it fails.

fitContinuousMCMC Early Burst model fails

Error message is

Error in .slater.getvcv(C, model, params) : object 'V' not found

Steps to reproduce:

library(geiger)
# all from example section of help('fitContinuousMCMC')
data(caniformia)
phy <- caniformia$phy
d <- caniformia$dat
node.priors <- caniformia$node.priors
root.prior <- caniformia$root.prior
fitContinuousMCMC(phy, d, model = "EB", Ngens = 1000, sampleFreq=100,
               printFreq = 100, node.priors = node.priors, root.prior = root.prior,
               outputName ="Trend_caniforms")

Version = geiger_2.0.6

early burst is given incorrect bounds in fitContinuous for non-ultrametric trees due to use of branching.times

The parameter 'a' for the EB model in fitContinuous is given bounds thusly, according to the documentation:

"The maximum bound is set to -0.000001, representing a decelerating rate of evolution. The minimum bound is set to log(10^-5)/depth of the tree."

But this isn't true for non-ultrametric trees, where I (and apparently Slater, based on conversations both privately and on R-Sig-Phylo) would define tree depth as the distance from the youngest tip to the root. This is because fitContinuous uses branching.times, which explicitly is inapplicable to non-ultrametric trees, according to Emmanuel's documentation:

"This function computes the branching times of a phylogenetic tree, that is the distance from each node to the tips, under the assumption that the tree is ultrametric. Note that the function does not check that the tree is effectively ultrametric, so if it is not, the returned result may not be meaningful."

And here's fitContinuous use of it, to get the bound:
https://github.com/mwpennell/geiger-v2/blob/master/R/traits.R#L67

I suggest replacing uses of branching.times to obtain tree depth with node.depth.edgelength. Here's a short example to show you that use of branching.times without checking whether the tree is ultrametric can give meaningless results, unlike node.depth.edgelength:

library(geiger)

set.seed(1)
phy<-rtree(10)

plot(phy)
axisPhylo()

branching.times(phy)
max(branching.times(phy))

node.depth.edgelength(phy)
max(node.depth.edgelength(phy))

(log(10^(-5))/max(branching.times(phy)))
(log(10^(-5))/max(node.depth.edgelength(phy)))

As we can see, branching.times can even return negative values for non-ultrametric trees. The value from max(node.depth.edgelength(phy)) matches the visual expectation for tree depth from the plot. The two approaches also produce very different bounds for the early burst parameter.

Other uses of branching.times in geiger may need to be investigated so that silent misuse of branching.times on non-ultrametric trees can be avoided:

https://github.com/mwpennell/geiger-v2/search?utf8=%E2%9C%93&q=%22branching.times%22

Cheers,
-Dave

bounds on alpha in fitContinuous(model = "OU")

I seem to have found a mismatch between fitContinuous's behavior and its documentation (version 2.0.3, from CRAN).

The documentation for fitContinuous with model == "OU" says "Default bounds are alpha = c(min = 0, max = 150)".

But when I fit a model with the default options and look at its bounds, I get alpha bounded between 0 and exp(1) (about 2.7).

fit =  fitContinuous(phylogeny, x, model = "OU")
fit$bnd
                 mn           mx
alpha 7.124576e-218 2.718282e+00
sigsq 7.124576e-218 2.688117e+43

I noticed this because I keep bumping up against the upper bound.

Should the behavior be changed to match the documentation (or vice versa)? Sorry if I'm missing something obvious.

MEDUSA issue with 'old' trees

Hello,
I'm attempting to use MEDUSA on a large tree that is ~850MY old at the root (cumulative branch length from any tip to root >800). When I attempt to run Medusa, I get the following error:

Appropriate  aicc-threshold for a tree of 3973 tips is: 11.90177.

Error in optim(fn = foo, par = log(sp), method = "N") : 
  function cannot be evaluated at initial parameters

However, if I set all the edges in $edge.length to 1 and run Medusa, it begins taking steps as expected.

Any help or insight would be greatly appreciated!

Update changelog to reflect changes from geiger 1.3

The ChangeLog doesn't really do justice to all the improvements and additions between geiger 1.3.1 and 1.99.2. It would be nice to have a concise summary of these somewhere (maybe already exists and can just be linked from the ChangeLog?)

I'd also consider adding version numbers next to the dates in the ChangeLog.

Just suggestions, feel free to close if not helpful.

rescale.phylo returns NaN under "delta" model for some trees

Hello all,

This only crops up with a small number of simulated trees I've encountered (maybe 5% of the time) so I'll include a reproducible example.

library(geiger)

tree<-"(((((((t18:1.463679077,t12:0.1729684733):0.1556820862,(((t24:1.845880473,t4:0.3034377797):1.650824079,t20:0.4856926412):0.6032191589,t30:0.4777796259):0.2100940662):0.05625117488,t5:1.296163165):0.5768372864,t9:0.659113761):1.781101809,((t22:0.1188352299,(t25:0.736267727,t14:3.616042752):0.8135201037):0.105873215,(t21:0.198981436,((t8:0.2432500795,t28:0.0247130706):0.684336435,t27:0.4200404161):0.8257523197):1.184559852):0.4631210882):3.936306288,((t1:1.781101809,t6:0.5768372864):3.936306288,(t7:0.1556820862,t11:1.463679077):0.05625117488):3.765426552):3.765426552,(t3:0.2100940662,(((t2:0.3034377797,t13:0.4856926412):1.845880473,(t17:1.296163165,(t26:0.4631210882,t15:0.105873215):0.659113761):0.4777796259):1.650824079,(((t16:3.616042752,(t19:0.198981436,t10:0.8257523197):1.184559852):0.736267727,t23:0.684336435):0.8135201037,t29:0.2432500795):0.1188352299):0.6032191589):0.1729684733);"

tree<-read.tree(text=tree)
delta<-0.5 # delta value doesn't actually matter
treeDelta<-rescale(tree,"delta",0.5)

What's the problem you might ask? Look at the edge lengths.

> treeDelta$edge.length
 [1]        NaN 3.17042529 1.15547898 0.35048350 0.03363380 0.09260120
 [7] 0.83842487 0.10206409 0.12480027 0.35143321 0.91496868 0.95451899
[13] 0.16150088 0.27597020 0.27916735 0.75257833 0.39964675 0.31228861
[19] 0.07013792 0.07819226 0.52460540 0.45487244 2.08798767 0.76061844
[25] 0.12293341 0.50201096 0.40078955 0.13940497 0.01423367 0.24754161
[31] 3.05283603 2.43871389 0.96281821 0.31952592 0.03885464 0.10679030
[37] 0.96466208 1.57963064 0.77112191 1.76659662 2.57086049 1.93405803
[43] 0.27399210 0.43420656 0.55626904 1.31141818 0.69690710 0.45165461
[49] 0.10572436 0.24703684 1.37134463 0.97413647 3.41184374 1.29707438
[55] 0.19570647 0.78099644 0.91072418 0.45897703

Whoops! Where's that come from? I did a little tracking of the NaN value and it is produced early in the .delta.rescale code (https://github.com/mwpennell/geiger-v2/blob/master/R/utilities-phylo.R#L1595-L1601). Let's take a closer look...

> phy<-tree
> ht=geiger:::heights.phylo(phy)
> N=Ntip(phy)
> Tmax=ht$start[N+1]
> mm=match(1:nrow(ht), phy$edge[,2])
> ht$t=Tmax-ht$end
> ht$e=ht$start-ht$end
> ht$a=ht$t-ht$e
> ht$a
 [1]  1.027161e+01  1.027161e+01  1.258006e+01  1.258006e+01  1.092924e+01
 [6]  1.032602e+01  1.005967e+01  9.482835e+00  8.270727e+00  9.084247e+00
[11]  9.084247e+00  9.349414e+00  1.085950e+01  1.085950e+01  1.017517e+01
[16]  1.146716e+01  1.146716e+01  7.587104e+00  7.587104e+00  1.729685e-01
[21]  4.272892e+00  4.272892e+00  2.904791e+00  3.563905e+00  3.563905e+00
[26]  2.444811e+00  3.629371e+00  3.629371e+00  1.708543e+00  8.950229e-01
[31]  0.000000e+00 -1.776357e-15  3.765427e+00  7.701733e+00  9.482835e+00
[36]  1.005967e+01  1.011592e+01  1.011592e+01  1.032602e+01  1.092924e+01
[41]  7.701733e+00  8.164854e+00  8.270727e+00  8.164854e+00  9.349414e+00
[46]  1.017517e+01  3.765427e+00  7.530853e+00  7.530853e+00  0.000000e+00
[51]  1.729685e-01  7.761876e-01  2.427012e+00  2.427012e+00  2.904791e+00
[56]  7.761876e-01  8.950229e-01  1.708543e+00  2.444811e+00

See that negative -1.776357e-15 at ht$a[32]? There's the problem. Essentially its a floating point error: sometimes subtracting ht$t-ht$e produces a small negative value very close to 0 (that isn't zero), and when you take that to a power...

> bl=(ht$a+ht$e)^delta-ht$a^delta
> bl
 [1] 0.220745611 0.026872057 0.251311579 0.042520936 0.072659116 0.073500882
 [7] 0.198143411 0.105221433 0.020586935 0.119761588 0.549738122 0.032366658
[13] 0.036703391 0.003747528 0.065174265 0.253496648 0.084126732 0.028116400
[19] 0.253982114 0.203025677 0.072138310 0.114320550 0.345278174 0.118914381
[25] 0.027835754 0.898291021 0.051526793 0.205625505 0.239781014 0.120842270
[31] 0.000000000         NaN 0.834728900 0.304221551 0.092277433 0.008855312
[37] 0.024380607 0.032858177 0.092527477 0.240898533 0.082221249 0.018466340
[43] 0.138121309 0.200260259 0.132172506 0.105522315 0.803769283 0.642079460
[49] 0.010229886 0.415894786 0.465120328 0.676871823 0.509210589 0.146457903
[55] 0.183485951 0.065041365 0.361055974 0.256476589 0.341501650

You get NaNs.

One kludge to fix this would be to ask for the abs(), as apparently R doesn't mind taking the exponential of very tiny positive values so much.

> ht$a=abs(ht$t-ht$e)
> bl=(ht$a+ht$e)^delta-ht$a^delta
> bl
 [1] 0.220745611 0.026872057 0.251311579 0.042520936 0.072659116 0.073500882
 [7] 0.198143411 0.105221433 0.020586935 0.119761588 0.549738122 0.032366658
[13] 0.036703391 0.003747528 0.065174265 0.253496648 0.084126732 0.028116400
[19] 0.253982114 0.203025677 0.072138310 0.114320550 0.345278174 0.118914381
[25] 0.027835754 0.898291021 0.051526793 0.205625505 0.239781014 0.120842270
[31] 0.000000000 1.940470662 0.834728900 0.304221551 0.092277433 0.008855312
[37] 0.024380607 0.032858177 0.092527477 0.240898533 0.082221249 0.018466340
[43] 0.138121309 0.200260259 0.132172506 0.105522315 0.803769283 0.642079460
[49] 0.010229886 0.415894786 0.465120328 0.676871823 0.509210589 0.146457903
[55] 0.183485951 0.065041365 0.361055974 0.256476589 0.341501650

But that's a crazy kludge. There's probably better ways to deal with this floating value issue, but I'm not a good enough programmer to know them.

I haven't investigated very extensively, but its possible this issue affects other rescale.phylo models too, if they take the power of a different between ht$t-ht$e.

I hope this is helpful.

Medusa breaks when too many shifts are added

Example:

require(geiger)
dat <- get(data(chelonia))
phy <- dat$phy
phy <- reorder(phy, "cladewise")

Put a low threshold so lots of shifts get added

res <- medusa(phy, cut="node", threshold = 1, ncores = 1)

Loading required package: parallel
Using user-specified aicc-threshold of 1 (rather than the default value for a tree of 226 tips: 6.621099).

Step 1: lnLik=-944.0201; aicc=1892.067; model=bd
Step 2: lnLik=-912.3048; aicc=1830.663; shift at node 252; model=yule; cut=node; # shifts=1
Step 3: lnLik=-884.209; aicc=1780.607; shift at node 393; model=bd; cut=node; # shifts=2
Step 4: lnLik=-871.356; aicc=1759.038; shift at node 280; model=yule; cut=node; # shifts=3
Step 5: lnLik=-867.2836; aicc=1755.067; shift at node 343; model=yule; cut=node; # shifts=4
Step 6: lnLik=-863.2939; aicc=1751.3; shift at node 417; model=yule; cut=node; # shifts=5
Step 7: lnLik=-859.7131; aicc=1748.389; shift at node 374; model=yule; cut=node; # shifts=6
Error in descendants[[node]] : subscript out of bounds

2-fold improvement in efficiency for OU VCV

Hi,

Thank you for creating geiger. I've been using it to fit OU trait evolution models to somewhat large timetrees (>2000 tips) using fitContinuous (VCV method, since my timetrees are not ultrametric). I noticed that there is an easy way to substantially reduce the computing time of the likelihood using a simple modification. This "issue" serves to point out to you the optimization approach, in the hope that it will be implemented in future versions of geiger. I have successfully tested the proposed approach on my own local installation of geiger. On trees with ~2500 tips it reduces computing time by a factor of about 2.

Specifically, in file "traits.R", function "ll.ou.vcv", you are computing the inverse of the V matrix using solve() and then the log-abs-determinant of V using determinant(). These two operations contribute the vast bulk of the computation time. The present implementation is sub-optimal because the two operations (solve and determinant) do not benefit from each other's internal computations. It is actually possible to "simultaneously" compute both the inverse of V and its log-abs-det using Cholesky decomposition:

# Compute the Cholesky decomposition, using pivoting to account for degenerate cases
cholV = chol(V, pivot=TRUE)

# Compute the inverse of the pivoting permutation.
# This is needed for computing the correct invV from the Cholesky decomposition.
pivot = attr(cholV,"pivot")
invpivot = integer(length(pivot))
invpivot[pivot] = c(1:length(pivot))

# Compute the inverse of V based on its Cholesky decomposition
invV = chol2inv(cholV)[invpivot,invpivot]

# Compute the logarithm of the modulus of the determinant of V using the Cholesky decomposition
logabsdet = 2*sum(log(abs(diag(cholV))))

The invV and logabsdet can then be used in the rest of your code as before.

fitDiscrete - fn does not return complete Q matrix

Function does not return the complete Q matrix as a 'matrix' object. The function print the matrix to the screen only.
The Q matrix in a 'matrix' format, rather than a vector of parameters ( see 'coef( fitDistrete(...) )' ), is important to apply in further analyses.

Add support for vector of root values in sim.char()

Currently the function 'sim.char' only accepts a single root value even when 'par' is a matrix and model is 'BM' or 'spetiational'.
I am adding this feature to 'sim.char'.

gbresolve

Hello,

when I use the gbresolve function there are genera that are not found
although they appear in the NCBI taxonomy browser.

for example:
when I run:

genus = "Christia"
gbresolve(genus, rank= "genus", within = "Fabaceae")

I get:

In gbresolve.default(genus, rank = "genus", within = "Fabaceae") :
The following taxa were not encountered in the NCBI taxonomy:
Christia

and so for genus = "Pycnospora" / "Solori" / "Thailentadopsis" and more examples.
the common to all these examples is that those genus are relatively new (but not very new).

thanks!
Nomi

dtt with discrete characters error

Hey there,

functions 'dtt' and 'disparity' seem to both throw the same error when presented with discrete characters using the 'num.states' index.
version: geiger 2.0.6
error:
Error in apply(data, 2, f) : dim(X) must have a positive length

steps to reproduce:
tmp=get(data(geospiza))
td=treedata(tmp$phy, tmp$dat)
geo=list(phy=td$phy, dat=td$data)
gb=round(geo$dat[,5]) ## create discrete data
names(gb)=rownames(geo$dat)
dtt(tmp$phy, gb, index="num.states")

appreciate any help you can lend here.
Cheers.

congruify.phylo help file has strange characters

Examples

sal=get(data(caudata))
res=congruify.phylo(sal$fam, sal$phy, sal$tax, tol=0, scale=NA, ncores=2)
print(res$calibrations)
plot(ladderize(sal$phy,right=FALSE), cex=0.35, type="fan", label.offset=2.5)
plot(ladderize(sal$fam,right=FALSE), cex=0.5, type="fan", label.offset=2.5, no.margin=FALSE)

cat("\n\n\n*** If 'PATHd8' is installed ***, execute the following:
\n\tres=congruify.phylo(sal$fam, sal$phy, sal$tax, tol=0, scale="PATHd8")
\tprint(res)
\n\tplot(res$phy, cex=0.35, type="fan", label.offset=2.5)
\n\tprint(max(branching.times(res$phy)))
\tprint(max(branching.times(sal$fam)))
\tprint(max(branching.times(sal$phy)))\n\n\n")

ratematrix returns singular matrix without warning

Using 'ratematrix' when number of traits >= number of tips return matrix which is singular without any warning to user.
Function should give a warning message when matrix output is singular.

name.check is not case sensitive!

Hi,
I just found out that name.check() will give an "OK" even if a species name is erroneously capitalised like in Callithrix_Jacchus instead of Callithrix_jacchus in the dataframe but not in the tree. This is a problem for various other packages which are instead case sensitive for the species names.

Medusa requires cladewise trees

works:

require(geiger)
dat <- get(data(chelonia))
phy <- dat$phy
phy <- reorder(phy, "cladewise")
resMedusa <- medusa(phy, cut="node", threshold = 3, ncores = 1)

Doesn't work:

phy <- reorder(phy, "postorder")
resMedusa <- medusa(phy, cut="node", threshold = 3, ncores = 1)

Using user-specified aicc-threshold of 3 (rather than the default value for a tree of 226 tips: 6.621099).

Step 1: lnLik=-944.0201; aicc=1892.067; model=bd
Step 2: lnLik=-912.3048; aicc=1830.663; shift at node 252; model=yule; cut=node; # shifts=1
Error in op[aff[1], ] : subscript out of bounds

fitContinuousMCMC: Error in solve.default(C): Lapack routine dgesv:

Hi,

I'm try to run fitContinuousMCMC function with several traits (around 600 datasets). I'm attaching 6 of them as example here:

In some cases I get the follow message:
Error in solve.default(C) : Lapack routine dgesv: system is exactly singular: U[967,967] = 0

I have the same error on several of my datasets.

I appreciate any help.

Regards,

Add `simulate.gfit()` and `update.gfit()` S3 methods

@mwpennell says he's doing this already, which is awesome. Just thought I'd put an issue down here to keep track of it.

fitDiscrete fails when traits are, um, actually discrete?

It looks like fitDiscrete takes numeric (doubles) but not integer-valued traits? From the example,

## match data and tree
tmp=get(data(geospiza))
td=treedata(tmp$phy, tmp$dat)
geo=list(phy=td$phy, dat=td$data)

And now let's use as.integer to make sure we're actually using discrete values instead of continuous values:

gb= as.integer(round(geo$dat[,5])) ## create discrete data
names(gb)=rownames(geo$dat)
tmp=fitDiscrete(geo$phy, gb, model="ER", control=list(niter=5), ncores=2)

I get this error:

.check.states(tree, states, strict.vals = 1:k) : 
  The states vector must contain names

Of course dropping the as.integer things work fine. Maybe I'm missing something here.

sim.bdtree does not check correct parameter use

Hi Matthew,

I noticed it is easy to misuse sim.bdtree, as this code may convince you of:

library(geiger)

# Should give error: 'if stopping criterium is set to 'time', please do not set the number of taxa 'n'
sim.bdtree(b = 0.4, d = 0.1, stop = "time", n = 10, t = 10)
# Should give error: 'if stopping criterium is set to 'taxa', please do not set the time of the simulation 't'
sim.bdtree(b = 0.4, d = 0.1, stop = "taxa", n = 10, t = 10)
# Should give error: 'if stopping criterium is set to 'time', please provide a value for 'n''
sim.bdtree(b = 0.4, d = 0.1, stop = "time", n = 10)
# Should give error: 'if stopping criterium is set to 'taxa', please provide a value for 'n''
sim.bdtree(b = 0.4, d = 0.1, stop = "taxa", t = 10)

It would have saved me some debugging if geiger did check its arguments.

Could you fix it? If you assign me, I will provide a pull request. Note that I will change the default parameters for n and t to NA or NULL, but I do not expect this to break code (yet I may be wrong).

mwpennell / geiger-v2 Goto Github PK

geiger-v2's People

Contributors

Stargazers

Watchers

Forkers

geiger-v2's Issues

Residuals 12

Residuals 12

Put a low threshold so lots of shifts get added

works:

Doesn't work:

Recommend Projects

Recommend Topics

Recommend Org