Git Product home page Git Product logo

tidytree's Introduction

tidytree: A Tidy Tool for Phylogenetic Tree Data Manipulation

CRAN_Status_Badge

Phylogenetic tree generally contains multiple components including node, edge, branch and associated data. ‘tidytree’ provides an approach to convert tree object to tidy data frame as well as provides tidy interfaces to manipulate tree data.

Visit https://yulab-smu.top/treedata-book/ for details.

✍️ Author

Guangchuang YU

School of Basic Medical Sciences, Southern Medical University

https://guangchuangyu.github.io

saythanks

⏬ Installation

Get the released version from CRAN:

install.packages("tidytree")

Or the development version from github:

remotes::install_github("GuangchuangYu/tidytree")

💖 Contributing

We welcome any contributions! By participating in this project you agree to abide by the terms outlined in the Contributor Code of Conduct.

tidytree's People

Contributors

arendsee avatar clearmind777 avatar davisvaughan avatar guangchuangyu avatar heavywatal avatar romainfrancois avatar xiangpin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tidytree's Issues

Tidytree update has impacted ggtree functions

I have code for making a phylogenetic tree that relies on ggtree, ggtreeExtra, and tidytree. It was working perfectly for many months, up until several days ago, when geom_hilight and geom_cladelab from ggtree stopped working. From my GitHub commits, I identified that the only change in my environment was an update to tidytree 0.4.4; when I reverted back to tidytree 0.4.2, the code ran perfectly again.

At least one other person has reported having this issue (https://support.bioconductor.org/p/9153297/#9153321), and it's related to #40 (I received a similar error message when using geom_cladelab).

unnest error

Hi, sorry to disturb. Recently when I was using ggtreeExtra to plot tree, something went wrong. As you can see,

trda %>% unnest(RareAbundanceBySample)
Error in UseMethod("unnest") :
no applicable method for 'unnest' applied to an object of class "c('MPSE', 'SummarizedExperiment', 'RectangularData', 'Vector', 'Annotated', 'vector_OR_Vector')

But the version of tidytree is 0.4.5, could you please help me solve that?

accesor-tidytree docs

While https://yulab-smu.top/treedata-book/chapter2.html#accesor-tidytree is very helpful, one is still left wondering after reading this section:

  • How does one list all nodes labels/numbers or branch length?
    • Must one just resort to conversion to a tibble just to get values from 1 column in the table?
  • How does one list all internal/leaf nodes?
    • This is especially confusing/challenging since conversion of a tbl_tree object to a tibble drops the isTip column
    • This is important for functions such as treeio::drop.tip

Found more than one class "phylo" in cache

Hello.
As described in previous issues #10 and #47 the problem remains:
tens and tens of lines full of:

Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by 'tidytree'

It's impossible to ignore as the results are nested within these lines and turning message=FALSE is not a solution as it cuts all other messages as well.

Do you have a workaround?

Thanks!

R version

tidytree now depends on R version 3.5, but I think the latest version of R is 3.4.3

Error in MRCA function

Hi,

I'm trying to view a subclade using the viewClade() function following your example. But MRCA throws an error. I installed the dev version of ggtree.

library(ggtree)
nwk <- system.file("extdata", "sample.nwk", package="treeio")
tree <- read.tree(nwk)
p <- ggtree(tree) + geom_tiplab()
viewClade(p, MRCA(p, tip=c("I", "L")))

Error in MRCA(.data$data, .node1, .node2 = .node2, ...) : 
  argument ".node1" is missing, with no default
In addition: Warning messages:
1: In get_clade_position_(treeview$data, node) :
  restarting interrupted promise evaluation
2: In get_clade_position_(treeview$data, node) :
  restarting interrupted promise evaluation
3: In get_clade_position_(treeview$data, node) :
  restarting interrupted promise evaluation

session info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=nb_NO.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=nb_NO.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=nb_NO.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggtree_3.3.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8         pillar_1.7.0       compiler_4.1.2    
 [4] yulab.utils_0.0.4  tools_4.1.2        digest_0.6.29     
 [7] aplot_0.1.2        evaluate_0.14      jsonlite_1.7.3    
[10] tidytree_0.3.7     lifecycle_1.0.1    tibble_3.1.6      
[13] nlme_3.1-155       gtable_0.3.0       lattice_0.20-45   
[16] pkgconfig_2.0.3    rlang_1.0.1        cli_3.1.1         
[19] DBI_1.1.2          ggplotify_0.1.0    rstudioapi_0.13   
[22] patchwork_1.1.1    yaml_2.2.2         parallel_4.1.2    
[25] xfun_0.29          treeio_1.19.1      fastmap_1.1.0     
[28] knitr_1.37         dplyr_1.0.8        generics_0.1.2    
[31] vctrs_0.3.8        gridGraphics_0.5-1 grid_4.1.2        
[34] tidyselect_1.1.1   glue_1.6.1         R6_2.5.1          
[37] fansi_1.0.2        rmarkdown_2.11     pacman_0.5.1      
[40] farver_2.1.0       ggplot2_3.3.5      purrr_0.3.4       
[43] tidyr_1.2.0        magrittr_2.0.2     htmltools_0.5.2   
[46] scales_1.1.1       ellipsis_0.3.2     assertthat_0.2.1  
[49] ape_5.6-1          colorspace_2.0-2   labeling_0.4.2    
[52] utf8_1.2.2         lazyeval_0.2.2     munsell_0.5.0     
[55] ggfun_0.0.5        crayon_1.4.2 

Something not quite right with most recent "offspring" function

Hi, first thanks for providing such a great package! I recently updated tidytree from 0.2.4 to 0.2.8 and noticed some of my previous code that call "ggtree::flip" shows inconsistent result. Further investigation showed that when calling the offspring, exactly the "offspring.tbl_tree" function, with argument self_incude=T, the version in 0.2.4 will include self but the version in 0.2.8 return a tbl_tree without any row. Can you check it there's any change you recently made to the function that might cause this?

Thanks,
Qin

converting from subsetted tbl_tree to phylo drops labels

Hi,

I am trying to use tidytree to subset a very large tree object so that only the closest relatives to a given tip are shown. I am able to successfully do the subsetting with a combination of the ancestor and offspring functions. However, when I try an convert the tbl_tree back to a phylo object, it doesn't correctly convert the labels and instead uses the node numbers as the tip labels.

Here is a MRE:

library(tidytree)
#> Warning: package 'tidytree' was built under R version 3.4.4
#> 
#> Attaching package: 'tidytree'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(ggtree)
#> ggtree v1.13.0.002  For help: https://guangchuangyu.github.io/software/ggtree
#> 
#> If you use ggtree in published research, please cite:
#> Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2017, 8(1):28-36, doi:10.1111/2041-210X.12628
#> 
#> Attaching package: 'ggtree'
#> The following object is masked from 'package:tidytree':
#> 
#>     MRCA
library(treeio)

nwk <- system.file("extdata", "sample.nwk", package="treeio")
tree <- read.tree(nwk)

tree_tbl <- tree %>% 
  as_data_frame() 
  
tree_tbl %>% 
  ancestor("A") %>% 
  {offspring(tree_tbl, .$node[[4]])} %>% 
  as.phylo() %>% 
  ggtree() + geom_tiplab()
#> Warning: package 'bindrcpp' was built under R version 3.4.4

Created on 2018-05-10 by the reprex package (v0.2.0).

Is there anyway around this? or to force the labels after the conversion?

ToDataFrameTable does not extract internal nodes

Hi!
I'm using the Node class for most tree manipulations. I propagate annotations up the tree from the leaves using the vignette's methods, e.g.

t <- Traverse(divtree, traversal = "post-order")
Do(t, function(x) x$n <- Aggregate(node = x, attribute = "n", aggFun = sum))

and

divtree$Do(function(node) node$samples <- unique(unlist(node$samples)))

I want to merge nodes of a binary tree created this way by pruning nodes that don't meet a threshold for several criteria, then keeping nodes that contain unique ids (currently held as a list per node). Unfortunately, pruning pushes many ids into internal nodes, and I do not want to agglomerate them with children from the second branch (which achieves a higher resolution).
I would like to obtain these nodes in a ToDataFrameTable conversion, but ToDataFrameTable currently leaves them out, and I have to remove them from my downstream analysis. Is there any way to keep internal nodes? Especially those that contain unique attributes?

Thank you!!
Kyle Kimler

groupOTU creates 0s as marker for non-group-associated value

Hi,

the following chunk introduces 0 in the created group.

> set.seed(2017)
> tr <- rtree(4)
> x <- as_tibble(tr)
> groupOTU(x, list(one= c('t3', 't1'),"zero"=c('t2')), group_name = "fake_group")
# A tibble: 7 x 5
  parent  node branch.length label fake_group
   <int> <int>         <dbl> <chr> <fct>     
1      5     1       0.435   t4    0         
2      7     2       0.674   t1    one       
3      7     3       0.00202 t3    one       
4      6     4       0.0251  t2    zero      
5      5     5      NA       NA    0         
6      5     6       0.472   NA    zero      
7      6     7       0.274   NA    one      

This is problematic, if a label with the value 0 exists.

> set.seed(2017)
> tr <- rtree(4)
> x <- as_tibble(tr)
> groupOTU(x, list(one= c('t3', 't1'),"0"=c('t2')), group_name = "fake_group")
# A tibble: 7 x 5
  parent  node branch.length label fake_group
   <int> <int>         <dbl> <chr> <fct>     
1      5     1       0.435   t4    0         
2      7     2       0.674   t1    one       
3      7     3       0.00202 t3    one       
4      6     4       0.0251  t2    0         
5      5     5      NA       NA    0         
6      5     6       0.472   NA    0         
7      6     7       0.274   NA    one

I think the problem is created in the following line:

foc <- rep(0, n)

and I would suggest to replace with with NA. Would this be possible?

Felix

issue with tidytree 0.3.9 -> 0.4.0 under latest R 4.2.1-2

Dear,

I just tried to install other packages depending on tidytree and they failed.
Installing tidytree itself also fails after looking for a missing class.

I recently had other packages failing in that way, it seems that some general functions of classes have vanished with some recent R upgrade.

Does any one know what to do here?

Thanks in advance

Bioconductor version 3.15 (BiocManager 1.30.18), R 4.2.1 (2022-06-23) ubuntu 20 (up-to-date)

> BiocManager::install("tidytree")
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cloud.r-project.org

Bioconductor version 3.15 (BiocManager 1.30.18), R 4.2.1 (2022-06-23)
Installing package(s) 'tidytree'
trying URL 'https://cloud.r-project.org/src/contrib/tidytree_0.4.0.tar.gz'
Content type 'application/x-gzip' length 63761 bytes (62 KB)
==================================================
downloaded 62 KB

* installing *source* package ‘tidytree’ ...
** package ‘tidytree’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error in setClassUnion("DNAbin_Or_AAbin", c("DNAbin", "AAbin", "NULL")) : 
  could not find function "setClassUnion"
Error: unable to load R code in package ‘tidytree’
Execution halted
ERROR: lazy loading failed for package ‘tidytree’
* removing ‘/opt/R_LIBS/tidytree’
* restoring previous ‘/opt/R_LIBS/tidytree’

The downloaded source packages are in
	‘/opt/R_LIBS/tmp/RtmpapGmC3/downloaded_packages’
Old packages: 'tidytree'
Update all/some/none? [a/s/n]: 
a
trying URL 'https://cloud.r-project.org/src/contrib/tidytree_0.4.0.tar.gz'
Content type 'application/x-gzip' length 63761 bytes (62 KB)
==================================================
downloaded 62 KB

* installing *source* package ‘tidytree’ ...
** package ‘tidytree’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error in setClassUnion("DNAbin_Or_AAbin", c("DNAbin", "AAbin", "NULL")) : 
  could not find function "setClassUnion"
Error: unable to load R code in package ‘tidytree’
Execution halted
ERROR: lazy loading failed for package ‘tidytree’
* removing ‘/opt/R_LIBS/tidytree’
* restoring previous ‘/opt/R_LIBS/tidytree’

The downloaded source packages are in
	‘/opt/R_LIBS/tmp/RtmpapGmC3/downloaded_packages’
Warning messages:
1: In install.packages(...) :
  installation of package ‘tidytree’ had non-zero exit status
2: In install.packages(update[instlib == l, "Package"], l, contriburl = contriburl,  :
  installation of package ‘tidytree’ had non-zero exit status

Not a BioConductor issue, I get the same with the R installer

> install.packages("tidytree")
trying URL 'https://cloud.r-project.org/src/contrib/tidytree_0.4.0.tar.gz'
Content type 'application/x-gzip' length 63761 bytes (62 KB)
==================================================
downloaded 62 KB

* installing *source* package ‘tidytree’ ...
** package ‘tidytree’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error in setClassUnion("DNAbin_Or_AAbin", c("DNAbin", "AAbin", "NULL")) : 
  could not find function "setClassUnion"
Error: unable to load R code in package ‘tidytree’
Execution halted
ERROR: lazy loading failed for package ‘tidytree’
* removing ‘/opt/R_LIBS/tidytree’
* restoring previous ‘/opt/R_LIBS/tidytree’
Warning in install.packages :
  installation of package ‘tidytree’ had non-zero exit status

The downloaded source packages are in
	‘/opt/R_LIBS/tmp/RtmpRfy7nt/downloaded_packages’

error in getSubsLabel

I have a bizarre issue. I tried the following:

library(treeio)
library(ggtree)
library(phangorn)
tre <- read.tree("NA.nwk")
tipseq <- read.phyDat("NA.fas",type="AA",format="fasta")
fit <- pml(tre, tipseq, k=4)
fit <- optim.pml(fit, optNni=FALSE, optBf=T, optQ=T,
optInv=T, optGamma=T, optEdge=TRUE,
optRooted=FALSE, model = "WAG",
control = pml.control(trace =0))
pmltree <- as.treedata(fit)

and I get the error 👍

error in getSubsLabel(seqs, label[pp[i]], label[i], translate, removeGap) :
seqA should have equal length to seqB

I use R 4.0.3. Using instead a file with only one character labels does not help. Is it that amino acid sequences are not supported ?

NA_nwk.txt
NA_withnodelabels_nwk.txt
NA_fas.txt

using the dplyr::mutate on tbl_tree convert it to tbl_df

Hi, applying mutate on tbl_tree remains no longer tbl_tree object . And therefore, cannot convert back to tree using as.phylo(). See the example below

tree <- rtree(4)
tree
Phylogenetic tree with 4 tips and 3 internal nodes.

Tip labels:
[1] "t1" "t4" "t3" "t2"

Rooted; includes branch lengths.

as_data_frame(tree) %>% class()
[1] "tbl_tree"   "tbl_df"     "tbl"        "data.frame"

as_data_frame(tree)  %>% dplyr::mutate(row_number()) %>% class()
[1] "tbl_df"     "tbl"        "data.frame"

tree_data <- as_data_frame(tree)  %>% dplyr::mutate(row_number())

as.phylo(tree_data)
Error in UseMethod("as.phylo") : 
  no applicable method for 'as.phylo' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"

Allow protein sequences in `treedata`

The treedata objects currently require the tip_seq and anc_seq slots to be of type DNAbin. Would it be possible to allow them to be AAbin as well? Currently protein sequences are not supported.

One option would be to assign the slots the character type.

could not find function "offspring.tbl_tree_item"

Hello
I'm trying to use the treeplot function of clusterProfiler package. I obtained error with my data and so tried with clusterProfiler example.
I'm obtaining the following error :

treeplot(ego2, showCategory = 30)
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
 # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
Error in offspring.tbl_tree_item(.data = .data, .node = .node, tiponly = tiponly,  : 
  could not find function "offspring.tbl_tree_item"

here are my code and session info. I updated all my packages. thanks in advance
Regards
Carine

library(clusterProfiler)
> library(org.Hs.eg.db)
> library(enrichplot)
> library(GOSemSim)
> library(ggplot2)
> library(DOSE)


> data(geneList)
> gene <- names(geneList)[abs(geneList) > 2]
> ego <- enrichGO(gene  = gene,
+                 universe      = names(geneList),
+                 OrgDb         = org.Hs.eg.db,
+                 ont           = "BP",
+                 pAdjustMethod = "BH",
+                 pvalueCutoff  = 0.01,
+                 qvalueCutoff  = 0.05,
+                 readable      = TRUE)
> d <- godata('org.Hs.eg.db', ont="BP")
preparing gene to GO mapping data...
preparing IC data...
> ego2 <- pairwise_termsim(ego, method = "Wang", semData = d)
> treeplot(ego2, showCategory = 30)
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
Error in offspring.tbl_tree_item(.data = .data, .node = .node, tiponly = tiponly,  : 
  could not find function "offspring.tbl_tree_item"
> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=French_France.utf8  LC_CTYPE=French_France.utf8   
[3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C                  
[5] LC_TIME=French_France.utf8    

time zone: Europe/Paris
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] DOSE_3.26.1           pathview_1.40.0       BiocManager_1.30.21  
 [4] org.Hs.eg.db_3.17.0   AnnotationDbi_1.62.2  IRanges_2.34.1       
 [7] S4Vectors_0.38.1      Biobase_2.60.0        BiocGenerics_0.46.0  
[10] dbplyr_2.3.3          gprofiler2_0.2.2      cowplot_1.1.1        
[13] enrichplot_1.20.0     lubridate_1.9.2       forcats_1.0.0        
[16] stringr_1.5.0         dplyr_1.1.2           purrr_1.0.1          
[19] readr_2.1.4           tidyr_1.3.0           tibble_3.2.1         
[22] ggplot2_3.4.2         tidyverse_2.0.0       clusterProfiler_4.8.2
[25] GOSemSim_2.26.1      

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3      rstudioapi_0.15.0      
  [3] jsonlite_1.8.7          magrittr_2.0.3         
  [5] farver_2.1.1            rmarkdown_2.23         
  [7] zlibbioc_1.46.0         vctrs_0.6.3            
  [9] memoise_2.0.1           RCurl_1.98-1.12        
 [11] ggtree_3.8.0            htmltools_0.5.5        
 [13] gridGraphics_0.5-1      htmlwidgets_1.6.2      
 [15] plyr_1.8.8              plotly_4.10.2          
 [17] cachem_1.0.8            igraph_1.5.0           
 [19] lifecycle_1.0.3         pkgconfig_2.0.3        
 [21] Matrix_1.6-0            R6_2.5.1               
 [23] fastmap_1.1.1           gson_0.1.0             
 [25] GenomeInfoDbData_1.2.10 digest_0.6.33          
 [27] aplot_0.1.10            ggnewscale_0.4.9       
 [29] colorspace_2.1-0        patchwork_1.1.2        
 [31] RSQLite_2.3.1           labeling_0.4.2         
 [33] fansi_1.0.4             timechange_0.2.0       
 [35] httr_1.4.6              polyclip_1.10-4        
 [37] compiler_4.3.1          bit64_4.0.5            
 [39] withr_2.5.0             downloader_0.4         
 [41] BiocParallel_1.34.2     viridis_0.6.3          
 [43] DBI_1.1.3               ggupset_0.3.0          
 [45] ggforce_0.4.1           MASS_7.3-60            
 [47] HDO.db_0.99.1           tools_4.3.1            
 [49] ape_5.7-1               scatterpie_0.2.1       
 [51] glue_1.6.2              nlme_3.1-162           
 [53] grid_4.3.1              shadowtext_0.1.2       
 [55] reshape2_1.4.4          fgsea_1.26.0           
 [57] generics_0.1.3          gtable_0.3.3           
 [59] tzdb_0.4.0              data.table_1.14.8      
 [61] hms_1.1.3               tidygraph_1.2.3        
 [63] utf8_1.2.3              XVector_0.40.0         
 [65] ggrepel_0.9.3           pillar_1.9.0           
 [67] yulab.utils_0.0.6       splines_4.3.1          
 [69] tweenr_2.0.2            treeio_1.24.2          
 [71] lattice_0.21-8          bit_4.0.5              
 [73] tidyselect_1.2.0        GO.db_3.17.0           
 [75] Biostrings_2.68.1       knitr_1.43             
 [77] gridExtra_2.3           xfun_0.39              
 [79] graphlayouts_1.0.0      KEGGgraph_1.60.0       
 [81] stringi_1.7.12          lazyeval_0.2.2         
 [83] ggfun_0.1.1             yaml_2.3.7             
 [85] evaluate_0.21           codetools_0.2-19       
 [87] ggraph_2.1.0            qvalue_2.32.0          
 [89] Rgraphviz_2.44.0        graph_1.78.0           
 [91] ggplotify_0.1.1         cli_3.6.1              
 [93] munsell_0.5.0           Rcpp_1.0.11            
 [95] GenomeInfoDb_1.36.1     png_0.1-8              
 [97] XML_3.99-0.14           parallel_4.3.1         
 [99] blob_1.2.4              bitops_1.0-7           
[101] viridisLite_0.4.2       tidytree_0.4.4         
[103] scales_1.2.1            crayon_1.5.2           
[105] rlang_1.1.1             fastmatch_1.1-3        
[107] KEGGREST_1.40.0        

Found more than one class "phylo" in cache

Hello,

Like describe in #10 we have many incorrigible warnings in conflit with phyloseq.
We can't just ignore them, because they spam users during our package development.

library(phyloseq)
library(ggtree)

t <- read_tree(system.file("extdata", "esophagus.tree.gz", package="phyloseq"))
#> Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
#> Also defined by ‘tidytree’

What can we do?
Thanks in advance

dplyr 1.0.0

In addition to #12, we see these issues when testing this package against the soon to be released dplyr 1.0.0:

[master*] 219.2 MiB ❯ revdepcheck::revdep_details(revdep = "tidytree")
══ Reverse dependency check ══════════════════════════════════ tidytree 0.3.1 ══

Status: BROKEN

── Newly failing

✖ checking examples ... ERROR
✖ checking tests ...

── Before ──────────────────────────────────────────────────────────────────────
0 errors ✔ | 0 warnings ✔ | 0 notes ✔

── After ───────────────────────────────────────────────────────────────────────
❯ checking examples ... ERROR
  Running examples in ‘tidytree-Ex.R’ failed
  The error most likely occurred in:

  > ### Name: as.treedata
  > ### Title: as.treedata
  > ### Aliases: as.treedata as.treedata.tbl_tree
  >
  > ### ** Examples
  >
  > library(ape)
  > set.seed(2017)
  > tree <- rtree(4)
  > d <- tibble(label = paste0('t', 1:4),
  +            trait = rnorm(4))
  > x <- as_tibble(tree)
  > full_join(x, d, by = 'label') %>% as.treedata
  Warning: `mutate_()` is deprecated as of dplyr 0.7.0.
  Please use `mutate()` instead.
  See vignette('programming') for more help
  This warning is displayed once every 8 hours.
  Call `lifecycle::last_warnings()` to see where this warning was generated.
  Error in get(x, envir = ns, inherits = FALSE) :
    object 'mutate.tbl_df' not found
  Calls: %>% ... mutate_.tbl_df -> mutate -> mutate.tbl_tree -> <Anonymous> -> get
  Execution halted

❯ checking tests ...
  See below...

── Test failures ───────────────────────────────────────────────── testthat ────

> library(testthat)
> library(tidytree)

Attaching package: 'tidytree'

The following object is masked from 'package:stats':

    filter

>
> test_check("tidytree")
── 1. Error: conversion to table is reversible (@test-access-related-nodes.R#56)
object 'mutate.tbl_df' not found
Backtrace:
  1. testthat::expect_equal(as.phylo(as_tibble(bi_tree)), bi_tree)
  5. tidytree:::as.phylo.tbl_tree(as_tibble(bi_tree))
  6. [ `%<>%`(...) ] with 7 more calls
 15. dplyr:::mutate_.tbl_df(., isTip = ~(!node %in% parent))
 17. tidytree:::mutate.tbl_tree(.data, !!!dots)
 18. utils::getFromNamespace("mutate.tbl_df", "dplyr")
 19. base::get(x, envir = ns, inherits = FALSE)

── 2. Failure: child works for bifurcating trees (@test-access-related-nodes.R#6
child(as_tibble(bi_tree), 1) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 3. Failure: child works for non-bifurcating trees (@test-access-related-nodes
child(as_tibble(multi_tree), 1) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 4. Failure: offspring works on bifurcating trees (@test-access-related-nodes.
offspring(as_tibble(bi_tree), 1) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 5. Failure: offspring works on non-bifurcating trees (@test-access-related-no
offspring(as_tibble(multi_tree), 1) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 6. Failure: parent works for bifurcating trees (@test-access-related-nodes.R#
parent(as_tibble(bi_tree), 11) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 7. Failure: parent works for non-bifurcating trees (@test-access-related-node
parent(as_tibble(multi_tree), 11) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 8. Failure: ancestor works for bifurcating trees (@test-access-related-nodes.
ancestor(as_tibble(bi_tree), 11) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 9. Failure: ancestor works for non-bifurcating trees (@test-access-related-no
ancestor(as_tibble(multi_tree), 11) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 10. Failure: MRCA works for bifurcating trees (@test-access-related-nodes.R#1
MRCA(as_tibble(multi_tree), 11, 5) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 11. Failure: MRCA works for non-bifurcating trees (@test-access-related-nodes
MRCA(as_tibble(multi_tree), 11, 5) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 12. Failure: sibling works for bifurcating trees (@test-access-related-nodes.
sibling(as_tibble(bi_tree), 11) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

── 13. Failure: sibling works for non-bifurcating trees (@test-access-related-no
sibling(as_tibble(multi_tree), 11) not equal to `empty_tbl`.
Attributes: < Component "class": Lengths (4, 3) differ (string compare on first 3) >
Attributes: < Component "class": 3 string mismatches >

══ testthat results  ═══════════════════════════════════════════════════════════
[ OK: 31 | SKIPPED: 0 | WARNINGS: 2 | FAILED: 13 ]
1. Error: conversion to table is reversible (@test-access-related-nodes.R#56)
2. Failure: child works for bifurcating trees (@test-access-related-nodes.R#64)
3. Failure: child works for non-bifurcating trees (@test-access-related-nodes.R#73)
4. Failure: offspring works on bifurcating trees (@test-access-related-nodes.R#81)
5. Failure: offspring works on non-bifurcating trees (@test-access-related-nodes.R#87)
6. Failure: parent works for bifurcating trees (@test-access-related-nodes.R#93)
7. Failure: parent works for non-bifurcating trees (@test-access-related-nodes.R#99)
8. Failure: ancestor works for bifurcating trees (@test-access-related-nodes.R#105)
9. Failure: ancestor works for non-bifurcating trees (@test-access-related-nodes.R#111)
1. ...

Error: testthat unit tests failed
Execution halted

2 errors ✖ | 0 warnings ✔ | 0 notes ✔

as.treedata fails to parse an appropriately-structured tibble

I'm plotting a phylo object with ggtree to take advantage of the ability to integrate lots of node metadata. I do this by conversion of the phylo object to a tibble, followed by various dplyr and purrr functions, then conversion back using as.treedata. When the tibble is modified directly, no issues are encountered and the tree structure is retained. However, if the tibble is converted to a list and back as in the purrr::modify_at example below and despite the tibble still having all the necessary column headings, as.treedata fails to correctly parse it, losing the branch lengths and misreading the node and tip labels.

library(purrr)
library(tidytree)

phy <- rtree(20, br = runif)

nodedata <- tibble(node = 21:39, nodeinfo = letters[1:19])
tipdata <- tibble(node = 1:20,  tipinfo = LETTERS[1:20])

phytbl <- as_tibble(phy) %>% 
  left_join(nodedata, by = "node") %>%
  left_join(tipdata, by = "node") %>%
  mutate(label = paste(label, tipinfo))

as.treedata(phytbl)@phylo # Exactly as expected

phytbl <- 
  phytbl %>%
  mutate(branchinfo = as.character(NA)) %>%
  group_split(branch.length > 0.6) %>%
  modify_at(2, ~ mutate(., branchinfo = "info")) %>%
  bind_rows()

as.treedata(phytbl)@phylo # Branch lengths, node and tip labels all lost

This appears to be due to the phytbl object losing the tbl_tree class during the modification, and if added back as.treedata will work as expected - however IMO this isn't readily apparent and took me some time and poking into the source code to figure out.

Conflict with phyloseq?

Hi,
When I load tidytree (v0.2.6) and phyloseq (v1.28.0) together and use the phyloseq function read_tree as in this example

library(phyloseq)
library(tidytree)

treefilename <- read_tree(system.file("extdata", "esophagus.tree.gz", package="phyloseq"))

I get these warnings

Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by ‘tidytree’
Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by ‘tidytree’
Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by ‘tidytree’
Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by ‘tidytree’
Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by ‘tidytree’
Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
Also defined by ‘tidytree’

Any ideas?

drop.tip functionality for tbl_tree class?

According to the docs, one must use ape::drop.tip() to remove tips. So, if one has a tbl_tree object, the user must tree %>% as.phylo %>% drop.tip %>% as_tibble, but all extra columns are removed in the process. It would be helpful if tidytree had a function to remove tips (and necessary internal nodes) directly from tbl_tree objects.

As far as I can tell tree_subset() cannot reproduce the functionality of ape::drop.tip.

speedup ancestor and offspring method

require(treeio)
require(tidytree)
require(microbenchmark)
file <- system.file("extdata/BEAST", "beast_mcc.tree", package="treeio")
beast <- read.beast(file)
x = as_data_frame(beast)
ancestor2 = ggtree:::getAncestor.df

microbenchmark(tidytree = ancestor(x, 23), ggtree = ancestor2(x, 23), times=100)

tidytree::ancestor is looping by calling tidytree::parent that using dply::filter.

Such looping is quite slow, the ggtree equivalent function is even faster. The tidytree version need to optimize.

> microbenchmark(tidytree = ancestor(x, 23), ggtree = ancestor2(x, 23), times=100)
Unit: microseconds
     expr       min         lq       mean     median        uq      max neval
 tidytree 23906.222 25256.0290 26731.8273 25817.7510 27029.123 72953.83   100
   ggtree   488.796   532.2265   592.7944   561.8835   603.894  2084.83   100

treedata object: cannot join on data column other than "node"

Works:

library(ape)
data(woodmouse)
d <- dist.dna(woodmouse)
tr <- nj(d)
bp <- boot.phylo(tr, woodmouse, function(x) nj(dist.dna(x)))
bp2 <- tibble(X=1:Nnode(tr) + Ntip(tr), bootstrap = bp)
full_join(tr, bp2, by=c('node'='X'))

Does not work:

library(ape)
data(woodmouse)
d <- dist.dna(woodmouse)
tr <- nj(d)
bp <- boot.phylo(tr, woodmouse, function(x) nj(dist.dna(x)))
bp2 <- tibble(X=1:Nnode(tr) + Ntip(tr), bootstrap = bp)
full_join(as.treedata(tr), bp2, by=c('node'='X'))

Error:

Error in match.arg(by, c("node", "label")) : 
  'arg' should be one of “node”, “label”

It appears that full_join() on a treedata object does not work with the standard dplyr UI of by=c('columnX'='columnY'), but instead, one MUST use by=c('node').

sessionInfo

R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] heatmaply_1.4.0       viridis_0.6.2         viridisLite_0.4.1     tidygraph_1.2.2       ggrepel_0.9.2        
 [6] ggtree_3.6.2          tidytree_0.4.1        treeio_1.22.0         ape_5.6-2             data.tree_1.0.0      
[11] lubridate_1.9.0       timechange_0.1.1      memoise_2.0.1         tidytable_0.9.1       data.table_1.14.6    
[16] plotly_4.10.1         ggplot2_3.4.0         visNetwork_2.1.2      reactable_0.3.0       collapsibleTree_0.1.7
[21] shinyTree_0.2.7       shinyWidgets_0.7.5    shinycssloaders_1.0.0 shinyBS_0.61.1        jsonlite_1.8.3       
[26] paws_0.1.12           styler_1.8.1          lintr_3.0.2           languageserver_0.3.14 dbplyr_2.2.1         
[31] dplyr_1.0.10          RPostgres_1.4.4       DBI_1.1.3             shiny_1.7.3           aws.signature_0.6.0  
[36] aws.s3_0.3.21         httr_1.4.4           

loaded via a namespace (and not attached):
 [1] colorspace_2.0-3              ellipsis_0.3.2                rsconnect_0.8.28             
 [4] rprojroot_2.0.3               base64enc_0.1-3               aplot_0.1.8                  
 [7] paws.security.identity_0.1.12 rstudioapi_0.14               farver_2.1.1                 
[10] remotes_2.4.2                 bit64_4.0.5                   fansi_1.0.3                  
[13] xml2_1.3.3                    codetools_0.2-18              R.methodsS3_1.8.2            
[16] cachem_1.0.6                  R.oo_1.25.0                   BiocManager_1.30.19          
[19] compiler_4.2.1                assertthat_0.2.1              fastmap_1.1.0                
[22] lazyeval_0.2.2                cli_3.4.1                     later_1.3.0                  
[25] htmltools_0.5.3               tools_4.2.1                   igraph_1.3.5                 
[28] gtable_0.3.1                  glue_1.6.2                    reshape2_1.4.4               
[31] Rcpp_1.0.9                    jquerylib_0.1.4               vctrs_0.5.1                  
[34] nlme_3.1-157                  iterators_1.0.14              crosstalk_1.2.0              
[37] paws.common_0.5.1             stringr_1.4.1                 ps_1.7.2                     
[40] mime_0.12                     lifecycle_1.0.3               renv_0.15.5                  
[43] dendextend_1.16.0             ca_0.71.1                     scales_1.2.1                 
[46] TSP_1.2-1                     hms_1.1.2                     promises_1.2.0.1             
[49] rex_1.2.1                     parallel_4.2.1                RColorBrewer_1.1-3           
[52] yaml_2.3.6                    curl_4.3.3                    gridExtra_2.3                
[55] ggfun_0.0.9                   yulab.utils_0.0.5             sass_0.4.2                   
[58] stringi_1.7.8                 desc_1.4.2                    foreach_1.5.2                
[61] seriation_1.4.0               cyclocomp_1.1.0               rlang_1.0.6                  
[64] pkgconfig_2.0.3               lattice_0.20-45               fontawesome_0.4.0            
[67] purrr_0.3.5                   patchwork_1.1.2               htmlwidgets_1.5.4            
[70] labeling_0.4.2                bit_4.0.5                     processx_3.8.0               
[73] tidyselect_1.2.0              plyr_1.8.8                    magrittr_2.0.3               
[76] R6_2.5.1                      generics_0.1.3                pillar_1.8.1                 
[79] withr_2.5.0                   tibble_3.1.8                  crayon_1.5.2                 
[82] utf8_1.2.2                    grid_4.2.1                    reactR_0.4.4                 
[85] blob_1.2.3                    callr_3.7.3                   webshot_0.5.4                
[88] digest_0.6.30                 xtable_1.8-4                  R.cache_0.16.0               
[91] tidyr_1.2.1                   httpuv_1.6.6                  gridGraphics_0.5-1           
[94] R.utils_2.12.2                munsell_0.5.0                 registry_0.5-1               
[97] ggplotify_0.1.0               bslib_0.4.1 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.