The designspace_culttax_article_2021 from yesdavid

RevBayes for Bayesian inference of phylogenies using continuous traits

There's so much fascinating work in this repo, really amazing stuff! It got me thinking about options for Bayesian inference with these data, and here's my attempt at a workflow using the RevBayes program (it's very similar to R!). I've added some comments throughout so hopefully you can make it work on your machine. Most of it is directly from Parins-Fukuchi 2018 (I know you mentioned this paper already, perhaps you've already tried it also) and the RevBayes code in her repository.

# Starting from the end of your 3_code_late_neolithic_early_bronze_age.R,
# having run all the code in that script...

# write a nex file for input into RevBayes
write.nexus.data(as.matrix(nicolas_2016_without_outliers_PCA_as_df_subset_typochron_FR),
                 file.path(output_folder, "nicolas_2016_without_outliers_PCA_as_df_subset_typochron_FR.nex"),
                 format = "continuous")

#----------------------------------------------------------------
# This is RevBayes code, not R, and needs to be run in the terminal

# This code file was downloaded and modified from: 
# Parins-Fukuchi, Caroline (2017), Data from: Use of continuous 
# traits can improve morphological phylogenetics, Dryad, 
# Dataset, https://doi.org/10.5061/dryad.40b70

# The code is cited in this paper: 
# Caroline Parins-Fukuchi, Use of Continuous Traits Can Improve Morphological 
# Phylogenetics, Systematic Biology, Volume 67, Issue 2, March 2018, 
# Pages 328–339, https://doi.org/10.1093/sysbio/syx072

## This is intended to be run on RevBayes v1.0.0
# v1.0.1 has changed several function names. see RevBayes documentation for more details. 
# This procedure was developed with the gratuitous aid of RevBayes example documents
# authored by Nicolas Lartillot, Michael Landis, and April Wright.

# BM: I downloaded RevBayes onto my computer and put in in my Applications folder (OSX)

# Assuming we are in an open RStudio Project, we can 
# start RevBayes the RStudio terminal with the commmand:

# /Applications/rb

# this should start RevBayes in the project directory

# run these lines below in the terminal after starting RevBayes (I copy-paste them
# into the terminal in RStudio)

getwd() # confirm that we are in the project directory where are input/output files are

fl <- "nicolas_2016_without_outliers_PCA_as_df_subset_typochron_FR.nex"  # continuous character set used for analysis in nexus format
outtr <- "OUTTREEFILE"   # file to write sampled trees
outlf <- "OUTLOGFILE"    # MCMC log
outsm <- "OUTSUMFILE"    # MAP summary tree file

# import the data into RevBayes, watch terminal for message indicating success
contData <- readContinuousCharacterData(fl)
# the most relevant official tutorial is probably 
# https://revbayes.github.io/tutorials/cont_traits/simple_bm.html

numTips = contData.ntaxa()
numNodes = numTips * 2 - 1
names = contData.names()
diversification ~ dnLognormal(0,1) # aka 'birth rate' for new varients 
turnover = 0 # we are going to parameterise the BD prior, 
# 'The turnover rate is the rate at which one species is replaced by another 
# species due to a birth plus death event.... the turnover rate represent the 
# longevity of a species' details: https://revbayes.github.io/tutorials/divrate/simple.html
speciation := diversification + turnover
extinction := turnover 
sampling_fraction <- 1

# instantiate a Birth-Death tree with the parameters set above
psi ~ dnBDP(lambda=speciation,                # The speciation rate
            mu=extinction,                    # The extinction rate.
            rho=sampling_fraction,            # The taxon sampling fraction
            rootAge=1, 
            samplingStrategy = "uniform",     # The sampling strategy of including taxa at the present.
            condition = "nTaxa",
            taxa=names) 
mvi = 0 # we are going to set tree rearrangement and node height scaling moves

# create our node height and topological rearrangement MCMC moves. These help us to explore
# the parameter space in each MCMC step. Good notes on these here: 
# http://www.peterbeerli.com/classes/images/5/5f/RB_CTMC_Tutorial_oct2015.pdf
moves[++mvi] = mvSubtreeScale(psi, weight=5.0) # change the ages of the internal nodes
moves[++mvi] = mvNodeTimeSlideUniform(psi, weight=10.0) # change the ages of the internal nodes
moves[++mvi] = mvNNI(psi, weight=5.0)  # nearest-neighbor interchange move
moves[++mvi] = mvFNPR(psi, weight=5.0) # a fixed-node height subtree-prune and regrafting move

logSigma ~ dnNormal(0,1) # place a prior on BM sigma parameter.
sigma := 10^logSigma
moves[++mvi] = mvSlide(logSigma, delta=1.0, tune=true, weight=2.0)

# For our MCMC analysis, we need to set up a vector of monitors to record 
# the states of our Markov chain.

monitors[1] = mnScreen(printgen=50000, sigma)
monitors[2] = mnFile(filename=outlf, printgen=400, separator = TAB, sigma)
monitors[3] = mnFile(filename=outtr, printgen=100, separator = TAB, psi)

# specify that we are going calculate BM likelihood using the REML PIC algorithm (see Felsenstein 1973)
# several chapters on fitting BM models here: https://lukejharmon.github.io/pcm/chapters/
traits ~ dnPhyloBrownianREML(psi, branchRates=1.0, siteRates=sigma, nSites=contData.nchar())

# When this clamp function is called, RevBayes sets each of the stochastic 
# nodes representing the tips of the tree to the corresponding 
# nucleotide sequence in the alignment. This essentially 
# tells the program that we have observed data for the sequences at the tips.
traits.clamp(contData) # match traits to tips

# we wrap the entire model to provide convenient access to the DAG. 
# To do this, we only need to give the model() function a 
# single node. With this node, the model() function can find all of the other
# nodes by following the arrows in the graphical model (see DOI:10.1093/sysbio/syw021
# for details of the typical graphical model)
bmv = model(sigma)     # link sigma param w/ BM model

# With a fully specified model, a set of monitors, and a set of moves, 
# we can now set up the MCMC algorithm that will sample parameter values in 
# proportion to their posterior probability. The mcmc() function will
# create our MCMC object:
chain = mcmc(bmv, monitors, moves)
chain.burnin(generations=500, tuningInterval=50) # originally 50000 and 500

chain.run(5000) # originally 500000
# !STOP! We have to rename OUTTREEFILE to OUTTREEFILE.trees before running the next line
# R code starts here to rename the file ----------------------------------------
file.rename(file.path(output_folder, "OUTTREEFILE"), 
            file.path(output_folder, "OUTTREEFILE.trees"))
# R code ends here -------------------------------------------------------------

# To summarize the trees sampled from the posterior distribution, 
# RevBayes can summarize the sampled trees by reading in the tree-trace file:
treetrace = readTreeTrace(file = outtr)
treefl <- outsm

# The mapTree() function will summarize the tree samples and write the 
# maximum a posteriori tree to file, we can also summarise trees as MCC 
# (maximum clade credibility) representation 
map = mapTree( file=treefl, treetrace )
mccTree( file="mcc.tre", treetrace )
q() # quits RevBayes

# End of RevBayes code
# resources beyond the official tutorials: 
# - https://ib.berkeley.edu/courses/ib200/labs/12/lab12.pdf
#- https://github.com/revbayes/revbayes_tutorial/blob/master/RB_PhyloComparative_BM_Tutorial/scripts/continuous_burst.Rev
# - https://github.com/search?p=3&q=dnPhyloBrownianREML&type=Code
#----------------------------------------------------------------

# ...back to R again 

outsumfile <- ape::read.nexus( file.path(output_folder, "OUTSUMFILE"))

ggtree(outsumfile) %<+% names_artefacts_ID_and_period + 
  theme_tree() + 
  geom_tiplab(size=2, 
              aes(label = ID_country,
                  color = ID_country)) +
  geom_treescale() 

# explore some plotting methods

# https://cran.r-project.org/web/packages/MCMCtreeR/vignettes/MCMCtree_plot.html
library(MCMCtreeR, quietly = TRUE, warn.conflicts = FALSE)

outsumfile.rb <- file.path(output_folder, "OUTSUMFILE")

MCMC.tree.plot(analysis.type='revbayes',
               directory.files=outsumfile.rb, 
               cex.tips=0.33,
               plot.type='phylogram', 
               lwd.bar=2,
               add.time.scale=FALSE,
               node.method='bar', 
               col.age='navy')

MCMC.tree.plot(
  directory.files=outsumfile.rb,
  analysis.type = "revbayes",
  cex.tips = 0.2,
  plot.type = "phylogram",
  lwd.bar = 2,
  node.method = "node.length",
  cex.labels = 0.5,
  add.time.scale=FALSE)


phy <- ape::read.tree(file.path(output_folder, "OUTTREEFILE.trees"))

# write it out to a format that crsl4/PhyloNetworks.jl can read
# cf. https://crsl4.github.io/PhyloNetworks.jl/latest/man/inputdata/
ape::write.tree(phy, file.path(output_folder, "phy.trees"))

ape::write.nexus(phy, file.path(output_folder, "phy.nexus"))

# I can't see any tree branches when the 
phangorn::densiTree(phy, type = "phylogram", cex = 0.5)

plot(phangorn::maxCladeCred(phy),  cex = 0.5)
plot(consensus(phy))

# Looks promising:
# https://revbayes.github.io/tutorials/intro/revgadgets.html
library(RevGadgets)

outtree.rb <- readTrees(outsumfile.rb)
ape::write.tree(outtree.rb[[1]][[1]]@phylo, file.path(output_folder, "out.trees"))

outtree.rb1 <- readTrees( file.path(output_folder, "out.trees"))

plot(outtree.rb1[[1]][[1]]@phylo)

# create the plot of the rooted tree
plot <- plotTree(tree = outtree.rb,
                 # label nodes the with posterior probabilities
                 # node_labels = "posterior", 
                 # offset the node labels from the nodes
                 node_labels_offset = 0.005,
                 # make tree lines more narrow
                 line_width = 0.5,
                 # italicize tip labels 
                 tip_labels_italics = TRUE)

# add scale bar to the tree and plot with ggtree
library(ggtree)
plot + geom_treescale(x = -0.35, y = -1)

#----------------------------------------------------------------------------
# PhyloNetworks

# prepare RevBayes output for input into PhyloNetworks 

phy <- ape::read.tree(file.path(output_folder, "OUTTREEFILE.trees"))

# write it out to a format that crsl4/PhyloNetworks.jl can read
# cf. https://crsl4.github.io/PhyloNetworks.jl/latest/man/inputdata/
# these are our 1000s of trees
ape::write.tree(phy, file.path(output_folder, "phy.trees"))

# this is our single MAP tree
library(RevGadgets)
outtree.rb <- readTrees(file.path(output_folder, "OUTSUMFILE"))
ape::write.tree(outtree.rb[[1]][[1]]@phylo, file.path(output_folder, "out.trees"))
This is the output from densiTree:
yesdavid / designspace_culttax_article_2021 Goto Github PK

designspace_culttax_article_2021's People

Stargazers

Watchers

designspace_culttax_article_2021's Issues

RevBayes for Bayesian inference of phylogenies using continuous traits

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent