Git Product home page Git Product logo

designspace_culttax_article_2021's People

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

designspace_culttax_article_2021's Issues

RevBayes for Bayesian inference of phylogenies using continuous traits

There's so much fascinating work in this repo, really amazing stuff! It got me thinking about options for Bayesian inference with these data, and here's my attempt at a workflow using the RevBayes program (it's very similar to R!). I've added some comments throughout so hopefully you can make it work on your machine. Most of it is directly from Parins-Fukuchi 2018 (I know you mentioned this paper already, perhaps you've already tried it also) and the RevBayes code in her repository.


# Starting from the end of your 3_code_late_neolithic_early_bronze_age.R,
# having run all the code in that script...

# write a nex file for input into RevBayes
write.nexus.data(as.matrix(nicolas_2016_without_outliers_PCA_as_df_subset_typochron_FR),
                 file.path(output_folder, "nicolas_2016_without_outliers_PCA_as_df_subset_typochron_FR.nex"),
                 format = "continuous")

#----------------------------------------------------------------
# This is RevBayes code, not R, and needs to be run in the terminal

# This code file was downloaded and modified from: 
# Parins-Fukuchi, Caroline (2017), Data from: Use of continuous 
# traits can improve morphological phylogenetics, Dryad, 
# Dataset, https://doi.org/10.5061/dryad.40b70

# The code is cited in this paper: 
# Caroline Parins-Fukuchi, Use of Continuous Traits Can Improve Morphological 
# Phylogenetics, Systematic Biology, Volume 67, Issue 2, March 2018, 
# Pages 328–339, https://doi.org/10.1093/sysbio/syx072

## This is intended to be run on RevBayes v1.0.0
# v1.0.1 has changed several function names. see RevBayes documentation for more details. 
# This procedure was developed with the gratuitous aid of RevBayes example documents
# authored by Nicolas Lartillot, Michael Landis, and April Wright.

# BM: I downloaded RevBayes onto my computer and put in in my Applications folder (OSX)

# Assuming we are in an open RStudio Project, we can 
# start RevBayes the RStudio terminal with the commmand:

# /Applications/rb

# this should start RevBayes in the project directory

# run these lines below in the terminal after starting RevBayes (I copy-paste them
# into the terminal in RStudio)

getwd() # confirm that we are in the project directory where are input/output files are

fl <- "nicolas_2016_without_outliers_PCA_as_df_subset_typochron_FR.nex"  # continuous character set used for analysis in nexus format
outtr <- "OUTTREEFILE"   # file to write sampled trees
outlf <- "OUTLOGFILE"    # MCMC log
outsm <- "OUTSUMFILE"    # MAP summary tree file

# import the data into RevBayes, watch terminal for message indicating success
contData <- readContinuousCharacterData(fl)
# the most relevant official tutorial is probably 
# https://revbayes.github.io/tutorials/cont_traits/simple_bm.html

numTips = contData.ntaxa()
numNodes = numTips * 2 - 1
names = contData.names()
diversification ~ dnLognormal(0,1) # aka 'birth rate' for new varients 
turnover = 0 # we are going to parameterise the BD prior, 
# 'The turnover rate is the rate at which one species is replaced by another 
# species due to a birth plus death event.... the turnover rate represent the 
# longevity of a species' details: https://revbayes.github.io/tutorials/divrate/simple.html
speciation := diversification + turnover
extinction := turnover 
sampling_fraction <- 1

# instantiate a Birth-Death tree with the parameters set above
psi ~ dnBDP(lambda=speciation,                # The speciation rate
            mu=extinction,                    # The extinction rate.
            rho=sampling_fraction,            # The taxon sampling fraction
            rootAge=1, 
            samplingStrategy = "uniform",     # The sampling strategy of including taxa at the present.
            condition = "nTaxa",
            taxa=names) 
mvi = 0 # we are going to set tree rearrangement and node height scaling moves

# create our node height and topological rearrangement MCMC moves. These help us to explore
# the parameter space in each MCMC step. Good notes on these here: 
# http://www.peterbeerli.com/classes/images/5/5f/RB_CTMC_Tutorial_oct2015.pdf
moves[++mvi] = mvSubtreeScale(psi, weight=5.0) # change the ages of the internal nodes
moves[++mvi] = mvNodeTimeSlideUniform(psi, weight=10.0) # change the ages of the internal nodes
moves[++mvi] = mvNNI(psi, weight=5.0)  # nearest-neighbor interchange move
moves[++mvi] = mvFNPR(psi, weight=5.0) # a fixed-node height subtree-prune and regrafting move

logSigma ~ dnNormal(0,1) # place a prior on BM sigma parameter.
sigma := 10^logSigma
moves[++mvi] = mvSlide(logSigma, delta=1.0, tune=true, weight=2.0)

# For our MCMC analysis, we need to set up a vector of monitors to record 
# the states of our Markov chain.

monitors[1] = mnScreen(printgen=50000, sigma)
monitors[2] = mnFile(filename=outlf, printgen=400, separator = TAB, sigma)
monitors[3] = mnFile(filename=outtr, printgen=100, separator = TAB, psi)

# specify that we are going calculate BM likelihood using the REML PIC algorithm (see Felsenstein 1973)
# several chapters on fitting BM models here: https://lukejharmon.github.io/pcm/chapters/
traits ~ dnPhyloBrownianREML(psi, branchRates=1.0, siteRates=sigma, nSites=contData.nchar())

# When this clamp function is called, RevBayes sets each of the stochastic 
# nodes representing the tips of the tree to the corresponding 
# nucleotide sequence in the alignment. This essentially 
# tells the program that we have observed data for the sequences at the tips.
traits.clamp(contData) # match traits to tips

# we wrap the entire model to provide convenient access to the DAG. 
# To do this, we only need to give the model() function a 
# single node. With this node, the model() function can find all of the other
# nodes by following the arrows in the graphical model (see DOI:10.1093/sysbio/syw021
# for details of the typical graphical model)
bmv = model(sigma)     # link sigma param w/ BM model

# With a fully specified model, a set of monitors, and a set of moves, 
# we can now set up the MCMC algorithm that will sample parameter values in 
# proportion to their posterior probability. The mcmc() function will
# create our MCMC object:
chain = mcmc(bmv, monitors, moves)
chain.burnin(generations=500, tuningInterval=50) # originally 50000 and 500

chain.run(5000) # originally 500000
# !STOP! We have to rename OUTTREEFILE to OUTTREEFILE.trees before running the next line
# R code starts here to rename the file ----------------------------------------
file.rename(file.path(output_folder, "OUTTREEFILE"), 
            file.path(output_folder, "OUTTREEFILE.trees"))
# R code ends here -------------------------------------------------------------

# To summarize the trees sampled from the posterior distribution, 
# RevBayes can summarize the sampled trees by reading in the tree-trace file:
treetrace = readTreeTrace(file = outtr)
treefl <- outsm

# The mapTree() function will summarize the tree samples and write the 
# maximum a posteriori tree to file, we can also summarise trees as MCC 
# (maximum clade credibility) representation 
map = mapTree( file=treefl, treetrace )
mccTree( file="mcc.tre", treetrace )
q() # quits RevBayes

# End of RevBayes code
# resources beyond the official tutorials: 
# - https://ib.berkeley.edu/courses/ib200/labs/12/lab12.pdf
#- https://github.com/revbayes/revbayes_tutorial/blob/master/RB_PhyloComparative_BM_Tutorial/scripts/continuous_burst.Rev
# - https://github.com/search?p=3&q=dnPhyloBrownianREML&type=Code
#----------------------------------------------------------------

# ...back to R again 

outsumfile <- ape::read.nexus( file.path(output_folder, "OUTSUMFILE"))

ggtree(outsumfile) %<+% names_artefacts_ID_and_period + 
  theme_tree() + 
  geom_tiplab(size=2, 
              aes(label = ID_country,
                  color = ID_country)) +
  geom_treescale() 

# explore some plotting methods

# https://cran.r-project.org/web/packages/MCMCtreeR/vignettes/MCMCtree_plot.html
library(MCMCtreeR, quietly = TRUE, warn.conflicts = FALSE)

outsumfile.rb <- file.path(output_folder, "OUTSUMFILE")

MCMC.tree.plot(analysis.type='revbayes',
               directory.files=outsumfile.rb, 
               cex.tips=0.33,
               plot.type='phylogram', 
               lwd.bar=2,
               add.time.scale=FALSE,
               node.method='bar', 
               col.age='navy')

MCMC.tree.plot(
  directory.files=outsumfile.rb,
  analysis.type = "revbayes",
  cex.tips = 0.2,
  plot.type = "phylogram",
  lwd.bar = 2,
  node.method = "node.length",
  cex.labels = 0.5,
  add.time.scale=FALSE)


phy <- ape::read.tree(file.path(output_folder, "OUTTREEFILE.trees"))

# write it out to a format that crsl4/PhyloNetworks.jl can read
# cf. https://crsl4.github.io/PhyloNetworks.jl/latest/man/inputdata/
ape::write.tree(phy, file.path(output_folder, "phy.trees"))

ape::write.nexus(phy, file.path(output_folder, "phy.nexus"))

# I can't see any tree branches when the 
phangorn::densiTree(phy, type = "phylogram", cex = 0.5)

plot(phangorn::maxCladeCred(phy),  cex = 0.5)
plot(consensus(phy))

# Looks promising:
# https://revbayes.github.io/tutorials/intro/revgadgets.html
library(RevGadgets)

outtree.rb <- readTrees(outsumfile.rb)
ape::write.tree(outtree.rb[[1]][[1]]@phylo, file.path(output_folder, "out.trees"))

outtree.rb1 <- readTrees( file.path(output_folder, "out.trees"))

plot(outtree.rb1[[1]][[1]]@phylo)

# create the plot of the rooted tree
plot <- plotTree(tree = outtree.rb,
                 # label nodes the with posterior probabilities
                 # node_labels = "posterior", 
                 # offset the node labels from the nodes
                 node_labels_offset = 0.005,
                 # make tree lines more narrow
                 line_width = 0.5,
                 # italicize tip labels 
                 tip_labels_italics = TRUE)

# add scale bar to the tree and plot with ggtree
library(ggtree)
plot + geom_treescale(x = -0.35, y = -1)

#----------------------------------------------------------------------------
# PhyloNetworks

# prepare RevBayes output for input into PhyloNetworks 

phy <- ape::read.tree(file.path(output_folder, "OUTTREEFILE.trees"))

# write it out to a format that crsl4/PhyloNetworks.jl can read
# cf. https://crsl4.github.io/PhyloNetworks.jl/latest/man/inputdata/
# these are our 1000s of trees
ape::write.tree(phy, file.path(output_folder, "phy.trees"))

# this is our single MAP tree
library(RevGadgets)
outtree.rb <- readTrees(file.path(output_folder, "OUTSUMFILE"))
ape::write.tree(outtree.rb[[1]][[1]]@phylo, file.path(output_folder, "out.trees"))

This is the output from densiTree:

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.