Git Product home page Git Product logo

phyx's Introduction

Build-N-Test

phyx logo


Note Phyx recently overwent an overhaul such that a simple git pull && git make will fail. Instead, see instructions here.


phyx performs phylogenetics analyses on trees and sequences. See installation instructions for Linux and Mac including any dependencies on the wiki here or below.

Authors: Joseph W. Brown*, Joseph F. Walker*, and Stephen A. Smith (* equal contribution)

Citation: Brown, J. W., J. F. Walker, and S. A. Smith; Phyx: phylogenetic tools for unix. Bioinformatics 2017; 33 (12): 1886-1888. doi: 10.1093/bioinformatics/btx063

License: GPL https://www.gnu.org/licenses/gpl-3.0.html

Some of the sequence comparison operations use the very nice edlib library. These are reported in this publication: Martin Šošić, Mile Šikić; Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance. Bioinformatics 2017 btw753. doi: 10.1093/bioinformatics/btw753.

Documentation

Documentation resides in several locations (all slightly out of date, alas). A pdf manual is available in the doc/ directory. A slightly-less-out-of-date list of the current programs with examples can be found on the wiki. Help for individual programs can be obtained with either PROGRAM -h or (if installed, see below) man PROGRAM. See a brief overview here.

Still, there are a whack of programs, so it can be difficult to remember the name of which program does what. Here is a quick reference:

Program Short description
pxaa2cdn produce a codon alignment from and AA alignment and unaligned nucleotides
pxbdfit diversification model inference
pxbdsim a birth death simulator
pxboot sequence alignment resampling (bootstrap or jackknife)
pxbp prints out bipartitions that make up the tree
pxcat an alignment concatenator
pxclsq clean sites based on missing or ambiguous data
pxcltr general tree cleaner
pxcolt collapse poorly-supported edges
pxcomp a composition homogeneity test
pxcomp compositional homogeneity test
pxconsq a consensus sequence constructor for an alignment
pxcontrates a brownian and ou estimator
pxfqfilt a fastq filter given a mean quality
pxlog a MCMC log manipulator/concatenator
pxlssq information about seqs in a file (like ls but for an alignment file)
pxlstr information about trees in a file (like ls but for a tree file)
pxmono monophyly tester
pxmrca information about an mrca
pxmrcacut a mrca cutter
pxmrcaname a mrca label maker
pxnj neighbour-joining tree inference
pxnni a nni changer
pxnw needleman-wunsch alignment
pxpoly a polytomy sampler that generates a binary tree
pxrecode a sequence alignment recoder
pxrevcomp a reverse complementor
pxrls taxon relabelling for sequences
pxrlt taxon relabelling for trees
pxrmk remove two-degree nodes from a tree
pxrms pruning seqs (like rm but for seqs)
pxrmt pruning trees (like rm but for trees)
pxrr rerooting and unrooting trees
pxs2fa convert an alignment to fasta format
pxs2nex convert an alignment to nexus format
pxs2phy convert an alignment to phylip format
pxseqgen sequence simulation program
pxssort sequence sorter
pxssplit split alignment into N individual sequence files
pxsstat multinomial alignment test statistics
pxstrec a state reconstructor
pxsw smith waterman alignment
pxt2new convert a tree to newick format
pxt2nex convert a tree to vanilla Nexus format
pxtcol annotate tree to colour edges
pxtcomb tree combiner
pxtgen exhaustive tree topology generator
pxtlate translate nucleotide sequences into amino acids
pxtrt extract an induced subtree from a larger tree
pxtscale tree rescaling
pxupgma upgma tree inference
pxvcf2fa convert vcf file to fasta alignment

Problems after updating (git pull)

If you have been using phyx and things are not working after a recent pull, this is because of a change in configuration. Please do the following in the src directory to remedy the situation:

make distclean
autoreconf -fi
./configure
make
make check
sudo make install

Installation instructions

phyx requires a few dependencies. Since installation of these dependencies differs on Linux vs. Mac OSX, we've separated the instructions below.

Mac install

Mac has become increasingly difficult to support at the command line with changes every version on location and standards for compilation tools. First, distribution of compiled programs is very difficult. Furthermore, Mac now defaults to clang as a C/C++ compiler, which does not support OpenMP. For Mac OSX 10.12, we have found that you can install with clang using the simple instructions and homebrew or using a fresh installation of gcc from here. Instructions for both are below (don't use both, choose one, probably the simple one). For simple instructions click here, and for advanced instructions click here.

Binary install with Homebrew

  1. Install the Homebrew package manager:

     /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
    
  2. Install the Brewsci phyx package:

     brew install brewsci/bio/phyx
    

Build from source with Homebrew

  1. Install the Homebrew package manager:

     /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    
  2. Install dependencies from homebrew:

     brew install git cmake nlopt armadillo
    
  3. On to phyx. First, clone the repository (if you haven't already):

     git clone https://github.com/FePhyFoFum/phyx.git
    
  4. Install phyx

     cd phyx/src
     autoconf
     ./configure
     make
     make check
    

If you want to install it so it is available anywhere in your system, do:

    sudo make install

Install with HPC GCC (advanced instructions)

  1. Install gcc and gfortran. Download gcc-6.2-bin.tar.gz or more recent from http://hpc.sourceforge.net/. Install with:

     sudo tar -xvf gcc-6.2-bin.tar -C /
    
  2. Install autoconf from http://ftp.gnu.org/gnu/autoconf/. Get autoconf-latest.tar.gz, then:

     tar -xzf autoconf-latest.tar.gz
     cd autoconf-2.69
     ./configure --prefix=/usr/local/autoconf-2.69
     make
     sudo make install
     ln -s autoconf-2.69 /usr/local/autoconf
    
  3. On to phyx. first, clone the repository (if you haven't already):

     git clone https://github.com/FePhyFoFum/phyx.git
    
  4. Install cmake and install Armadillo. Get cmake from https://cmake.org/download/. I got https://cmake.org/files/v3.6/cmake-3.6.2-Darwin-x86_64.tar.gz. Get armadillo from the deps directory or http://arma.sourceforge.net/download.html, get the stable one. Untar it. Double click the Cmake.app. Click "Browse source..." and choose the armadillo folder that was created after untaring. Click "Browse build..." and choose the same folder as browse source. Click "Configure" and then click "Generate". Go to the terminal and browse to that armadillo folder and type:

     make
     sudo make install
    
  5. Install nlopt. Get nlopt from the deps directory or go to http://ab-initio.mit.edu/wiki/index.php/NLopt#Download_and_installation and download the latest (probably nlopt-2.4.2.tar.gz). Untar and browse in the terminal to that directory:

     ./configure --without-octave --without-matlab
     make
     sudo make install
    
  6. Compile phyx. Now you can go to the src directory of phyx and type:

     autoconf
     ./configure
     make
     make check
     sudo make install
    

and all the programs should compile without issue.

Linux install

These instructions work for most ubuntu versions as well as debian.

  1. Install general dependencies:

     sudo apt-get install git autotools-dev autoconf automake cmake libtool liblapack-dev libatlas-cpp-0.6-dev libnlopt-cxx-dev
    
  2. Clone the phyx repo (if you haven't already):

     git clone https://github.com/FePhyFoFum/phyx.git
    
  3. Install armadillo dependency

Note: it is possible to get from apt-get, but need version >= 5.2:

    sudo apt-get install libarmadillo-dev

On debian it was necessary to use backports:

    sudo apt-get -t jessie-backports install libarmadillo-dev

If that is not possible, compile the provided code:

    cd phyx/deps
    tar -xvzf armadillo-7.400.2.tgz
    cd armadillo-7.400.2
    ./configure
    make
    sudo make install
  1. Finally, install phyx:

     cd phyx/src
     autoconf
     ./configure
     make
     make check
    

If you want to install it so it is available anywhere in your system, do:

    sudo make install

phyx's People

Contributors

blackrim avatar chinchliff avatar jfwalker avatar jonchang avatar josephwb avatar smoe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phyx's Issues

Line endings

The perennial problem: different flavours of carriage returns. Gah.

I've noticed this problem in using pxlssq on a file with mac line endings.

Not aware of a general purpose way to deal with this, but it must exist.

Low priority.

Programs hang when no input given

So I see a number of the programs that can process stdin (the majority) hang waiting for input when none is given. We should fix this.

When an argument must be given (e.g. mrca file for pxmrcaname), we can use:

if (argc == 1) {
cout << "No arguments given." << endl; // and possibly give usage
exit(0);
}

Easy. The trouble is with programs that can get by with only stdin. If nothing is provided, it waits for keyboard input (which is something that will not be useful).

I am looking into finding the most portable solution.

"Argument list too long"

Involves programs that can take wildcards e.g. pxcat:

./pxcat -s *.phy -o foo.fa
-bash: ./pxcat: Argument list too long

The solution seems to be to pass a file which itself lists the input files. Unfortunately, this has the same problem (for some OSs e.g. Mac):

find *.phy > flist.txt
-bash: /usr/bin/find: Argument list too long

However, the following seems to work for both linux/mac:

for x in *.phy; do echo $x >> flist.txt; done

So flist.txt can be passed instead of the wildcard, and parsed when it needs to be.

make install

Should have both a make install and make uninstall targets.

should pxbdsim produce trees with just a single terminal

Under some parameter combinations, "trees" such as:

taxon_1;

are produced.

In one sense I think this is actually fine: start with 1 lineage, end with 1 lineage (with potentially others arising and going extinct). The problem is that 1) tree readers cannot deal with such a "tree", and 2) the branch length information is lost (i.e. how old is the tree?).

So, what is the desired behaviour here? Bump the minimum tip complement to 2 (or 3)? Or keep things as they are and just process the trees when doing further analysis? For example, what if the most probable tree size for some parameter combination is 1 tip? Implemented a minimum > 1 will give misleading results.

Unittests

Noted in review:

Tests

  • Unfortunately, neither the publication nor the manual make mention of any unitttests or indeed tests of any kind. Furthermore, looking at the repository, there is also no evidence of tests of any kind.
  • I am afraid that I cannot recommend this software for publication without demonstration and documentation of tests. There is no reason for academic software to be held to lower standards that any other kinds of software. This software will be used by people who do not know about the internals of the code, and the results of THOSE people's work, in turn, will be cited/used by other people who may not know about the programs in th first place. At each stage there is trust in that certain critical internals have been given due diligence and care by the previous stage. This trust is essential to our community. When it comes to software development, the trust is in the form that the programs are doing what the authors claim that the programs are doing. Validating that this is indeed the case (and demonstrating that validation) is the authors' responsibility to the community, especially if they are aiming to gain citable academic publication out of this.
  • Tests should, at the very least, demonstrate that the programs are doing what they are supposed to do given some minimal canonical input. Testing C++ programs dealing with complex data like this is difficult, but certainly not only possible, but dedicatedly pursued by responsible authors (e.g., NCL, phycas).
  • I recommend testing input/ouput (e.g., by round-tripping data files and ensuring content remains as expected) explicitly and separately from manipulations and operations.
  • With the latter, simple examples from the manual/documentation/paper for each one will do to start.
  • In all cases, if semantic-checking of output is too challenging, a simple pattern matching will do (the latter is fragile, in that small tweaks to the programs' writers in terms of spacing, etc., will result in tests failing, but this can easily if tediously be fixed).
  • More advanced testing would be nice, e.g., for incorrect input and so on. But I recognize that this would be a lot of work and that authors may want to develop these later.

Handle internal quotes in node labels

Reading quoted labels works whether labels start/stop with single or double quotes. However, it is not yet supported the newick convention of double-single quotes to represent internal single quotes, e.g.

'Swainson''s Hawk' == "Swainson's Hawk"

Currently, the first single quote after Swainson would be interpreted as the end of the label, and so the tree is likely to die a terrible death.

This is a simple tweak, but putting here so we do not forget about it. OpenTree trees, for example, may contain such labels.

Install broken with gcc 6?

Updated gcc on my mac to v6.1.0 and now sudo make install does not work:

Frinkatron-TNG:src josephwb$ sudo make install
install -m 0755 px* /usr/bin
install: /usr/bin/pxaatocdn: Operation not permitted
make: *** [install] Error 71

A simple work around is to simply change:

install -m 0755 px* /usr/bin

to:

cp px* /usr/bin

but that isn't very satisfying. Perhaps @jfwalker should try doing this on his system to rule out some weird setup of my own.

It doesn't seem that debian yet supports gccv6.1.0 (I am actually using gccv4.9.2?). We should try some other flavour of linux (ubuntu) to see if it is indeed gcc to blame. I will reinstall gccv5 on my mac to check things there.

No example file given for how to supply a file-list of tips to pxrmt

Just a small documentation suggestion.

It appears from the Phyxed_manual (2016-12-02) that the file formatting of the list of tips one can supply to pxrmt is undocumented. No example tips file is given in phyx/example_files/pxrmt_example.

Now obviously you & I might think it obvious that taxon names should be supplied in a file one per line c.f. grep -f <PATTERN FILE> and other unix classics:

s2
s4
s5

(and I infer from empirical testing that this is what it should be)

But other users might reasonably try and supply it a taxon file as a csv or tsv list

s2,s4,s5

or say

s2|s4|s5

or other such schemes. So I think you should formally provide an example tips-to-be-deleted.txt file for pxrmt

Need (portable) more precise random number seed

Currently using:

srand((unsigned)time(NULL));

Precision time is 1 second. For programs that execute quickly (all of them!), can end up using same seed multiple times. Alternatives are available, but seem to be platform-specific.

pxcat seems slow

Potential ways to make faster:

  1. eliminate unnecessary object creation/destruction
  2. two-passes over files?

No tree returned from pxrr if root inferred already the same

Identified by @ningwang83. For the following tree:

(Caloperdix_oculeus:0.01930412,(((Ithaginis_cruentus:0.00205884,Haematortyx_sanguiniceps:0.00627251):0.00202574,(Lerwa_lerwa:0.00764893,(Syrmaticus_soemmerringii:0.00192670,(Pucrasia_macrolopha:0.00604448,(Tetrastes_bonasia:0.00582967,Lagopus_muta:0.00392594):0.00800837):0.00183335,((Rhizothera_longirostris_b:0.00000001,Rhizothera_longirostris_m:0.00000001):0.00273287,Tragopan_temminckii:0.00095258):0.00096312,Lophura_edwardsi:0.00218318,Perdix_dauurica:0.00582756,Meleagris_ocellata:0.00984718,Tetraophasis_obscurus:0.01196948):0.00219832):0.00188930,(Tetraogallus_himalayensis:0.01051263,Gallus_gallus:0.01080512):0.00280650):0.00393994,(Pavo_muticus:0.00568134,Rhynchortyx_cinctus:0.01438741):0.00228128):0.00380033,Argusianus_argus:0.00000001);

and the specified outgroups "-g Rhynchortyx_cinctus,Caloperdix_oculeus", pxrr reports:

./pxrr -t ning.tre -g Rhynchortyx_cinctus,Caloperdix_oculeus
you asked to root at the current root
the outgroup taxa don't exist in this tree

(The second error, "the outgroup taxa don't exist in this tree", is a different problem in that the first condition causes a false to be passed).
A problem here is that the original tree is not rooted at all:

./pxlstr -t ning.tre
tree #: 0
rooted: false
binary: false
nterminal: 20
ninternal: 12
branch lengths: true
rttipvar: NA
treelength: 0.157657
ultrametric: false
rootheight: NA

so the error is occurring earlier on.

pxnni gets Floating point exception

Running pxnni multiple times sometimes generates:

./pxnni -t ultra_100.tre 
Floating point exception

This is unrelated to the upgrade in random numbers I am doing. I'll look at this when I get done that.

Not sure if this is a problem involving root, etc. For reference, "ultra_100.tre" is:

((((s1:0.3603553431,s2:0.3603553431):0.8968782862,s3:1.257233629)foo:0.4207546592,(((s4:0.1190332191,s5:0.1190332191):0.02544171746,s6:0.1444749366):0.7647
640019,s7:0.9092389385):0.76874935):0.06349158892,((s8:0.3106889265,s9:0.3106889265):0.1031508249,s10:0.4138397514):1.327640126);

Problem with -march=native on mac

Apparently only works for linux:

"-march=native causes the compiler to auto-detect the architecture of the build computer. At present, this feature is only supported on Linux, and not all architectures are recognized. If the auto-detect is unsuccessful the option has no effect."

Except, compilation on mac dies horribly with "no such instruction" errors, e.g.:

/var/folders/qt/c5jl7pyd5rs91869hs20skl40000gn/T//ccqKwTM3.s:208:no such instruction: `vzeroupper'

Removing -march=native from the Makefile works, but it seems that configure should handle this.

Better handling of #includes

There are some tools that can do this (e.g. for clang). Better than mucking with it manually. I cannot find an automated way to do this with netbeans yet.

Should probably decide on guidelines for #includes as well. Seems to be two (vocal) camps: 1) #includes are only in cpp files (not headers), or 2) headers should contains enough #includes that they stand alone (minimizes unnecessary includes in code files).

Not terribly urgent; more of an aesthetic issue, really.

pxbd time

There is an error in the time calculation for the birth death calculator that gets negative branch lengths

Master help?

@ningwang83 has mentioned that remembering what each program does is difficult (there are so many!). I wonder if we could use a "master help" program, like:

pxh

which prints out a very short description of each program, and maybe an example command.

pxstrec doesn't build on ubuntu 13.10

the linker isn't happy.

cody@chromium:~/phylo/phyx/src$ make pxstrec
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"utils.d" -MT"utils.d" -o  "utils.o" "utils.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"node.d" -MT"node.d" -o  "node.o" "node.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"tree.d" -MT"tree.d" -o  "tree.o" "tree.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"tree_reader.d" -MT"tree_reader.d" -o  "tree_reader.o" "tree_reader.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"tree_utils.d" -MT"tree_utils.d" -o  "tree_utils.o" "tree_utils.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"sequence.d" -MT"sequence.d" -o  "sequence.o" "sequence.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"seq_reader.d" -MT"seq_reader.d" -o  "seq_reader.o" "seq_reader.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"seq_utils.d" -MT"seq_utils.d" -o  "seq_utils.o" "seq_utils.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"seq_models.d" -MT"seq_models.d" -o  "seq_models.o" "seq_models.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"pairwise_alignment.d" -MT"pairwise_alignment.d" -o  "pairwise_alignment.o" "pairwise_alignment.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"rate_model.d" -MT"rate_model.d" -o  "rate_model.o" "rate_model.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"state_reconstructor.d" -MT"state_reconstructor.d" -o  "state_reconstructor.o" "state_reconstructor.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"optimize_state_reconstructor_nlopt.d" -MT"optimize_state_reconstructor_nlopt.d" -o  "optimize_state_reconstructor_nlopt.o" "optimize_state_reconstructor_nlopt.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"optimize_cont_models_nlopt.d" -MT"optimize_cont_models_nlopt.d" -o  "optimize_cont_models_nlopt.o" "optimize_cont_models_nlopt.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"bd_sim.d" -MT"bd_sim.d" -o  "bd_sim.o" "bd_sim.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"superdouble.d" -MT"superdouble.d" -o  "superdouble.o" "superdouble.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"cont_models.d" -MT"cont_models.d" -o  "cont_models.o" "cont_models.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"boot.d" -MT"boot.d" -o  "boot.o" "boot.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"recode.d" -MT"recode.d" -o  "recode.o" "recode.cpp"
g++ -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x -c -fmessage-length=0 -MMD -MP -MF"main_strec.d" -MT"main_strec.d" -o  "main_strec.o" "main_strec.cpp"
building pxstrec
g++ -o "pxstrec" -O3 -ffast-math -ftree-vectorize -fopenmp -g -std=c++0x main_strec.o ./utils.o ./node.o ./tree.o ./tree_reader.o ./tree_utils.o ./sequence.o ./seq_reader.o ./seq_utils.o ./seq_models.o ./pairwise_alignment.o ./rate_model.o ./state_reconstructor.o ./optimize_state_reconstructor_nlopt.o ./optimize_cont_models_nlopt.o ./bd_sim.o ./superdouble.o ./cont_models.o ./boot.o ./recode.o -llapack -lblas -lpthread -lm -lnlopt_cxx -larmadillo 
main_strec.o: In function `main':
/home/cody/phylo/phyx/src/main_strec.cpp:577: undefined reference to `optimize_sr_periods_nlopt(std::vector<RateModel, std::allocator<RateModel> >*, StateReconstructor*, std::vector<arma::Mat<double>, std::allocator<arma::Mat<double> > >*, int)'
./rate_model.o: In function `RateModel::setup_fortran_P(arma::Mat<double>&, double, bool)':
/home/cody/phylo/phyx/src/rate_model.cpp:280: undefined reference to `wrapdgpadm_'
./state_reconstructor.o: In function `StateReconstructor::set_periods_model()':
/home/cody/phylo/phyx/src/state_reconstructor.cpp:102: undefined reference to `BranchSegment::getPeriod()'
/home/cody/phylo/phyx/src/state_reconstructor.cpp:102: undefined reference to `BranchSegment::setModel(RateModel*)'
./state_reconstructor.o: In function `StateReconstructor::set_tree(Tree*)':
/home/cody/phylo/phyx/src/state_reconstructor.cpp:77: undefined reference to `BranchSegment::BranchSegment(double, int)'
/home/cody/phylo/phyx/src/state_reconstructor.cpp:87: undefined reference to `BranchSegment::BranchSegment(double, int)'
./state_reconstructor.o: In function `StateReconstructor::conditionals_periods(Node&)':
/home/cody/phylo/phyx/src/state_reconstructor.cpp:189: undefined reference to `BranchSegment::getModel()'
/home/cody/phylo/phyx/src/state_reconstructor.cpp:195: undefined reference to `BranchSegment::getDuration()'
/home/cody/phylo/phyx/src/state_reconstructor.cpp:198: undefined reference to `BranchSegment::getDuration()'
collect2: error: ld returned 1 exit status
make: *** [pxstrec] Error 1

Not finding gfortran on mac

Yep, another mac issue. Make assumes gfortran lives in:

/usr/bin/gfortran

I installed it with homebrew, so it is actually here:

/usr/local/bin/gfortran

Need consistent naming of programs

Most sequence-targeted programs begin with s, while most tree-targeted begin with t, but this is by no means standard. Program names should be (I think):

  1. Intuitive
  2. Short(ish)

Obviously 2 should not impede 1.

A couple suggestions:

  1. pxclsq -> pxscln
  2. pxconcat -> pxscat (or just pxcat)
  3. pxseqgen -> pxsgen

Generate all NNI neighbours

Extend pxnni to have the ability to generate all possible NNI rearranged trees (of which there are 2(n-3) possible for an unrooted tree).

pxbdsim produces tips with zero branch lengths with time stop criterion

For example:

$ ./pxbdsim -t 10 -b 0.1 -d 0
(taxon_1:0,taxon_2:0);
$ ./pxbdsim -t 10 -b 0.1 -d 0
(taxon_1:7.5340461594881471541,(taxon_2:0,taxon_3:0):7.5340461594881471541);

Indeed, it always seems to happen, even with large trees:

(taxon_1:31.352878359893182392,(((taxon_2:0.26678816672157523726,taxon_3:0.26678816672157523726):19.075515338417730504,taxon_4:19.342303505139305742):11.552387747133771256,((((taxon_5:12.756308022150605552,(taxon_6:9.8056366825922083308,((taxon_7:0.19768636666934469304,taxon_8:0.19768636666934469304):6.2816107891398402785,taxon_9:6.4792971558091849715):3.3263395267830233593):2.9506713395583972215):1.6837827348388003657,(taxon_10:7.3300096055135597339,taxon_11:7.3300096055135597339):7.110081151475846184):13.034317227225852065,(((((taxon_12:9.9862279006124694547,taxon_13:9.9862279006124694547):0.44103860994158594622,taxon_14:10.427266510554055401):2.0491803876133403151,taxon_15:12.476446898167395716):5.1322497972562572954,((((taxon_16:0.70305991070630824424,(taxon_17:0.60800683772441743713,taxon_18:0.60800683772441743713):0.095053072981890807114):2.1349376313264158966,taxon_19:2.8379975420327241409):0.80962871661254354194,(taxon_20:2.92575464881264935,((taxon_21:0.76247717018086547114,taxon_22:0.76247717018086547114):1.6037378620409157293,(taxon_23:2.1658956212991995471,taxon_24:2.1658956212991995471):0.20031941092258165327):0.55953961659086814961):0.7218716098326183328):3.9596570031349642704,taxon_25:7.6072832617802319533):10.001413433643421058):6.7745994161289537772,((taxon_26:0,taxon_27:0):4.9434850035543576041,taxon_28:4.9434850035543576041):19.439811107998249184):3.0911118726626511943):1.9636326781083788262,((((((taxon_29:0.30018634379202779883,taxon_30:0.30018634379202779883):1.1567849919954369398,taxon_31:1.4569713357874647386):11.618320780502720879,(taxon_32:12.019808062740032994,(taxon_33:7.1116984576603456958,(taxon_34:2.584240851696911534,taxon_35:2.584240851696911534):4.5274576059634341618):4.9081096050796872987):1.0554840535501526233):1.6820447357970422786,((taxon_36:4.1824073471853608908,taxon_37:4.1824073471853608908):9.0883312703756899964,((((taxon_38:5.6500574957537210707,(taxon_39:1.5594588167775427223,taxon_40:1.5594588167775427223):4.0905986789761783484):1.1895276598668047541,taxon_41:6.8395851556205258248):5.2503201658691125431,(((taxon_42:0.5442605826532300739,taxon_43:0.5442605826532300739):6.7736634034175438046,(taxon_44:1.9184219058324032403,taxon_45:1.9184219058324032403):5.3995020802383706382):2.8597185074104132241,taxon_46:10.177642493481187103):1.9122628280084512653):0.57054482721289190295,(taxon_47:3.1324879657662307864,taxon_48:3.1324879657662307864):9.5279621829362994845):0.61028846885852061632):1.4865982345261770092):2.4417939097061349685,((taxon_49:1.7204286277044857911,taxon_50:1.7204286277044857911):10.462666196567731447,taxon_51:12.183094824272217238):5.0160359375211456268):6.6258686883240116572,(((taxon_52:4.2448349791995525493,taxon_53:4.2448349791995525493):5.4937043627528296952,((taxon_54:2.1935762882509166616,taxon_55:2.1935762882509166616):0.44929135262226793657,taxon_56:2.6428676408731845981):7.0956717010791976463):1.2053869393888518857,(((taxon_57:1.0770593093759188719,taxon_58:1.0770593093759188719):5.0225742154341688206,(((taxon_59:0.13514207092465824189,taxon_60:0.13514207092465824189):0.84545055801174129329,taxon_61:0.98059262893639953518):4.9291067739562635097,(taxon_62:5.658946120286074688,taxon_63:5.658946120286074688):0.25075328260658835688):0.18993412191742464756):0.71322589912116285404,taxon_64:6.8128594239312505465):4.1310668574099835837):12.881073168776140392):5.6130412122062622871):1.4566505899494401888):0.45818710762010539383);

Since I am in the guts of this now I will see what is going on.

Manual overhaul

Noted in review. I admit I do not like working with a Word document; markdown or latex would be easier. The example setup is a bit clunky and hard to manage, too.

#############################

Manual:

The manual is off to a great start, but needs to be brought in sync with the current state of the repo -- e.g. folder names don't match and some programs appear to be missing. Perhaps it's an out-of-date LaTeX pdf?

The current repo stores examples in in ./example_files/ not ./Example/

Manual example folders missing from repo:
pxaatocdn_example missing
pxconsq_example missing
pxs2fa_example missing
pxs2nex_example missing
pxs2phy_example missing
pxvcf2fa_example missing
pxt2new_example missing

Manual example names that mismatch repo folder names:
pxnw to pxnw_example
pxsw to psxw_example
pxtlate to pxtlate_example
pxbp to pxbp_example
pxmrca to pxmrca_example
pxnni to pxnni_example

Missing programs
pxnni
pxtscale

pg9
The program options are out of date when compared to pxrms --help.

pg10
-r List.txt to -f List.txt

pg13
I couldn't determine out what the output from pxmrca meant (although I did finally figure out who KIM, LEE, and THURSTON were). Maybe add more description to the manual and --help output.

pg14
Not sure what pxmrcacut does, but I could not view the output Newick string in FigTree or plot it in R using ape.

pg22
The pxcontrates example gave me nan values when using the provided files

pxbp requires rooted trees?

I tried analyzing all possible trees for 5 taxa:

((A,B),E,(C,D));
((A,B),D,(C,E));
((A,B),C,(D,E));
((A,C),E,(B,D));
((A,C),B,(D,E));
((A,C),D,(B,E));
((A,D),E,(B,C));
((A,D),B,(C,E));
((A,D),C,(B,E));
((A,E),B,(C,D));
((A,E),C,(B,D));
((A,E),D,(B,C));
((B,C),A,(D,E));
((B,D),A,(C,E));
((B,E),A,(C,D));

and got weird results:

pxbp -t all_possible.trees 
15 trees 
6 unique clades found
C D 	FREQ:	0.230769	ICA:	0.0113967
E C 	FREQ:	0.230769	ICA:	0.0113967
E D 	FREQ:	0.230769	ICA:	0.0113967
B D 	FREQ:	0.166667	ICA:	-0.0126821
B E 	FREQ:	0.166667	ICA:	-0.0126821
B C 	FREQ:	0.166667	ICA:	-0.0126821
TSCA: 0.0722364

e.g. the clade (A,B) is missing. I imagine because these are all unrooted? It would be nice if it also worked for unrooted trees.

General check if files exist

Need explicit check if file(s) exist. Currently prints:

ERROR: end of file too soon

if file does not exist, which is misleading.

Some programs segfault when run without arguments

Noted during review:

More informative error messages would help, in addition to suggesting the use of the --help flag. --help arguments from the repo and the manual don't match perfectly (see below).

Provide mac binaries?

Oi.

This would obviously be nice, but a huge headache given, well, mac. The install instructions have worked across versions for us, but obviously not for a reviewer.

I will look into (somehow) producing binaries that are supported across multiple versions, but since I have never been able to do so in the past, I am not optimistic.

pxrr needs better name checker

From @jfwalker: pxrr segfaults is outgroup not in the tree.

Specifying a single bad outgroup seems to work fine (real outgroup is "Beta"):

./pxrr -t cluster5535.tre -g Beta123
the outgroup taxa don't exist in this tree

But barfs with 1 good and 1 bad outgroup:

./pxrr -t cluster5535.tre -g Beta123,Spol
Segmentation fault

Here is the example tree ("cluster5535.tre"):

(((((Dimu:34.875000,MJM1652:40.125000):14.750000,DrobinSFB:60.250000):17.500000,(Beta:45.500000,Spol:45.500000):450.750000):10.750000,DrolusSFB:64.250000):30.875000,NepSFB:30.875000);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.