Git Product home page Git Product logo

pyseer's Introduction

pyseer

SEER, reimplemented in python by Marco Galardini and John Lees

pyseer --phenotypes phenotypes.tsv --kmers kmers.gz --distances structure.tsv --min-af 0.01 --max-af 0.99 --cpu 15 --filter-pvalue 1E-8

Run tests Documentation Status Anaconda package

Motivation

Kmers-based GWAS analysis is particularly well suited for bacterial samples, given their high genetic variability. This approach has been implemented by Lees, Vehkala et al., in the form of the SEER software.

The reimplementation presented here should be consistent with the current version of the C++ seer (though we do not guarantee this for all possible cases).

In this version, as well as all the original features, many new features (input types, association models and output parsing) have been implemented. See the documentation and paper for full details.

Citations

Unitigs and elastic net preprint: Lees, John A., Tien Mai, T., et al. Improved inference and prediction of bacterial genotype-phenotype associations using pangenome-spanning regressions. bioRxiv 852426 (2019) doi: 10.1101/852426

pyseer and LMM implementation paper: Lees, John A., Galardini, M., et al. pyseer: a comprehensive tool for microbial pangenome-wide association studies. Bioinformatics 34:4310โ€“4312 (2018). doi: 10.1093/bioinformatics/bty539

Original SEER implementation paper: Lees, John A., et al. Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes. Nature communications 7:12797 (2016). doi: 10.1038/ncomms12797

Documentation

Full documentation is available at readthedocs.

You can also build the docs locally (requires sphinx) by typing:

cd docs/
make html

Prerequisites

Between parenthesis the versions the script was tested against:

  • python 3+ (3.6.6)
  • numpy (1.15.2)
  • scipy (1.1.0)
  • pandas (0.23.4)
  • scikit-learn (0.20.0)
  • statsmodels (0.9.0)
  • pysam (0.15.1)
  • glmnet_python (commit 946b65c)
  • DendroPy (4.4.0)

If you would like to use the scree_plot_pyseer script you will also need to have matplotlib installed. If you would like to use the scripts to map and annotate kmers, you will also need bwa, bedtools, bedops and pybedtools installed.

Installation

The easiest way to install pyseer and its dependencies is through conda::

conda install pyseer

If you need conda, download miniconda and add the necessary channels::

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

pyseer can also be installed through pip; download this repository (or one of the releases), cd into it, followed by:

python -m pip install .

If you want multithreading make sure that you are using a version 3 python interpreter:

python3 -m pip install .

If you want the next pre-release, just clone/download this repository and run:

python pyseer-runner.py

Copyright

Copyright 2017 EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.