Git Product home page Git Product logo

dammet's Introduction

install with bioconda

Build Status

DamMet, a full probabilistic model for mapping ancient methylomes


DamMet is probabilistic model for mapping ancient methylomes using sequencing data underlying an ancient specimen. The model is implemented as a two step procedure. The first step recovers a maximum likelihood estimate (MLE) of the position specific deamination rates for methylated and unmethylated cytosine residues. The second step, making use of these deamination rates, returns a MLE of the methylation level in a user-defined genomic window. The software is multithreaded

Installation

DamMet is dependent on htslib and nlopt. Both software will be downloaded and installed with DamMet.

  git clone https://github.com/KHanghoj/DamMet.git
  cd DamMet && make && cd ..

zlib and cmake should be globally installed.

How to run DamMet

DamMet requires a single positional argument (estDEAM or estF). estDEAM will estimate deamination profiles and estF estimates methylation levels.

In addition DamMet also requires three keyword arguments:

  • BAM file (-b)
  • Reference genome (-r)
  • a comma separated list of chromosomes/contigs (-c) or path to file with chromosomes/contigs (-cf).

The two step procedure could look like this:

    ./DamMet estDEAM -b ${BAM} -r ${FASTA} -c ${CHROM} -M ${M} -O ${OUT_PREFIX}
    ./DamMet estF -b ${BAM} -r ${FASTA} -c ${CHROM} -M ${M} -O ${OUT_PREFIX} -N ${NCPG}

Running example

To demonstrate that DamMet works and produces the expected output, we have made a small running example.

git clone https://github.com/KHanghoj/DamMet.git
make -C DamMet 
git clone https://github.com/KHanghoj/DamMet-tutorial.git
cd DamMet-tutorial
bash run.sh

The output plot (result.pdf) of this running example should be identical to result.expected.pdf. The main script in this example ('run.sh') can easily be used as a template to suit any analyses of interest.

Options, Input and Output formats

All available options followed by a description can displayed by running DamMet without any arguments (./DamMet/DamMet).

Special Input formats

  1. -R allows the user to provide a list of read groups (ID) that should be considered individually for estimating deamination rates. Readgroups not present in the input file will be ignored and their sequencing data is not considered for any analyses. The input file should contain a single read group name (ID) per line. If the user does not provide a file to -R all, all sequencing data with be merged into a single read group. The latter is the default in DamMet.
  2. -E allows the user to provide genomic sites that should be excluded. The file show contain a site per line (e.g. chr20 100001). The genomic position should be 1-based.
  3. -e allows the user to provide genomic regions that should be excluded. The file should take the form of a standard BED file.

Output format

Two types of files are produced by DamMet, namely a CHR.READGROUP.deamrates file and a CHR.READGROUP.[BED].F file.

CHR.READGROUP.deamrates

This file contains the MLE of the deamination rates.

  1. Methylation status (Methylated == 0; UnMethylated == 1)
  2. DNA position along a DNA molecule (0-based)
  3. Prime (5-prime == 0; 3-prime == 1)
  4. Deamination rate
CHR.READGROUP.CHROM.BED.F or CHR.READGROUP.CHROM.F

Every row is a BED region or a genomic CpG. Every column output comes with a description.

Simulating sequence data with methylation specific deamination patterns using gargammel.

Along with the publication of DamMet, we also developed a new feature to gargammel that enables the user to simulate ancient DNA sequences with methylation specific deamination patterns. With this new feature, it has become possible to answer questions like, what is the accuracy with the current sequencing effort and what is the optimal trade off between genomic resolution (e.g. genomic window) and sequencing effort.

Citation

https://academic.oup.com/gigascience/article/8/4/giz025/5475519

Troubleshooting

htslib installation issues

For one user installing the following software fixed some issues installing htslib: libbz2-1.0 libbz2-dev libbz2-ocaml libbz2-ocaml-dev liblzma-dev.

dammet's People

Contributors

khanghoj avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.