Git Product home page Git Product logo

mdelta's Introduction

mDELTA: an algorithm for multifuricating Developmental cEll Lineage Tree Alignment.

PyPI - Python Version PyPI license

  • mDELTA is an algorithm for multifuricating Developmental cEll Lineage Tree Alignment. In essence, it compares two rooted, unordered, tip-labeled trees, and finds the best global / local correspondence between the nodes. The mDELTA program is designed for analyzing developmental cell lineage trees reconstructed through single-cell DNA barcoding (such as done by scGESTALT or SMALT, while greater cellular coverage is expected to yield more meaningful mDELTA alignments).

  • Except for dealing with cell lineage trees instead of biological sequences, mDELTA is conceptually similar to sequence alignment. It helps quantify similarity among different lineage trees, disentangle the consensus and variation, find recurrent motifs, and facilitate comparative/evolutionary analyses.

  • Also included in this repository are Python/R scripts for statistical analyses and visualization of mDELTA results, which facilitates their biological interpretation.

  • mDELTA was developed by Jingyu Chen under the supervision of Professor Jian-Rong Yang at the Zhongshan School of Medicine of Sun Yat-Sen University in China.

For details, please visit https://github.com/Chenjy0212/mdelta/blob/main/README2.md/

Readme Card

Quick start

Installation

pip install mdelta

This will install mDELTA and its prerequisites

Running the program

mDELTA.py -h
usage: mDELTA [-h] [-nt NAME2TYPEFILE] [-nt2 NAME2TYPEFILE2] [-sd SCOREDICTFILE] [-t TOP] [-ma MAV] [-mi MIV] [-p PV] [-T TQDM]
              [-n NOTEBOOK] [-P PERM] [-a ALG] [-c CPUS] [-o OUTPUT] [-x DIFF] [-mg MERGE]
              TreeSeqFile TreeSeqFile2

Multifuricating Developmental cEll Lineage Tree Alignment(mDELTA)

positional arguments:
  TreeSeqFile           [path/filename] A text file storing cell lineage tree #1 in newick format. Tips can be labeled by name or
                        cell type. Branch lengths should be removed.
  TreeSeqFile2          [path/filename] A text file storing cell lineage tree #2 in newick format. Tips can be labeled by name or
                        cell type. Branch lengths should be removed.

optional arguments:
  -h, --help            show this help message and exit
  -nt NAME2TYPEFILE, --Name2TypeFile NAME2TYPEFILE
                        [path/filename] List of correspondance between tip name and cell type for cell lineage tree #1.
  -nt2 NAME2TYPEFILE2, --Name2TypeFile2 NAME2TYPEFILE2
                        [path/filename] List of correspondance between tip name and cell type for cell lineage tree #2.
  -sd SCOREDICTFILE, --ScoreDictFile SCOREDICTFILE
                        [path/filename] A comma-delimited text file used to determine similarity scores between cells. If there
                        are exactly three columns, they will be interpreted as (1) the cell (name or type) in Tree #1, (2) the
                        cell in Tree #2, and (3) the similarity score. If otherwise, the first column will be interpreted as the
                        cell (name or type) and the remaining columns as features of the cell (e.g. expression of a gene). The
                        similarity scores will be estimated between all pairs of cells based on the Euclidean distance calculated
                        using all the features. Overrides `-ma` and `-mi`.
  -t TOPN, --top TOPN   [int > 0] Performs local (instead of global) alignment, and output the top TOPNUM local alignments with the
                        highest score (e.g. `-t 10`). In the case of global alignment, this parameter should be omitted.
  -ma MAV, --mav MAV    [float]
  -mi MIV, --miv MIV    [float] Shorthand for a simple matching score scheme, where the matching score between a pair of the same
                        cell types is MAV and all other pairs are MIV. (e.g. `-ma 2 -mi -2`). Overridden by `-sd`.
  -p PV, --pv PV        [float] The score for pruning a tip of the tree (e.g. `-p -2`). Default to -1.
  -T TQDM, --Tqdm TQDM  [0(off) or 1(on)] Toggle for the jupyter notebook environment.
  -n NOTEBOOK, --notebook NOTEBOOK
                        [0(off) or 1(on)] Toggle for the jupyter notebook environment.
  -P PERM, --PERM PERM  [int > 0] Toggle for the statistical significance. For each observed alignment, the aligned trees will be
                        permuted PERM times to generate a null distribution of alignment scores, with which a P value can be
                        calculated for the observed alignment score.
  -a ALG, --Alg ALG     [KM / GA] Use Kuhn-Munkres or Greedy Algorithm to find the optimal alignment score.
  -c CPUS, --CPUs CPUS  [int > 0] Number of threads for multi-processing. Default to 50., it can reach the maximum number of local
                        CPU cores - 1.
  -o OUTPUT, --output OUTPUT
                        [path/filename] Output filename
  -x DIFF, --diff DIFF  [int > 0] Alignment must consist of a minimal 
                        of DIFF% aligned cell pairs that are different from previous(better) local
                        alignments in order to be considered as another new alignment (e.g. `-x 20` means 20%).
  -mg MERGE, --merge MERGE
                        [float] This is the scaling factor for calculating the score of merging an internal node (e.g. -mg -1),
                        which is multiplied by the number of tips of the internal node to be merged. Default to 0.

More details on https://github.com/Chenjy0212/mdelta

Aligning a tree with itself, output top three local alignments

mDELTA.py ExampleFile/tree.nwk ExampleFile/tree.nwk -t 3

โœ๏ธ Authors

    Jingyu Chen (EeWhile) ๐Ÿ‘จ

    Student of SYSU. ๐Ÿซ

    ZHONGSHAN SCHOOL OF MEDICINE,SYSU https://zssom.sysu.edu.cn/zh-hans ๐Ÿ’–

    Undergraduate majoring in computer science, master majoring in bioinformatics. ๐Ÿ‘จโ€๐Ÿ’ป

    Below is the contact information of the author. ๐Ÿ‘€

    Github Stars Bilibili Zhihu Weibo neteasy-mysic douyin instagram
    QQ wechat mail gmail
    sysu

    sysulogo

If you use this project in your research, please cite this project.

@misc{mdelta2022,
    author = {Jingyu Chen},
    title = {mDELTA: Multifuricating Developmental cEll Lineage Tree Alignment},
    year = {2022},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/Chenjy0212/mdelta}},
}

mdelta's People

Contributors

chenjy0212 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.