Git Product home page Git Product logo

trapga's Introduction

Documentation Status

GATAI

Overview

GATAI is a Python library built upon setga, designed for extracting genes that play a significant role in development. It utilizes transcriptomic data of genes, spanning multiple developmental stages and their respective gene ages (for more information on how to get the gene ages, see GeneEra.

The project features an algorithm designed to identify genes contributing to the observed TAI pattern in developmental gene expression data. GATAI aims to identify a subset of genes that, if removed from the dataset, would significantly reduce the presence of the pattern. By employing a multi-objective genetic algorithm, it maximizes the removal of the TAI pattern while minimizing the number of removed genes.

To determine the significance of the TAI pattern during the optimization, GATAI uses the variance sampling introduced in myTAIs flat-line-test by Hajk-Georg Drost

The algorithm utilizes the DEAP (Distributed Evolutionary Algorithms in Python) library, which provides a flexible framework for implementing genetic algorithms. It offers various selection methods, mutation operators, and genetic operators to evolve populations of candidate solutions.

Additionally, to enhance its search capability and avoid being trapped in local optima, the algorithm employs an island model. This approach involves maintaining multiple subpopulations, or "islands," that evolve independently. Periodic migration of individuals between islands allows for sharing of genetic information and prevents the algorithm from converging prematurely to suboptimal solutions. This utilization of the island model enhances the algorithm's ability to explore the solution space and discover more globally optimal solutions.

hg3_slow.mp4

Features

  • Gene Extraction: Automatically identifies and extracts genes significant to development from transcriptomic data.
  • Built on setga: Leveraging the capabilities of SetMiG, a library for extracting a minimal subset which optimizes a given function.

Installation

pip3 install gatai

Citation

Please cite myTAI as well

> Drost et al. __myTAI: evolutionary transcriptomics with R__. _Bioinformatics_ 2018, 34 (9), 1589-1590. [doi:10.1093](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btx835/4772684)

Usage

The main function of this tool is to identify a small set of genes driving the TAI pattern, for this having a tsv file, where the first column named "Phylostratum" are the gene ages, second column named "GeneID" are the gene ids and the following columns are the expressions for the respective developmental stages. Those columns need to be sorted time-wise and the replicates collapsed.

To identify the genes driving the TAI pattern run:

gatai run_minimizer input_data output_folder

In the output folder a text file with identified genes will be stored

If the run statistics should be stored, run

gatai run_minimizer input_data output_folder --save_stats

This will save the summary of the run with the elapsed time, number of generations, number of extracted genes etc. The pickled logbook, best solutions for every generation and the final population will be stored as well.

In case you have sampled variances precomputed and saves in a file with the values separated by newline, you can run

gatai run_minimizer input_data output_folder --variances variances_file

In case your dataset is single cell, you can run

gatai run_minimizer input_data output_folder --single_cell

to run the version working eith the expression matrix as a sparse matrix. However, due to TAI not being tested for single cell we do not recommend to draw any conclusions from the identified genes.

Contributing

Contributions to this project are welcome. If you have any ideas for improvements, new features, or bug fixes, please submit a pull request. For major changes, please open an issue to discuss the proposed modifications.

License

This project is licensed under the MIT License. Feel free to use and modify the code according to the terms of this license.

trapga's People

Contributors

lavakin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.