Git Product home page Git Product logo

go-off's Introduction

GO-off

This is a Python script for automation of steps for running GORILLA (http://cbl-gorilla.cs.technion.ac.il/) and REVIGO (http://revigo.irb.hr/index.jsp) online tools (with default settings) to perform GO enrichment on datasets of Arabidopsis thaliana gene IDs.

Expected input dataset is a text file (located in the folder with script) containing gene IDs separated by newlines.
    As an example of input data, I provided two datasets (dataset_1 and dataset_2). Reference dataset cannot be replaced.
Expected outputs (saved in foled "output") are:
    text (.txt) file of Arabidopsis gene IDs dataset used for GO analysis (parentheses of the file name shoe the number of IDs)
    two screenshot (.png) images from GORILLA and REVIGO webpages
    excel tables: 
        GORILLA table with calculated enrichment, etc.
        GORILLA gene IDs extracted from GORILLA table GO categories REVIGO table
        REVIGO reduced table (dispensability <= 0.7)
        GO RESULTS = contains REVIGO-reduced GO category IDs and counted numbers (enrichment, ...) 

Important notes:

The geckodriver operates Firefox browser, so you have to install this browser before running the script. Please run the script only with Windows 10. Let me know, in case it is not working on your machine. Keep in mind that the content in folder "output" is overwritten with each iteration.

Comments:

The aim of the last update is to give the user more freedom in selecting and merging user's input datasets. Now there is no limit for the number of merges of user's input files, so the user can create various merging workflow. This script can be run multiple times in a loop but each merging prompt takes always only two files.

This update includes warning messages whenever the "output" folder and its content is going to be overwritten. The user needs to confirm it each time by pressing Enter. This gives a space for quickly placing the content somewhere else to save it, if needed. In the future I am planning to modify the script to provide an extra file with history of the user's merging workflow steps, so the user will get more information about the merging actions that have been done on the output dataset.

Also, I apologize for my spaghetti code... Please ask me, if you do not understand any part of the script.


Eran Eden*, Roy Navon*, Israel Steinfeld, Doron Lipson and Zohar Yakhini. "GOrilla: A Tool For Discovery And Visualization of Enriched GO Terms in Ranked Gene Lists", BMC Bioinformatics 2009, 10:48.
Eran Eden, Doron Lipson, Sivan Yogev, Zohar Yakhini. "Discovering Motifs in Ranked Lists of DNA sequences", PLoS Computational Biology, 3(3):e39, 2007.
Supek F, Bošnjak M, Škunca N, Šmuc T. "REVIGO summarizes and visualizes long lists of Gene Ontology terms", PLoS ONE 2011. doi:10.1371/journal.pone.0021800

go-off's People

Contributors

konetom avatar

Forkers

mkoesters

go-off's Issues

Generalizing GO-off

Hi @konetom ,

Thanks for your work!
I found your script while looking for a way to automatize the Gorilla/REVIGO workflow for an mouse dataset.
I decided to restructure your project a little bit to make it linux compatible, to allow choosing different organisms and implemented a CLI.
I'm also planning to implement some more REVIGO options to set.

I wanted to know if you would accept a PR, in that case I'd also update the docs in the next few days and provide some example on how to use the CLI.
My fork is here https://github.com/MKoesters/GO-off

Best,
Manuel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.