PyGenePlexus

A Python package of the GenePlexus analysis pipeline.

The GenePlexus paper
The repository for reproducing the experiments
The webserver
Documentation
Data

Quick start

Installation

Install the GenePlexus package via pip.

pip install geneplexus

Run GenePlexus pipline

Example script

See example/example_run.py for example usage of the API.

Command-line interface

geneplexus --input_file example/input_genes.txt --output_dir example_result

Full CLI options (check out with geneplexus --help)

Run the GenePlexus pipline on a input gene list.

optional arguments:
  -h, --help            show this help message and exit
  -i , --input_file     Input gene list (.txt) file (one gene per line). (default: None)
  -d , --gene_list_delimiter
                        Delimiter used in the gene list. Use 'newline' if the genes are separated
                        by new line, and use 'tab' if the genes are seperate by tabs. Other
                        generic separator are also supported, e.g. ', '. (default: newline)
  -n , --network        Network to use. {format_choices(config.ALL_NETWORKS)} (default: STRING)
  -f , --feature        Types of feature to use. The choices are: {Adjacency, Embedding,
                        Influence} (default: Embedding)
  -g , --gsc            Geneset collection used to generate negatives and the modelsimilarities.
                        The choices are: {GO, DisGeNet} (default: GO)
  -s , --small_edgelist_num_nodes
                        Number of nodes in the small edgelist. (default: 50)
  -dd , --data_dir      Directory in which the data are stored, if set to None, then use the
                        default data directory ~/.data/geneplexus (default: None)
  -od , --output_dir    Output directory with respect to the repo root directory. (default:
                        result/)
  -l , --log_level      Logging level. The choices are: {CRITICAL, ERROR, WARNING, INFO, DEBUG}
                        (default: INFO)
  -q, --quiet           Suppress log messages (same as setting log_level to CRITICAL). (default:
                        False)
  -z, --zip-output      If set, then compress the output directory into a Zip file. (default:
                        False)
  --clear-data          Clear data directory and exit. (default: False)
  --overwrite           Overwrite existing result directory if set. (default: False)
  --skip-mdl-sim        Skip model similarity computation. This computation is not yet available
                        when using custom networks due to the lack of pretrained models for
                        comparison. (default: False)

Dev

Installation

Install the PyGenePlexus package in editable mode with dev dependencies

pip install -e ."[dev]"

Testing

Run the default test suite

pytest test/

By default, test data will be cached. Thus, after the first test run, data redownload will not be tested. To force redownload, specify the --cache-clear option

pytest test/ --cache-clear

Building Documentation

Install doc dependencies pip install -r docs/requirements.txt
Build

cd docs
make html

Open doc open build/html/index.html

CLI option setting up custom

Currently, if a user wants to use GenePlexus with their custom network or gsc, they will need to set up the required custom files using the geneplexus.custom module first, before they can proceed to run the GenePlexus pipeline using the CLI.

The goal here is to make a CLI option that calls the necessary geneplexus.custom functions to set up custom files, and thus eliminates the need for one to manually prepare them.

Working notes

--custom option -> enables preprocessing custom network/gsc data
- Preprocessing runlong also save to ${data_dir}/custom_logs/${net}_${feature}_${gsc}.log
  - Network stats: num_nodes, num_edges
  - GSC stast: num_genesets, med_size, avg_size, std_size, max_size, min_size
Required files
- Edgelist_xxx.edg (custom network)
- GSCOriginal_xxx.json (custom gsc)
Set up custom network and gsc
- custom.edgelist_to_node -> NodeOrder_${net}.txt
- custom.edgelist_to_matrix -> Data_${feature}_{network}.npy
- custom.subset_gsc_to_network -> GSC_${gsc}_${net}_GoodSets.json, GSC_${gsc}_${net}_universe.txt

krishnanlab / pygeneplexus Goto Github PK

pygeneplexus's Introduction

PyGenePlexus

Quick start

Installation

Run GenePlexus pipline

Example script

Command-line interface

Dev

Installation

Testing

Building Documentation

pygeneplexus's People

Contributors

Stargazers

Watchers

Forkers

pygeneplexus's Issues

Working notes

Recommend Projects

Recommend Topics

Recommend Org