Git Product home page Git Product logo

automated-biological-evidence-generation-in-drug-discovery's Introduction

Automated Biological Evidence Generation In Drug-Discovery

AnyBURL

The directory java contains source files for the AnyBURL algorithm with the original license preserved. Some changes have been applied to the source to address bugs and other small issues.

The java code must be compiled before running the example scripts here, to compile the code ensure that you have Java installed (we recommend OpenJDK v14 and Apache Maven v3.9.4).

To compile the java code;

$ make build-java

Python

We recommend installing the code in a python virtualenv. Once you have a virtualenv set up, the code can be installed with

pip install -e '.[dev]'

and tests can be run with tox

tox

and linters can be run with the shortcut

make lint

Running the example: Parkinson Disease

Train AnyBURL using the graph provided in data/triples.txt

anyburl-train data

results are written to a results directory.

Generate rules from the trained AnyBURL artefacts in results

anyburl-predict data

further results are written to the results directory.

Filter the rules and create Evidence

Run this command to create a list of evidence chains for "Parkinson disease" treatments (defined in data/parkinson_disease_predicted_treatments.txt), the output is written to evidence-chains.jsonl.

healx-chains \
    data \
    results \
    results/predict-1000 \
    results/predict-explanation \
    --predictions-filter-file data/parkinson-disease-filter/predictions.txt \
    --explanations-filter-file data/parkinson-disease-filter/prioritised-edge-types.txt

Results are written to an evidence chains file results/evidence-chains.jsonl in the JSONL format. Each line contains a generated "chain" prediction with the following example structure

{
  "prediction": "methixene_COMPOUND",
  "prediction_score": 0.08924107199497017,
  "start_node": "methixene_COMPOUND",
  "end_node": "parkinson_disease_DISEASE",
  "metapath": [
    {
      "label": "COMPOUND_inhibits_GENE",
      "reversed": false
    },
    {
      "label": "DISEASE_associates_GENE",
      "reversed": true
    }
  ],
  "path_score": 0.005949453701297636,
  "path": [
    "methixene_COMPOUND",
    "htr2c_GENE",
    "parkinson_disease_DISEASE"
  ]
}

we can expect to find thousands of chains in this file.

Filter the chains for Parkinson Disease

Run this command (after creating the evidence chains file) to filter chains by gene and pathway for "Parkinson disease".

healx-filter \
    results/evidence-chains.jsonl \
    data/parkinson-disease-filter/genes.txt \
    data/parkinson-disease-filter/pathways.txt \
    data/parkinson-disease-filter/predictions-short-list.txt \
    data/parkinson-disease-filter/prioritised-edge-types.txt \
    --filtered-evidence-chains-file filtered-evidence-chains.txt

which will produce a filtered evidence chains file in text format, filtered-evidence-chains.txt - one line per text chain. An example of one of the lines from this file are

AMANTADINE inhibits CHRNA3 participates NEUROACTIVE_LIGAND-RECEPTOR_INTERACTION involves CABERGOLINE in_trial_for PARKINSON_DISEASE

.

automated-biological-evidence-generation-in-drug-discovery's People

Contributors

danodonovan avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.