Git Product home page Git Product logo

egem's People

Contributors

aaditis avatar fanela avatar marcdimeo avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

dwgoblue aaditis

egem's Issues

Fix medium conditions

Describe the bug
@ScottCampit did not add some medium components, as there were few cell lines that were grown in the medium conditions, and the amount of glucose in the medium was similar to another medium condition. However, it is now apparent that these assumptions do not hold, and I need to finally fix the medium conditions.

Steps to take

  • Fix medium conditions for those that point to either RPMI or DMEM
  • Add new DMEM conditions that include:
    • (+) gln
    • (-) gln
    • (++) glc
    • (+) glc
    • (-) glc
    • (+) pyr
    • (-) pyr

Complete end of summer checklist

To do:

  • clean up the code into manageable snippets (4-5hrs)
  • Make all code that are not functions into functions. Clean up all inefficient parts of your code. Make your code in standard PEP8 format.
  • generalize the code so that it works with any dataset for gene expression, proteomics, whatever (1-2hrs)
  • Same as above, but in addition to making it a function, have it so that the script automates several processes, including mapping gene symbols, etc.
  • vectorize the code (1-2 hrs)
  • There are tons of instances were you're dynamically updating variables. This is computationally inefficient. Instead, we should use vectorized implementations via numpy and pandas.
  • package them into different scripts (1-2hrs)
  • Categorize them, and make different python scripts. We're gonna make this a library. For examples, look at the sriram-lab/MetOncoFit project. I want it to look like that.
  • improve the docstring documentation (3-4hrs)
  • All you functions should have well written docstrings, so that if a developer sees your code, they can read the description and get what's going on immediately. Add comments and such in more appropriate places.
  • improve the wiki documentation (> 1hr)
  • Write the same description for your functions in the GitHub Wiki site. The point of this is to make it clear to users what each function does, how it does it, and the outputs.
  • re-organize the directory structure/files (>1hr)
  • Files are everywhere. Clean it up.
  • merge your branch to master (>1hr)
  • We need to merge the finished product to master
  • Finally, you should write up your results more formally in the manuscript, as if we were going to publish it soon. (1-3 days)
  • Read Shen et al., 2019 and format your section (Methods, Results, Discussion) as though it was going to be included in the paper.
  • Image files should be .svg file format, dpi = 300
  • I'll want to go over at least 3 revisions with you

metabolic_sensitivity.m dynamic range

Things to do:

  • Capture epsilons for each reaction of interest with the highest dynamic range
  • Set epsilons for each reaction to that value
  • Obtain values and evaluate results

Issue with the SRA case in metabolic_sensitivity

Describe the bug
The reduced cost line in metabolic sensitivity returns an output error. The value that I want stored (which in some cases is 0) is returned as an empty array. This causes a size error.

To Reproduce
Run the code as written in the scampit branch

Expected behavior
I would expect that the empty value will be filled in as a 0 if the reduced cost were in fact 0.

FBA case in metabolic_sensitivity.m does not output correct metabolic fluxes

I wrote code that outputs the metabolic fluxes for various medium conditions in three separate cases. One case uses FBA as the optimization method for calculating the metabolic fluxes. However, from Shen et al., 2019, the output does not match.

To test:

  • Use acetylation model in same analysis
    • If error, then there is issue with your code
  • Use new eGEM in Shen et al., analysis
    • If error, then there is an issue with your model

Add PTM nodes into metabolic network

To add these metabolic nodes, I need to look for literature evidence about

  • the histone PTM, and
  • the readers and writers that influence the histone markers

Lauren:

  • H3K27

Scott:

  • H3K4
  • H3K9

Later...

  • H3K56
  • H3K79

Histone reader, writer, and eraser expression and individual histone marker expression

Now that you have extracted more genes that correspond to histone writers and erasers, there are several more tasks we can do for more in-depth analyses:

  • Extract synonyms for each writer and eraser. This will increase the number of hits in your analysis
    • Store as a json file and read into Python as a dictionary. This is the first step and will help you make this into a function
    • Write your code as a function that reads in either txt or json files and outputs the visualization as an svg or png file. Make it so that users can specify whether they want an svg or png output.
  • Get reader genes and see if there is a correlation. My guess is that there will be a (weak) correlation

Metabolic sensitivity heatmaps

ISSUE WITH SOME METABOLIC REACTIONS
Some metabolic reactions that are expected to output a flux value return 0 in the metabolic_sensitivity.m code.

To Reproduce
Steps to reproduce the behavior:

  1. Go to run_all.m to line 145
  2. Run the metabolic_sensitivity function
  3. It will return errors for the excel sheets, but also the resulting heatmap does not contain metabolic fluxes that we expect.

Fix recon1 rxngenemat

The RxnGeneMat does not accurately map genes and reactions, probably because old identifiers are used. Need to fix this now.

Make module for seashore-ludlow analysis

Steps for drug sensitivity analysis:

  • Make code from Shen et al into a MATLAB function
  • Put AUC data in right format in MATLAB variable
  • Ensure that H3 relative values are formatted correctly
  • Run with bulk methylation genome scale metabolic model unconstrained (without inhibitor gene expression)
  • Analyze the results

Validation steps:

  • Use Le Roy et al data for proteomics instead of H3 relative values from the CCLE database
    • Create a MATLAB variable containing the methylation data

Next steps:

  • Add individual methylation reactions for specific metabolic markers onto the genome-scale model and save as a new metabolic model
  • Run same analysis to see if we can predict drug sensitivity for specific histone markers rather than bulk methylation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.