fricktobias / dbs-pro Goto Github PK
View Code? Open in Web Editor NEWDBSpro Analysis
License: MIT License
DBSpro Analysis
License: MIT License
Currently it is seemingly needed to give the path to the construct relative to the output directory rather than your working directory.
Currently snakemake is using all inputs/output as a globbed list rather than looping over them.
Include options to pass to snakefile. Used config in snakemake module?
Make dict with arguments in run.py --> pass to config.
We should have a config file containing all relevant parameters for running the pipeline that is prone to change. For example:
Currently the entire DBS FASTQ is read into memory.
Instead of individual rules for each case a general case should be developed
Use pytest and run tests settings and running from remote directories.
Just and idea I had about how we might want to change the order in our pipeline.
I have found the following issue. For UMIs we cluster them for each ABC target but do not separate on DBS. This could mean that we are merging UMIs that should in fact be separate. My proposal would be to separate all UMIs by ABC and DBS before clustering. This would better represent the actual conditions in the experiment.
I am however unsure about the benefits in the end, possibly this would only be a lot of work for nothing, but I wanted to raise the idea anyway to set what you think.
START. Input = Fastq file
Separate for DBS
1.1 Extract DBS
1.2. Cluster DBS
1.3 Correct DBS fastq
Separate for ABCs
2.1. Extract ABC-UMI
2.2 Split ABC-UMI by ABC
2.3 Cluster ABCs independently
2.4 Correct ABC fastqs.
Analysis of corrected DBS and ABC files.
END.
START. Fastq file
END.
It is too slow...
Provide example of how to edit and run the pipeline in a linked publically available google colab jupyter noteboook
Currently the script takes the name from a inputed tsv file. Instead we want the names to be given from the file supplied in the config file.
The wrapper DBS-Pro_automation.sh script should be & it's functionality should be moved to the installed package DBS-Pro.
These modules are currently not included in the environment file
The UMI sequences can be used to identify chimeric sequences by looking for UMI:s linked to several different ABC or DBS sequences.
The ABC file system is currently not that adaptable and it would be nice to be able to have several ABC-sequence files for different setups.
I'd suggest adding functionality for adding new construct file using dbspro set
and a separate command for changing which is used by adding a dbspro config
command (or something like that).
Currently the script only runs from the DBS-Pro folder
Fastq is not needed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.