Comments (5)
Hi Roshan thank you for your interest in the tool. Yeah you're absolutely right: the germline-filter
module could be configured in a more tolerant manner. I'll add this to the list of things to do.
In the meantime, I think all --metadata needs to be is a comma separated values file with cell_id
and patient_id
headers. It could be as simple as this:
We plan on adding adnl documentation in the future but the basic idea is if you have a set of samples and a set of patients those samples came from, you can filter out the common variants found across all of the samples from a given patient. So for the above example, if cell1 and cell2 share 10 variants, those variants would be removed by germline-filter
as they are common across all samples of patient1.
from cerebra.
Hi Lincoln,
Many thanks for the clarification regarding the metadata option.
I have a few further questions I hope you can clarify about the tool:
-
Should the cells VCF file be a joint called file of all cells or each cell contained in a seperate VCF with multiple files then passed in via --cells ?
-
Is the filter for shared somatic variants between the cells able to be adjusted or disabled?
-
On the Readme page it mentions the framework was designed for scRNA-seq, are there any specific filters the tool applies to adjust for the inherent technical noise of scRNA sequencing?
Best wishes,
Roshan
from cerebra.
Hi Roshan
-
Its probably best to keep vcf files separate. For example, if you have a directory full of vcf files named
vcf_dir
then for thepath to germline vcf files directory:
argument you would specifyvcf_dir
-
If you dont want to use the filter you can just run the other two modules (
count-mutations
andfind-aa-mutations
) on an unfiltered set of vcf files. In other words, the two subsequent modules do not require vcf files to be germline filtered -
As of right now we're not doing any filtering for potential technical noise. Its a very interesting problem but is perhaps a bit outside the scope of this package. Right now cerebra just summarizes variants found in a set of vcf files -- it doesnt make any judgments as to whether or not those variants are "real".
from cerebra.
hopefully this is addressed by b020ac8
from cerebra.
this should also be addressed by #74
from cerebra.
Related Issues (20)
- zlib error HOT 2
- vcfpy exception HOT 2
- JoSS Review: Installation HOT 5
- JoSS Review: Functionality HOT 3
- JoSS Review: Automated Tests
- JoSS Review: Community Guidelines
- count-variants not runnable without specifying cosmicdb HOT 1
- Versioning question
- Requirements and CI
- Python 3.8 support
- Universal wheel
- Review comments on CONTRIBUTING.md HOT 6
- Installation process HOT 8
- improve testing speed HOT 3
- pysam build fails on python3.8 HOT 5
- JoSS Review: Clarify README.md HOT 3
- JoSS Review: Quality of writing HOT 3
- trouble adding to bioconda
- would like to make cerebra modules importable
- hgvs throwing stdout
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cerebra.