Comments (7)
Interesting... is that field new? I haven't seen it before. At a quick glance it looks like there are >80K values in the file, not sure how many unique variants are actually annotated:
$ gunzip -c ClinVarFullRelease_2016-08.xml.gz | grep ModeOfInheritance | wc -l
84328
Here are counts of the values in that field:
$ gunzip -c ClinVarFullRelease_2016-08.xml.gz | grep ModeOfInheritance | sed 's/.*\">//' | sed 's/<\/.*//' | sort | uniq -c
51208 Autosomal dominant inheritance
22387 Autosomal recessive inheritance
951 Autosomal unknown
49 Codominant
127 Mitochondrial inheritance
108 Other
14 Sex-limited autosomal dominant
1209 Somatic mutation
317 Sporadic
947 X-linked dominant inheritance
4542 X-linked inheritance
997 X-linked recessive inheritance
5 Xlinked NA
24 Xlinked recessive
48 Xlinked unknown
1 Y-linked inheritance
158 autosomal dominant
90 autosomal recessive
1146 autosomal unknown
@bw2 would you by any chance have time to look into this more? If ClinVar is really going to start filling in this field for a large subset of variants then I imagine this is something we'd want to support.
from clinvar.
Yes, our group is also interested in getting these fields annotated with the final TSV results.
THank you!
from clinvar.
Thanks for pointing out ModeOfInheritence. I've added it to the flatfile and vcf as inheritance_modes
. The source for it appears to be an optional field in the clinvar submission spreadsheet - currently 19% of variants have it populated.
Also while looking at this, I noticed a couple other fields I hadn't seen before so I added them also:
disease mechanism
- this might also be coming from the clinvar submissions spreadsheet (though I couldn't find a field for it).
prevalence
and age of onset
- clinvar takes these from orphanet.
xrefs
- misc. other cross-reference info and ids packed into this column as key:value;key:value;...
I've created a pull request with these changes: #12
Any feedback on the code or data files is appreciated.
from clinvar.
Thank you for the additions. This is definitely helpful. Could you also add the origin info to it? It has germline, de novo and somatic as values for it.
from clinvar.
sure, since the table is changing, I'll add this one also.
from clinvar.
I've added the 'origin' column to PR #12 , and also put in a new 'gold_stars' column which just maps a variant's review_status to a number between 0 - 4. Any feedback is appreciated.
from clinvar.
Thank you again! This is very useful..
from clinvar.
Related Issues (20)
- Sometimes "symbol" disagrees with primary hgvs gene annotation HOT 1
- Missing variants HOT 4
- RScript crash HOT 1
- Feature request: Including strand info and genomic start and end coordinates HOT 1
- Wrong gene mapping HOT 2
- Add "last_evaluated" field to VCF
- Improper handling of pseudoautosomal region HOT 1
- Improper labelling of conflicting variants as pathogenic HOT 3
- Job parse failure on system execution. ENV issues? HOT 5
- Another translation for gold stars
- Add LICENSE HOT 2
- Need to Update pathogenic & conflict boolean regex. HOT 1
- ClinSig changes for some halpotypes in join_data.R - become wrong! HOT 2
- Clinical Significance Order Lost in Allele Grouping HOT 8
- Convert the pipeline to hail for .tsv processing steps?
- Augmented tables in output HOT 2
- missing clinvar_alleles_with_exac_v1.*.tsv.gz HOT 1
- Mismatch between clinical_significance_ordered and submitters_ordered HOT 2
- Multiple isoforms not in index
- missing hgvs_c when indel is realigned and shifted.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clinvar.