Comments (2)
(Oops, didn't mean to close that).
OK so this is easily solved without any fixes: it's trying to use a leader sequence that's not present in the reference. Assuming we're talking human, the only TRDV2 leader is for *03, which you can specify via the -l
flag. So for example the following works:
python3 stitchr.py -v TRDV2*01 -j TRDJ1*01 -cdr3 CACDTIRPKFSTDKLIF -l TRDV2*03
However that's definitely not the expected error state! Thanks for pointing this out.
from stitchr.
Note for posterity.
The code used to assume that every gene region (leader/V/J/C) had at the very least a prototypical allele (*01), so when a gene was requested without an allele this was the safe default option. The problem here arose for a gene where this was not true, as there was a V gene that GENE-DB had a valid prototypical (TRDV2*01) but which lacked a corresponding prototypical leader allele (the only one available for the gene being TRDV2*03). As such the defaulted sequence was not available, and it failed without a useful error message.
Now the code has been edited to follow this process when an allele is not provided:
- Look for a prototypical allele
- If it doesn't exist, pick an allele from those that do exist (flagging up a warning that it's done this)
- If it does, use the the prototypical allele (but check if other alleles exist, and if it does flag that up too)
(Note this only applies for genes that do exist in the data; asking for a non-featured gene will still throw a ValueError).
TRDV2 is actually a useful place to illustrate this behaviour, as has 01/03 allele variable sequences, but only an 03 leader sequence. So asking stitchr
to use the following Vs will give the following outputs:
Requested V | Stitched V | Stitched L |
---|---|---|
TRDV2 | TRDV2*01 | TRDV2*03 |
TRDV2*01 | TRDV2*01 | TRDV2*03 |
TRDV2*02 | TRDV2*01 | TRDV2*03 |
TRDV2*03 | TRDV2*03 | TRDV2*03 |
As always, in general the best results are achieved by specifying exactly which alleles you want where possible, and keeping track of which alleles are used in your stitched sequences.
from stitchr.
Related Issues (20)
- Simplifying output / silent mode HOT 1
- J/C region broken in certain genes in skip/extra gene mode HOT 1
- Lower case CDR3 amino acid characters cause an error HOT 1
- Possible to use custom species (non-human/non-mouse)? HOT 1
- Alternative non-templated codon usage?
- Improve % of perfectly replicated CDR3s when using a NT CDR3
- Add -cite option to stitchr
- thimble function gives an error for LEADER sequence when stitching TCR HOT 3
- Compatibility of stitchr with Windows HOT 5
- CDR1/2 HOT 2
- Wild card usage in Thimble ignores extra genes (-xg/additional-genes.fasta)
- First example at https://jamieheather.github.io/stitchr/installation.html not working as expected HOT 2
- Importing stitchr for use in other scripts - obtain stitched aa sequence without C region information HOT 5
- Specify databases location HOT 5
- Add option to read FASTA automatically into additional-genes.fasta
- Add more details to docs about different error/warning messages HOT 1
- TRBD gene HOT 2
- Integration with tidytcells HOT 4
- A issue about stitching immunoglobulins HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stitchr.