Comments (5)
Hi Zach,
Sorry for the late reply I was away with very limited access to the internet...
About the gene anchor indices
They are obtained via the IMGT gapped alignments (they are aligned regarding a conserved cystein/tryptophan/phenylalanin). If I remember correctly the index is given in the fasta header for each sequence (or an index that allows you to compute it). There is currently nothing in IGoR to extract it automatically but it is a rather straightforward script to code.
Alternatively one could multialign all the genomic sequences in order to identify these conserved residues and extract the indices for species not present on IMGT. I had drafted a piece of code to perform this, it should be somewhere in the pygor codebase.
About the BCRs genomic templates
In general I have tried to include text files along with the models to give the reference from which the model has been taken from. Such text file should sit in the same model's folder.
For BCRs the model comes from the IGoR paper [1]. As stated in the SI of the paper I used custom genomic templates from [2]. Only templates found in the individual were kept so as to characterize correctly somatic hypermutations.
from igor.
I'm having the same problem with TCR. Since the Vgene list shipped with iGOR does not have some of the genes in my data, I was hoping to use the latest TRBV fasta from IMGT, however i did not find the anchor position in header. Could you please give us a more detailed instruction on obtaining anchor? Thank you very much
Hi Quentin,
How did you obtain the indices of where the anchor index is for the J and V genes in your models? Is there implementation or a feature in IGoR so that it can do this? If not, what was the method you used? IMGT doesn't appear to contain the anchor indices information.
Additionally, what was the process in deciding which genes would be used and which would not be used? For example, let's consider the information in
models/human/bcr_heavy/ref_genome
. I am attempting to see where you got these values from using IMGT, specifically the hyperlinks in the table, which is two-thirds down the page, available at http://www.imgt.org/vquest/refseqh.html. F+ORF+all P IGHV Human has 477 sequences (http://www.imgt.org/genedb/GENElect?query=7.2+IGHV&species=Homo+sapiens). Your genomicVs.fasta file in theigor_1-3-0/models/human/bcr_heavy/ref_genome
directory has only 97. Why is it so small compared to the IGMT files? Were those the only ones available at the time?Thanks,
Zach
from igor.
Hi,
You can get the anchors by taking the number 309 and subtracting the gaps from this number. See here for a reference of the anchor indices.
For instance, take TRBV1*01.
>L36092|TRBV1*01|Homo sapiens|P|V-REGION|91723..92006|284 nt|1| | | | |284+42=326| | |
The number of gaps is 42. So 309-42 = 267.
Another example:
>M13550|TRBV7-3*05|Homo sapiens|(F)|V-REGION|1..231|231 nt|1| | | | |231+90=321|partial in 5'| |
The number of gaps is 90. so 309-90 = 219. And that's the number given in the table in the SONIA link.
Hope that helps.
from igor.
Hi,
You can get the anchors by taking the number 309 and subtracting the gaps from this number. See here for a reference of the anchor indices.
For instance, take TRBV1*01.
>L36092|TRBV1*01|Homo sapiens|P|V-REGION|91723..92006|284 nt|1| | | | |284+42=326| | |
The number of gaps is 42. So 309-42 = 267.
Another example:
>M13550|TRBV7-3*05|Homo sapiens|(F)|V-REGION|1..231|231 nt|1| | | | |231+90=321|partial in 5'| |
The number of gaps is 90. so 309-90 = 219. And that's the number given in the table in the SONIA link.Hope that helps.
Thanks for the reply, one last question: I noticed that the default iGOR V/J anchor files have different formats compared to the SONIA ones, the latter only has 3 fields while the iGOR one has many other information like species.
I wonder can I used them interchangeably?
from igor.
You can't use them interchangeably. I'm not sure what's going on in the T cell anchor csvs, but in the B cell one there are two fields separated by a semi-colon. It needs to be in this format. It will not work otherwise. I can't speak to using IGoR for T cells since I've used it for B cells only. I assume it needs to be in this format though and can't imagine why it wouldn't be otherwise.
from igor.
Related Issues (20)
- File not found when IGoR is installed locally HOT 1
- Errors: python to parse the output results? HOT 2
- Model edge gene choice relations differ HOT 1
- Chain IGoR commands HOT 1
- Missing unknown subargument error for -output HOT 1
- Using the --coverage output subarg HOT 1
- using the --coverage error
- make new database HOT 2
- make check failures but make install worked HOT 1
- Segmentation Fault in -run_demo HOT 5
- Limiting IGoR's CPU usage HOT 1
- Install error HOT 2
- How to get everyone sequence clonotype ?
- the result file foo_indexed_CDR3s.csv HOT 1
- segfault with gcc > 7 HOT 2
- IGoR over-estimating TCR V deletions?
- installation from source
- All 0 output in infer and question about L_thresh
- Remove superfluous IMGT information in TCR beta model parms file
- Undefined reference error during compling
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from igor.