The genomic sequences of SIRV-Set4 spike-in controls, download from https://www.lexogen.com/wp-content/uploads/2021/06/SIRV_Set4_Norm_Sequences_20210507.zip
This file is suggested for studies aimed at biological data analysis. The transcript sequences of ERCC+SIRV+longSIRV spike-in transcripts, and polyA tail were included. The header information is as follows: ">transcript_id|gene_id|strand|transcript_length(include polyA)|tag|polyA_length"
For SIRVs, this file only includes transcripts tagged as 'positive', which means all the transcripts were included in the kit.
This file is suggested for studies aimed at biological data analysis. The exon annotation of ERCC+SIRV+longSIRV spike-in transcripts. Here, we made a few modifications to SIRV spike-in transcripts, we labeled 'A', 'B', 'C', and 'D' after the gene IDs based on the structure of these transcripts from the same genomic region. Transcripts with the same gene ID indicate they are different isoforms from the same gene locus, while transcripts with different postfixes (e.g., SIRV3A, SIRV3B) indicate they should be different genes since they are from two different strands. This kind of modification came from the LRGASP project (https://lrgasp.github.io/lrgasp-submissions/docs/reference-genomes.html), but I made a small change. Because SIRV406 overlaps with SIRV401/2/7 but does not overlap with SIRV403/4/5/8. In LRGASP annotation, SIRV406 is from SIRV4B (a single-isoform gene). But here we believe SIRV406 belongs to SIRV1A because it partially overlaps with SIRV401/2/7. So we do not have SIRV4B in our annotation files to distinguish with LRGASP annotation.
This file is suggested for studies aimed at tool development. The header format is the same with 'ERCC.SIRV.longSIRV.transcripts.with.polyA.tail.fa'. For SIRVs, this file includes transcripts tagged as both 'positive' and 'negative', while 'negative' can be used as negative controls when evaluating tools' performance.
This file is suggested for studies aimed at tool development. The format is the same with 'ERCC.SIRV.longSIRV.annotation.gtf'. For SIRVs, this file includes transcripts tagged as both 'positive' and 'negative', while 'negative' can used as negative controls when evaluating the performance of tools.
This file includes the information extracted from SIRV-Set4 sequence design information files (https://www.lexogen.com/wp-content/uploads/2021/06/SIRV_Set4_Norm_sequence-design-overview_20210507a.xlsx).
download from https://www.lexogen.com/wp-content/uploads/2021/06/SIRV_Set4_Norm_sequence-design-overview_20210507a.xlsx.
Please feel free to contact me is you have any question.