Git Product home page Git Product logo

sirv-set4_spike_in_annotation's Introduction

SIRV-Set4_spike_in_annotation

1. ERCC.SIRV.longSIRV.genome.fa

The genomic sequences of SIRV-Set4 spike-in controls, download from https://www.lexogen.com/wp-content/uploads/2021/06/SIRV_Set4_Norm_Sequences_20210507.zip

2. ERCC.SIRV.longSIRV.transcripts.with.polyA.tail.fa

This file is suggested for studies aimed at biological data analysis. The transcript sequences of ERCC+SIRV+longSIRV spike-in transcripts, and polyA tail were included. The header information is as follows: ">transcript_id|gene_id|strand|transcript_length(include polyA)|tag|polyA_length"

For SIRVs, this file only includes transcripts tagged as 'positive', which means all the transcripts were included in the kit.

3. ERCC.SIRV.longSIRV.annotation.gtf

This file is suggested for studies aimed at biological data analysis. The exon annotation of ERCC+SIRV+longSIRV spike-in transcripts. Here, we made a few modifications to SIRV spike-in transcripts, we labeled 'A', 'B', 'C', and 'D' after the gene IDs based on the structure of these transcripts from the same genomic region. Transcripts with the same gene ID indicate they are different isoforms from the same gene locus, while transcripts with different postfixes (e.g., SIRV3A, SIRV3B) indicate they should be different genes since they are from two different strands. This kind of modification came from the LRGASP project (https://lrgasp.github.io/lrgasp-submissions/docs/reference-genomes.html), but I made a small change. Because SIRV406 overlaps with SIRV401/2/7 but does not overlap with SIRV403/4/5/8. In LRGASP annotation, SIRV406 is from SIRV4B (a single-isoform gene). But here we believe SIRV406 belongs to SIRV1A because it partially overlaps with SIRV401/2/7. So we do not have SIRV4B in our annotation files to distinguish with LRGASP annotation.

4. ERCC.SIRV.longSIRV.transcripts.extended.with.polyA.tail.fa

This file is suggested for studies aimed at tool development. The header format is the same with 'ERCC.SIRV.longSIRV.transcripts.with.polyA.tail.fa'. For SIRVs, this file includes transcripts tagged as both 'positive' and 'negative', while 'negative' can be used as negative controls when evaluating tools' performance.

5. ERCC.SIRV.longSIRV.extended.annotation.gtf

This file is suggested for studies aimed at tool development. The format is the same with 'ERCC.SIRV.longSIRV.annotation.gtf'. For SIRVs, this file includes transcripts tagged as both 'positive' and 'negative', while 'negative' can used as negative controls when evaluating the performance of tools.

6. SIRV_Set4_transcript_annotation.txt/SIRV_Set4_transcript_annotation.xlsx

This file includes the information extracted from SIRV-Set4 sequence design information files (https://www.lexogen.com/wp-content/uploads/2021/06/SIRV_Set4_Norm_sequence-design-overview_20210507a.xlsx).

7. SIRV_Set4_Norm_sequence-design-overview_20210507a.xlsx

download from https://www.lexogen.com/wp-content/uploads/2021/06/SIRV_Set4_Norm_sequence-design-overview_20210507a.xlsx.

Please feel free to contact me is you have any question.

sirv-set4_spike_in_annotation's People

Contributors

hmutpw avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.