Git Product home page Git Product logo

bio-genomeupdate's Introduction

solgenomics

The mason components needed to run the solgenomics.net website

bio-genomeupdate's People

Contributors

bellerbrock avatar jeremyde avatar mflores2021 avatar phosmani avatar suryasaha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bio-genomeupdate's Issues

Generate trim files for TPF in NCBI GRC format

Columns for Trim curation file:

  • Taxid (value (Solanum lycopersicum)=4081)
  • AssmGrp (value=TGP)
  • AssmUnit (value=Primary)
  • Chr (values=1-12, Un)
  • TPFType (values=chromosome, contig)
  • Acc.ver
  • TrimPos (1-based; first or last base used in AGP)
  • FromEnd (values = L, H; L: trim bases with values lt TrimPos; H: trim bases with values gt TrimPos)
  • Comment

Create new classes

Check why mixed alignment cases are not flagged as error

Check AlignCoordGroup.pm

sample nucmer output

18361542 18377740 1 16206 16199 16206 99.94 70787664 203766 0.02 7.95 1 1 SL2.50ch03 Contig90
53614018 53614614 17944 17348 597 597 99.66 70787664 203766 0.00 0.29 1 -1 SL2.50ch03 Contig90

500bp.mixedoutoforder.agp.group_coords.stdout

Contig90 SL2.50ch03 18361542 18564072 202530 1 203766 203766 154944 1 1 0 Contains 0 0 48822 0 596 SL2.50ch03:597:53614018:53614614

filter_delta script

Print filtered delta output file without the BACs that are aligned out of order.

Create test set for align_BACends_group_coords.pl

-r Fasta file of reference (required)
-q Fasta file of query (assembled and singleton BACs, required)
-c Contig or component AGP file for reference (includes scaffold gaps)
-s Chromosome AGP file for reference (with only scaffolds and gaps)

Handle cases where inserted BAC/BAC contig is flush with either end of WGS

Attribute (accession_prefix_last_base) does not pass the type constraint because: The string, -1, was not a positive coordinate at /usr/local/lib/x86_64-linux-gnu/perl/5.20.2/Moose/Object.pm line 24
Moose::Object::new('Bio::GenomeUpdate::SP::SPLine', 'chromosome', 10, 'accession_prefix', 'AEKE02007654', 'accession_suffix', 'AC239654', 'accession_prefix_orientation', '-', 'accession_suffix_orientation', '+', 'accession_prefix_last_base', -1, 'accession_suffix_first_base', 1, 'comment', 'BAC AC239654 is contained within WGS contig AEKE02007654 from previous version. Designates switch point from WGS contig to BAC.') called at /home/surya/work/Eclipse/Bio-GenomeUpdate/lib/Bio/GenomeUpdate/TPF.pm line 1628

Generate switch point curation files for TPF in NCBI GRC format

Columns for switch point curation file:

  • Taxid (value (Solanum lycopersicum)=4081)
  • AssmGrp (value=TGP)
  • AssmUnit (value=Primary)
  • Chr (values=1-12, Un)
  • TPFType (values=chromosome, contig (latter used for unlocalized or unplaced scaffolds, see TPF spec))
  • Acc.ver1
  • Acc.ver2
  • Orient1 (orientation of acc.ver1 in AGP)
  • Orient2 (orientation of acc.ver2 in AGP)
  • Point1 (1-based, last base of acc.ver1 to be used in AGP)
  • Point2 (1-based, first base of acc.ver2 to be used in AGP)
  • Comment (req’d, min 25 char)

The order of BACs written to TPF for an assembled BAC contig is reversed

AEKE02023669 ? SL2.50sc05925 MINUS
AC244870 ? SL2.50sc05925 PLUS
AC244937 ? SL2.50sc05925 MINUS contig469
AC244803 ? SL2.50sc05925 PLUS contig469
AC244944 ? SL2.50sc05925 PLUS contig469
AC254768 ? SL2.50sc05925 MINUS contig469
AEKE02023661 ? SL2.50sc05925 MINUS

Contig469_right_1000 aligns to -ive AEKE02023661.1
Contig469_right_1000 aligns to -ive end of AC244937
Contig469_left_1000 aligns to -ive end of AC254768
Contig469_left_1000 aligns to middle of AC244870

Correct order
AEKE02023669 ? SL2.50sc05925 MINUS
AC244870 ? SL2.50sc05925 PLUS
AC254768 ? SL2.50sc05925 MINUS contig469
AC244944 ? SL2.50sc05925 PLUS contig469
AC244803 ? SL2.50sc05925 PLUS contig469
AC244937 ? SL2.50sc05925 MINUS contig469
AEKE02023661 ? SL2.50sc05925 MINUS

68kb 99% identical alignment between AC244870 and AC254768
9.2kb 99% identical alignment between AC244937 and AEKE02023661

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.