Git Product home page Git Product logo

Comments (6)

hoytpr avatar hoytpr commented on May 25, 2024

@ErinBecker this looks like a great tool, but maybe the HPC lesson is a better spot for this. There are a few tools for faster data transfer, and it's a great topic, but for this lesson we intentionally are using a small dataset. IMHO.

from organization-genomics.

hoytpr avatar hoytpr commented on May 25, 2024

I believe we can close this, although it seems appropriate to re-write the lesson to make use of the new tools. @ErinBecker
According to NCBI (Ben Busby?) and on https://github.com/ncbi/sra-tools,
"With release 2.9.1 of sra-tools we have finally made available the tool fasterq-dump, a replacement for the much older fastq-dump tool. As its name implies, it runs faster, and is better suited for large-scale conversion of SRA objects into FASTQ files that are common on sites with enough disk space for temporary files. fasterq-dump is multi-threaded and performs bulk joins in a way that improves performance as compared to fastq-dump, which performs joins on a per-record basis (and is single-threaded).

fastq-dump is still supported as it handles more corner cases than fasterq-dump, but it is likely to be deprecated in the future.

You can get more information about fasterq-dump in our Wiki at https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump."

from organization-genomics.

ErinBecker avatar ErinBecker commented on May 25, 2024

Thanks for the feedback @hoytpr. I'm pinging @ACharbonneau to see if she wants to try to incorporate this into the Cloud lesson.

from organization-genomics.

JasonJWilliamsNY avatar JasonJWilliamsNY commented on May 25, 2024

Arizona BugBBQ - We don't think any importing from SRA is needed for this workshop. Learners should be given skills that will be make this easier on its own. There tools that will pull from SRA without using NCBI tools, etc.

from organization-genomics.

hoytpr avatar hoytpr commented on May 25, 2024

It's true that there are tons of ways to download, and life science students probably know a couple of ways. Providing the data with a link for interested learners is a great way to save time for more important items.

from organization-genomics.

hoytpr avatar hoytpr commented on May 25, 2024

@JasonJWilliamsNY and @ErinBecker Because the curl and wget functions worked very well in the Arizona BugBBQ, and because the original/predecessor fastq-dump is already multi-threaded, (even if not needed), I'm going to close this issue.

from organization-genomics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.