Git Product home page Git Product logo

Comments (7)

lczech avatar lczech commented on July 30, 2024

Hi @bensprung,

you are right, for fastqc we are currently only using the R1 fastq file as input. This is mostly because so far (at least in our tests), the results for R2 are kind of the same, so they do not add much information, and also because I was a bit lazy when coding this - it's a bit tricky to change the current behavior and add R2 as well. This is because the settings for fastqc also allow to use the trimmed reads instead of the raw samples as input, and in turn, the trimmed reads can also be merged into a single fastq file. Hence, there is a combination of settings where the data has R1 and R2, but we'd still only want to use one file as input - and I just avoided coding that so far ;-)

If you think it's worth having fastqc run on R2 in the case that raw sample data is used or trimming without merging is used as input for fastqc, I can add that though. Let me know, it's tricky, but doable :-)

Cheers
Lucas

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Okay okay, I couldn't resist and coded this now. Please download the latest grenepipe, which now supports FastQC on both paired end reads, as well as on the trimmed (and potentially merged) files as well, depending on the FastQC settings in the config file.

Please let me know if that is what you needed and works for you.

So long
Lucas

from grenepipe.

bensprung avatar bensprung commented on July 30, 2024

Ha, thank you! It looks great. I was going to say, I didn't want to create extra work. But I do have a good (post-hoc now) case for doing this. I had somehow added trailing tabs to some of my samples.tsv files, causing only one set of reads to be processed, which I didn't notice for a while. I might have noticed it faster if the typical multiQC view had stats for both files, as it does now with this change. Very nice!

from grenepipe.

lczech avatar lczech commented on July 30, 2024

I was going to say, I didn't want to create extra work.

No worries - I want to create a tool that is actually useful, and my use case might differ from yours, so thanks for your input on what kind of tools and things are actually needed in practice :-)

I had somehow added trailing tabs to some of my samples.tsv files, causing only one set of reads to be processed, which I didn't notice for a while.

Oh, that is unexpected - could you explain what happened there in some more detail please? Do you have single or paired end end reads (one or two columns in the table)? How does a tab cause the trouble there? That should get fixed, as this seems like an easy to make mistake.

from grenepipe.

bensprung avatar bensprung commented on July 30, 2024

Yep I had paired end reads. I (mistakenly) formatted my samples.tsv like this:

sample\tunit\tplatform\tfq1\tfq2
sample_name\t1\tILLUMINA\t/path/to/fq1\tpath/to/fq2\t

So there's an extra tab at the end of the first non-header row. It seemed to cause a one-off error in what field was interpreted as what, but the pipeline still ran with fq2 playing the role of fq1, and no fq2. I think!

from grenepipe.

bensprung avatar bensprung commented on July 30, 2024

p.s. as long as I have you on the line, a couple things I have noticed about snpEff:

(1) the download-dir parameter in config.yaml must have a trailing slash to work, e.g. "/home/ben/my_download_dir/" works but "/home/ben/my_download_dir" does not. Not sure if that is the intended behavior (maybe just a warning about this in the comments would suffice?).

(2) if feasible, it might be good to support custom databases for snpEff. I am new to using it but so far as I can tell the yeast databases it supplies are kind of out of date, it has R64-1 while the latest is R64-3

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Hm weird. I'm using pandas to read that table - no idea what it does with those extra tabs... Apparently not the right thing though. Might look into this at some point.

As for your PS: Good points! Would you mind opening issues for those as well?

Closing this one here for now - feel free to re-open as needed, and see you in the other issues then (no worries, I'll be on the line here for a while).

from grenepipe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.