isugenomics / bioinformatics-workbook Goto Github PK
View Code? Open in Web Editor NEWBioinformatics Workbook repository
Home Page: https://bioinformaticsworkbook.org
License: MIT License
Bioinformatics Workbook repository
Home Page: https://bioinformaticsworkbook.org
License: MIT License
The GATK Best Practices Workflow for DNA-Seq tutorial currently presents information that may require an update. It is important to note that Picard is included in GATK starting with GATK4.
Would it be possible to update this tutorial to reflect the change in the inclusion of Picard within GATK4? While it is understood that the tutorial may be intended to support GATK versions prior to 4.0, it is important to bring attention to this update for the sake of accuracy and clarity.
Optionally, it may be beneficial to link the tutorial to the relevant GATK Best Practices Workflows along with a note about any modifications necessary to adapt to non-human data (e.g. maize, plants).
Hi Jennifer,
Thanks for writing the awesome tutorial!
I found you wrote "Optionally you could subset to only genes that are differentially expressed between groups." in this WGCNA tutorial. However, from what I have noticed from here, the author actually doesn’t recommend this.
Taotao
Hi there,
This is a semi-automated message from a fellow bioinformatician. Through a GitHub search, I found that the following source files make use of BLAST's -max_target_seqs
parameter:
Based on the recently published report, Misunderstood parameter of NCBI BLAST impacts the correctness of bioinformatics workflows, there is a strong chance that this parameter is misused in your repository.
If the use of this parameter was intentional, please feel free to ignore and close this issue but I would highly recommend to add a comment to your source code to notify others about this use case. If this is a duplicate issue, please accept my apologies for the redundancy as this simple automation is not smart enough to identify such issues.
Thank you!
-- Arman (armish/blast-patrol)
Thanks for this great resource. fastqc-dump
is fairly poorly documented so this workbook was a great starting point for me.
I think there is a small error on: https://bioinformaticsworkbook.org/dataAcquisition/fileTransfer/sra.html#gsc.tab=0
parallel --jobs 3 "fastq-dump --split-files --origfmt --gzip {}" ::: SRR.numbers
I believe this will only work if you cat
SRR.numbers
parallel --jobs 3 "fastq-dump --split-files --origfmt --gzip {}" ::: $( cat SRR.numbers)
Here are some materials I think it should be added to README.md. Originally rejected in #10 and @aseetharam suggests me to post it to issue section here, which disappears for at least two weeks.
To run the repo locally, go to the root directory of this repo
as suggested by @hsiaoyi0504
Hello!
Thank you for the great bioinformatics workbook. I've been using the Braker tutorial as inspiration to annotate me own de novo genome and it's been quite helpful this far.
I am trying to generate the transcript/EST information as part of the (required?) input for Braker2. Is the Transcript/EST information a different datatype entirely than RNA-seq, and if I don't have it then this step must be skipped? Or can it be generated from the RNA-seq data?
I was thinking that if it can be generated then perhaps I'd generate a gtf file with stringtie/guided trinity, pull the fasta file from that with bedtools, and then re-align the fasta to the genome. This being said I may be way off-base and missing something obvious.
If you have any advice here then that would be very valuable!
Best,
Dustin
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.