Git Product home page Git Product logo

Comments (13)

nghiavtr avatar nghiavtr commented on September 26, 2024

Hi @RenaeAtkinson,

Thank you for using Scasa in your research.

The error is at the mapping step of alevin, likely it can not find out the input fastq files. Please check if the file names if they are in the right format. It is noted that the names of fastq files should contain "R1" and "R2", please see the details here: https://github.com/eudoraleer/scasa/wiki#6-input-fastq-files

Best,
Nghia

from scasa.

RenaeAtkinson avatar RenaeAtkinson commented on September 26, 2024

from scasa.

nghiavtr avatar nghiavtr commented on September 26, 2024

hi @RenaeAtkinson,

I don't see a clear issue in your command except "--project" instead of "–project". Most default values parameters are used in your command, so can you try again with the shorter version below:

scasa --in /network/rit/lab/conklinlab/Renae/HNVC/HNVC02/SRR10340946/
--fastq SRR10340946_R1.fastq,SRR10340946_R2.fastq
--out /network/rit/lab/conklinlab/Renae/SCASA/HNVC02/
--ref /network/rit/lab/conklinlab/Renae/SCASA/refMrna.fa
--whitelist /network/rit/lab/conklinlab/Renae/HNVC/V2/737K-august-2016.txt
--tech 10xv2
--nthreads 32

Best,
Nghia

from scasa.

RenaeAtkinson avatar RenaeAtkinson commented on September 26, 2024

from scasa.

nghiavtr avatar nghiavtr commented on September 26, 2024

Hi,
The error indicates that the alignment by Alevin has not been performed.
I am thinking of the reason that the input filename is not correct, but it is so weird because likely it is not.

Can you try to test the issue by renaming SRR10340946_R1.fastq by Sample_01_S1_L001_R1_001.fastq and SRR10340946_R2.fastq by Sample_01_S1_L001_R2_001.fastq as in the sample files of Scasa

Another possibility is that R1 and R2 files do not contain the correct information (one for sequence content and another for barcode+UMI), in that case we just switch the file name.

Let try and please let me know if any of these ways work, thanks!

Nghia

from scasa.

RenaeAtkinson avatar RenaeAtkinson commented on September 26, 2024

from scasa.

nghiavtr avatar nghiavtr commented on September 26, 2024

Hi,

It is really strange. Can you put the few first lines of R1 and R2 here?
And if possible, can you send me the files or a subset of reads from the files, I will try to reproduce the error by running Scasa on the files.

Nghia

from scasa.

RenaeAtkinson avatar RenaeAtkinson commented on September 26, 2024

from scasa.

nghiavtr avatar nghiavtr commented on September 26, 2024

hi @RenaeAtkinson ,

Well, I can not reproduce your error, please see the codes I tried below. So it is sure that the issue is not at the input data format.

I guess you might have missed some steps, for example forgetting to add the paths of scasa or salmon alevin( export PATH and export LD_LIBRARY_PATH)

Nghia


##################################################################
# 1. Download scasa:
##################################################################
wget https://github.com/eudoraleer/scasa/releases/download/scasa.v1.0.0/scasa_v1.0.0.tar.gz
tar -xzvf scasa_v1.0.0.tar.gz
export PATH=$PWD/scasa:$PATH

##################################################################
# 2. Download salmon alevin:
##################################################################
wget https://github.com/COMBINE-lab/salmon/releases/download/v1.4.0/salmon-1.4.0_linux_x86_64.tar.gz
tar -xzvf salmon-1.4.0_linux_x86_64.tar.gz
export PATH=$PWD/salmon-latest_linux_x86_64/bin:$PATH
export LD_LIBRARY_PATH=$PWD/salmon-latest_linux_x86_64/lib:$LD_LIBRARY_PATH

##################################################################
# 3. Download UCSC hg38 cDNA fasta reference:
##################################################################
mkdir Annotation
cd Annotation
wget https://www.dropbox.com/s/xoa6yl562a5lv35/refMrna.fa.gz
refPath=$PWD/refMrna.fa.gz

wget https://github.com/10XGenomics/cellranger/blob/master/lib/python/cellranger/barcodes/737K-august-2016.txt
whitelistFile=$PWD/737K-august-2016.txt

cd ..

##################################################################
# 4. Download the CITE-seq RNA samples:
##################################################################

mkdir CiteSeqData
cd CiteSeqData

### use sratools to download the sample
# module load sratools/3.0.0
prefetch SRR10340946
cd SRR10340946
fastq-dump --gzip --split-3 SRR10340946.sra

#change the name
mv SRR10340946_1.fastq.gz SRR10340946_L001_R1_001.fastq.gz
mv SRR10340946_2.fastq.gz SRR10340946_L001_R2_001.fastq.gz

InputDir=$PWD
cd ..

#number of threads
threadNum=$(nproc)

#run scasa
scasa --in $InputDir --fastq SRR10340946_L001_R1_001.fastq.gz,SRR10340946_L001_R2_001.fastq.gz --ref $refPath  --tech 10xv2 --nthreads $threadNum --whitelist $whitelistFile --out ScasaOut_SRR10340946

from scasa.

RenaeAtkinson avatar RenaeAtkinson commented on September 26, 2024

from scasa.

nghiavtr avatar nghiavtr commented on September 26, 2024

Hi @RenaeAtkinson,

If you see the message: 'Error in file(con, "r") : cannot open the connection', it is definitely that the program can not find out the file and so it is not the issue of Scasa.

I have tried to run Scasa with your working sample SRR10340946 on my linux computer, it worked well without error. I have provided you the codes previously (but I forgot to put them in the code format, very sorry). So I put the codes again below. I use the sratools to download SRR10340946 data. You just need to copy-and-paste the command lines and it should work.

Nghia

##################################################################
# 1. Download scasa:
##################################################################
wget https://github.com/eudoraleer/scasa/releases/download/scasa.v1.0.0/scasa_v1.0.0.tar.gz
tar -xzvf scasa_v1.0.0.tar.gz
export PATH=$PWD/scasa:$PATH

##################################################################
# 2. Download salmon alevin:
##################################################################
wget https://github.com/COMBINE-lab/salmon/releases/download/v1.4.0/salmon-1.4.0_linux_x86_64.tar.gz
tar -xzvf salmon-1.4.0_linux_x86_64.tar.gz
export PATH=$PWD/salmon-latest_linux_x86_64/bin:$PATH
export LD_LIBRARY_PATH=$PWD/salmon-latest_linux_x86_64/lib:$LD_LIBRARY_PATH

##################################################################
# 3. Download UCSC hg38 cDNA fasta reference:
##################################################################
mkdir Annotation
cd Annotation
wget https://www.dropbox.com/s/xoa6yl562a5lv35/refMrna.fa.gz
refPath=$PWD/refMrna.fa.gz

wget https://github.com/10XGenomics/cellranger/blob/master/lib/python/cellranger/barcodes/737K-august-2016.txt
whitelistFile=$PWD/737K-august-2016.txt

cd ..

##################################################################
# 4. Download the CITE-seq RNA samples:
##################################################################

mkdir CiteSeqData
cd CiteSeqData

### use sratools to download the sample
# module load sratools/3.0.0
prefetch SRR10340946
cd SRR10340946
fastq-dump --gzip --split-3 SRR10340946.sra

#change the name
mv SRR10340946_1.fastq.gz SRR10340946_L001_R1_001.fastq.gz
mv SRR10340946_2.fastq.gz SRR10340946_L001_R2_001.fastq.gz

InputDir=$PWD
cd ..

#number of threads
threadNum=$(nproc)

#run scasa
scasa --in $InputDir --fastq SRR10340946_L001_R1_001.fastq.gz,SRR10340946_L001_R2_001.fastq.gz --ref $refPath  --tech 10xv2 --nthreads $threadNum --whitelist $whitelistFile --out ScasaOut_SRR10340946

from scasa.

RenaeAtkinson avatar RenaeAtkinson commented on September 26, 2024

from scasa.

nghiavtr avatar nghiavtr commented on September 26, 2024

Hi @RenaeAtkinson ,

The first error of mkdir can be ignored, it is harmless
The second error indicates that the salmon alevin was not performed properly because no bfh.txt file exists, so yes this is the main issue.
I have no experience with running Salmon using conda, but usually we dont need conda to run Salmon. I also have not tried salmon version 1.10.1 that I am not sure if it has any changes in setting. I suggest you use the same salmon version as I have tested.

Nghia

from scasa.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.