Comments (8)
I had this error as well. I tried to turn off all galaxy stuff, but eventually, I just put sam_fa_indicies.loc where bcbio was expecting it. Would be nice if this was unneeded and bcbio could use the .yaml to find the indicies.
from bcbio-nextgen.
Pavel and Brent;
Thanks for the feedback. I added some documentation about how the code finds the galaxy
directory to grab the .loc
files:
https://bcbio-nextgen.readthedocs.org/en/latest/contents/configuration.html#reference-genome-files
The easiest way is to put your bcbio_system.yaml
file in the Galaxy directory created by the installer, but it can also be tweaked by specifying the path inside that directory.
I'm open for ideas to make this easier but I am also trying not to invent yet another way to specify these locations, which is why I co-opted the Galaxy location files. Brainstorming a bit, we could do something like:
indexes:
directory: /path/to/base/directory
hg19:
seq: hg19/seq/hg19.fa
bwa: hg19/bwa/hg19
if we dropped the loc files entirely. Brent, is that the approach you were envisioning?
from bcbio-nextgen.
I'm not sure how all the pieces fit together so take all this with the requisite grain of salt.
Normally, I have a dir like:
/path/to/hg19/
that contains everything for hg19, including the hg19.fa, all the bwa files, bowtie files, and the fai, etc...
I know that's not how your installer works, but maybe it could also use a base directory, which in your example above, would be:
hg19: /path/to/base/directory/hg19/
that could first look in hg19, then in hg19/seq or hg19/bwa as appropriate -- but only if the .loc files are not found?
Though maybe that's moving too much responsibility/complexity to the code.
It's not too difficult to comply with using the galaxy .loc files--, but it would be nice to have a message directly from bcbio_nextgen.py on start-up rather than it's sub-processes.
from bcbio-nextgen.
I've used installer and it stores .loc files in directory .../bcbio/data/galaxy/tool-data/
This .loc files contains correct paths to human references (.../bcbio/data/genomes/Hsapiens/hg19/seq/hg19.fa etc). All this files were created by bcbio_installer so it looks like some misconfiguration when bcbio_nextgen looks for .loc file in another directory. I don't think loc files should be entirely droped but probably just the path could be fixed.
from bcbio-nextgen.
Pavel;
Sorry, the fix for your problem may have gotten lost in all the discussion. The pipeline uses the location of bcbio_system.yaml
to identify where the galaxy
directory is. If you put your bcbio_system.yaml
in /data/bcbio/data/galaxy, that should resolve the issue. Let me know if you still have problems.
from bcbio-nextgen.
Pavel -- feel free to reopen if you run into any other issues.
from bcbio-nextgen.
I run again exome example on the following manner:
~/bcbionextgen/anaconda/bin/python ~/bcbionextgen/tools/bin/bcbio_nextgen.py ~/bcbionextgen/galaxy/bcbio_system.yaml ../input ../config/NA12878-exome-methodcmp.yaml -n 8
and get log with error message:
[2013-08-07 12:38] Found YAML samplesheet, using ../config/NA12878-exome-methodcmp.yaml instead of Galaxy API
[2013-08-07 12:38] Checking sample YAML configuration: ../config/NA12878-exome-methodcmp.yaml
[2013-08-07 12:38] Preparing 2_2013-04-03_methodcmp
[2013-08-07 12:38] Preparing 4_2013-04-03_methodcmp
[2013-08-07 12:39] Timing: alignment
[2013-08-07 12:39] multiprocessing: align_prep_full
[2013-08-07 12:39] Aligning lane 2_2013-04-03_methodcmp with bwa aligner
[2013-08-07 12:39] bwa mem alignment from fastq: NA12878-2
[2013-08-07 12:39] Uncaught exception occurred
Traceback (most recent call last):
File "/home/fedotov/bcbionextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 22, in run
_do_run(cmd, checks)
File "/home/fedotov/bcbionextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 46, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command '/home/fedotov/bcbionextgen/tools/bin/bwa mem -M -t 8 -R '@rg\tID:2\tPL:illumina\tPU:2_2013-04-03_methodcmp\tSM:NA12878-2' -v 1 /home/fedotov/bcbionextgen/galaxy/tool-data/(GRCh37) /home/fedotov/bcbionextgen/experiments/work/../input/NA12878-NGv3-LAB1360-A_1.fastq.gz /home/fedotov/bcbionextgen/experiments/work/../input/NA12878-NGv3-LAB1360-A_2.fastq.gz | /home/fedotov/bcbionextgen/tools/bin/samtools view -b -S -u - | /home/fedotov/bcbionextgen/tools/bin/samtools sort -@ 8 -m 768M - /home/fedotov/bcbionextgen/experiments/work/align/NA12878-2/tx/tmpxZvXxO/2_2013-04-03_methodcmp-sort
' returned non-zero exit status 2
Parameter for galaxy/tool-data is incorrect (/home/fedotov/bcbionextgen/galaxy/tool-data/(GRCh37)).
If I run full command with this parameter replaced to ~/bcbionextgen/galaxy/tool-data/bwa_index.loc path content
(/home/fedotov/bcbionextgen/genomes/Hsapiens/GRCh37/bwa/GRCh37.fa) then process goes well.
from bcbio-nextgen.
Pavel;
Apologies, this is a bug in the latest release. I'm planning a new release soon to address this, but in the meantime if you do:
bcbio_nextgen.py upgrade -u development
It will grab the latest development code which works cleanly. Sorry about the issue.
from bcbio-nextgen.
Related Issues (20)
- Error with bcbio_setup_genome.py: AttributeError: 'Namespace' object has no attribute 'cloudbiolinux'
- [main_samview] fail to read the header from "filename.sam".
- recalibrate=true fails, Unsupported class file major version 55 HOT 3
- ValueError: Could not find directory in config for snpeff HOT 1
- Empty tmpcbl/ref-transcripts.gtf created while building mm39 genome HOT 2
- Unable to install on M1 mac HOT 1
- subset of chromosome regions bed file HOT 1
- COSMIC NEWER VERSION SUPPORT(v94 and later) HOT 6
- Error during install - dbsnp issue HOT 15
- Methylation pipeline HOT 2
- Question: callable regions generation HOT 2
- Java error after fresh install of v1.2.9 HOT 5
- Question on parrallelization with callable regions HOT 2
- Wrong bcbio-nextgen versioned pinned in requirements file for v1.2.2 and later HOT 1
- tabix error on effects-ploidyfix.vcf.gz
- Error installing hg38 dbsnp data: ERROR 404: Not Found HOT 3
- IBM Power9 (ppc64le) support HOT 1
- Only header but no mutations were present in varscan vcf output.
- Fail to build Normal DB in PureCN pipeline
- Docker version of Bcbio
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bcbio-nextgen.