nf-core / mhcquant Goto Github PK

View Code? Open in Web Editor NEW

32.0 83.0 25.0 25.86 MB

Identify and quantify MHC eluted peptides from mass spectrometry raw data

Home Page: https://nf-co.re/mhcquant

License: MIT License

HTML 1.65% Python 11.13% Nextflow 87.21%

nf-core nextflow peptides mass-spectrometry mhc workflow pipeline openms immunopeptidomics dda

mhcquant's Introduction

Introduction

nfcore/mhcquant is a best-practice bioinformatics pipeline to process data-dependent acquisition (DDA) immunopeptidomics data. This involves mass spectrometry-based identification and quantification of immunopeptides presented on major histocompatibility complex (MHC) molecules which mediate T cell immunosurveillance. Immunopeptidomics has central implications for clinical research, in the context of T cell-centric immunotherapies.

The pipeline is based on the OpenMS C++ framework for computational mass spectrometry. Spectrum files (mzML/Thermo raw/Bruker tdf) serve as inputs and a database search (Comet) is performed based on a given input protein database. Peptide properties are predicted by MS²Rescore. FDR rescoring is applied using Percolator based on a competitive target-decoy approach. For label free quantification all input files undergo identification-based retention time alignment, and targeted feature extraction matching ids between runs.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the nf-core website.

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.tsv

ID	Sample	Condition	ReplicateFileName
1	tumor	treated	/path/to/msrun1.raw|mzML|d
2	tumor	treated	/path/to/msrun2.raw|mzML|d
3	tumor	untreated	/path/to/msrun3.raw|mzML|d
4	tumor	untreated	/path/to/msrun4.raw|mzML|d

Each row represents a mass spectrometry run in one of the formats: raw, RAW, mzML, mzML.gz, d, d.tar.gz, d.zip

Now, you can run the pipeline using:

nextflow run nf-core/mhcquant
    -profile <docker/singularity/.../institute> \
    --input 'samplesheet.tsv' \
    --fasta 'SWISSPROT_2020.fasta' \
    --outdir ./results

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline summary

Default Steps

By default the pipeline currently performs identification of MHC class I peptides with HCD settings:

Preparing spectra dependent on the input format (PrepareSpectra)
Creation of reversed decoy database (DecoyDatabase)
Identification of peptides in the MS/MS spectra (CometAdapter)
Refreshes the protein references for all peptide hits and adds target/decoy information (PeptideIndexer)
Merges identification files with the same Sample and Condition label (IDMerger)
Prediction of retention times and MS2 intensities (MS²Rescore)
Extract PSM features for Percolator (PSMFeatureExtractor)
Peptide-spectrum-match rescoring using Percolator (PercolatorAdapter)
Filters peptide identification result according to 1% FDR (IDFilter)
Converts identification result to tab-separated files (TextExporter)
Converts identification result to mzTab files (MzTabExporter)

Additional Steps

Additional functionality contained by the pipeline currently includes:

Quantification

Corrects retention time distortions between runs (MapAlignerIdentification)
Applies retention time transformations to runs (MapRTTransformer)
Detects features in MS1 data based on peptide identifications (FeatureFinderIdentification)
Group corresponding features across label-free experiments (FeatureLinkerUnlabeledKD)
Resolves ambiguous annotations of features with peptide identifications (IDConflictResolver)

Output

Annotates final list of peptides with their respective ions and charges (IonAnnotator)

Documentation

To see the the results of a test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.

Nextflow installation
Pipeline configuration
- Pipeline installation
- Adding your own system config
Running the pipeline
- This includes tutorials, FAQs, and troubleshooting instructions
Output and how to interpret the results

Credits

nf-core/mhcquant was originally written by Leon Bichmann from the Kohlbacher Lab. The pipeline was re-written in Nextflow DSL2 by Marissa Dubbelaar and was significantly improved by Jonas Scheid and Steffen Lemke from Peptide-based Immunotherapy and Quantitative Biology Center in Tübingen.

Helpful contributors:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #mhcquant channel (you can join with this invite).

Citations

If you use nf-core/mhcquant for your analysis, please cite it using the following doi: 10.5281/zenodo.1569909 and the corresponding manuscript:

MHCquant: Automated and Reproducible Data Analysis for Immunopeptidomics

Leon Bichmann, Annika Nelde, Michael Ghosh, Lukas Heumos, Christopher Mohr, Alexander Peltzer, Leon Kuchenbecker, Timo Sachsenberg, Juliane S. Walz, Stefan Stevanović, Hans-Georg Rammensee & Oliver Kohlbacher

Journal of Proteome Research 2019 18 (11), 3876-3884. doi: 10.1021/acs.jproteome.9b00313

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

In addition, references of tools and data used in this pipeline are as follows:

OpenMS framework

Pfeuffer J. et al, Nat Methods 2024 Mar;21(3):365-367. doi: 0.1038/s41592-024-02197-7.

Comet Search Engine

Eng J.K. et al, J Am Soc Mass Spectrom. 2015 Nov;26(11):1865-74. doi: 10.1007/s13361-015-1179-x.

Retention time prediction

Bouwmeester R. et al, Nature Methods 2021 Oct;18(11):1363-1369. doi: 10.1038/s41592-021-01301-5

MS² Peak intensity prediction

Declercq A. et al, Nucleic Acids Res. 2023 Jul 5;51(W1):W338-W342. doi: 10.1093/nar/gkad335

MS²Rescore framework

Buur L. M. et al, _J Proteome Res. 2024 Mar 16. doi: 10.1021/acs.jproteome.3c00785

Percolator

Käll L. et al, Nat Methods 2007 Nov;4(11):923-5. doi: 10.1038/nmeth1113.

Identification based RT Alignment

Weisser H. et al, J Proteome Res. 2013 Apr 5;12(4):1628-44. doi: 10.1021/pr300992u

Targeted peptide quantification

Weisser H. et al, J Proteome Res. 2017 Aug 4;16(8):2964-2974. doi: 10.1021/acs.jproteome.7b00248

mhcquant's People

Contributors

Stargazers

Watchers

mhcquant's Issues

U / B / X in fasta

mhcflurry will fail if any of these are included in a fasta file : U / B /X

Samplesheet format `Usage.md`

Description of feature

Change \t to real tabs in samplesheet.

Please create TEMPLATE branches

https://nf-co.re/adding_pipelines

Issue annotation of the HLA alleles in parameters

Description of the bug

The current annotation of A 03:01;A 68:01;B 27:05;B 35:03;C 02:02;C 04:01 give a misconception on how the annotation of the HLA alleles should be.

Command used and terminal output

No response

Relevant files

No response

System information

No response

PASS Filter for Variants in vcf reader

The vcf reader function in MHCquant currently doesn't filter provided vcfs by PASS.

Hence, this has to be done manually by the user at the moment but should be easily fixed in a new release.

parsing of VEP annotated VCF files fails

Description of the bug

VEP annotated VCF files caused an error during parsing in variants2fasta.py here .In more detail: The field corresponding to CDS_position is filled with the string {position/length}, but the program expects an parsable integer here.

Command used and terminal output

nextflow run nf-core/mhcquant -r 2.2.0 -profile cfc \
 --input '/sfs/7/workspace/ws/qeasc01-QLFGB-mhcquant-0/samplesheet_class1_mhcquant.tsv' \
 --fasta '/sfs/7/workspace/ws/qeasc01-QLFGB-mhcquant-0/UP000005640_9606.fasta' \
 --allele_sheet '/sfs/7/workspace/ws/qeasc01-QLFGB-mhcquant-0/allele_sheet_mhcquant.tsv' \
 --peptide_max_length 12 \
 --predict_class1 \
 --fdr_threshold 0.1 \
 --predict_RT \
 --include_proteins_from_vcf \
 --vcf_sheet '/sfs/7/workspace/ws/qeasc01-QLFGB-mhcquant-0/vcf_sheet_mhcquant.tsv' \
 --variant_annotation_style 'VEP' \
 --variant_indel_filter \
 --variant_reference 'GRCH37' \
 --max_time '240.h'


Command error:
  Using TensorFlow backend.
  Traceback (most recent call last):
    File "/home-link/qeasc01/.nextflow/assets/nf-core/mhcquant/bin/variants2fasta.py", line 228, in <module>
      sys.exit(main())
    File "/home-link/qeasc01/.nextflow/assets/nf-core/mhcquant/bin/variants2fasta.py", line 172, in main
      variants = read_variant_effect_predictor(args.vcf, gene_filter=protein_ids)
    File "/home-link/qeasc01/.nextflow/assets/nf-core/mhcquant/bin/variants2fasta.py", line 76, in read_variant_effect_predictor
      coding[transcript_id] = MutationSyntax(transcript_id, int(transcript_pos)-1,
  ValueError: invalid literal for int() with base 10: '144/3999'

Relevant files

No response

System information

No response

Add backslashes to run command

Description of feature

The Quick Start run command needs backslashes.

Problem with step 11, run Percolator

I would like to thank the team for the amazing work and the great software, however, I have tried to run the pipeline, everything went well, until step 11 where execution ends with the following error:
"""
Error executing process > 'run_percolator (1)'

Caused by:
Process run_percolator (1) terminated with an error exit status (8)

Command executed:

OMP_NUM_THREADS=6
PercolatorAdapter -in s1_all_ids_merged_psm.idXML
-out s1_all_ids_merged_psm_perc.idXML
-seed 4711
-trainFDR 0.05
-testFDR 0.05
-threads 6
-enzyme no_enzyme
-peptide-level-fdrs
-subset-max-train 0
-doc 0 \

Command exit status:
8

Command output:
Loading input file: s1_all_ids_merged_psm.idXML
Merging peptide ids.
Merging protein ids.
Error: Unexpected internal error (Prefix of string '6file=4959' successfully converted to an integer value. Additional characters found at position 2)

Command wrapper:
Loading input file: s1_all_ids_merged_psm.idXML
Merging peptide ids.
Merging protein ids.
Error: Unexpected internal error (Prefix of string '6file=4959' successfully converted to an integer value. Additional characters found at position 2)
"""
I have tried to run the PercolatorAdapter from the openMS docker container with the s1_all_ids_merged_psm.idXML in he intermediate results also ending up in the same results.

I would truly appreciate your help, thanks a lot

Ion identification within a mass spectrum

Hello !

I am very new to MS/MS and MHCquant.
I would like to get the peak identification in my mass spectrum when I visualize them, to know which peaks were considered as significant for peptide identification and which ion corresponds to which peak (cf image).
I guess there is something within the pipeline that identifies such peaks: is there a way to retrieve this information somewhere ?

For now, I just retrieve the scan ID for each identified peptide, and I visualize the associated mass spectrum on XCalibur.

Best,
Paul

Default runtime limit for predicting possible neoepitopes is too short

The default runtime limit for the processes:
"predict_possible_neoepitopes" and "predict_possible_classII_neoepitopes"

is too low since querying the Biomart API for large vcf files can exceed this limit.

However, currently it is possible to raise the runtime by specifying a runtime profile with the -c parameter.

dev branch as not been bumped to dev

Include OpenMS FileInfo to check for valid mzML files

Description of feature

We keep having issues with malformed mzML files and retrieve non-descriptive errors by the CometAdapter. Therefore it would be of benefit to check the input (or converted) mzML files before processing them further. For this FileInfo in openms comes in handy.

boolean parameters default == false

All boolean parameters such as include_proteins_from_vcf should be false per default, since setting them to false is not possible at the moment.

protein fasta file with empty first line

ERROR ~ Error executing process > 'generate_decoy_database (1)'

Caused by:
Process generate_decoy_database (1) terminated with an error exit status (3)

Command executed:

DecoyDatabase -in UP000005640_9606_reviewed_added_vcf.fasta
-out UP000005640_9606_reviewed_added_vcf_decoy.fasta
-decoy_string DECOY_
-decoy_string_position prefix

Command exit status:
3

Command output:
Version 2.4.0-HEAD-2018-10-26 of DecoyDatabase is available at www.OpenMS.de
Warning: Only one FASTA input file was provided, which might not contain contaminants.You probably want to have them! Just add the contaminant file to the input file list 'in'.
Error: Unable to read file (Error while parsing FASTA file! The first entry could not be read! Please check the file! in: )

Command wrapper:
Version 2.4.0-HEAD-2018-10-26 of DecoyDatabase is available at www.OpenMS.de
Warning: Only one FASTA input file was provided, which might not contain contaminants.You probably want to have them! Just add the contaminant file to the input file list 'in'.
Error: Unable to read file (Error while parsing FASTA file! The first entry could not be read! Please check the file! in: )

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

-- Check '.nextflow.log' file for details

Output documentation

Description of the bug

There are some typos in the output doc and other things I would suggest to change.

Command used and terminal output

No response

Relevant files

No response

System information

No response

Remove TOC `Usage.md`

Description of feature

The Table of contents can be removed from the Usage.md to adhere to the structure of other pipeline docs.

Changes to the params documentation

Description of feature

I looked at the parameter documentation and noted some things that I would propose to change. I will do a bunch of commits in the following pull request to implement these changes.

Remove second environment

Hi @Leon-Bichmann !

we need to get all tools in a single environment, thus need to recompile all tools required by this pipeline against the most current CXX environment in bioconda/conda-forge. Can you start listing up tools in here, then we could for example start coming up with a strategy on how to achieve this?

This second environment causes so many downstream issues, that we should get rid of this asap:

Nextflow Conda environment support is broken this way
Reproducibility for users is quite difficult, as they need two environments set up

....

We should get this rolling for 1.3.X already...

Add more documentation about individual mzTab columns

For example the score column contains q-values but it is not included in the documentation.

Issue in the MzTabExporter

Description of the bug

I experienced the following issue on the dev branch, and I would like to resolve this before release (of course), but I have no idea where to look for this. I noticed a similar issue request on OpenMS
The major difference is that I tried to update OpenNMS from 2.6.0 to 2.8.0 on the dev branch, I will go over the changes between these two versions, but some input would be highly appreciated.

The following error is returned:

Caused by:
  Process `NFCORE_MHCQUANT:MHCQUANT:PROCESS_FEATURE:OPENMS_MZTABEXPORTER_QUANT (QAMTL477AO_J314_Pre_T39L243_J314_Pre_T39L243)` terminated with an error exit status (8)

Command executed:

  MzTabExporter -in QAMTL477AO_J314_Pre_T39L243_J314_Pre_T39L243_resolved.consensusXML \
      -out QAMTL477AO_J314_Pre_T39L243_J314_Pre_T39L243.mzTab \
      -threads 2 \
  
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MHCQUANT:MHCQUANT:PROCESS_FEATURE:OPENMS_MZTABEXPORTER_QUANT":
      openms: $(echo $(FileInfo --help 2>&1) | sed 's/^.*Version: //; s/-.*$//' | sed 's/ -*//; s/ .*$//')
  END_VERSIONS

Command exit status:
  8

Command output:
  No fractions annotated in consensusXML. Assuming unfractionated.
  Error: Unexpected internal error (PSM controllerType=0 controllerNumber=1 scan=9440 does not map to an MS file registered in the quantitative metadata. Check your merging and filtering steps and/or report the issue, please.)
  <No fractions annotated in consensusXML. Assuming unfractionated.> occurred 4 times

Command wrapper:
  nxf-scratch-dir node003:/scratch/117012/nxf.8kITDHhTPh
  No fractions annotated in consensusXML. Assuming unfractionated.
  Error: Unexpected internal error (PSM controllerType=0 controllerNumber=1 scan=9440 does not map to an MS file registered in the quantitative metadata. Check your merging and filtering steps and/or report the issue, please.)
  <No fractions annotated in consensusXML. Assuming unfractionated.> occurred 4 times

I also looked into the file of interest and I could file a spectrum_reference annotated.

<consensusElement id="e_11275396928539264273" quality="3.574097" charge="5">
			<centroid rt="2886.454795973814726" mz="527.123682987430925" it="3.764833e06"/>
			<groupedElementList>
				<element map="0" id="28923268865551816" rt="2897.00668984044114" mz="527.123682987430925" it="4.085203e06" charge="5"/>
				<element map="1" id="14924012554702823270" rt="2882.901894744373749" mz="527.123682987430925" it="3.478242e06" charge="5"/>
				<element map="2" id="13437987073385272035" rt="2880.484798142305408" mz="527.123682987430925" it="3.642787e06" charge="5"/>
				<element map="3" id="1164487179193034968" rt="2885.425801168138605" mz="527.123682987430925" it="3.8531e06" charge="5"/>
			</groupedElementList>
			<PeptideIdentification identification_run_ref="PI_0" score_type="q-value" higher_score_better="false" significance_threshold="0" MZ="527.1254486319" RT="2874.61804275669" spectrum_reference="controllerType=0 controllerNumber=1 scan=9440" >
				<PeptideHit score="0.0126048" sequence="ANANSRQQIRKLIKDGLIIRKPV" charge="5" aa_before="I I I I I I I" aa_after="T T T T T T T" start="32 32 32 29 52 52 32" end="54 54 54 51 74 74 54" protein_refs="PH_219 PH_1709 PH_1272 PH_1265 PH_1253 PH_937 PH_927">
					<UserParam type="string" name="target_decoy" value="target"/>
					<UserParam type="string" name="MS:1002258" value="11"/>
					<UserParam type="string" name="MS:1002259" value="176"/>
					<UserParam type="string" name="num_matched_peptides" value="4252"/>
					<UserParam type="int" name="isotope_error" value="0"/>
					<UserParam type="float" name="MS:1002252" value="2.234"/>
					<UserParam type="float" name="MS:1002253" value="1.0"/>
					<UserParam type="float" name="MS:1002254" value="0.0"/>
					<UserParam type="float" name="MS:1002255" value="25.699999999999999"/>
					<UserParam type="float" name="MS:1002256" value="1.0"/>
					<UserParam type="float" name="MS:1002257" value="0.0494"/>
					<UserParam type="string" name="protein_references" value="non-unique"/>
					<UserParam type="float" name="COMET:deltCn" value="1.0"/>
					<UserParam type="float" name="COMET:deltLCn" value="0.0"/>
					<UserParam type="float" name="COMET:lnExpect" value="-3.00780485478826"/>
					<UserParam type="float" name="COMET:lnNumSP" value="8.355144739461839"/>
					<UserParam type="float" name="COMET:lnRankSP" value="0.0"/>
					<UserParam type="float" name="COMET:IonFrac" value="0.0625"/>
					<UserParam type="float" name="MS:1001492" value="0.412529"/>
					<UserParam type="float" name="MS:1001491" value="0.0126048"/>
					<UserParam type="float" name="MS:1001493" value="0.245321"/>
				</PeptideHit>
				<UserParam type="int" name="id_merge_index" value="4"/>
				<UserParam type="string" name="FFId_category" value="internal"/>
				<UserParam type="int" name="map_index" value="0"/>
				<UserParam type="string" name="feature_id" value="11275396928539264273"/>
			</PeptideIdentification>
			<UserParam type="string" name="feature_id" value="11275396928539264273"/>
		</consensusElement>

Command used and terminal output

nextflow run nf-core/mhcquant \
-r dev \
-profile cfc \
--input input.tsv \
--outdir ./results \
--fasta *.fasta \
--digest_mass_range 800:2500 \
--activation_method CID \
--prec_charge 2:3 \
--fdr_threshold 0.05 \
--number_mods 3 \
--precursor_mass_tolerance 5 \
--fragment_mass_tolerance 0.02 \
--num_hits 1 \
--peptide_min_length 8 \
--peptide_max_length 12 \
--max_rt_alignment_shift 300 \
--max_time '240.h' \
--email [email protected]

Relevant files

nextflow.log

System information

Nextflow version: 22.04.4
Hardware: HPC
Container engine: Singularity
Version of nf-core/mhcquant: dev

Variable modifications

Specifying multiple variable modifications results in an error since the OpenMS CometAdapter takes space separated lists as input parameter whereas nextflow provides a single string.

Missing bioconda recipes

http://comet-ms.sourceforge.net/release/

Inconsistent annotation fdr_level parameters

Description of feature

It might be nothing major, but I noticed that the three parameters are inconsistent with the dividers used.

peptide_level_fdrs
psm-level-fdrs
protein_level_fdrs

The suggestion would be to change psm-level-fdrs > psm_level_fdrs and test this in the pipeline

Make ion annotation optional

Description of feature

Since the ion annotation module outputs potentially big tsv files (especially *_all_peaks.tsv) it would be better make this output optional and default false

Use centralized configs

https://github.com/nf-core/configs !!!

Process labeled data

Description of feature

To process labeled data, we could let Comet search also for labeled peptides by specifying them as variable modifications using the unimod nomenclature. A large set of modifications is supported by the CometAdapter, however they are still fixed to distinct accessions and not customizable afaik.

Suggestions welcome

Update Pipeline summary in `Readme.md`

Description of feature

The Pipeline summary needs to be updated.

Fixed modifications

Specifying a fixed modification such as 'Carbamidomethyl (C)' has failed since quotes are missing surrounding the parameter:

     CometAdapter  -in ${mzml_file} \\

	                   (...)

	                   -variable_modifications '${params.variable_mods}' \\
	                   -fixed_modifications ${params.fixed_mods} \\

	                   (...)

--> .command.sh: line 17: syntax error near unexpected token `('

[CI] Travis build doesn't find Java

ERROR: Cannot find Java or it's a wrong version -- please make sure that Java 8 is installed

NOTE: Nextflow is trying to use the Java VM defined by the following environment variables:

 JAVA_CMD: /usr/local/lib/jvm/openjdk11/bin/java

 JAVA_HOME: /usr/local/lib/jvm/openjdk11

The command "wget -qO- get.nextflow.io | bash" failed and exited with 1 during .

I assume that we need to bump the minimum nextflow version to at least 19.04.0 .

Has been suggested by @ewels on nf-core slack on October 11th.

Warn if alleles which are not supported by MHCFlurry are specified

Hi,

I think that it would be a good idea to warn users if a specific allele or multiple alleles are not supported by MHCFlurry.
I propose that we enable "echo true" for processes: predict_psms and predict peptides. I could then adapt the python scripts to print a warning if an unsupported allele was detected.

Alternative 1: Instead of enabling echo we could redirect stdout, but imo this is overkill here.
Alternative 2: We validate the alleles inline in the nextflow script. I think this unnecessarily clutters the nextflow script.

What do you think @Leon-Bichmann ?

Pipeline crashes when setting the --skip_quantification flag

Check Documentation

I have checked the following places for your error:

Description of the bug

Pipeline does not run any process when setting the --skip_quantification flag.

Steps to reproduce

Steps to reproduce the behaviour:

Command line:

nextflow run nf-core/mhcquant -r 2.0.0 \
--input 'samples.tsv' \
--fasta 'uniprot-proteome_UP000005640.fasta' \
--allele_sheet 'alleles_new.tsv'  \
--predict_class_1  \
--skip_quantification \
--max_time '240.h' \
-profile cfc \
-resume \
-c config.conf

See error:

------------------------------------------------------
WARN: There's no process matching config selector: get_software_versions
[-        ] process > NFCORE_MHCQUANT:MHCQUANT:INPUT_CHECK:SAMPLESHEET_CHECK -
[-        ] process > NFCORE_MHCQUANT:MHCQUANT:OPENMS_DECOYDATABASE          -
[-        ] process > NFCORE_MHCQUANT:MHCQUANT:OPENMS_THERMORAWFILEPARSER    -
[-        ] process > NFCORE_MHCQUANT:MHCQUANT:OPENMS_COMETADAPTER           -
[-        ] process > NFCORE_MHCQUANT:MHCQUANT:OPENMS_PEPTIDEINDEXER         -
WARN: Access to undefined parameter `singularity_pull_docker_container` -- Initialise it to a default value eg. `params.singularity_pull_docker_container = some_value`
No such variable: Exception evaluating property 'idXML' for nextflow.script.ChannelOut, Reason: groovy.lang.MissingPropertyException: No such property: idXML for class: groovyx.gpars.dataflow.DataflowBroadcast

 -- Check script '/home-link/iizle01/.nextflow/assets/nf-core/mhcquant/./workflows/mhcquant.nf' at line: 287 or see '.nextflow.log' file for more details

Log files

Have you provided the following extra information/files:

The command used to run the pipeline
The .nextflow.log file

System

Hardware: HPC
Executor: slurm
OS: CentOS Linux
Version: CentOS Linux release 7.9.2009

Nextflow Installation

Version: 21.04.3

Container engine

Engine: singularity version 3.7.4-1.el7

Problem with Mhcquant

Hi, I'm trying to use mhcquant with workflow docker profile but when I try to star the pipeline appears this error message:
WARN: unknown format for entry /Users/presta/null in provided sample sheet. ignoring line.
Someone could help me?
Thank you

Missing output documentation

e.g. there is a output_docs file opened, but no documentation is created in general.
Also make sure to use a channel for the output_docs feature if possible to allow staging of the documentation template to different storage providers by nextflow.

MapAlignerIdentification step error.

-[nf-core/mhcquant] Pipeline completed with errors-
Error executing process > 'align_ids'

Caused by:
  Process `align_ids` terminated with an error exit status (8)

Command executed:

  MapAlignerIdentification -in train_sample_66_ms_run_4.mzML_idx_fdr_filtered.idXML \
                           -trafo_out train_sample_66_ms_run_4.mzML_idx_fdr_filtered.trafoXML \
                           -model:type linear \
                           -algorithm:max_rt_shift 300

Command exit status:
  8

Command output:
  Progress of 'loading input files':

  -- done [took 0.00 s (CPU), 0.01 s (Wall)] --
  Warning: Value of parameter 'min_run_occur' (here: 2) is higher than the number of runs incl. reference (here: 1). Using 1 instead.
  Progress of 'aligning maps':
  Error: Unexpected internal error (No reference RT information left after filtering)

Command wrapper:
  Progress of 'loading input files':

  -- done [took 0.00 s (CPU), 0.01 s (Wall)] --
  Warning: Value of parameter 'min_run_occur' (here: 2) is higher than the number of runs incl. reference (here: 1). Using 1 instead.
  Progress of 'aligning maps':
  Error: Unexpected internal error (No reference RT information left after filtering)

Work dir:
  /projectsp/f_jx76_1/xiaolong/temp/MSV000082648/20200326comet/train_sample_66_ms_run_4.mod1/work/5e/dce3d87a55eac31852aad2c4ad3398

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

RT alignment mixup

If previous process of database search are not synchrone, the order of id files can be switched leading to the application of a wrong retention time transformation to each file.

Error in db_search_comet: Profile data provided but centroided MS2 spectra expected

Hi,
I am trying to run MHCquant but I am getting the following error:

Caused by: Process db_search_comet (1) terminated with an error exit status (8)
Progress of 'loading chromatogram list':
-- done [took 0.01 s (CPU), 0.00 s (Wall)] --
Error: Unexpected internal error (Error: Profile data provided but centroided MS2 spectra expected. To enforce processing of the data set the -force flag.)
Command error:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap

The full error log can be found here: error_log.txt

Any idea?
Thank you very much!

vcf support

Next Release should integrate vcf support for mutated neoantigen search based on the (Fred2 bioconda package).

problem --fixed_modification

Description of the bug

Something went wrong with the conversion, so the --fixed_modificationparameter is standard on Carbamidomethyl (C).

Command used and terminal output

Relevant files

System information

Add in Citation to main readme

https://pubs.acs.org/doi/abs/10.1021/acs.jproteome.9b00313

Add Zenodo DOI for release to main README on master

Would be good to add the Zenodo DOI for the release to the main README of the pipeline in order to make it citable. You will have to do this via a branch pushed to the repo in order to directly update master. See PR below for example and file changes:
nf-core/atacseq#38

See https://zenodo.org/record/3359618#.XVZ0bOhKhPY

Web-hooks are already set-up for this repo to have a unique Zenodo DOI generated everytime a new version of the pipeline is released. Would be good to add this in after every release 👍

Experimental design for batches of replicates

On the long run it would be nice to have an experimental design file specifying which samples are replicates and that the whole batch can be processed at once.

Profile test and test_full cannot access ref database

Description of the bug

When running the test or test_full profile mhcquant terminates with

Error: File not found (the file 'test.fasta' could not be found)

The accession of the reference database in the test data repository is erroneous.

Furthermore we need to increment the amount of tests since the current ones do not cover all of the feature implemented (e.g. ion annotation)

Command used and terminal output

nextflow run nf-core/mhcquant -profile test,cfc --outdir test

bioconda dependency conflict

When trying to install openms and percolator in the same conda environment i get a number of conflicts:

If openms2.3 and percolator3.1 is selected:
libxerces-c-3.1 will be installed but openms requires libxerces-c-3.2
the build works but all tools fail

if openms2.4 and percolator3.1 is selected:
the build fails because of a dependency conflict
conflict:
openms=2.4 -> boost[version='>=1.64.0,<1.64.1.0a0']
percolator -> boost==1.62
percolator -> xerces-c==3.1.2

Some discussions or attempts to fix this can be found here:
bioconda/bioconda-recipes#11871
bioconda/bioconda-recipes#12060

Percolator needs fixed seed

Use -S flag for PercolatorAdapter to make results reproducible

Add links to references

Description of feature

In the Readme.md the additional references are lacking links to the papers.

Raise memory requirements of FeatureFinderIdentification step

Issue ion annotator feature

Description of the bug

Normal run of the pipeline leads to the following error
Process NFCORE_MHCQUANT:MHCQUANT:PYOPENMS_IONANNOTATOR input file name collision -- There are multiple input files for each of the following file names

Command used and terminal output

nextflow run nf-core/mhcquant -r dev -profile cfc --input  --outdir ./results --fasta --digest_mass_range 800:2500 --activation_method CID --prec_charge 2:3 --fdr_threshold 0.05 --number_mods 3 --precursor_mass_tolerance 5 --fragment_mass_tolerance 0.02 --num_hits 1 --peptide_min_length 8 --peptide_max_length 12 --max_rt_alignment_shift 300 --max_time '240.h' --email [email protected]

Relevant files

No response

System information

Nextflow version: 22.04.4
Hardware: HPC
Container engine: Singularity
Version of nf-core/mhcquant: dev

Retrieve ID for output of unquantified data

Description of feature

Currently, there is no annotation of the different samples of the data that is generated with the --skip_quantification parameter

Comet: theoretical_fragment_ion parameter missing

Description of feature

The “theoretical_fragment_ion” parameter instructs Comet whether or not to include signal from the flanking bins in the cross-correlation calculation. See Comet paper

They also state that High-Resolution Runs should have theoretical_fragment_ion = 0. However in the nextflow.config this parameter is not specified. This leads to running Comet with theoretical_fragment_ion = 1 since that is the default. See Comet doc.

Missing mzTab files

Description of the bug

.mztab files were missing for me in the Intermediate_Results directory

nf-core / mhcquant Goto Github PK

mhcquant's Introduction

Introduction

Usage

Pipeline summary

Default Steps

Additional Steps

Quantification

Output

Documentation

Credits

Contributions and Support

Citations

mhcquant's People

Contributors

Stargazers

Watchers

Forkers

mhcquant's Issues

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Description of feature

Description of feature

Description of feature

Check Documentation

Description of the bug

Steps to reproduce

Log files

System

Nextflow Installation

Container engine

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Description of feature

Description of feature

Description of the bug

Command used and terminal output

Relevant files

System information

Recommend Projects

Recommend Topics

Recommend Org