Git Product home page Git Product logo

msfragger's Introduction

MSFragger is an ultrafast database search tool for peptide identification in mass spectrometry-based proteomics. It has demonstrated excellent performance across a wide range of datasets and applications. MSFragger is suitable for standard shotgun proteomics analyses as well as large datasets (including timsTOF PASEF data), enzyme unconstrained searches (e.g., peptidome), open database searches (e.g., precursor mass tolerance set to hundreds of Daltons) for identification of modified peptides, and glycopeptide identification (N-linked and O-linked).

MSFragger is implemented in the cross-platform Java programming language and can be used three different ways:

  1. With FragPipe user interface
  2. As a standalone Java executable
  3. Through ProteomeDiscoverer

MSFragger writes peptide-spectrum matches in either tabular or pepXML formats, making it fully compatible with downstream data analysis pipelines such as Trans-Proteomic Pipeline, Percolator, and Philosopher. See the complete documentation, including a list of Frequently Asked Questions. Example parameter files can be found here.

Supported file formats

The following spectral file formats can be searched directly with MSFragger, see the FragPipe homepage for compatibility with workflow components downstream from MSFragger.

  • mzML/mzXML - data from any instrument in mzML/mzXML format can be used

  • Thermo RAW - Thermo raw files (.raw) can be read directly, conversion to mzML is not required. In Linux, Mono need to be installed.

  • Bruker timsTOF PASEF - MSFragger can read Bruker timsTOF PASEF (DDA) raw files (.d) directly, as well as MGF files converted by the Bruker DataAnalysis program. Please note: timsTOF data requires Visual C++ Redistributable for Visual Studio 2017 in Windows. If you see an error saying cannot find Bruker native library, please try to install the Visual C++ redistibutable.

License

The entire MSFragger suite of tools (MSFragger-Core, MSFragger-LOS, MSFragger-Glyco, MSFragger-DIA, MSFragger-Labile), collectively known as "MSFragger", is distributed as a single JAR file. It is available freely for academic research, non-commercial or educational purposes under academic license.

Other uses require a commercial license after the initial 60-day evaluation period that can be obtained by contacting Drew Bennett ([email protected]) at the University of Michigan Office of Tech Transfer. For questions, please contact Prof. Alexey Nesvizhskii ([email protected]).

Download MSFragger

Whether you run use FragPipe, Proteome Discoverer (PD, Thermo Scientific), or the command line, you will need to download the latest MSFragger JAR file. See instructions for downloading or upgrading MSFragger.

Release Notes

Check here for the full list of MSFragger versions and changes.

Running MSFragger

FragPipe

On Windows or Linux, the easiest way to run MSFragger is through FragPipe, which has a variety of built-in workflows for complete data analysis.

ProteomeDiscoverer node

MSFragger and Philosopher (PeptideProphet) are also available as processing nodes in Proteome Discoverer (PD, Thermo Scientific). Currently, the MSFragger-PD node can be used in PD versions 2.2, 2.3 and 2.4.

Command line

See Launching MSFragger on the Wiki page.

Documentation

For technical documentation on MSFragger (hardware requirements, search parameters, etc.), see the MSFragger wiki page.

Questions and Technical Support

See our Frequently Asked Questions (FAQ) page. Please post all questions/bug reports regarding MSFragger itself on the MSFragger GitHub issue page, or if more appropriate on FragPipe page or Philosopher page.

Requests for Collaboration

If you would like to propose a new collaboration that can take advantage of MSFragger and related tools, please contact us directly.

Integration

MSFragger is currently integrated or supported by the following software projects:

How to Cite

  • Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D., & Nesvizhskii, A. I. (2017). MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nature Methods, 14(5), 513-520.
  • Yu, F., Teo, G. C., Kong, A. T., Haynes, S. E., Avtonomov, D. M., Geiszler, D. J., & Nesvizhskii, A. I. (2020). Identification of modified peptides using localization-aware open search. Nature Communications, 11(1), 1-9.
  • Polasky, D. A., Yu, F., Teo, G. C., & Nesvizhskii, A. I. (2020). Fast and Comprehensive N-and O-glycoproteomics analysis with MSFragger-Glyco. Nature Methods, 17(11), 1125-1132.
  • Yu, F., Haynes, S. E., Teo, G. C., Avtonomov, D. M., Polasky, D. A., & Nesvizhskii, A. I. (2020). Fast Quantitative Analysis of timsTOF PASEF Data with MSFragger and IonQuant. Molecular & Cellular Proteomics, 19(9), 1575-1585.
  • Polasky, D., Geiszler, D., Yu, F., Li, K., Teo, G. C., & Nesvizhskii, A. I., (2023). MSFragger-Labile: A flexible method to improve labile PTM analysis in proteomics. Molecular & Cellular Proteomics, 22(5), 100538.
  • Yu, F., Teo, G. C., Kong, A. T., Fröhlich, K., Li, G. X., Demichev, V., & Nesvizhskii, A. I., (2023). Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform. Nature Communications, 14, 4154.

For other tools developed by the Nesvizhskii lab, see our website www.nesvilab.org

msfragger's People

Contributors

anesvi avatar dpolasky avatar fcyu avatar guoci avatar sarah-haynes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

msfragger's Issues

.pepXML not found using PDnode (PD2.2)

Many thanks to the Nesvizhskii lab for releasing these awesome tools!

I'd be very keen on running MSFragger through our PD2.2 server, and managed to get the nodes up and running. The only problem is when trying a search (using a .mzml input file converted from a Thermo .raw file), the processing workflow errors out as it cannot find the .pepxml file that should be located in the "temp" folder. The only files present in this folder at that point are "fragger.params", the actual FASTA file and the .pepindex for that FASTA file.

Any input as to what I can do to debug this issue would be really appreciated, many thanks.

PD2.2 running on Windows 7, 64bit

Open searches constantly out of memory

Describe the problem

  • I'm submitting a:

    • [x ] problem running the software
    • bug report
    • feature request
    • question
  • My MSFragger use case:

    • Closed search (standard small precursor mass tolerance)
    • [x ] Open search (large precursor mass tolerance)

** put your general problem description here **
The open search runs constantly out of memory

System info

You can find that printed on the Config tab.

  • OS and version: ** win7 64 bin**
  • Java version: **java 8 **

Describe your experiment

label free with mod on K with 196.13683, and phosphorylation on STY

Genral proteomics experiment description

e.g. "TMT, Human, full cell lysate with Trypsin" , "AP-MS pulldowns, mouse,
liver tissue
"
Trypsin digest, but since the K is modified, we are looking for 3 mis-cleavages.
...

Input data files

** e.g. "fractionated HeLa, 5 samples, 3 bio replicates, 2 technical replicates" or "ten 3 hour LC gradients, full cell lysate, fruit fly" or
or at least "5 mzml files 1.5Gb each" **
1 file 500 mb

Sequence database

uniref100 human
** your response here, preferably a link or at least name, organism etc. **
** Size of the either in proteins or in megabytes **


Attach fragger.params file

num_threads = 6
precursor_mass_tolerance = 500.00
precursor_mass_lower = -200
precursor_mass_upper = 500
precursor_mass_units = 0
precursor_true_tolerance = 20
precursor_true_units = 1
fragment_mass_tolerance = 20
fragment_mass_units = 1
isotope_error = 0
mass_offsets = 0
search_enzyme_name = Trypsin
search_enzyme_cutafter = KR
search_enzyme_butnotafter = P
num_enzyme_termini = 2
allowed_missed_cleavage = 3
clip_nTerm_M = 1

variable_mod_01 = 15.99490 M

variable_mod_02 = 42.01060 [^

variable_mod_03 = 79.96633 STY

variable_mod_04 = -17.02650 nQnC

variable_mod_05 = -18.01060 nE

allow_multiple_variable_mods_on_residue = 1
max_variable_mods_per_mod = 3
max_variable_mods_combinations = 5000
output_file_extension = tsv
output_format = tsv
output_report_topN = 1
output_max_expect = 50.0
precursor_charge = 0 0
override_charge = 0
digest_min_length = 7
digest_max_length = 50
digest_mass_range = 400.0 5000.0
max_fragment_charge = 2
track_zero_topN = 0
zero_bin_accept_expect = 0
zero_bin_mult_expect = 1
add_topN_complementary = 0
minimum_peaks = 15
use_topN_peaks = 100
min_fragments_modelling = 3
min_matched_fragments = 6
minimum_ratio = 0.01
clear_mz_range = 0.0 0.0
add_Cterm_peptide = 0.000000
add_Nterm_peptide = 0.000000
add_Cterm_protein = 0.000000
add_Nterm_protein = 0.000000
add_G_glycine = 0.000000
add_A_alanine = 0.000000
add_S_serine = 0.000000
add_P_proline = 0.000000
add_V_valine = 0.000000
add_T_threonine = 0.000000
add_C_cysteine = 57.021464
add_L_leucine = 0.000000
add_I_isoleucine = 0.000000
add_N_asparagine = 0.000000
add_D_aspartic_acid = 0.000000
add_Q_glutamine = 0.000000
add_K_lysine = 0.000000
add_E_glutamic_acid = 0.000000
add_M_methionine = 0.000000
add_H_histidine = 0.000000
add_F_phenylalanine = 0.000000
add_R_arginine = 0.000000
add_Y_tyrosine = 0.000000
add_W_tryptophan = 0.000000
add_B_user_amino_acid = 0.000000
add_J_user_amino_acid = 0.000000
add_O_user_amino_acid = 0.000000
add_U_user_amino_acid = 0.000000
add_X_user_amino_acid = 0.000000
add_Z_user_amino_acid = 0.000000
database_name = E:\Users\humanUniref100.fasta
variable_mod_06 = 196.13683 K

Run log output

** copy text from the text console in the gui here **

image

Poor results on HLA-II file, possibly user error?

Hello!
I'm attempting to analyze a DDA HLA-II immunopeptide file using FragPipe (v9.1) with MSFragger (v20190222) and Philosopher (v20190405). It wasn't clear on what executable to include for Python (or if that was needed), so I left that blank.
I was able to make it through the run in record time, but the results weren't quite what I was expecting. Specifically, over half of the reported identifications in psm.tsv, peptide.tsv, and protein.tsv were reverse/decoy sequences with high scores. When I analyze the same file with MetaMorpheus, I got ~6000 PSMs, ~4000 peptides, and ~700 proteins at a 1% FDR, so I don't think it's an issue with the MS file. Am I interpreting the output correctly, or is there anything wrong with my fragger.params? I had simply clicked the "Non-specific Search" option and then changed the precursor mass tolerances to +-5 ppm.
I've uploaded the relevant files here.
Thank you for your help!

FragPipe-equivalent workflow on Linux for Brucker .d files

Hi,

I would like to use MSFragger to analyse Brucker timsTOF data. It worked very well locally on Windows using FragPipe, but the real goal is to use it on our cluster on CentOS 7.7

I tried adapting the provided script but so far I am getting:

Checking /cl_tmp/standards/FragTest1/HeLa_QC_120min_Slot1-1_1_261.d...
Failed in checking /cl_tmp/standards/FragTest1/HeLa_QC_120min_Slot1-1_1_261.d
Bruker native libraries not found

What should I add?

Also, I would definitely need to include IMQuant in the pipeline, but I'm not sure how to do it.

I'd be very grateful for some tips.

Cheers,

Natalia

No sequences with localized modification in report when using new released MSFragger

Hi, I am using MSFragger-20190530 and newly released philosopher to do an open-window search on my data. The header of the output psm.tsv file is like follow:
image
I didn't find the column of sequences with localized modification which is mentioned as a new feature of the new version MSFragger
image
Is there anything I did wrong? Thank you so much

PeptideProphet + msfragger opensearch issue

Hello dev team,
I am currently using msfragger v2.2 (non-specific digestion + open search) in combination with philosopher v2. I noticed that the Mixture model quality test is failing for most of my searches for all charges (1+ to 7+) when the msfragger option 'localize_delta_mass' is on.

I attached my msfragger config file below and I would like to mention again that it's not an experiment specific issue.

Fragger_Params.txt

Are you aware of any explanation for this behaviour ?


software versions:

  • msfragger v2.2
  • philosopher v2
  • MSConvert (peak picking on) proteowizard version = 3.0.19304.503cb4044

Best.

combine open-search with non-specific search??

Describe the problem

three questions:
Question 1:
I used Hela standards as test file to compare the difference between close and open search. But I could not see the difference. No more PTM was found in open search.
Question 2:
For open search, does it matter what variable modifications have been specified?
Question 3:
We have some peptidome samples from wheat, I would like to run OPEN SEARCH along with NON-SPECIFIC DIGESTION. The database are either 76 MB or 700 MB. Is that possible ?

  • I'm submitting a:

    • problem running the software
    • bug report
    • feature request
    • [* ] question
  • My MSFragger use case:

    • Closed search (standard small precursor mass tolerance)
    • [ *] Open search (large precursor mass tolerance)

** put your general problem description here **

System info

You can find that printed on the Config tab.

  • OS and version: ** Windows 7 **
  • Java version: ** Version 8 Update 191 **

Describe your experiment

Genral proteomics experiment description

e.g. "peptidome" , "from wheat"
"Hela standards" , "commercial available"

...

Input data files

** e.g.
" wheat peptidome, 1 sample, 3 fractions"
"HeLa, 1 samples, 3 replicates, 2_" or

Sequence database

** wheat database **
** 76 MB **


Attach fragger.params file

** You can find it in the output directory you specified for analysis. **

Run log output

** copy text from the text console in the gui here **

Update pep.xml schema

MSFragger is following the pep.XML schema 1.18 from 2015. The latest version 1.22 is from 2017 and it contains diferences that can cause problems during parsing. Some of the corrections we see PeptideProphet applying in the beginning of the analysis are not necessary when using the newest schema.

The pep.xml shcemas can be found here.

Implement mzID and mzTab outputs

Implementing the PSI standard formats mzID and mzTab will benefit the downstream processing, the consolidation of all results into a single file and third-party tools.

openSearch issue: MSFragger mass calibration + peptideProphet

Hi MSFragger team,

I am running MSFragger open search using MSFragger-2.2 and philosopher build 20190319. I am getting this message in the fragger run: "Not enough data to perform mass calibration, using the uncalibrated data". Subsequently, the mixture model quality test is failing for all charges in the peptide prophet step. Utimately, the protein prophet does not find any peptide prophet results and fails completely. Would you know what is causing this problem?
My proteomics data were generated using LTQ Orbitrap XL or LTQ FT Ultra mass spectrometer.
I alternatively tried open search using data generated from Q Exactive and this problem was not there. I am attaching my fragger.params and log file.

fragger_params.txt

log-fragpipe-run-at_2019-11-28_13-40-32.log

Thanks in advance
Adithi

Container for MSFragger

Dear @anesvi we are starting building pipelines here based on MSFragger. All pipelines are running in cloud. We will be interested to create a container in biocontainers for the tool. The license of the tool will be attached to the container. Do you think that is possible. ?

OutofMemoryError

Describe the problem

  • I'm submitting a:

    • [X ] problem running the software
  • My MSFragger use case:

    • Closed search (standard small precursor mass tolerance)
    • Open search (large precursor mass tolerance)

** put your general problem description here **
OutofMemoryError

System info

You can find that printed on the Config tab.

  • OS and version: Windows 10 enterprise
  • java version "1.8.0_202"
    Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
    Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)

Describe your experiment

General proteomics experiment description

Open search metaproteomic analysis of Wastewater treatment plant

...

Input data files

6 mzML file of 2Gb each. Converted from raw files using msconvert.

Sequence database

Metagenomic database of 5Gb size. I know it's too big but I want to double check


Attach fragger.params file and log output

fragger.txt
log_2019-02-12_04-11-26.txt

Issue:

Hello, I have an issue at the very beginning of the analysis apparently linked to a lack of memory.
Which, is definitively possible since I'm using a massive 2.5Gb database (5Gb with the decoy). But in another hand I've giving 100Gb of ram on 50 cores, I could give more and the software is obviously not using them all. SO could it be something different? I've copied part of the log and attached the file.

Thanks a lot for your help!
Ben

`Unknown parmameters:
fragpipe_ram = 150
mass_offsets = 0
shifted_ions = 0
shifted_ions_exclude_ranges = (-1.5,3.5)

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.OutOfMemoryError
at java.lang.AbstractStringBuilder.hugeCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.newCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuffer.append(Unknown Source)
at b.a(Unknown Source)
at b.(Unknown Source)
at w.a(Unknown Source)
at e.a(Unknown Source)
at MSFragger.main(Unknown Source)
... 5 more

Process finished, exit value: 1
Previous process returned exit code [1], cancelling further processing.`

Updater website link

We should probably put the updater website download link in the readme here. Any objections?

Which score to use to filter out 'low quality' spectra

Hi there,

I was wondering on which score (hyperscore or nextscore) one should base the filtering to remove low-quality spectra.
Do you also have any advice on which minimum value one can consider a spectrum of decent quality.
I know this might be very subjective... just to have an idea from the creators :)
I have seen in literature people using 20-30 for Mascot, 40-60 for Andromeda, 3-5 for Sequest.

Thanks

Error: "Could not create the Java Virtual Machine"

Hi all,

I'm trying to test MSFragger on a single mzML file (1.5GB) and a reference human database from Uniprot (55 MB, including decoys) but MSFragger keeps failing giving the following error:
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Invalid maximum heap size: -Xmx10G
The specified size exceeds the maximum representable size.
Process 'MsFragger' finished, exit code: 1
Process returned non-zero exit code, stopping MsFragger

I'm using Java 1.8.0_201_b09 (X64). The PC has 512Gb RAM and 23 logical cores. I set MSFragger to perform a closed search with 10Gb RAM and 10 cores. I also set Java runtime parameters as -Xmx100G (so everything should be in large excess compared to the MSFragger settings).

Here is our configuration:
System info:
System OS: Windows 10, Architecture: AMD64
Java Info: 1.8.0_201, Java HotSpot(TM) Client VM, Oracle Corporation
Below I'm copying the log.

Any suggestion to solve this issue?

Thank you,

Paolo Cifani

System info:
System OS: Windows 10, Architecture: AMD64
Java Info: 1.8.0_201, Java HotSpot(TM) Client VM, Oracle Corporation

Version info:
FragPipe version 9.1
MSFragger version 20190222
Philosopher version 20190301 (build 201903011453)

LCMS files:
Experiment/Group:

  • C:\Users\KentsisLab\DATA\MSfragger\190220_PC_microR_1E6_a.mzML

9 commands to execute:
Workspace [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe workspace --clean
Workspace [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe workspace --init
MsFragger [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
java -jar -Dfile.encoding=UTF-8 -Xmx20G C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\MSFragger-20190222.jar C:\Users\KentsisLab\DATA\MSfragger\Results\fragger.params C:\Users\KentsisLab\DATA\MSfragger\190220_PC_microR_1E6_a.mzML
MsFragger
java -cp C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\FragPipe.exe umich.msfragger.util.FileMove C:\Users\KentsisLab\DATA\MSfragger\190220_PC_microR_1E6_a.pepXML C:\Users\KentsisLab\DATA\MSfragger\Results\190220_PC_microR_1E6_a.pepXML
ReportDbAnnotate [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe database --annotate C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\171023_UPR_homo_cRAP_withdecoy.fasta --prefix #DECOY#
PeptideProphet [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe peptideprophet --decoyprobs --ppm --accmass --nonparam --expectscore --decoy #DECOY# --database C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\171023_UPR_homo_cRAP_withdecoy.fasta 190220_PC_microR_1E6_a.pepXML
ProteinProphet [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe proteinprophet C:\Users\KentsisLab\DATA\MSfragger\Results\interact-190220_PC_microR_1E6_a.pep.xml
ReportFilter [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe filter --sequential --prot 0.01 --tag #DECOY# --pepxml C:\Users\KentsisLab\DATA\MSfragger\Results --protxml C:\Users\KentsisLab\DATA\MSfragger\Results\interact.prot.xml
ReportReport [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe report



Workspace [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe workspace --clean
INFO[14:45:15] Executing Workspace 20190301                 
INFO[14:45:16] Removing workspace                           
INFO[14:45:16] Done                                         
Process 'Workspace' finished, exit code: 0

Workspace [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\philosopher_windows_amd64.exe workspace --init
INFO[14:45:16] Executing Workspace 20190301                 
INFO[14:45:19] Creating workspace                           
INFO[14:45:19] Done                                         
Process 'Workspace' finished, exit code: 0

MsFragger [Work dir: C:\Users\KentsisLab\DATA\MSfragger\Results]
java -jar -Dfile.encoding=UTF-8 -Xmx20G C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\MSFragger-20190222.jar C:\Users\KentsisLab\DATA\MSfragger\Results\fragger.params C:\Users\KentsisLab\DATA\MSfragger\190220_PC_microR_1E6_a.mzML
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Invalid maximum heap size: -Xmx20G
The specified size exceeds the maximum representable size.
Process 'MsFragger' finished, exit code: 1

Process returned non-zero exit code, stopping
MsFragger
java -cp C:\Users\KentsisLab\DATA\MSfragger\MSFragger-GUI_v3.0\FragPipe.exe umich.msfragger.util.FileMove C:\Users\KentsisLab\DATA\MSfragger\190220_PC_microR_1E6_a.pepXML C:\Users\KentsisLab\DATA\MSfragger\Results\190220_PC_microR_1E6_a.pepXML

Bruker native libraries not found

Hi there!

I'm trying to run MSFragger on a TIMS-TOF dataset. However, MSFragger cannot seem to find the Bruker native libraries. I have ProteoWizard with the vendor libraries, DataAnalysis 5.0 and pretty much the entire Bruker software tree in the machine. I updated MSFragger to the latest version (2.2), and also have the latest timsdata.dll with the other Bruker DLLs (in C:\Program Files\Common Files\Bruker Daltonik\DLLs), as well as in the local folder. Which "native libraries" are needed, and where does MSFragger expect them?

Many thanks in advance!

Magnus

The output from the command line:

H:<folder with .d directory and MSFragger>java -jar MSFragger-2.2.jar closed_fragger.params <.d directory>
MSFragger version MSFragger-2.2
Batmass-IO version 1.17.1
(c) University of Michigan
RawFileReader reading tool. Copyright (c) 2016 by Thermo Fisher Scientific, Inc. All rights reserved.
System OS: Windows 10, Architecture: AMD64
Java Info: 1.8.0_231, Java HotSpot(TM) 64-Bit Server VM, Oracle Corporation
JVM started with 14 GB memory

Checking <.d data directory>
Failed in checking <.d directory path>
Bruker native libraries not found

Map modified peptides to leading accession

Hi,

MSFragger is particularly effective to identify modified peptides (either chemically or post translationally). However, this would be really comfortable to map peptides to Protein ID (from report.tsv file).

As an example (which is just an illustration, not necessarly right or true), lets say this peptide is of particular interest (because phosphorylated on Tyr-6). It would be pretty handy to map this peptide to it's protein accession in peptides/psms tables. My suggestion would be to add a specific column with protein accession next to peptide sequence (and if size of file is not a concern, add a column with Gene and Protein description).

Spectrum	Peptide	Modified Peptide	Variable Modifications	Charge	Retention	Calculated M/Z	Observed M/Z	Original Delta Mass	Adjusted Delta Mass	Experimental Mass	Peptide Mass	PeptideProphet Probability	Expectation
file	GVTIPYRPKPSSSPVIFAGGQDR	GVTIPY[243]RPKPSSSPVIFAGGQDR	Y6:79.9663	4	0	628.0704	608.5795	-77.9634	-77.962	2430.2888	2508.2522	0.9699	0.6134

Changed to :

Spectrum	Peptide	Modified Peptide	Variable Modifications	Protein accession	Gene	Protein description	Charge	Retention	Calculated M/Z	Observed M/Z	Original Delta Mass	Adjusted Delta Mass	Experimental Mass	Peptide Mass	PeptideProphet Probability	Expectation
file	GVTIPYRPKPSSSPVIFAGGQDR	GVTIPY[243]RPKPSSSPVIFAGGQDR	Y6:79.9663	Q15366-2	PCBP2	Isoform of Q15366, Isoform 2 of Poly(rC)-binding protein 2 	4	0	628.0704	608.5795	-77.9634	-77.962	2430.2888	2508.2522	0.9699	0.6134

This would be a great enhancement when hunting for PTMs which MSFragger is exceptionally performant for.

Best regards,
Vivian

PD : an item with the same key has already been added

Hello,
I'm trying out MSFragger in Proteome Discoverer 2.3. A job was submitted but it quickly throws an exception:
"An item with the same key has already been added".
There's another error that says "unable to access jarfile".

I thought that perhaps there was an issue because a previous PD-Sequest analysis had used the same name for the pdresult and consensus outputs, so I changed those output names. There is still an error about the "an item with the same key...", but there's more errors:
"The specified size exceeds the maximum representable size"
"Invalid maximum heap size: -Xmx23G"
That was the parameter that was determined if RAM was set to -1. I tried setting this manually to 16, and 8 GB, and the error then changes to be -Xmx14G and -Xmx6G.

I've tried to attached the two magellan server logs, and the processing workflow file.

I realized that if I double click the jar file it doesn't run anything. I updated the Java runtime to the version 8.231. Any ideas?

thanks
Philip

MagellanServer.log

MSFragger_Percolator_1Da_Precursor_0.5DaFragment.txt

MagellanServer.log

MsFragger produces no output

I run MsFragger and it runs fine but it produces no output.

PS E:\msfragger> java -jar -Xmx28G E:\msfragger\MSFragger-20190222.jar E:\msfragger\fragger.params E:\msgf\36979_2_1_293T_IMAC_HCD_F1.mzML

fragger_params.txt

MSFragger version MSFragger-20190222
MSFTBX version 1.8.6
(c) University of Michigan

System OS: Windows 10, Architecture: AMD64
Java Info: 1.8.0_211, Java HotSpot(TM) 64-Bit Server VM, Oracle Corporation
JVM started with 25486MB memory

Unknown parmameters:
fragpipe_ram = 28
Peptide index read in 7194ms
Selected fragment tolerance 0.00 Da and maximum fragment slice size of 19170.76MB
4916071084 fragments to be searched in 2 slices (36.63GB total)
Operating on slice 1 of 2: 130772ms
36979_2_1_293T_IMAC_HCD_F1.mzML 10384ms [progress: 48407/48407 (100.00%) - 33922.21 spectra/s] - completed 1552ms
Operating on slice 2 of 2: 103604ms
36979_2_1_293T_IMAC_HCD_F1.mzML 6731ms [progress: 48407/48407 (100.00%) - 130126.34 spectra/s] - completed 442ms

Enzymes that cut before a specific aminoacid are not allowed

Describe the problem

  • I'm submitting a:

    • problem running the software
    • bug report
    • feature request
    • question
  • My MSFragger use case:

    • Closed search (standard small precursor mass tolerance)
    • Open search (large precursor mass tolerance)

** Enzymes that cut before a specific aminoacid not allowed**
Hello,
In my experiment I have used LysargiNase to digest my proteins. It is an enzyme that cuts before K or R, but I have not found any option in MSFragger for enzymes cutting before a specific aminoacid.
I think that adding this feature would be very useful for many researchers.

Regards,

Edu

Non specific digestion - avoiding Out Of Memory errors (not enough RAM)

Hi,
I'm analyzing some non-digested peptides data. But it always report following issue no matter how much RAM I assigned for search:

Will execute 13 commands:
java -jar C:\Users\Weixian Deng\Downloads\MSFragger-20171106\MSFragger-20171106\MSFragger-20171106.jar C:\Research\MHC\fragger.params C:\Users\Weixian Deng\Downloads\2018-02-05-140min-200nl-wd-MHC-B.mzML 

java -cp C:\Users\Weixian Deng\Downloads\MSFragger-GUI_v4.3\MSFragger-GUI.jar umich.msfragger.util.FileMove C:\Users\Weixian Deng\Downloads\2018-02-05-140min-200nl-wd-MHC-B.pepXML C:\Research\MHC\2018-02-05-140min-200nl-wd-MHC-B.pepXML 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe workspace --init 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe peptideprophet --decoy rev --decoyprobs --ppm --accmass --nonparam --expectscore --database C:\Users\Weixian Deng\Downloads\2018-02-06-td-UP000005640.fas C:\Research\MHC\2018-02-05-140min-200nl-wd-MHC-B.pepXML 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe workspace --clean 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe workspace --init 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe proteinprophet --output interact interact-2018-02-05-140min-200nl-wd-MHC-B.pep.xml 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe workspace --clean 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe workspace --init 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe database --annotate C:\Users\Weixian Deng\Downloads\2018-02-06-td-UP000005640.fas 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe filter --sequential --pepxml C:\Research\MHC --protxml C:\Research\MHC\interact.prot.xml 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe report 

C:\Users\Weixian Deng\Downloads\philosopher_windows_amd64.exe workspace --clean 

~~~~~~~~~~~~~~~~~~~~~~


Executing command:
$> java -jar C:\Users\Weixian Deng\Downloads\MSFragger-20171106\MSFragger-20171106\MSFragger-20171106.jar C:\Research\MHC\fragger.params C:\Users\Weixian Deng\Downloads\2018-02-05-140min-200nl-wd-MHC-B.mzML 
Process started
MSFragger version MSFragger-20171106
(c) University of Michigan


Sequence database filtered and tagged in 65ms

Exception in thread "main" java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
	at java.util.concurrent.FutureTask.report(Unknown Source)
	at java.util.concurrent.FutureTask.get(Unknown Source)
	at q.a(Unknown Source)
	at w.a(Unknown Source)
	at e.a(Unknown Source)
	at MSFragger.main(Unknown Source)
	... 5 more
Caused by: java.lang.OutOfMemoryError: Java heap space
	at B.a(Unknown Source)
	at r.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)

Digestion completed in 979ms

Process finished, exit value: 1

...

Can you help me with this issue?

Should MsFragger and X!Tandem produce essentially the same identifications for (simple) closed searches?

This may be a "dumb" question, but as I understand it, MsFragger uses the same scoring mechanisms as X!Tandem. However, in table 1 of the original paper (https://www.nature.com/articles/nmeth.4256), for a closed search, X!Tandem showed 638,052 PSMs vs MSFragger's 609,897. Even though the discrepancy is small (~5%), what accounts for this difference? (As I understand it, the main change between MsFragger and X!Tandem is the indexing of fragments, which was a truly novel way of substantively speeding up searches, especially for open searches, but it shouldn't change the final results, though, right?)

Put another way, if we're doing closed searches and "run time or RAM is not an issue", is there any reason why X!Tandem should identify more PSMs than MsFragger (assuming of course that we're using as similar as possible search settings, those search settings are the "simple basics ones" (ie nothing fancy) and we turn off X!Tandem's refinement mode)?

MSFragger conda

I was thinking about adding MSFragger to bioconda.

Any objections from your side?

I could not find any license information in this repo, is there any?

MSFragger issues

  • I'm submitting a:
    • problem running the software
    • bug report
    • feature request
    • question

Is there a separate issue tracker for MSFragger (CLI) or are all issues on-topic here? A related question: will MSFragger's source be released in the future and have its own repository with PRs etc?

Bioconda distribution

Hello,

I am planning to start using MSFragger in my research and I would like to know more about the type of licence associated with the software. Specifically, I am interested to know if it would be possible to distribute it via Bioconda. It would be very useful for me and I would be glad to execute the implementation.

Kind regards.

MSFrager could support continuous search?

Hello,
If the MSFrager could support continuous search for the same work,it will be great convenient for user save time. Even if there are some files in the search process, every time you start, you have to start from scratch. So it is really a waste of time. Searching engine such as MaxQuant, Proteome Discoverer and pFind all could continue task when there are some files in the search process. Could t MSFrager do like those softwares? I believe that would be more convenient to use.

MS Fragger PD node error

I'm trying to implement MS Fragger in PD 2.3, when I run a method with the node I get this error. I've also had trouble analyzing RAW files with the FragPipe GUI, but everything works great if I convert the files to mzXML, is there a configuration I'm missing somewhere?

Time Processing Node Level Message
11:28 PM Job Execution Info ----- Job execution until failure took: 19.5 s. -----
11:28 PM (1): MSFragger Error Could not process spectra due to following exception: An item with the same key has already been added.
11:28 PM (1): MSFragger Error Error: An item with the same key has already been added.
11:28 PM (1): MSFragger Info MSFragger Running Time --0:0:7
11:28 PM (1): MSFragger Info Picked up _JAVA_OPTIONS: -Xmx8G
11:28 PM (1): MSFragger Info ... 6 more
11:28 PM (1): MSFragger Info at edu.umich.andykong.msfragger.MSFragger.main(Unknown Source)
11:28 PM (1): MSFragger Info at edu.umich.andykong.msfragger.MSFragger.b(Unknown Source)
11:28 PM (1): MSFragger Info at edu.umich.andykong.msfragger.r.(Unknown Source)
11:28 PM (1): MSFragger Info at edu.umich.andykong.msfragger.t.a(Unknown Source)
11:28 PM (1): MSFragger Info at edu.umich.andykong.msfragger.b.a(Unknown Source)
11:28 PM (1): MSFragger Info at umich.ms.fileio.filetypes.thermo.ThermoRawFile.(ThermoRawFile.java:157)
11:28 PM (1): MSFragger Info at umich.ms.fileio.filetypes.thermo.ThermoRawFile.init(ThermoRawFile.java:180)
11:28 PM (1): MSFragger Info at com.dmtavt.batmass.io.thermo.ThermoGrpcServerProcess.(ThermoGrpcServerProcess.java:76)
11:28 PM (1): MSFragger Info Caused by: java.lang.UnsupportedOperationException: Batmass-IO binaries for Thermo support and/or Thermo native libraries not found found
11:28 PM (1): MSFragger Info at com.simontuffs.onejar.Boot.main(Boot.java:166)
11:28 PM (1): MSFragger Info at com.simontuffs.onejar.Boot.run(Boot.java:340)
11:28 PM (1): MSFragger Info at java.lang.reflect.Method.invoke(Unknown Source)
11:28 PM (1): MSFragger Info at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
11:28 PM (1): MSFragger Info at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
11:28 PM (1): MSFragger Info at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
11:28 PM (1): MSFragger Info Exception in thread "main" java.lang.reflect.InvocationTargetException
11:28 PM (1): MSFragger Info 001. Lewis_WTB_20191008154845.raw

MFragger example files

Hello,
I want to know how to set MSGFPlus_Params.txt file when I use the MSFragger software to open search data and I do not find the example files about open search Params. Could you give me some suggestions to try open search?

Exception when searching

I'm running searches with .mgf files. For one file it went successfully, but for another file I got the following exception,
ecoli-total-2-3_RC3_01_455_AnashkL5.mgf 971ms [progress: 256/56829 (0.45%) - 2560.00 spectra/s]Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.NullPointerException
at n.b(Unknown Source)
at MSFragger.main(Unknown Source)
... 5 more
Could anyone help me with this problem? Thanks!

Searching ETD data

Hi, MSFragger works like magic in our research, is it possible to search ETD data using MSFragger as well?

Repo and organization permissions

I think it'd be helpful to:

  1. Have more than 1 account assigned owner permissions for the Nesvilab org - currently there's only 1 (Felipe). I guess Alexey's account should be.
  2. Make @anesvi at least a member of the org, not just a collaborator and set his membership visibility to public. That's also needed to assign him as an owner.
  3. Grant admin rights to members who maintain repositories in the org.

parameters for MSFragger narrow search

Hi,
I tried narrow search with MSFragger.
But the number of identified PSMs at FDR 1% is only 1/4 of the Comet ID results.

Would you please tell me the optimal parameter settings for narrow search?

Thank you very much.
H. Li


Parameters used for narrow search:

precursor_mass_tolerance = 20.00
precursor_mass_units = 1 # 0=Daltons, 1=ppm
precursor_true_tolerance = 0.00
precursor_true_units = 1 # 0=Daltons, 1=ppm
fragment_mass_tolerance = 20.00
fragment_mass_units = 1 # 0=Daltons, 1=ppm

isotope_error = 1 # 0=off, 0/1/2 (standard C13 error)

search_enzyme_name = Trypsin
search_enzyme_cutafter = KR
search_enzyme_butnotafter = P

num_enzyme_termini = 2 # 2 for enzymatic, 1 for semi-enzymatic, 0 for nonspecific digestion
allowed_missed_cleavage = 2 # maximum value is 5

clip_nTerm_M = 0

variable_mod_01 = 15.9949 M
variable_mod_02 = 42.0106 [^

add_C_cysteine = 57.021464 # added to C - avg. 103.1429, mono. 103.00918

Suitability for low resolution fragments

Is MSFragger (in closed search) at all suitable for data acquired with high resolution precursors and low resolution fragments (linear ion trap)?

I am having difficulty getting PSM rate above 4% in a samples I can achieve 15% in elsewhere.

Though it might not be the first choice tool, it's the only one I can see that does database splitting for parallelisation and it runs on linux so is suitable for HPC.

Thanks,

Andrew

Everything looks fine but Philosopher reports "No PSM was found in data set."

Hello,

Thank you for your support and help with the MSFragger user. I have completed several data search work, but there is still a problem that is described as follow:
“INFO[12:45:11] Executing Filter v1.5.1
INFO[12:45:11] Processing peptide identification files
FATA[12:45:47] No PSM was found in data set.
Process 'ReportFilter' finished, exit code: 1

Process returned non-zero exit code, stopping”
log_2019-10-12_16-05-43.txt

Attachment is the log file for this running and . The software version I use are MSFragger-20190628.jar and the FragPipe v9.4.

How to get higher precision on expectation scores?

Currently, I'm using the TSV format outputs and the expectation scores are with 4 digits after the decimal point. To analyze the scores I need higher precision, say with scientific notation. How can I get such precision?

Thanks,
Yisu

What is the difference between "precursor_mass_lower/upper" vs "precursor_true_tolerance"?

I'm a little confused as to the difference between "precursor_mass_lower/upper" vs "precursor_true_tolerance". Could you please clarify? (I read the documentation.) In particular, this is what I'm trying to do:

I'm trying to search a pseudo spectra file that was generated against DIA data. (The pseudo spectra was generated using a tool similar in concept to DIA Umpire.) Because I'm using a pseudo spectra file (and not a "real" DDA spectra), I do NOT know the precursor mass precisely; in fact, let's assume that in the common cases, I only know that the precursor falls in a 4Da window. In X!Tandem, I was able to successfully search this pseudo spectra file by setting the precursor mass of each spectra to halfway in between the DIA window (eg if the DIA window was say 600Da to 604Da, I would set the presumed precursor mass to 402Da) and then set the precursor toleration to +- 2Da.

Can something similar be done with MS Fragger, and if so, what should the values be for "precursor_mass_lower/upper" vs "precursor_true_tolerance"? (In X!Tandem, there is only 1 variable that I need to set, the "precursor tolerance", which I set to 2 Da in the above example; but in MSFragger, I'm confused as to which of the above 2 parameters I need to set or if they somehow need to "add up" to 2Da etc.)

All of the questions above are thus far for CLOSED SEARCHES. However, if I then wished to run an open search, would there be any reason why my pseudo spectra (with the very large +- 2Da tolerance for the precursor) not work well in MSFragger due to the super large precursor inprecision? (I ask because I believe it was designed with a more "normal" precursor tolerance in mind, such as 20ppm etc.) Finally, even if OPEN SEARCHES should work for my above described pseudo spectra, would the "add_topN_complementary" be a BAD idea to enable since the precursor tolerance is huge (ie 2 Da)?

Search for a custom modification on all amino acid residues

Hello, I would like to search for a custom modification (+537.11794) on all possible amino acid residues of a trypsin digested protein, and I am wondering how to do this. I have tried a closed search, adding all 20 amino acid residues to the variable modifications, with mass delta 537.11794. That ran into out of memory issues. I tried an open search, but the upper mass appears to be limited to 500. I am using FragPipe v9.1.

Single amino acid polymorphisms

Hi,

AFAICT, MSFragger has no built-in way of addressing known single amino acid polymorphisms (cf. x!tandem). Is such an option on the agenda and/or it there a good workaround recommendation?

Thanks!

Follow-up to my question from Jun 14th; Recalibration with sliced database (#28)

Dear Fengchao or whoever it may concern,

First of all thank you again for your fast help with my previous error. The fixed version worked quite well for me. However, on a few specific mzXML files MSFragger allway stops with the "out of memory" error after finishing the actual search. Thus it completes the search for all spectra and returns already the time used for all spectra but than does not finish writing the pep.xml file. Instead there are files with the extention ".fragtmp" which are about 20Gb in size (about twice as much as the mzXML). The concerned files are from one batch with other files that do not produce this error.

Could this error be somehow related to the previous one or do you have any suggestion, what I could try with theese files?

Best regards,
Juergen

FDR and results

I have a couple question related to the output results:
1- after successful processing, there are two sets of .tsv files (modifications, ions, peptide, psm and report), one in the main output folder and one in a folder called '.meta'. They look identical to me, is it so? Any reason why they are duplicated?

2- to my understanding, the fragger results are filtered to 1% FDR (at protein, peptide/psm levels) via protein/peptideprophet. Can one adjust the FDR threshold in the GUI?
Or one would need to filter for protein/peptide probability >0.9 to get to 1% FDR?

3- I believe the decoy hits are removed from the results, right?

4- Under the 'Modified Peptide' column, what are the numbers in square brackets (i.e. M[147]PEDESTPEKR, or n[43]SRPLSDQDK). I thought they would correspond to the modification accession # in UNIMOD, but that does not match...

Thanks!

Open Search with a custom modification

Describe the problem

Setting up an open search with a custom modification in our samples labeled with our probe of 196.2 on Lysine (variable).

  • I'm submitting a:

    • problem running the software
    • bug report
    • feature request
    • question
  • My MSFragger use case:

    • Closed search (standard small precursor mass tolerance)
    • Open search (large precursor mass tolerance)

We'd like to search for any PTM, such methox, acetylation, phosphorylation variable modifications that occur when the peptide is labeled by our probe. How should I set up the search parameters?
Here are some settings we tried:

precursor_mass_tolerance = 200.00
precursor_mass_units = 0 # 0=Daltons, 1=ppm
...
variable_mod_01 = 15.9949 M
variable_mod_02 = 42.0106 [^
variable_mod_03 = 79.96633 STY
variable_mod_04 = 42.0106 K
variable_mod_05 = 196.2 K
...
precursor_charge = 2 5
override_charge = 1

System info

You can find that printed on the Config tab.

  • OS and version: Windows 7 64bit with 24 GB ram 8 cpu
  • Java version: 1.8.61

Describe your experiment

Genral proteomics experiment description

Human, full cell lysate label with our probe digested with Trypsin

pepXML

I use the command line to process the MSFragger,but the result files which genarated by MSFragger were not be understand when I want to found the identified PSM, peptides and proteins. another question is how to find the used time about this procession.

Recalibration with sliced database

Dear All,

I'm conducting an open mass window search with semi-specific cleavage on a ~15mb database. Using a VM with 500GB memory this worked fine with the previous release of MSFragger. However since we had some lock mass errors I would really like to use the automatic recalibration of the latest version. Unfortunately it seems that for this function the database slicing does not apply. I choose almost all options to reduce search space (incl. fully specific digest, less peptide length variability, no variable modifications, ...) but there always occurs a memory error. When checking the console, it seems that the problem is related to the fact that MSFragger tries to do the "Firstsearch" in one slice. Once I deactivate the automatic recalibration, my normal parameter file works perfectly.
Thus I would like to know, if there is any way to influence the parameters for the first search separately? Or does somebody has an idea, how to avoid this error?

Best
Juergen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.