Comments (29)
Hi Weixian,
Can you attach or e-mail me your fragger.params? For non-specific digests, we need to impose further constraints on the search space to reduce memory usage, otherwise, it can get quickly out of control.
Andy
from msfragger.
- You can find the fragger.params file in the output directory that you specified in the GUI.
- Alternatively you can set the same parameters that you used in the GUI and click the "Save" button at the top of MSFragger tab.
- Please update the GUI to a more recent version: http://github.com/chhh/msfragger-gui/releases/latest
This will help with updates in the future.
from msfragger.
Thank you for reply, here's the params file.
fragger.params.zip
from msfragger.
by the way, in this file, I didn't delete anything in the enzyme selection tab, but I tried delete enzyme name and digestion site, it reported same issue.
from msfragger.
from msfragger.
@weixiandeng the size of the mzml file is more or less irrelevant, it's the size of the database (fasta file) that matters the most for RAM usage. How large is your fasta file?
from msfragger.
In case the problem is that your are running out of memory, I suggest you reduce the maximum peptide length to say 25
digest_min_length = 7
digest_max_length = 25
This will reduce the size of the fragment index and the overall memory requirement (if that's an issue with your run)
Alexey
from msfragger.
from msfragger.
from msfragger.
@weixiandeng As I said before, it would help if you shared info about your fasta file. What database are you using? How large is the file? Can you share the file?
from msfragger.
Assuming I'm using the human swiss-prot DB, and I've got access to 128GB machine. would that be enough? should I have more?
from msfragger.
That should be more than enough using reasonable parameters for non-specific digests. I tried a non-specific digests on the human Uniprot database (with reversed decoys) and peptide lengths 7-25 and managed to fit it all within 32GB of memory (-Xmx32G). You can try reducing max_variable_mods_combinations to 1000 to reduce the number of modified peptides.
from msfragger.
from msfragger.
from msfragger.
Have you tried a default closed search, with trypsin digestion? Just to make sure the problem is really related no non-specifc digestion.
Alexey
from msfragger.
from msfragger.
Can you give us the database you're trying to use?
from msfragger.
I'm trying to do something similar and could imagine this no enzyme searching thing being very useful in the MHC peptides world and the search for undigested peptide hormones. In an idealized setting where you might have MS data for such a thing, do you have an idealized params file that could be tried for no enzyme searches? I'm used to using Comet, where you can define the enzyme with a "0" in the params file for no enzyme searching.
from msfragger.
from msfragger.
Great. Thanks for the quick response. That is a good summary of the necessary parameters. In my particular case, I am actually interested in longer undigested peptides. Would it reduce the computational burden if I extended the range of the digestion length to like 20-50 or does that matter at all?
from msfragger.
from msfragger.
Ok. Great. Thanks for the help!
from msfragger.
from msfragger.
from msfragger.
Hi. Thanks for all of your help. I'm using a compute cluster that has really a lot of resources (up to 1TB per node) and I still can't seem to make this work. I have a single .mzXML file that has an enrichment of non-digested peptides. I tried many many times now to get this to work with this command:
java -Xmx512G -jar MSFragger-20171106.jar fragger_no_enzyme-2.params 20180124_QEp1_CPBA_EASY03_025_30_SA_plasma_endopeps_Joan_1mL_StageTip_1to1_01.mzXML
and I get the following error:
Peptide index read in 81ms
Selected fragment tolerance 0.02 Da and maximum fragment slice size of 404630.20MB
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at e.a(Unknown Source)
at MSFragger.main(Unknown Source)
... 5 more
TROUBLESHOOTING: I now went back and took the same .mzXML file in the same directory using the same command as above and just ran it exactly the same using the fragger.params file that came with the MSFragger initial download and it runs past where the error is occurring. I changed each of the parameters individually and found that the following parameter causes this error to occur:
search_enzyme_cutafter = ARNDCQEGHILKMFPSTWYV
If I use all of the parameters that you outlined in the post above for no enzyme searches with leaving the default 'KR' in the 'search_enzyme_cutafter' line, I can make it run through (I don't know if the results are right, but the algorithm doesn't create the error above). This is even leaving the 7-50 amino acid length designations. If I put back the "search_enzyme_cutafter = ARNDCQEGHILKMFPSTWYV" and reduce the amino acid length to 8-25 like you suggested, I also recreate the error above. I even tried changing the length to something arbitrarily defined (in this case 20-21 amino acids) and it also failed. Is there something that I am missing or some other strategy that I should try?
Thanks so much for your help.
Best,
chris
PS - I forgot to mention that I also did this against an extremely oversimplified uniprot database where I deleted all of the proteins except for 2 and then added back the reverse sequences with philosopher.
from msfragger.
I might have just solved my own problem with the parameter that I wasn't changing and probably should have. How do this parameter play into the rest of the parameter set? I set it to 0 and can make this No_enzyme search get past the previous error at least.
num_enzyme_termini = 0 # 2 for enzymatic, 1 for semi-enzymatic, 0 for nonspecific digestion
from msfragger.
from msfragger.
Yes, that is correct. If you were to leave things as enzymatic cleavage while setting your enzyme to cut at every single point, you would not get any peptides due to the limits on the number of missed cleavages. Non-enzymatic searches should always be done with num_enzyme_termini = 0.
Andy
from msfragger.
Update on the parameters for nonspecific search:
Using MSFragger-GUI, please specify:
Enzyme Name: nonspecific
That way PeptideProphet will automatically recognize that the enzyme was nonspecific and you would not need to add --enzyme nonspecific
in PeptideProphet tab
Please also specify Cut After: ARNDCQEGHILKMFPSTWYV
and Not After: empty
and select Cleavages: NON_SPECIFIC
If you edit fragger.params
directly, specify the following:
search_enzyme_name = nonspecific
search_enzyme_cutafter = ARNDCQEGHILKMFPSTWYV
search_enzyme_butnotafter =
num_enzyme_termini = 0
As mentioned before, reduce peptide length to 8-25 (perhaps less) and do not use variable mods other than M+16 (unless you have enough memory). If you want to add more variable mods e.g. extra variable modifications on Cys (for MHC peptides) and the program crashes, please reduce the peptide length to 8-15. Adding STY+80 with nonspecific search will certainly require a cluster with a lot of memory. Instead, you can perform searches without variable mods like STY+80 but using mass_offsets option. Please specify mass shifts of interest, e.g. for phosphorylation, as mass_offsets: 0/79.9663 . Hopefully we can put together a tutorial/website soon explaining these options better, and we will also provide sample parameter files for various scenarios.
Alexey
from msfragger.
Related Issues (20)
- .wiff convert to .mzML but non-centroid scans HOT 1
- another stand-alone version HOT 2
- Cannot perform open search with precursor tolerance lower than -230 Da HOT 3
- FragPipe (v20.0) error following updates HOT 2
- MS Fragger DDA+ parameter: What optimal setting for 'Report top N' for DDA+? HOT 1
- Question: Extraction of scannumber HOT 2
- Are processed MS/MS spectra decharged before database search carried out by MSFragger? HOT 1
- Use a pre-digested database for close/open search HOT 4
- nonspecific search HOT 1
- Are DDA+ IDs of different ranks equal? HOT 4
- Error During pepxml Rewrite: Unable to Find Correct Mapping for Raw File Path HOT 8
- Low resolution MS database search HOT 4
- Setting up variable modifications HOT 2
- Does MSFragger allow two modifications on the same residue? HOT 4
- Question about PTM format HOT 1
- Question about multiple PTMs on same residue HOT 1
- Insufficient memory problem HOT 1
- Peptides with and without modficiation HOT 8
- Is it possible modify peptide level data before proceeding to protein quant? HOT 1
- Index out-of-bound error when perform open search HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from msfragger.