Git Product home page Git Product logo

everhartlab / sclerotinia-366 Goto Github PK

View Code? Open in Web Editor NEW
11.0 3.0 6.0 168.55 MB

Analysis for "Population structure and phenotypic variation of *Sclerotinia sclerotiorum* from dry bean (*Phaseolus vulgaris*) in the United States"

Home Page: https://doi.org/10.7717/peerj.4152

License: Other

R 8.78% Makefile 43.10% Shell 48.12%
reproducible-paper fungal plant-pathology reproducible-research reproducible-science docker-container r rmarkdown sclerotinia sclerotinia-sclerotiorum

sclerotinia-366's People

Contributors

sporangia avatar zkamvar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sclerotinia-366's Issues

Referencing specific sections of Supplemental Information

There are a couple of sections in the manuscript that point the reader towards supplemental information, but reviewer 2 pointed out that it's unclear as to where exactly they should be going.

At the time of the review, the paper linked to the OSF repository https://osf.io/ejb5y/, which serves as an archive for this repo.

Here's the example text:

In contrast, when we compared the three cultivars, Beryl, Bunsi, and G122, we found no significant differentiation (Supplementary Information).

I've come up with two solutions (that have been expanded by both Pat Schloss and Ben Marwick, in this twitter thread):

  1. Adding text pointing the reader to a specific section with a citation to the data:

    In contrast, when we compared the three cultivars, Beryl, Bunsi, and G122, we found no significant differentiation (See the section on Host Differentiation in the wmn-differentiation.md file in the Supplemental files https://osf.io/ejb5y/)

  2. Adding a straight up link (as a footnote) pointing to the section:

    In contrast, when we compared the three cultivars, Beryl, Bunsi, and G122, we found no significant differentiation (Supplementary Information1)

Currently, I've opted for 2, but I'm not quite settled yet. I would love to hear opinions.

Hao Ye suggested section numbering. This would be a good option if I were rendering my reports in pdf... though it may be good practice to go back and number the sections before submission ¯\_(ツ)_/¯

Footnotes

  1. https://github.com/everhartlab/sclerotinia-366/blob/v1.0/results/wmn-differentiation.md#host-differentiation
    ^1 https://github.com/everhartlab/sclerotinia-366/blob/v1.0/results/wmn-differentiation.md#host-differentiation

Simplify MLG/MCG graph figure

@sporangia has brought up a good point that the MLG/MCG graph figure (Fig. 3 so far) is way too complex to discern. She suggested a summary and liked the idea of showing a subset of the graph with the larger graph inset and grouping that with figure S1 (The MCG bar graph).

Save 16 locus data as csv file

The 16 locus data is currently stored as an excel file, but to future-proof it, the data should be stored as a csv. This can be done at the end of the data_comparison Rmd file.

Modify dockerfile to use dependency dockerfile

I realized that it's valuable to be able to have all the dependencies (without the data and code) since building the docker image takes a lot of resources from the analysis itself. I have created a separate repository (https://github.com/everhartlab/sclerotinia-366-dependencies) to hold the dockerfile that builds this code. This means that we can replace much of the current dockerfile with:

FROM zkamvar/sclerotinia-366-dependencies
MAINTAINER Zhian Kamvar <[email protected]>

## Copy the current directory to /analysis
COPY . /analysis

## Run the analysis
RUN . /etc/environment \
&& cd /analysis \
&& make clean \
&& make -j 4

Assess Virulence by MLG and Region

Otto-Hanson et al. assessed virulence by Region showing a significant difference. Since we know now that MCG is a bit of an unreliable measure, it would also be important to look at these.

Investigate differentiation between white mold nursery populations

The white mold nursery populations are unique because they are not fungicide treated and have the same cultivars planted in them year after year.

The question becomes, are white mold nurseries differentiated from each other or are they more or less homogeneous? We could use AMOVA to test for these with location and binary source (wmn or non-wmn) as the hierarchy.

Investigate effect of cultivar

Because the same three cultivars were planted in the white mold nurseries, we can test if there is any effect of cultivar on the population structure.

We have three cultivars:

  • G122: resistant
  • Bunsi: ?
  • Beryl: succeptible

My hypothesis is that cultivar has no effect on population structure simply because of the inoculum load in the soil. We can test for differentiation using DAPC.

Review Methods

Hi @sporangia,

I believe I'm finished with the methods for now and would appreciate a review.

I've created a branch called sydney-methods-review if you want to use that for your review. Just pull the repository and then select sydney-methods-review from the bottom of the git branch menu in Rstudio:

deleteme

Investigate statistical measures for mlg/mcg graph

The MLG-MCG graph is also known as a bipartite network. This is a special kind of network where there are two kinds of nodes and connections may only be between nodes of different types.

A simple method to test this would be to compare it with random graphs using igraph's sample_bipartite(), but there are several papers that talk about statistical inference of these graphs:

Strona and Veech, 2015 (Methods in Ecology and Evolution)
Saracco et al., 2015 (Scientific Reports)
Yildrim and Cosica, 2014 (PLoS One)

I have not yet read these papers deeply, but given the size of our data set, we have a good chance of more formally investigating this angle.

Author Reviews

Sydney Everhart

Contribution: supervised the data analysis, analyzed the data, contributed analysis tools, wrote the paper, edited and reviewed drafts of the paper.

Contribution:
Jim Steadman

L.36: ...that can be a yield-limiting...
L.37: Comment --- Resistance has been identified but not in commercial beans
L.85: The early nursery
L.96: ddH2O for 3 min.
L.393: While the evidence may suggest host...
L.490: MCG does not necessarily represent...

Contribution: Conceived and designed experiments, organized network of white mold screening nurseries, provided S. sclerotiorum isolates, edited and reviewed drafts of the paper.

B. Sajeewa Amaradasa

L.67: isolates collected over 10 years between 2003 and 2012
L.72: ...effect of cultivar on genetic diversity of the pathogen by assessing...
L.116: Question: is this ~2.5cm? If you say >2.5cm, what is the upper limit? (ZNK: perhaps I worded this incorrectly; perhaps: "removing plant growth beyond 2.5 cm above the fourth node.")
L.253: Comment: better to give absolute numbers since P<0.05 can be 0.0001, too.
L. 261: Comment: Is this for Year, Host, and MCG? If so, better to mention that or remove region and just mention P ≤ 0.007
L.277: ...effect for MLHs (P = 7.44e^-4^) for MLHs, with means that...
L.341: Did Carbone and Kohn study temporal structure as well? (ZNK: no, they did not investigate that)
L.462: What is the substrate others have used?

Contribution: analyzed the data, contributed analysis tools, wrote the paper, edited and reviewed drafts of the paper.

Serena McCoy

L.122: Replace with more specific language --- Each mycelial mat was collected in a filtered Büchner funnel

Acknowledgements: Becky should also be acknowledged for the lab and greenhouse work she did.

Contribution: Carried out and experiments (MCG assessment, aggressiveness ratings, genotyping), edited and reviewed drafts of the paper.

Reviews back: minor revisions

Editor's Decision

Your manuscript has been seen by three qualified reviewers. Based on their detailed assessments and my own, I feel this work is well suited for publication in PeerJ after a number of minor revisions.

Reviewer 1

Basic reporting

  • The article is well written.
  • Literature References are relevant and fits the study and within a broader field.
  • The structure of the article is of good standard. The format of PeerJ is the opportunity to provide greater detail and more in-depth discussions that may be restricted in other journals. However, there is a delicate balance of introducing as much detail as necessary while retaining the attention of the reader. This article is on the borderline of introducing too much detail. For instance, is the use of Figures 1, 2 and 4 all necessary? The underlying question is, Do the Figures and additional detail in the paper enhance the paper or create a distraction from the primary message of the study? And while there is great detail on different analyses there is lack of depth on the Mexican population? Is this an important population and why?
  • Complete study without indications of fracturing of the research to increase publication count.

Experimental design

  • Well defined gap in scientific knowledge, confirms the efficacy of white mold screening nurseries through population genetics analysis and associated aggressiveness of isolates.
  • Appropriate use of technologies for investigating genetic populations with that are predominantly clonal. Identified and evaluated an appropriate phenotype of interest (aggressiveness). Identified the limitations of the study, in that the pathogen is not host specific and may influence conclusions of the study.
  • Methods would allow for an investigator to replicate the same or similar study to confirm results or develop new studies.

Validity of the findings

  • Study provides novel results in the scientific field of S. sclerotiorum on dry beans an important agricultural commodity. Potentially validates the continued use of nurseries for resistance selection in breeding and introduces a potential area of concern/focus for future research with the Mexican population of pathogens.
  • Data is provided in repository: https://github.com/everhartlab/sclerotinia-366/. Review of data available (includes: year, location, host…) indicates the opportunity for evaluation/use of data to confirm results and/or in future studies. Statistical analysis of data was appropriate.
  • Conclusions are well stated and appropriate.

Comments for the author

Overall this paper did a wonderful synthesis of the topic while applying and interpreting the results with good scientific standards. I found only one important area of correction which was in the abstract requiring the clarification of "11 states in the United States of America,...". This is noted in the material and methods but is absent in the abstract.

From a scientific perspective, it should be noted that additional phenotypes for aggressiveness should be evaluated in future research. The straw test is only an indicator for one type of pathogen potential and limits interpretation of the populations. The conclusion given in this study is valid but greater resolution may have been possible with more phenotype data (and of course larger populations). I was also interested in more interpretation/analysis as to why the Mexican population did not have MLH shared with any other regions, the absence of clarity on this is done at a loss.

Reviewer 2

Basic reporting

Manuscript is written clearly, with appropriate references but limited introduction. Please see general comments reagrding that. The authors has clearly stated research questions, hypothesis, detailed M&M, concise results, and extensive discussion.

Experimental design

Research questions were clear and addressed appropriately in subsequent analyses. Methodology and data analyses are described with sufficient details to allow reproducibility.

Validity of the findings

Data was robust, available for reproducibility and well explained.

Comments for the author

Population structure and phenotypic variation of Sclerotinia sclerotiorum from dry bean in the United States by Kamvar et al. utilized microsatellite loci to answer number of interesting questions including evaluating phenotypic and genetic diversity of S. sclerotiorum in the nurseries (here referred as natural populations since no control was used to limit disease spread) using regional differences and across different time intervals (span of nine years). In addition, the authors investigated correlation between mycelial compatibility groups and multilocus haplotypes among these populations. Introduction is a bit short and I would suggest expanding few sections (please see specific comments below). Materials and methods are precise and well written and I really appreciated data availability, including all sorts of analyses, which was refreshing. I also like the acknowledgment of shortcomings of the analyses (compound microsats example), which can be challenging to work with. Results were explained well with few exceptions that need some clarification. Overall, well written manuscript with interesting and relevant results for nursery producers/growers. As such, I recommend it for publication with minor revisions.

Reviewer 2 gave specific comments in PDF form:

reviewer-2-comments.pdf

Reviewer 3

Basic reporting

The paper focuses on addressing a question of the genetic diversity of Sclerotinia and addressing how this genetic diversity could be ligated to virulence and compatibility between strains. The context provided for the paper is sufficient to state the goal of the research, and the authors show knowledge of the system and the analyses required to achieve the stated goals. The article is sound and data/code for analyses have been made public and easily accessible. In general, the conclusions were supported by the data, there are some points that required some clarification, but overall the research was well developed and the data is nicely presented to follow up the document.

Experimental design

The paper is one of the most extensive population studies in plant pathogens addressing questions on the structure of the population and the correlation of genetic diversity with specific traits. The goals and research question stated by the authors were mostly addressed with the data and there are points that despite the difficulty of the question, there are good approximations to the answer. For instance, the limited information provided by SSRs could not potentially lead correlation with phenotypic traits, therefore the authors acknowledge this, and approach the question using different statistical methods.
The methods and the analyses conducted are well explained and the authors provided all the code and data for corroborating the results.

Validity of the findings

Overall the study provides an extensive view of the genetic diversity of S. sclerotiorum in screening nurseries and commercial fields of dry bean. Despite that the question of how genetic diversity links phenotypic traits like MCGs and virulence has been approached before, the authors analyzed a large number of isolates using multiple markers and different statistical approaches to answer this question. The result is still negative since there are limitations by the markers, and since this pathogen has reduced diversity. Sclerotinia is soilborne pathogen and it is expected that there should be constraint populations at the geographical level, however, there is a reduced diversity suggesting little differentiation among regions. This is addressed by the authors, where soil or contaminated plant material could have played an important role on the transmission of this pathogen. In addition, the goal of establishing a census of the genetic diversity and its relation to aggressiveness is major task but necessary to establish a baseline for breeders to target a representative pool of the pathogen’s population.
Nevertheless, the authors go through a good job of addressing the issues and limitations of the study. There are some points or comments that I would recommend to the authors to discuss and/or consider, those were included on the general comments.

Comments for the author

  • The availability of the data and the analyses posted on github was really helpful and it made very enjoyable to read the paper and understand some of the logic of the authors, it has been a great experience. It also provides a good view on the paper and it helps to assess paper and give recommendations on the paper. Kudos! I want to compliment the authors for making these resources available.

  • Since populations for certain areas were only collected within a single year, variation between years and region should also be looked at with caution. Are year and region still important if samples with more than one year are retained? How much variability is explained if so? It will a good way to corroborate if there is a continuum of genotypes or every year is bottleneck increasing diversity and to determine how much populations sampled once contribute to the analysis. However, the authors are aware of this on line 350-352.

  • One thing to be addressed is how well these microsatellites represent the whole genome, this was not addressed on Sirjusingh and Kohn (2001) maybe due to the lack of the genome sequence. However, this could explain the lack of power of the existing set to represent the haplotypes in the population. Despite, that most studies are using reduced genome approaches or more powerful techniques, it will be informative for other researchers still using this set of microsatellites to have this information. Njambere et al. (2010; 10.1139/G10-019) did an approximation for S. trifoliorum using linkage groups, but I am not aware of something similar for S. sclerotiorum to corroborate these SSRs.

  • The relation between MCGs, MLH and aggressiveness is quite interesting, however, it is hard to follow in the text. The graph in github (https://github.com/everhartlab/sclerotinia-366/blob/master/results/mlg-mcg.md), summarizes really well some on this information as well as table 3. Maybe you can consider including this graph either on the article or add a column to table 3 with the average aggressiveness. The graph might be more meaningful since you can see the variability of aggressiveness of the different strains by the different factors.

  • L384-404 The authors have a discussion on differentiation of the population based on region. Then, the discussion is centered on how regions like WA still had a considerable amount of variation within the US locations sampled. However, one of the points discussed is that one of the locations was inoculated in 2002 and then crop history differed between the two locations sampled. As the authors suggested there is little or minimal differentiation between isolates from different hosts. Nonetheless, the source of the sclerotia used to inoculated the fields is also different. This could be also part of the differences that the authors see in 2008. Despite that other studies have indicated a limited differentiation between hosts, it seems that there is some effect on the genetic. Is the virulence different on these isolates with respect to other isolates? Aldrich-Wolfe et al. (2015) presents information on the allele sizes for the markers used, are these haplotypes present in the current study? It will be interesting to determine if most haplotypes are share or not, and if those present across multiple hosts have a different virulence. However, the major point of the paper is dry bean but the history on crop rotation could explain some of the variability across years.

  • P6L262 varaibles change to variables

Missing endif in Makefile

You are missing a endif in the Makefile:

# Testing whether or not we are in a docker container
ifeq ("$(wildcard /proc/1/cgroup)","")
	RCMD="devtools::install()"
else ifneq ("$(shell grep -cq docker /proc/1/cgroup)","0")
	RCMD="Sys.Date()"
else
	RCMD="devtools::install()"
endif

Change scaling of Figures 4 and S2

Figure 4 and Figure S2 in the paper currently have the nodes scaled by radius, not area 🤦‍♂️

Luckily, the numbers are presented on the figure, so the interpretation does not change even if I rescale the nodes by area.

For reference, figure 4 would look like this rescaled (I also modified the legend because the sizes there changed as well:

image

Figure S2 would look like this:

image

@sporangia what do you think? Should we contact PeerJ about this?

Fix barplot figure

The barplot figure is great, but it would be better if it didn't have the singletons. Removing the singletons will allow for the use of a more normal aspect ratio and make things slightly less confusing 😄

PeerJ has some nits to pick

Figures 

  1. Please upload your figures in either EPS, PNG, or PDF (vector PDFs only), measuring at least 900 by 900 pixels and eliminating excess white space around the images, as primary files here https://peerj.com/manuscripts/20972/files.
  2. Please use numbers to name your files, example: Fig1.eps, Fig2.png.
  3. Figure 6 has multiple parts. Each part needs to be labeled alphabetically to use (A, B, C and D). Please provide a replacement figure measuring at least 900 by 900 pixels, saved as PNG, EPS, or PDF (vector images) file format without excess white space around the images. Each figure with multiple parts should label each part alphabetically (e.g. A, B, C and D) and all parts of each single figure should be submitted together in one file.
  4. If you have figures composed in the LaTeX source file we can use that at the time of production. Please leave a Note to Staff at https://peerj.com/manuscripts/20972/declarations/#other if you choose to provide the figures in the LaTex source file so that staff will know that's where it can be found. (example note to staff: Figure 1 and 2 are in the source file manuscript and can be found in the .tex file).

Tables 

  1. We ask for the tables in Word (composed in Word, not images pasted in Word docs), but if you have them composed in the LaTeX source file we can use that instead at the time of production.
  2. Please leave a Note to Staff at https://peerj.com/manuscripts/20972/declarations/#other if you choose to provide the tables in the LaTex source file so that staff will know that's where it can be found. (example note to staff: Tables 1, 2, 3 are in the source file manuscript and can be found in the .tex file).

Remove Supplemental Files from Manuscript Source File 

The supplemental files should only be supplied as separate files in the Supplemental Files section and should not appear in the manuscript source document. Please remove them from the manuscript document and re-upload the manuscript here: https://peerj.com/manuscripts/20972/files, and then upload the Supplemental Files separately in the appropriate section. Supplemental Files are not typeset, they are published as downloadable files with the titles and legends exactly as they are entered into the system, so please ensure the titles and legends are complete. Examples of such legends are here: https://peerj.com/articles/2344/#supplemental-information.

Data not Shown 

We noted your statement “Data not Shown” (in the Figure 1 legend "Edges extending from MLHs displayed to other MCGs are not shown"). We would like to draw your attention to our Data Sharing policy as detailed at https://peerj.com/about/policies-and-procedures/#data-materials-sharing. Of course, the inclusion of this statement does not necessarily mean that our policy is being violated, so please can I ask you to leave a note to staff at https://peerj.com/manuscripts/20972/declarations/#other or email me (at [email protected]) to let me know the reason(s) for not showing this data in this instance?

ZNK comment: WTH?! There is a data accessibility section that links to this repo!!
(╯°□°)╯︵ ┻━┻

Funding Statement 

  1. Please remove all financial and grant disclosure information from the source file manuscript. This information should only be provided in the Funding Statement here: https://peerj.com/manuscripts/20972/declarations/#question_18.
  2. Please use full names, instead of initials, for the author names in the Funding Statement. Edit here https://peerj.com/manuscripts/20972/declarations/#question_18.

Competing Interests 

Please remove all competing interests information from the source file manuscript and make sure it is included in your Competing Interest Statement instead here https://peerj.com/manuscripts/20972/declarations/#question_17.

Author Contributions 

Please remove all author contributions information from the source file manuscript and make sure it is included in your Authorship Statement instead here https://peerj.com/manuscripts/20972/declarations/#authorship.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.