Git Product home page Git Product logo

sortee-github-hackathon / manuscript Goto Github PK

View Code? Open in Web Editor NEW
23.0 8.0 17.0 56.63 MB

This repository implements an automated system to write our collaborative manuscript, while tracking changes and contributions.

Home Page: https://sortee-github-hackathon.github.io/manuscript/v/latest/index.html

License: Other

Shell 1.29% HTML 97.39% TeX 1.30% SCSS 0.01%
github ecology evolution reproducible-research reproducible-science open-science documentation open-data open-source

manuscript's Introduction

Not just for programmers: How GitHub can accelerate collaborative and reproducible research in ecology and evolution

HTML Manuscript PDF Manuscript GitHub Actions Status

DOI Preprint Code Archive

If you use the code in this repository in a publication, please cite the published paper:

"Braga, P. H. P., Hébert, K., Hudgins, E. J., Scott, E. R., Edwards, B. P. M., Sánchez Reyes, L. L., Grainger, M. J., Foroughirad, V., Hillemann, F., Binley, A. D., Brookson, C. B., Gaynor, K. M., Shafiei Sabet, S., Güncan, A., Weierbach, H., Gomes, D. G. E., & Crystal-Ornelas, R. (2023). Not just for programmers: How GitHub can accelerate collaborative and reproducible research in ecology and evolution. Methods in Ecology and Evolution, 00, 1– 17. https://doi.org/10.1111/2041-210X.14108"

BibTeX entry:

@article{bragaNotJustProgrammers2023,
  title = {Not Just for Programmers: {{How GitHub}} Can Accelerate Collaborative and Reproducible Research in Ecology and Evolution},
  author = {Braga, Pedro Henrique Pereira and H{\'e}bert, Katherine and Hudgins, Emma J. and Scott, Eric R. and Edwards, Brandon P. M. and S{\'a}nchez Reyes, Luna L. and Grainger, Matthew J. and Foroughirad, Vivienne and Hillemann, Friederike and Binley, Allison D. and Brookson, Cole B. and Gaynor, Kaitlyn M. and Shafiei Sabet, Saeed and G{\"u}ncan, Ali and Weierbach, Helen and Gomes, Dylan G. E. and {Crystal-Ornelas}, Robert},
  year = {2023},
  journal = {Methods in Ecology and Evolution},
  volume = {n/a},
  number = {n/a},
  pages = {1-17},
  doi = {10.1111/2041-210X.14108},
  copyright = {All rights reserved},
  langid = {english},
  keywords = {collaboration,data management,ecoinformatics,GitHub,open science,project management,reproducible research,version control},
}

Subject: The use of Github in Ecology and Evolution

Manuscript description

A friendly guide to GitHub and all the things you can currently do with it. Very few papers focus on GitHub as a tool for collaboration. We will also mention where GitHub falls short as a tool for collaboration.

Important links and dates

In this section you'll find a few important links to help us keep track of documents we use outside of the GitHub ecosystem. This includes a google slide deck where we are working on figures, the original outline we made for the manuscript in HackMD, as well as meeting notes.

Links Figure brainstorming

Project deadlines and dates

  • March 16, 2022: figures and tables complete
  • March 30, 2022: everyone does read through for general edits
  • April 13, 2022: Several authors do final detailed read through
  • April 27, 2022: Everyone approved before submission
  • June 1, 2022: Submit! 🎉
  • August 11, 2022: Submitted to Nature Ecology and Evolution;
  • August 22, 2022: Declined;
  • October 27, 2022: Submitted to Methods in Ecology and Evolution;
  • November 17, 2022: Returned for Major Revisions;
  • March 5, 2023: Last date for coauthor approval of the revision, response letter, authorship order, and author contributions;
  • March 9, 2023: Deadline for the submission of the revision!
  • March 10, 2023: Revised version submitted to Methods in Ecology and Evolution!
  • March 10, 2023: Accepted for publication in Methods in Ecology and Evolution!!! 🎉

Contributing

A free, personal Github Account is necessary to contribute to this project.

To contribute in writing, you must follow the guidelines described within the CONTRIBUTING.md file.

In a nutshell, suggestions about the literature require the creation of discussions, and written contributions require the modification of files within the content directory and pushing changes through pull requests.

Authorship Guidelines

Authorship contributions are categorized following the guidelines from the CRediT Taxonomy and the International Committee of Medical Journal Editors.

All prospective authors must follow the contributing guidelines within the CONTRIBUTING.md file. There you will find out that you are encouraged to write a few words about yourself in the Self-Introductions discussion section, and you will also see how to fill-in your author information once you contribute to this project.

Repository directories & files

The directories and main files are as follows:

  • / (main root) this directory contains this document README.md, which helps uses with the general information about this repository and our project.
  • CONTRIBUTING.md contains procedures and directions for prospective authors to contribute to this manuscript.
  • USAGE.md contains a getting started with Git guidelines, information on formatting text, citing references, adding figures and tables, and other manuscript editing.
  • content contains the manuscript source, which includes markdown files as well as inputs for citations and references and figures.
  • R contains R scripts and RMarkdown documents used to generate some of the figures and tables.
  • data contains .csv files with raw data used in generating some figures and tables.
  • output (and the output and gh-pages branches) contains the outputs (generated files) from Manubot including the resulting manuscript files (in HTML, PDF, and other formats). You should not edit these files manually, because they will be overwritten by the Manubot.
  • webpage is a directory meant to be rendered as a static webpage for viewing the HTML manuscript.
  • build contains commands and tools for building the manuscript.
  • ci contains files necessary for deployment via continuous integration.
  • LICENSE.md and LICENSE-CC0.md contain the licenses associated with Manubot and with the content we are developing in this project. Please see the "License" section below.

Continuous Integration

Whenever a pull request is opened, CI (continuous integration) will test whether the changes break the build process to generate a formatted manuscript. The build process aims to detect common errors, such as invalid citations. If your pull request build fails, see the CI logs for the cause of failure and revise your pull request accordingly.

When a commit to the main branch occurs (for example, when a pull request is merged), CI builds the manuscript and writes the results to the gh-pages and output branches. The gh-pages branch uses GitHub Pages to host the following URLs:

For continuous integration configuration details, see .github/workflows/manubot.yaml.

NOTE: Currently the CI build process does not run and render R Markdown documents. For full reproducibility, files in /R/ need to be 'knit' manually to generate some files needed to build the complete manuscript.

License

License: CC BY 4.0 License: CC0 1.0

Except when noted otherwise, the entirety of this repository is licensed under a CC BY 4.0 License (LICENSE.md), which allows reuse with attribution. Please attribute by linking to https://github.com/SORTEE-Github-Hackathon/manuscript.

Since CC BY is not ideal for code and data, certain repository components are also released under the CC0 1.0 public domain dedication (LICENSE-CC0.md). All files matched by the following glob patterns are dual licensed under CC BY 4.0 and CC0 1.0:

  • *.sh
  • *.py
  • *.yml / *.yaml
  • *.json
  • *.bib
  • *.tsv
  • .gitignore

All other files are only available under CC BY 4.0, including:

  • *.md
  • *.html
  • *.pdf
  • *.docx

Please open an issue for any question related to licensing.

About Manubot

Manubot is a system for writing scholarly manuscripts via GitHub. Manubot automates citations and references, versions manuscripts using git, and enables collaborative writing via GitHub. An overview manuscript presents the benefits of collaborative writing with Manubot and its unique features. The rootstock repository is a general purpose template for creating new Manubot instances, as detailed in SETUP.md. See USAGE.md for documentation how to write a manuscript.

Please open an issue for questions related to Manubot usage, bug reports, or general inquiries.

manuscript's People

Contributors

aariq avatar adbinley avatar aguncan avatar brandonedwards avatar colebrookson avatar drmattg avatar dylangomes avatar emmajhudgins avatar fhillemann avatar helenweierbach avatar kaitlyngaynor avatar katherinehebert avatar lunasare avatar pedrohbraga avatar robcrystalornelas avatar shafieisabets avatar vjf2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

manuscript's Issues

What to do about URLs

I've noticed that the Manubot-generated citations for URLs don't look particularly good. Exhibit 1:
Screen Shot 2022-05-10 at 5 02 06 PM

We have a lot of URLs in our manuscript, even after going through and trying to track down proper citations. I wonder if URLs might be better formatted as footnotes rather than in-text citations?

Formatting of contributor roles table

The contributor roles table is looking great! I think that one way to make it more clear, and to integrate the roles and the features + descriptions, would be to indicate which features will be used by which actors/roles. Not sure exactly what this looks like, but maybe in the style of a feature comparison table where each row is the 'user' and the column is the GitHub 'feature'? not sure how the use cases fit in, though..

from Google, just so you know what I mean:
image

Incorrect format for DOI citations

There are a number of broken citations. For example, @doi.org/10.1371/journal.pbio.3000763 which should actually be @doi:10.1371/journal.pbio.3000763. Can fix with a simple find and replace when doing the big revisions.

cover letter submission - a couple of comments

@robcrystalornelas @pedrohbraga - I am not sure if this is the best place for discussing the cover letter - but I have some thoughts...(some of which are pedantic and none of which are "hills I will die on")

This line seems a little orphaned and I think the following paragraph, which speaks to the potential auidence for this paper is more valuable coming straight after the openning "gambit".

The term "self-teach" jars my English sensabilities. I might extend this sentence to something like: "Albeit, many researchers lack exposure to adequate software development practices and thus are required to dedicate valuable time and effort to learning how to use research-facilitating tools. Formal academic courses on these types of tools in Ecology and Evolution are often unavailable and researchers are left to learn through informal blogs, videos and other online forums. This may present a plethora of practical barriers to applying adequate standards to maintaining scientific code"... or something like that!

This line might benefit from an extra clause - "Our manuscript is the most accessible and practical guide to using GitHub in E&E research to date. We build upon some existing resources, which are mainly targeted at other disciplines, and our own experiences (from across all the subdisciplines of E&E) in writing and sharing code openly, to highlight steps E&E researchers can take to improve their research practices in an reproducible, collaborative and transparent way." This line would need to be deleted.

I noticed this in the main MS too we have used two acronyms for the same thing (I assume its the same!) (

Our manuscript is important for the *Nature Ecology & Evolution* readership because EEB researchers will continue to migrate their workflows to GitHub and need resources to guide their journey toward more open and collaborative EEB science.
) - EEB and E&E

Should we state here that we are going to preprint this manuscript (if we are that is). NEE are supportive of preprints.

Matt

Updating table order

Hi @Aariq, hoping you might be able to help with re-ordering tables.

Essentially, we need the manuscript updated so that:

Table 1 is the comparison table
Table 2 is the Roles table

in PR #251 I managed to make sure the actual text in the manuscript is corrected so that Table 1 is comparison table and table 2 is the roles table. I think their links will work as well.

However, even once that PR is merged, I think the roles table will appear first when it should now appear 2nd AND the table captions might still be mis-labeled with their numbers reversed.

Hopefully once the PR is merged this will be a quick fix. Think you will be able to make adjustments? Thanks!

Clarifying definitions in Box 1

From @kaitlyngaynor and @fhillemann

for Branch
I think we need to zoom out even further. What is a branch, in simplest terms? Define this clearly before getting into development branches.

I agree; maybe just add one first sentence, saying something along those lines: Git workflow timelines or repositories are analogous to trees, with a main working project and diverging branches that are pointers to changes during the development process.

for repository

think it would be helpful here to explain the difference between "local" and "remote" repositories, too.

Archive code with DOI

After revisions, remember to set up Zenodo integration and make a release before submission. Include Zenodo DOI in data availability statement.

R1.1: Request to shorten and reduce the repetition and complexity of certain parts

Reviewer's 1 comment:

R1.1. This paper clearly communicates the enthusiasm of the authors for integrating GitHub in the EEB research process. Advantages in areas of collaboration, transparent and reproducible science are clear and nicely discussed. This is a well written paper. I would, however, recommend some restructuring and shortening. As it stands, this paper is too long and some of the use cases are too nerdy to spark enthusiasm in somebody who is not already using GitHub. Breaking it into 13 use cases leads to some repetition and those use cases may be combined more effectively for convincing a new user.

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

GitLab

Just realized we have not mentioned GitLab, but I think it might be worth a phrase or a sentence in the intro. The major difference (I think) is that GitLab can be self-hosted (https://about.gitlab.com/install/). That means you could run it on a University or lab server, for example. I get the sense (from asking the US-RSE Slack) that some academics/govt labs prefer it because they have more control.

Mention Quarto in section on manuscript writing

In general I don't feel we should get too much into the weeds mentioning specific tools outside of GitHub, but I do think Quarto may be worth mentioning as a promising new technology. Quarto is the even more language agnostic successor to R Markdown. It moves the rendering of documents (via pandoc) to a command line tool so users of R, Python, Julia, etc. can all edit and render the format. And because it's still based on markdown, the files could be edited directly in the GitHub interface or just a text editor for collaborators that don't want to touch RStudio or VSCode or Jupyter.

It also consolidates a lot of features that were previously in different *down packages (bookdown, blogdown, etc.) into native features. This includes things like in-text citations, cross-references to figures and tables, footnotes, and author and title blocks which are really important for scientific authoring.

Has anyone else played around with Quarto, and do you think it's worth mentioning?

Formatting when converting to word

In this issue, I'm compiling a list of changes I needed to make to word document output from manubot before submitting to Nature E&E. These could potentially be incorporated into the word doc template that manubot uses.

  1. Modifying author list and affiliation so that authors appear together and affiliations below.
  2. Superscripts added to all authors and affiliations
  3. Asterisk in author list to indicate that multiple authors contributed equally
  4. Format all text in manuscript so that it has 1.5 line spacing
  5. Add line numbers
  6. Format all text in manuscript so that it is size 12 font, times new roman.

Update needed to figure 1

The thin dotted lines going from panels back to original are too small. I can barely see where they are going and I've got good eyes. I would either make them more noticeable, or remove completely.

Should the letter "B)" be located down at the bottom left corner of the plot? I am not understanding what it is supposed to be referencing. I would think it would be near the panel at the top that lists "issues, pull requests, discussions," etc.??

I like this figure! I might recommend re-ordering the contributors in panel D so they are in the same order as the files in the left column. Then the arrows won't cross and it will be a lot easier to follow them.

Where the caption says "CONTRIBUTING.md, LICENCE.md, & README.md files" All of these things are actually listed by "B)" in the figure, not D)

Minor comments from R1 and R2

Reviewer 1 comments:

  • R1.5. Line 346 delete second ‘can’: who can also can change through time
  • R1.6. Line 389 it should be ‘each other’s work’: contribute to each other work without necessarily
  • R1.7. Line 457 missing ‘collaborator’? especially when many may be
  • R1.8. Line 477: not sure what this sentence is saying: ‘requiring the complementation of other tools to fully integration project files and GitHub repositories’

Reviewer 2 comments

  • R2.6. The sentence from Line 66-67 is missing a close parenthesis.
  • R2.7. There appears to be a “fourth” barrier at Line 478. I suggest splitting the paragraph from Line 472 to 483 into two paragraphs and expanding on both “reluctance to share data” and “language-specific resources” as barriers to adoption.
  • R2.8. Line 570 “could also be e” should be “could also be a”.

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using #) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

Change mention of "software" to "code"

@Aariq mention in hypothesis that we might, in the intro of our paper include a statement like: "software packages or data analysis code (hereafter, code)". This way we only need to refer to 'code' throughout the rest of the paper.

I agree with this change, and have created this issue since we'll need a PR to make this change throughout the manuscript

R.2.2: Clarify how technical difficulty in Figure 2 was assessed

Reviewer 2 comments

R2.2. Figure 2 was useful, especially as it summarized across multiple areas that a reader make in-roads into using Github. Please clarify how the “Technical Difficulty” was assessed. Was this based on the impression of the working group? Was it quantified using different types of required knowledge (e.g., programming, software design, working in the Terminal, etc.)?

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using # followed by the number of this issue) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

Updating figure 1 legend

Looks like the figure 1 legend needs to be updated to account for the changes to figure. E.g., there is no section D now

I'm happy to try and work on this or if you have a chance, go for it @colebrookson

Concern: Missing application of automated workflows in GitHub for research in ecology and evolution

Hi, again!

In this issue, I would like to comment on the absence of the application of automated workflows in the use cases of our manuscript.

Concern

Although we had a section dedicated to this in the hackathon and although we use GitHub Actions to automate the production of our manuscript, we have not discussed automated workflows as a use case.

The usage of automated workflows within GitHub (including GitHub Actions) in ecological data‐model integration has been recommended by Fer et al. (2020, Glob. Change Biology; https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7756391/).

GitHub-integrated automated workflows to decrease the time and effort required by researchers have been implemented in a series of projects.

One example is the Portal Project, a long-term study of a Chihuahuan desert ecosystem, which has been described in Yenni et al. (2019, PLOS Biology; https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000125).

Another example is the CAN-SAR, a database of Canadian species at risk information. Their automation workflow using GitHub Actions is described in Naujokaitis-Lewis et al. (2022, Nature; https://www.nature.com/articles/s41597-022-01381-8).

I feel that automatic testing of data and code and continuous integration and deployment (CI and CD) workflows are becoming a very strong part of data synthesis in ecology and evolution and that the manuscript and the readership would benefit much from a dedicated section for this.

Proposed solution

The proposition that I have is to add a new use case related to automated workflows, where a short description and applications of automation using GitHub Actions or GitHub-integrated CI and CD are provided.

I can work on creating the first draft of the section and adding it to the text. However, if this section is considered a new use case, it would also require that our figure is adjusted to include this part.

Please let me know your comments about this concern and the proposed solution!

R1.3: Request to define the attributes of GitHub in terms of investment return on time and effort

Reviewer's 1 comment:

R1.3: Although the abstract states ‘We outline features ranging from low to high technical difficulty’ the paper reads a bit like a laundry list of what GitHub can do (in fact, the word ‘can’ is used about 140 times, which makes for tedious read). Figure 2 helps sort through this laundry list and defines the technical difficulty. It might be better to clearly lay out where anybody can start using GitHub effectively in the text. And the emphasis is on ‘effectively’. Most people are not likely to learn a new piece of software if it does not promise to reduce effort and time. So, defining tasks where GitHub shines in terms of return on investment maybe the better approach to convincing new users and then only mention the advance use cases with some pointers to further reading, but not going into too much detail.

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using #) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

Author contributions statment using CRediT

The latest email about author order reminded me that it might be a good idea to include an author contributions statement in the manuscript. Making this statement could also help us feel more confident about author order.

The Contributor Roles Taxonomy is one possible guide for structuring such a statement:
https://credit.niso.org/

Box 1 - order

@kaitlyngaynor makes a great comment about the order of terms in Box1. "...it may be easier to read if it's organized conceptually. For example, start with repository, then commit, then push/pull, then pull request, then merge, etc."

As the terms refer to each other I struggled to find a sensible order (but perhaps I am over-thinking it). What does everyone think about re-ordering the Box - and any suggestions for a sensible order?

Formatting tables

I am wondering how we want to make tables for this manuscript written in GitHub. There are probably many options. I typically make tables in Word or with R markdown (e.g. kable, etc.). The HackMD table looks nice, but will that work the same way in our document? Do we want to standardize how we do this?

Thoughts?

Template for relevant papers?

Is it possible for us to set up a template so that every time we create a new discussion thread for relevant papers, the boilerplate language for describing the paper (pasted below) is automatically copied over?


Title: Include the manuscript's title

Study Link: Include the https:// link that brings us to the page of the manuscript
Citation: @doi:replace

Citation must follow the Manubot-style citation. Leave it in blank, if unsure.

Suggested keywords that help identify the relevance of this paper to Ecology and Evolution:

  • Keyword 1 (replace me, copy and paste more than three if needed)
  • Keyword 2 (replace me, copy and paste more than three if needed)
  • Keyword 3 (replace me, copy and paste more than three if needed)

Which areas of expertise are particularly relevant to the paper?

  • ecology and/or evolution;
  • biostatistics;
  • informatics and computational research;
  • open science and reproducibility;
  • other: replace_with_your_suggestion.

Summary

Suggested questions to answer about each paper:

  • What is the general main finding or takeaway of the paper?
  • What did they analyse and how did they do it?
  • What does this paper suggest to improve the issue with reproducibility in science?
  • Do you have any concerns about methodology or the interpretation of these results beyond this analysis?

Any comments or notes?

Concern: Perception of insufficient contextualized usage of GitHub specific to researchers in ecology and evolution

Hi, everyone

I hope you are all well!

I have a few concerns and comments related to our manuscript that I would like to raise as issues, so that we can discuss them and, if necessary, address them prior to the submission of our manuscript for consideration for publication in a peer-reviewed journal.

Concern

In this issue, I would like to comment on the apparent lack of domain-specific contextualized usage of GitHub in the manuscript.

While we frequently use the term “ecology and evolutionary biology” (EEB) to help cater our manuscript to our choice of public, most of the applications that we explicitly provide are non-EEB specific.

There are a few examples in the use cases [e.g., the R packages in the “Peer review” section, and some places where we discuss laboratories (but note that there are laboratories in non-EEB fields)].

It is great (and natural) that we have content that can be used by researchers in other domains. However, I feel that more domain-specific applications of the usage of GitHub can help our readers and researchers in ecology and evolution reflect better on how they can use GitHub in their hands-on research projects. Ultimately, the journals we are aiming at are specific to ecology and evolution, and I fear that reviewers and editors might express concerns about this in the initial parts of the review process.

I acknowledge our challenges in trying to contextualize this super broad tool to EEB researchers, as well as our limitations with the text length. Some of the applications we discuss would not require this type of specificity (e.g., manuscript writing).

Proposed solution

A potential solution that I see to adjust for this perception is to include more field-specific applications for GitHub across the text, such as examples of usage with long-term ecological studies, sequencing and assembly of genetic databases, simulations of communities or metacommunities and their applications in theoretical and empirical studies.

For example, the automated near-term iterative forecasting systems that have been developed for the long-term ecological study in Portal (Arizona, US; https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.13104). Another example could be the reproducible pipeline for the study by Kazelles et al. (2019, Global Change Biology; https://onlinelibrary.wiley.com/doi/abs/10.1111/gcb.14829), which has an automated pipeline and a binder associated (https://github.com/McCannLab/HomogenFishOntario) with the GitHub repository that automatically runs and reproduces the analyses and figures that are published in their study.

Please let me know what you think about this concern, if you have a proposition, and (if needed) if you could help with some of the additions to the manuscript.

Write access needed to merge PRs?

It looks like maybe I need write access to merge a PR. I got the impression from our last meeting that I should be able to merge reviewed PRs.

Screen Shot 2022-01-27 at 11 28 13 AM

Tidy up scatterblob.Rmd for supplemental material

The methods that went into making the scatterblob plot are really interesting and rather than trying to squeeze it all into a figure caption, it might be better to just edit what's already in the .Rmd file to include the notes from the Google spreadsheet and knit it to Markdown or Word to be included as a supplement to the manuscript.

Style for referencing terms in box 1 within text

How do we want to consistently format the key terms we define in box 1 when used throughout the manuscript (if at all?) Do we want to bold, use quotes, italics? Just might be helpful to highlight when these terms are used.

small changes to abstract

I suggest the following small changes to the first couple of sentences of the abstract, to highlight that we’re not only talking about code in the MS. It becomes more obvious later in both the abstract and the paper anyway, but I felt the first two sentences were too focused and code only.

revised:
Researchers in ecology and evolutionary biology are increasingly dependent on computational code to conduct research, and the use of efficient methods to share, reproduce, and collaborate on code as well as any research-related documentation has become fundamental. GitHub is an online, cloud-based service that can help researchers track, organize, discuss, share, and collaborate on software and other materials related to research production, including data, code for analyses, and protocols.

original:
Researchers in ecology and evolutionary biology are increasingly dependent on computational code to conduct research. With the growing role of data science in research, the use of efficient methods to share, reproduce, and collaborate on code has become fundamental. GitHub is an online, cloud-based service that can help researchers track, organize, discuss, share, and collaborate on software and code.

Figure 2 has a few artifacts and could be improved for resolution and colour contrast with the background

Hi,

Figure 2 looks very awesome (!!!). However, I have found a few issues with it, which I list below:

  1. When contrasted with darker colours, it shows x-axis and y-axis tick numbers (ranging from 0 to 5 and 0 to 15, respectively) in white colour (see below);
  2. When contrasted with darker colours, the extreme categories or labels (the muted red at the top and the purple and blue at the bottom) are difficult to read. When contrasted with bright colours, the categories in the middle become difficult to read;
  3. Image resolution is currently set to 96 dpi (however, I am not certain if this is Manubot that is converting it to a lower resolution (I will confirm);
  4. A few minor typos (for the sake of consistency) in the x-axis tick labels and categories, so they match the rest of the figure: "Working diaries/blogs", "Upper-Intermediate", "Pre-Intermediate", "Low-Intermediate", and "Hosting and deploying academic websites";
  5. Finally, I think that we perhaps want to be more specific with respect to "the community" and "the public". We use both "user-community" and "EEB community" in the text, and it is not clear which one we are referring to. It also would be good to do the same for "the public", as it is not clear to the reader whether we are speaking of "non-academic GitHub users" or the "general public".

I think that a quick fix for issues (1) and the first half of issue (2) could be to force a white background on the figure so that we ensure that readers are always looking at it in our desired background colour.

Nevertheless, the issue with the colours in the middle might still be prevalent after this adjustment. These colours can be muted to improve their contrast, or alternatively, we could use other colour gradients (e.g. something like the Zissou1 from the wesanderson R package or a sequential palette from the RColorBrewer R package) to describe the technical difficulties in the tasks using GitHub.

We may also be required to increase figure resolution to match journal requirements (>= 300 dpi).

Let me know if you need to discuss further some of these suggestions I raised!

Thank you again for having worked to create this figure! I find it came out great!

Co-author approval before submitting manuscript

Please add a check mark next to your name if you approve of us submitting our manuscript as a preprint to biorxiv and to Nature Ecology and Evolution

  • Saeed
  • Dylan
  • Luna
  • Rob
  • Emma
  • Katherine
  • Pedro
  • Kaitlyn
  • Vivienne
  • Cole
  • Ali
  • Freddy
  • Matt
  • Allison
  • Brandon
  • Eric
  • Helen

R2.1: Improvements to legibility and abstraction on Figure 1

Reviewer 2 comments:

The manuscript presents how ecologists and evolutionary biologists (EEB) can use collaborative software development project management tools. The authors present a high-level view of Github and the Git version control system that is at its core. As the authors detail, the tools that Github provides has broad potential to be leveraged within the EEB community. The paper is well written and helps to make Github tools and related concepts legible to EEB audiences. On the whole, I see this as potentially being an impactful paper for improving EEB research. I have the following Major and Minor comments for the authors:

Major Comments

R2.1. Figure 1 was more confusing to me than helpful. It presents a view of a Github interface in order to detail generalized features; however, it is edited/abstracted so much that it doesn’t map easily to the interface as it would be viewed by a reader of the paper. To improve this I would reduce the level of abstraction of the web interface.

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using # followed by the number of this issue) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

R1.2: Request to focus on tailoring the language and use cases to entry-level researchers

Reviewer's 1 comment:

R1.2. The title and abstract gave me the impression that the goal is to convince EEB researcher to start using GitHub. If that’s the case, it might be better to tailor the use cases to that entry level and use less GitHub specific lingo. Advanced usage may be mentioned but not detailed as much as it is currently done. E.g. collaboratively writing a paper in GitHub is probably out of the question for most. Most GitHub options for communication, discussion, issue tracking are still somewhat esoteric for most non-programmers.

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using # followed by the number of this issue) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

Figure out how to cross-reference arbitrary sections

In order to reference Box 1 (definitions of GitHub terms) and Box 2 (tips on getting started) we could try to figure out how to do that with cross references (@box:definitions doesn't seem to point to a H3 with {#box:definitions} like I thought it might). Possibly not worth it as there are only 2 boxes and order is unlikely to change---manual references will work fine.

Minor Concerns: Citations needed across the Discussion and Text length must be shortened

Hi. Below, I list a few minor and specific concerns in relation to the manuscript:

  • 1. Certain sentences from the Discussion seem to need some citations to support them. Examples are "First, there may be hesitation to independently adopting and learning a new tool. Institutional encouragement and instructional resources focused on researchers in ecology and evolution may be limited." and "We suspect a major additional barrier to EEB researchers is a distinct lack of GitHub help documents for non-English researchers in ecology and evolution, meaning that EEB researchers potentially miss the opportunity to fully understand the importance of version control, reproducibility, and other benefits of GitHub."

  • 2. Text length needs to be reduced. Journal guidelines establish that the manuscript must be 4000 words. Right now, we are a little above 4600 words. A few examples and sections can be shortened to allow us to get closer to the journal's word-limit requirements.

  • 3. The use case "GitHub Organizations" appears disjoint from the other titles. Would it be adequate to use "Institutional organization"?

R2.4: Add dependency testing to the automation use case

Reviewer 2 comment:

R2.4. I suggest adding a discussion on dependency testing within or following the paragraph on Automation (Lines 373-382). This is a project “ecosystem” phenomenon that comes from collaboration, where you build your project on the work of someone else. As projects change over time, they can alter other projects. Software engineers have been working on this challenge for a long time in distributed teams where different parts of software are being built by different programmers. Checks can be done automatically within the software engineering framework (see Pasquier et al. 2017 https://www.nature.com/articles/sdata2017114). Beyond detection, any major changes can be detected and presented without additional work from the user via features like badges (https://shields.io/) within the project page (e.g., README).

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using # followed by the number of this issue) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

R1.4: Suggestions on the ordering of importance of use cases

Reviewer 1's comments:

R1.4. In my experience, the project continuity is actually very high on the importance list for researchers, i.e., knowing that the code and data will be findable by the next student. This includes the discussion of organizing and managing teams, keeping lab information in one place etc. Followed by code versioning and the ability to go back to older versions. Interest in website development is picking up because it really is simple to do in GitHub, and the information can be maintained by several people (i.e., a lab group).

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using #) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

R2.3: Provide a few sentences on the history of Git

Reviewer 2 comment

R2.3. I would find it helpful to give a brief (1-2 sentence) history of Git (L85-89). Namely, that it was developed as an aid for software development within distributed groups of software engineers and was developed within an open-source framework so that it could be improved by the community. This provides more context as to why Github (as it extends Git) is a useful tool for collaboration “by design”.

Recommendations during the revision:

  1. When performing changes addressing this comment, please recall this issue (using # followed by the number of this issue) in the pull request; and,
  2. Clearly justify changes in the final comment of the pull request (to allow us to revise the manuscript in time).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.