Git Product home page Git Product logo

bepiper's Introduction

BEpipeR: a user-friendly, flexible, scalable, and easily expanded pipeline for a streamlined processing of biotic and abiotic Biodiversity Exploratories data in R

Marcel Glück | Oliver Bossdorf | Henri A. Thomassen

Github All Releases GitHub Latest Release) DOI

Quick-start

Motivation

The wealth of biotic and abiotic environmental data generated in the Biodiversity Exploratories continues to grow steadily, and so does the effort of implementing always the newest data into our statistical frameworks. Unsurprisingly, many BE projects restrict their analyses to a handful of frequently used data sets, neglecting the wealth of information at their fingertips. Oftentimes, this might be caused by the need for stringent quality control and (pre-)processing that many environmental data sets still require. However, this restraint might often prevent us from obtaining a more complete understanding of our complex study systems. To remedy this issue, this project provides a comprehensive user-friendly, flexible, scalable, reproducible and easy-to-expand R pipeline that permits for the streamlined processing of (a)biotic EP-level data generated by the Exploratories. We are convinced that such a framework will benefit many scientists in the Exploratories, as the generated data might be used as input in many types of environmental association studies. Additionally, with modifications, this pipeline might be readily adopted to process other types of plot-based data.

This project is a registered Biodiversity Exploratories synthesis project.

Features and functionalities

✔️ Flexibility: One pipeline, three modes. Switch between forest, grassland, and combined (forest & grassland) mode effortlessly.

✔️ Ease of use: Simply parse aggregation information through csv parameters files. No coding required.

✔️ Customizability: Easily adapt the pipeline to your own needs by e.g. subsetting the template for your plots of interest.

✔️ Deployability: Effortlessly run this pipeline on both Linux and Windows operation systems thanks to reproducible environments.

✔️ Under active development: Shape the future of this project by either providing suggestions for implementations or participate by coding.

Processing performed

  1. Data preparation and wrangling: Template creation, plot locations harmonization, correction of suspicious (NA) values, subsetting, fallbacks to more basal (taxonomic) levels, data reshaping, normalization by variable (for e.g. sampling effort)
  2. Quality control: Multi-mode outlier detection
  3. Data aggregation: Both within and across data sets (mean, median, SD, MAD); processing of yearly climate aggregates (incl. the removal of poorly-supported data points)
  4. Diversity indices: Normalization by repeated rarefaction; calculating species richness, Simpson/Shannon-Wiener/Margalef/Menhinick index, ...)
  5. Post-processing: Data joining, quality control, variables selection by variance inflation factor (VIF) analyses
  6. Data export and metadata compilation: Export of composite data sets and VIF-produced subsets; fetching metadata to the variables produced to assist in preparing the data for publication, submission to BExIS, etc ...

FAQ

How do I attribute this pipeline?

Please cite this pipeline as (replace the X.X.X with the actual pipeline version used):

Glück M., O. Bossdorf and H. A. Thomassen (2024). BEpipeR: a user-friendly, flexible, scalable, and easily expanded pipeline for a streamlined processing of biotic and abiotic data in R. vX.X.X. Zenodo. https://zenodo.org/doi/10.5281/zenodo.10683384

Please do so if you use the pipeline or parts of it in your own work. If you use data produced through this pipeline, please cite both the data set and this pipeline.

Acknowledgements

People and/or institutions we are indebted to:

bepiper's People

Contributors

marcelglueck avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.