Git Product home page Git Product logo

nicholasjclark / physalia-forecasting-course Goto Github PK

View Code? Open in Web Editor NEW
30.0 2.0 9.0 383.21 MB

Ecological forecasting using Dynamic Generalized Additive Models with R ๐Ÿ“ฆ's {mvgam} and {brms}

Home Page: https://nicholasjclark.github.io/physalia-forecasting-course/

R 0.20% HTML 98.58% Makefile 0.01% CSS 0.92% JavaScript 0.30%
brms ecological-modelling forecasting generalised-additive-models generalised-linear-models mgcv multilevel-models stan time-series-analysis generalized-additive-models

physalia-forecasting-course's Introduction

Ecological forecasting with mvgam and brms

Physalia-Courses

https://www.physalia-courses.org/

Nicholas J Clark

27โ€“31st May, 2024

COURSE OVERVIEW

Time series analysis and forecasting are standard goals in applied ecology. But most time series courses focus only on traditional forecasting models such as ARIMA or Exponential Smoothing. These models cannot handle features that dominate ecological data, including overdispersion, clustering, missingness, discreteness and nonlinear effects. Using the flexible and powerful Bayesian modelling software Stan, we can now meet this complexity head on. Packages such as mvgam and brms can build Stan code to specify ecologically appropriate models that include nonlinear effects, random effects and dynamic processes, all with simple interfaces that are familiar to most R users. In this course you will learn how to wrangle, visualize and explore ecological time series. You will also learn to use the mvgam and brms packages to analyse a diversity of ecological time series to gain useful insights and produce accurate forecasts. All course materials (presentations, practical exercises, data files, and commented R scripts) will be provided electronically to participants.

TARGET AUDIENCE AND ASSUMED BACKGROUND

This course is aimed at higher degree research students and early career researchers working with time series data in the natural sciences (with particular emphasis on ecology) who want to extend their knowledge by learning how to add dynamic processes to model temporal autocorrelation. Participants should ideally have some knowledge of regression including linear models, generalized linear models and hierarchical (random) effects. But weโ€™ll briefly recap these as we connect them to time series modelling.

Participants should be familiar with RStudio and have some fluency in programming R code. This includes an ability to import, manipulate (e.g. modify variables) and visualise data. There will be a mix of lectures and hands-on practical exercises throughout the course.

LEARNING OUTCOMES

  1. Understand how dynamic GLMs and GAMs work to capture both nonlinear covariate effects and temporal dependence
  2. Be able to fit dynamic GLMs and GAMs in R using the {mvgam} and {brms} packages
  3. Understand how to critique, visualize and compare fitted dynamic models
  4. Know how to produce forecasts from dynamic models and evaluate their accuracies using probabilistic scoring rules

COURSE PREPARATION

Please be sure to have at least version 4.2 โ€” and preferably version 4.3 or above โ€” of R installed. Note that R and RStudio are two different things: it is not sufficient to just update RStudio, you also need to update R by installing new versions as they are released.

To download R go to the CRAN Download page and follow the links to download R for your operating system:

To check what version of R you have installed, you can run

version

in R and look at the version.string entry (or the major and minor entries).

We will make use of several R packages that you'll need to have installed. Prior to the start of the course, please run the following code to update your installed packages and then install the required packages:

# update any installed R packages
update.packages(ask = FALSE, checkBuilt = TRUE)

# packages to install for the course
pkgs <- c("brms", "dplyr", "gratia", "ggplot2", "marginaleffects",
          "tidybayes", "zoo", "viridis", "mvgam")

# install packages
install.packages(pkgs)

INSTALLING AND CHECKING STAN

When working in R, there are two primary interfaces we can use to fit models with Stan (rstan and CmdStan). Either interface will work, however it is highly recommended that you use the Cmdstan backend, with the cmdstanr interface, rather than using rstan. More care, however, needs to be taken to ensure you have an up to date version of Stan. For all mvgam and brms functionalities to work properly, please ensure you have at least version 2.29 of Stan installed. The GitHub development versions of rstan and CmdStan are currently several versions ahead of this, and both of these development versions are stable. The exact version you have installed can be checked using either rstan::stan_version() or cmdstanr::cmdstan_version()

Compiling a Stan program requires a modern C++ compiler and the GNU Make build utility (a.k.a. โ€œgmakeโ€). The correct versions of these tools to use will vary by operating system, but unfortunately most standard Windows and MacOS X machines do not come with them installed by default. The first step to installing Stan is to update your C++ toolchain so that you can compile models correctly. There are detailed instructions by the Stan team on how to ensure you have the correct C++ toolchain to compile models, so please refer to those and follow the steps that are relevant to your own machine. Once you have the correct C++ toolchain, you'll need to install Cmdstan and the relevant R pacakge interface. First install the R package by running the following command in a fresh R environment:

install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

cmdstanr requires a working installation of CmdStan, the shell interface to Stan. If you don't have CmdStan installed then cmdstanr can install it for you, assuming you have a suitable C++ toolchain. To double check that your toolchain is set up properly you can call the check_cmdstan_toolchain() function:

check_cmdstan_toolchain()

If your toolchain is configured correctly then CmdStan can be installed by calling the install_cmdstan() function:

install_cmdstan(cores = 2)

You should now be able to follow the remaining instructions on the Getting Started with CmdStanR page to ensure that Stan models can successfully compile on your machine. A quick way to check this would be to run this script:

library(mvgam)
simdat <- sim_mvgam()
mod <- mvgam(y ~ s(season, bs = 'cc', k = 5) +
               s(time, series, bs = 'fs', k = 8),
             data = simdat$data_train)

But issues can sometimes occur when:

  1. you don't have write access to the folders that CmdStan uses to create model executables
  2. you are using a university- or company-imposed syncing system such as One Drive, leading to confusion about where your make file and compilers are located
  3. you are using a university- or company-imposed firewall that is aggressively deleting the temporary executable files that CmdStan creates when compiling

If you run into any of these issues, it is best to consult with your IT department for support.

PROGRAM

09:00 - 12:00 (Berlin time): live lectures and introduction to / review of the practicals

3 additional hours: self-guided practicals using annotated R scripts

Monday (day 1)

Lecture 1 (html | pdf)
Lecture 2 (html | pdf)
Live code examples (Random effects)
Tutorial 1 (html)

  • Introduction to time series and time series visualization
  • Some traditional time series models and their assumptions
  • GLMs and GAMs for ecological modelling
  • Temporal random effects and temporal residual correlation structures

Tuesday (day 2)

Lecture 3 (html | pdf)
Live code examples (Interactions | Time-varying effects)
Tutorial 2 (html)

  • Dynamic GLMs and Dynamic GAMs
  • Autoregressive dynamic processes
  • Gaussian Processes
  • Dynamic coefficient models

Wednesday (day 3)

Lecture 4 (html | pdf)
Live code examples (Distributed lags | Distributed MAs)
Tutorial 3 (html)

  • Bayesian posterior predictive checks
  • Forecasting from dynamic models
  • Point-based forecast evaluation
  • Probabilistic forecast evaluation

Thursday (day 4)

Lecture 5 (html | pdf)
Live code examples (Functionals | Time-varying seasonality)
Tutorial 4 (html)

  • Multivariate ecological time series
  • Vector autoregressive processes
  • Dynamic factor models
  • Multivariate forecast evaluation

Friday (day 5)

  • Group-based practical examples / case studies
  • Review, feedback and open discussion

physalia-forecasting-course's People

Contributors

nicholasjclark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

physalia-forecasting-course's Issues

Updates for 2024 version

In all tutorials and relevant lecture slides, provide links to mvgam vignettes

Day 1

  • Link to the ESA talk rather than the Stat Rethinking Categories and Curves
  • Skip over ETS and ARMA slides in lecture
  • Skip through the Beta example as well
  • Introduce the portal example when first talking about random effects; then pause, switch to live coding and show how to do this in mgcv (provide syntax to do it in brms, mvgam and glm as well); show how to use marginaleffects to make hindcast predictions; emphasize how to create smooth plots; emphasize how to take realisations of fitted functions so they can be used for more custom plots or analyses
  • Create an extra exercise script that shows how to handle nested and crossed random effects in mgcv: https://stats.stackexchange.com/questions/618715/building-the-right-gam-model-struggling-with-the-jump-from-lmer/618760#618760
  • Switch to live coding for the yearly smooth, again showing in mgcv; use gratia to show the basis; extract residuals and plot against another predictor to start building in the idea of a workflow
  • Link to Nishan's paper in the tutorial and include the cheatsheet pdf in the supplied files

Day 2

  • Link to the Oceania EcoForecasting seminar
  • Skip over the tscount AR example slides
  • For the dynamic Beta GAM, pause and switch to live coding; show how to do something similar in mgcv with the RW MRF basis; talk about tensor products and introduce the idea of ti() decompositions to help build strategies (link to some of Gavin's Stackoverflow descriptions and the gam.models help page as well)
  • Skip over the enforcing stationarity Student T example
  • In the dynamic coefficient example, switch to live coding; show how to do something similar in mgcv with the RW MRF basis; mention how one could use the same principles for spatially varying coefficients
  • Use Hilbert GP in dynamic coefficient mvgam example
  • In tutorial fix typo: "except the length scale was changed to 3" should be "except the length scale was changed to 4"
  • Subtitles should use "GAM" rather than "GLM"; i.e. "A standard Poisson GLM" should be "A standard Poisson GAM"
  • Link to MRFtools
  • Include the cheatsheet pdf in the supplied files

Day 3

  • Link to the forecasting vignette and to Juniper's paper on forecast evaluation
  • Skip over expectations entirely and reduce down the conditional predictions part
  • At the types of predictions, switch to live coding and show something cool / interesting (distributed lags in mgcv)
  • Same at probabilistic forecast evaluation; perhaps show a strategy to estimate a decaying effect of some treatment (time since treatment example, or time-varying dispersion with a distributional model)
  • Remove stochastic trend extrapolation completely, and talk about loo_compare instead; relate back to the cheatsheet
  • Include the cheatsheet pdf in the supplied files

Day 4

  • At hierarchical dist lags, switch to live coding and show something cool / interesting (phylogenetically informed intercepts and nonlinear functions in mgcv; illustrate prediction by excluding certain terms to show how the trend is built of additive functions)
  • At multivariate forecast evaluation, switch to live coding and show time-varying seasonality in mvgam

Day 5

  • Pick a more simple and useful dataset for groups to analyse

Be sure to create PDFs of all lectures again, once finalised

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.