We are training hidden Markov models (HMMs) and random forests (RFs) to infer the breeding success of a threatened shorebird, Limosa limosa (hereby, known as godwits), with geospatial data. The challenges with geospatial data include a variable error rate in predicted location, limited covariate data to "teach" a model, and other commmon adversities when analyzing large data sets, including missing and inconsistent data. In addition, a limited amount of research has been conducted in automating the evaluation of precocial nesting behavior. Precocial means the hatchlings are almost fully independent once hatched where they forage for food themselves under the protection of the parents.
The goal of this repository is to build an autonomous framework and methodology in evaluating shorebird nesting behavior. The benefits of such a study would open the opportunity to evaluate survival probability of chicks and fitness of individuals. We hope this study leads to more focused conservation management and increases the reach of future shorebird studies.
Behavior | No Observation | incubating | foraging | dead bird | chicktending | migrating |
---|---|---|---|---|---|---|
Placeholder | -1 | 1 | 2 | 3 | 4 | 5 |
- Step length 2. Turning Angle 3. Distance to water 4. Revisit 5. Residence Time
- Days since arrival 2. Days before departure 3. Sliding window of variance 4. Standard deviation (to capture autocorrelation)
- Make Random Forest
- Calculate Accuracy
- Calculate Out of Bag Error
- F1 score: 93.8%
- Precision: 94.2%
- Recall: 93.4%
Behavior | incubating | foraging | chicktending | migrating |
---|---|---|---|---|
incubating | 669 | 14 | 13 | 0 |
foraging | 63 | 1133 | 65 | 0 |
chicktending | 26 | 51 | 768 | 0 |
migrating | 0 | 0 | 0 | 8 |
Create Hidden Markov Model
- Make HMM
- Calculate Accuracy
- Calculate Out of Bag Error
This is to estimate probability of hatching and probability of fledging. Note: nest failure = chick tending failure by default.
- Build state and space matrices
- State: the number of hours that a bird showed incubating behavior (LOOKING AT THESE, I THINK ITS JUST INSTANCES IN A DAY, BUT I DONT THINK THAT MATTERS MUCH ON THE RESULTS)
- Space: the number of GPS fixes per day
- requires full nesting behavior or depredated = 4 complete, 1 failed for 5 total (GPS).
- Process matrices with number of days in incubation
- need a function
- Run model
- Assess Markov Chain Monte Carlo diagnostics
- coda package
- Plot survival and detection process
- probability of successful hatching
- Calculate performance metrics from reproductive outcome
- Accuracy, Recall, Sensitivity, Specificity, F1
This achieves the same end, but should perform better as it will allow the MCMC sampling to estimate Pr(survival) at a specified elapsed-time: for nesting 24 d and for chick tending 28 days.
- Identify the season range for each model
- use the matrices you built in the last section
- have user specify the start and end date (POSIXct) to be considered. Build part of the function
- trim the 365 day matrices down to what is specified
- Within the specified window, find first instance of consistent behavior (incubation or chick tending)
- need a persistence check: that is, where did a bird first exhibit 4 days in a row with some instances of a given behavior
- user needs to be able to specify the number of days
- Define the histories from the first entry from the first consistent behavior period until specified number of days
- user needs to be able to specify number of days, again: in BTGO its 22-24 d of incubation and 28-34 d of chick tending.
- this way, each row (individual history) of the nest.beh.final should start with the first day a bird was observed incubating and proceed for 24 days
- Ensure that the observation matrix (i.e., nest.beh.final) samples the same columns and thus has EXACTLY the same dimensions as the state matrix (i.e., nest.beh.final)
- I envisino the function looking like: state_matrix <- function(full_matrix, start_season_date, end_season_date, num_days_incub){} where the full_matrix is what we already have (nest.beh.final), the start_season_date, end_season_date terms filter the columns down so that only the breeding season is considered, and then the num_days_incub specifies how many columns from the first the matrix must be wide.
- In the end, the matrix will have each row for an indiv., each column for the number of days in an incubation att or chick-tending period, the first cell cannot be 0, and each preceeding cell in a row is the sum of predicted incubating states or GPS fixes on a given day (can be 0 or >=1)
- Label birds using Tableau
- Focus on birds with complte incubation
- Predict Behaviors
- Calculate LOOCV to estimate error
- If good:
- Proceed to estimate survival
- If bad:
- Subset GPS data to resemble Argos
- Train model on Argos data (all birds)
- Predict Survival (Refer to Bayesian model)
- Includes birds tagged by NRS in Spain
- Scope the range of species
- Scolopacids (godwits, whimbrel)
- Charidriformes (all shorebirds)
- All Precocial Birds (geese, ducks, turkey)
- Pick the number of species from each group or choose example species
- Minimum sample size per species ?
- BIG QUESTION: How to measure accuracy?
- Feeler in Arctic Shorebird Demographic Network
- Movement Tracks
- Nest Fate
- Fledgling Fate
- Create a R and Description Folder
- Break code into useable functions
- Add documentary to each function using roxygen2
- Construct a source package
- Submit to CRAN
-
McClintock, B.T., Michelot, T. (2018) momentuHMM: R package for generalized hidden Markov models of animal movement. Methods in Ecology and Evolution, 9(6), 1518-1530.
-
Michelot, T., Langrock, R., Patterson, T.A. (2016). moveHMM: An R package for analysing animal movement data using hidden Markov models. Methods in Ecology and Evolution. 7(11), 1308-1315.