rvanmazijk / cape-vs-swa Goto Github PK

Open access repository for (some) data-sets, reproducible analyses, conference slides and manuscript for a publication based on my BSc Hons project (in press in Journal of Biogeography).

Home Page: https://www.researchgate.net/project/Plant-species-richness-turnover-environmental-heterogeneity-in-the-Cape-and-SW-Australia

License: Creative Commons Attribution 4.0 International

R 99.77% TeX 0.09% Makefile 0.14%

biogeography macroecology academic biology botany r manuscript open-science science-research science

cape-vs-swa's Introduction

Environmental heterogeneity explains contrasting plant species richness between the South African Cape and southwestern Australia

Ruan van Mazijk, Michael D. Cramer and G. Anthony Verboom

Department of Biological Sciences, University of Cape Town, Rondebosch, South Africa
Corresponding author: RvM, [email protected]

This is an open access repository for (some) data-sets, reproducible analyses, conference slides and manuscript drafts for a publication based on my BSc Hons project, published in Journal of Biogeography (https://onlinelibrary.wiley.com/doi/10.1111/jbi.14118) in 2021. See the ResearchGate page for more.

The folder for-Dryad contains the versions of scripts and data files lodged formally on Dryad, here: https://doi.org/10.5061/dryad.8w9ghx3m8.

Abstract

Aim: Given the importance of environmental heterogeneity as a driver of species richness through its effects on species diversification and coexistence, we aimed to account for the dramatic difference in species richness per unit area between two similar mediterranean-type biodiversity hotspots and whether this difference is explained by differences in environmental heterogeneity.

Location: The Greater Cape Floristic Region, South Africa (GCFR) and Southwest Australian Floristic Region (SWAFR).

Taxon: Vascular plants (tracheophytes).

Methods: Comparable, geospatially explicit environmental and species occurrence data were obtained for both regions and used to generate environmental heterogeneity and species richness raster layers. Heterogeneity in multiple environmental variables and species richness per unit area were compared between the two regions at a range of spatial scales. At each scale, richness was also regressed against these individual axes and against a major axis of heterogeneity, derived by principal component analysis (PCA).

Results: The GCFR is generally more environmentally heterogeneous and species-rich than the SWAFR. Species richness per unit area is significantly related to the major axis of heterogeneity across both regions, the latter describing ca. 38-50% of overall heterogeneity, the slope of this relationship differing between the two regions only at the finest spatial scale. Multivariate regressions, and regressions against the first axes of the PCAs (PC1), revealed variations in the dependence of species richness on environmental heterogeneity between the two regions.

Main conclusions: Notwithstanding some region-specific effects, we present evidence of a common positive relationship between floristic richness and environmental heterogeneity across the GCFR and SWAFR. This is dependent on spatial scale, being strongest at the coarsest level of sampling. The generally greater richness per unit area of the GCFR compared to the SWAFR is thus explained by the former’s generally greater environmental heterogeneity and is concordant with its greater levels of floristic turnover.

Keywords: biodiversity, environmental heterogeneity, fynbos, Greater Cape Floristic Region, kwongan, macroecology, species richness, species turnover, vascular plants, Southwest Australian Floristic Region

Acknowledgments

This work was funded by the South African Department of Science and Technology (DST) and the National Research Foundation (NRF) under the DST-NRF Freestanding Innovation Honours Scholarship (to RvM), and by the South African Association of Botanists (SAAB) Honours Scholarship (to RvM). Thanks go to the Department of Biological Sciences, University of Cape Town, for providing a 2TB external hard drive for local GIS data storage.

For results used in earlier drafts of this work and a conference presentation, many computations were performed using facilities provided by the University of Cape Town's ICTS High Performance Computing team (http://hpc.uct.ac.za).

cape-vs-swa's People

Contributors

Watchers

cape-vs-swa's Issues

Mention conservation importance

A variety of habitats obviously NB to species richness conservation

Plot richness ~ enviro + roughness

(#15)

Setup mirror of repo on FigShare

A "final product mirror" as opposed to the development repo that this is.

Maybe DataDryad?

Think also about UCT FigShare system (ZivaHub)

Print abstract & keywords into YAML header from file

In manuscript/index.Rmd, I currently have the (multiline!) abstract and keywords inside the abstract: argument of the YAML header:

abstract: |
  | **Aim** Foo
  | ** Location** Bar
  | **Taxon** Wizz
  |
  | _Keywords_
  | Foo, bar, wizz

This is pretty ugly.

I want to try and keep this stuff in, say, _abstract.Rmd, and dump it on render into the YAML header:

abstract: '`r readr::read_file("_abstract.Rmd")`'

Something like that would be great. I've tried it, but it creates an odd error wherein it is as if the whole YAML header is absent (thus giving no title, author, etc.), and throwing the error:

This document format requires a nonempty <title> element.
  Please specify either 'title' or 'pagetitle' in the metadata.
  Falling back to '_main.utf8'

Collate "core" code for analyses & maybe make this repo a "dev" repo?

Testing

Save to bootstrap samples to disc for 0.05º roughness analysis

Because of the huge memory overhead when there are so many pixels at this resolution---causes hang & crash (see 8aacb71)

Species turnover note

(From Thu, 25 Jul)

This metric of turnover has the same problems as T_QDS / S_HDS and Jaccard distance:

HDS %>%
  mutate(euc_dist = sqrt(2*(add_turnover - mean_QDS_richness)^2)) %>%
  ggplot(aes(euc_dist, fill = region)) +
    geom_histogram(bins = 20, position = position_dodge())

Conclusion: scrap

Separate repos into analysis, manuscript, slides

Mention lack of need for autocorrelative investigations

Due to autocorrelated predictors

Manuscript note

Tony is still working on the intro, but has put it aside for a bit as he is busy this week.
When he gets back to it, he will chat to me about my methods section (and cutting it down)
Then we will write the results.

Repeat HDS richness ~ mean QDS richness and turnover w/ GWR

(#15)

BW friendly Cape (orange) and SWA (blue) colours

This is problematic. Compare colour

to black and white

for the roughness distribution panels.

I imagine this is also a problem for all of Figure 2 and 3 too.

Note that cellular roughness data for _that_ PCA incl. cells w/ < 4 sub-cells, but only those PC-values from cells w/ == 4 sub-cells analysed

Run BRTs at 3QDS scale

Diversity index as measure of heterogeneity in discrete soil class layer?

Search for SWA soil classification layer
Get ARC LandTypes layer for GCFR from Mike

Cont. writing SIs

Collinearity among environmental predictor variables
Selecting representative BRT-models from sets of replicates
Testing for concordance in ranked lists

Testing

"With & without the Kogelberg" analyses no longer needed

Running these models:

foo1 <- lm(HDS_richness ~ PC1, HDS)
foo2 <- lm(HDS_richness ~ PC1, HDS[HDS$HDS_richness < 2500, ])
plot(HDS_richness ~ PC1, HDS, col = factor(HDS$region))
abline(foo1)
abline(foo2, lty = "dashed")

foo1 <- lm(QDS_richness ~ PC1, QDS[QDS$QDS_richness < 2000, ])
foo2 <- lm(QDS_richness ~ PC1, QDS)
plot(QDS_richness ~ PC1, QDS, col = factor(QDS$region))
abline(foo1)
abline(foo2, lty = "dashed")

Yields slopes from foo1 and foo2 that are almost identical.

Conclusion: the "outliers" of the Kogelberg (or the bits of the QDS raster that kind of touch the Kogelberg, because a lot of the "real" Kogelberg is excluded from the final dataset due to the edge buffer) are not a big deal!!!

Math-ify "QDS richness" etc. consistently in figure axes/labels

Should be S_QDS etc. instead

Note, figure/table/equation in-text styling is also customisable

Using _bookdown.yml, you can specify how these things appear in text before a bookdown \@ref()-call. See https://bookdown.org/yihui/bookdown/internationalization.html for more (copied YAML below.

language:
  label:
    fig: 'Figure '
    tab: 'Table '
    eq: 'Equation '

Or rather, if you wish:

language:
  label:
    fig: 'Fig. '
    tab: 'Table '
    eq: 'Eqn. '

Fix logo display in README.md on GitHub Pages site

![](logos/...) works well on GHP, as I see in my MSc's site, but then I lose size customisation (the { width=... } does not parse on GHP).

<img src="logos/..." width=...> and ![](logos/...) work for the README on the repo display, but <img ...> breaks for GHP.

Perhaps manually reduce size of huge logos and then simply use ![](...)?

Change figure panel labels to lower case

Re-install MS Word -> MacBook Air

Add maps from SAAB-AMA-SASSB-2019 talk to ms

Delete packrat/lib*/

This takes up Gbs of space!! (See c28b8f9)

Will make a branch just before 8ef150b to delete these directories.

Fix shapefile imports

Throws weird error/warning:

Warning message:
In readOGR(here::here("data/derived-data/borders/GCFR_QDS/")) :
  First layer lon read; multiple layers present in
/Users/ruanvanmazijk/projects/Cape-vs-SWA/data/derived-data/borders/GCFR_QDS, check layers with ogrListLayers()

Maybe is fine?

Test

Lozada-Gobilard S, Stang S, Pirhofer-Walzl K, Kalettka T, Heinken T, Schröder B, Eccard J, Joshi J (2018) Environmental filtering predicts plant-community trait distribution and diversity: Kettle holes as models of meta-community systems. Ecology and Evolution, in press.

Practice talk

Change figure panel labels to lower case

Wiley Author Guidelines specify that panels should be numbered "(a) (b) (c)".

Fix bookdown rendering

Getting this error on call bookdown::render_book("index.Rmd")

Error in x[i] <- sprintf("<a href=\"%s#%s\"", filenames[which.max(lines[lines <=  : 
  replacement has length zero

Neither editing the contents of index.Rmd to bare minimum, removing _output.yml, nor excluding all body .Rmds helps.

Make "adverstisement" page for this project on my GHP blog, or as own repo

E.g. index, preliminary-analyses, slides, proposal

Basically a description, some results, figures, link to the RG project

Improve richness ~ enviro + roughness more

(#15, #43)

What I have done so far:

Richness @QDS ~ Enviro @QDS + Roughness BetwQDS

What I think would be better:

Richness  @QDS ~ Enviro  @QDS + Roughness  WithinQDS([email protected])
Richness  @HDS ~ Enviro  @HDS + Roughness  WithinHDS([email protected])
Richness @3QDS ~ Enviro @3QDS + Roughness Within3QDS([email protected])

And

Turnover  @HDS ~ Enviro  @HDS + Roughness  WithinHDS([email protected])
Turnover @3QDS ~ Enviro @3QDS + Roughness Within3QDS([email protected])

Add geographically weighted regressions

In #11 mentioned the importance of autocorrelative structure in space.

Want to apply GWR for HDS richness ~ avg QDS richness & turnover models, and also (maybe) for richness/turnover ~ environment/heterogeneity models (eventually!).

Hard reset to remove /old-images/ from repo

Not all are open source friendly!

Update GitHub Pages site for this project

Which "SoilsGrid250m" data?

Which "SoilsGrid250m" folder am I using for Cape vs SWA publication---the 4mb ones or the 300kb ones?

See RUAN_UCT/GIS/SoilGrids250m/*/depths-averaged/?/ for the smaller files currently in this project repo under data/derived-data/soils/.

Plot forms of BRT-learnt relationships between environment/heterogeneity & richness/turnover

Make better table and figure presentation when rendering ms to PDF

Fix PNG figure sizes (currently over edge of page)
Make prettier tables
- bookdown has ugly default tables w/ kable::, so rather use kableExtra
Fix LaTeX equation display in tables
Fix ** use in data sources table---rather use kableExtra::group_rows()

Cont. manuscript in MS Word w/ Mike & Tony

Get up-to-date figures from ms, SI -> slides

Context map (#122)
Figure S4: PCA biplots (modified version from MEDECOS 2020) (#123)
Figure 1a–d: richness, PC1 maps (#123)
Figure 4: richness vs PC1 (#124)
Figure S5: PC1-based outliers (#124)

Update slides' title to match abstract submitted/final paper name?

Re: #120.

Or at least include the citation-proper on the title slide?

Update paths in README

Re: R/02_analyses/ etc.

Add mixed (random) effects models

This would be especially useful for the HDS richness ~ avg QDS richness & turnover models (and maybe even the richness-environment models---unless we use BRTs).

This is is because traditional multiple linear regressions (that I have used for the richness-turnover stuff so far) assumes independence between observations. Mixed effects models structure error by the categories, not having this assumption---seemingly ideal for my comparing two regions.

Remake figures to be more talk-friendly, as needed

Re: #115.

Figure 1a–d: richness, PC1 maps (#123)
- Vertical colour ramps
- Coloured region headings
& all figures w/ new colour scheme (#127)? Namely:
- Context map (#122)
- Figure S4: PCA biplots (#123)
- Figure 4: richness vs PC1 (#124)

Aspects to the study to maybe discuss

~~Autocorrelative investigations are worthwhile~~
Temporal range of your data sources << that of the question you're asking

AND:

Cramer & Verboom 2016's methods § is NB!
Linder paper in review by Tony is also NB!

Use jackknife aot bootstrap for CLES and U-tests

Following a chat with Tony earlier today, we think it is best that I rather use jackknife resampling, so as to really get at the issue of unequal number of pixels at different spatial scales when comparing Cape and SWA roughness.

What I was doing before:

bootstrap_sample <- function(x, n) {
  t(repeat(n, {
    sample(x, size = length(x), replace = TRUE)
  }))
}

Cape %<>% bootstrap_sample(n = 1000)
SWA %<>% bootstrap_sample(n = 1000)
CLES(SWA, Cape)

Rather, I should (i) limit resamples to the lowest common number of pixels (i.e. ca. 600 for 3QDS as the limiting size) and (ii) sample without replacement to make the jackknife distribution more "null". Also, Tony pointed out that I needn't resample my roughness-value distributions, but simply resample directly from the pairwise comparison matrix. This will save on computation time because I will only have to make that matrix once.

Thus:

jackknife_sample <- function(x) {
  sample(x, size = length(threeQDS(x)), replace = FALSE)
}
CLES_jackknife <- function(x, y, n) {
  pw <- matrix(nrow = length(rows), ncol = length(cols))
  rownames(pw) <- x
  colnames(pw) <- y
  pw_comparisons <- pw
  for (i in 1:nrow(pw) {
    for (j in 1:ncol(pw) {
      pw_comparisons[i, j] <- 
        rownames(pw)[[i]] < colnames(pw)[[j]]
    }
  }
  CLES_values <- vector(length = nrow(pw) * ncol(pw))
  for (n in 1:1000) {
    rows <- jackknife_sample(1:nrow(pw))
    cols <- jackknife_sample(1:ncol(pw))
    jackknifed_pw <- pw_comparisons[rows, cols]
    jackknifed_pw %<>% as.vector()
    CLES_values[[n]] <- sum(jackknifed_pw), na.rm = TRUE) / length(jackknifed_pw)
  }
  CLES_values
}

CLES_jackknife(SWA, Cape, n = 1000)