Quantifying trends in biodiversity with generalized additive models

class: inverse, middle, left, my-title-slide, title-slide

.title[
# Quantifying trends in biodiversity with generalized additive models
]
.author[
### Gavin Simpson
]
.institute[
### Aarhus University
]
.date[
### August 18, 2022
]

---

# Biodiversity losses

.center[![:scale 75%](./resources/living-planet-index-by-region.svg)
]

???

Many of us will be familiar with trends like this

Global biodiversity is in trouble

Wherever we look we see declining populations

Living Plant Index is an average measure of change (so some time series in a region could be increasing even if overall we have a decline in LPI)

---

# Biodiversity losses..?

.center[![:scale 100%](./resources/vellend-et-al-2013.png)
]

Vellend *et al* (2013) *PNAS*

.center[![:scale 100%](./resources/dornelas-et-al-2014.png)
]

Dornelas *et al* 2014 *Science*

???

But studies showing declining biodiversity are not without issue

There has been a decade or more long discussion about the extent to which diversity is declining

---

# Biodiversity losses..?

.center[![:scale 100%](./resources/gonzalez-et-al-2016.png)
]

Gonzalez *et al* (2016) *Ecology*

.center[![:scale 100%](./resources/vellend-et-al-2017.png)
]

Vellend *et al* 2017 *Ecology*

---

# Biodiversity gains & losses

.center[![:scale 75%](./resources/dornelas-et-al-2019.png)
]

Dornelas *et al* (2019) *Ecology Letters*

In part disagreement relates to *population* versus *assemblage* level metrics

Partly due to how the data are analysed

---

# Linear trends

Majority of these large-scale syntheses use linear trend models

.center[![:scale 75%](./resources/dornelas-et-al-2014-fig-2.jpeg)
]

Fig 2 from Dornelas *et al* (2014) *Science*

`\(y_i\)` sometimes transformed

---

# Seibold et al 2019

Analysed 150 grassland and 140 forest sites across 3 regions of Germany 2008&ndash;2017

.center[![:scale 75%](./resources/seibold-et-al-2019.png)
]

Seibold *et al* (2019) *Nature*

---

# Declining arthopod biomass, abundance, and richness

.center[![:scale 60%](./resources/seibold-et-al-fig-1.svg)
]

Seibold *et al* (2019) *Nature*

---

# Declining arthopod biomass, abundance, and richness

* Grasslands

* Biomass &#8595; 67%

* Abundance &#8595; 78%

* Richness &#8595; 36%

* Forests (30 site subset)

* Biomass &#8595; 41%

* Richness &#8595; 36%

---

# Good and bad years

.center[
![:scale 60%](./resources/fournier-et-al-2019.svg)
]

Fournier *et al* (2019) *Conservation Biology* [doi: 10.1111/cobi.13371](https://doi.org/10.1111/cobi.13371)

---

# Good and bad years

Daskalova *et al* (2021) attempted to account for this effect using random effects of `Year`

.center[
![:scale 75%](./resources/daskalova-et-al-2021-fig-1.svg)
]

Daskalova *et al* (2021) *Insect Conservation & Diversity*

---

# Generalized Additive Models

Random effects of `Year` are one form of common trend

Random effects are a special case of a penalized spline

---

# Generalized Additive Models

<br />

![](resources/tradeoff-slider.png)

.references[Source: [GAMs in R by Noam Ross](https://noamross.github.io/gams-in-r-course/)]

GAMs are an intermediate-complexity model

* can learn from data without needing to be informed by the user
* remain interpretable because we can visualize the fitted features
* fit smooth functions using penalized splines

---

# Splines formed from basis functions

![](index_files/figure-html/basis-functions-1.svg)

???

Splines are built up from basis functions

Here I'm showing a cubic regression spline basis with 10 knots/functions

We weight each basis function to get a spline. Here all the basis functions have the same weight so they would fit a horizontal line

---

# Weight basis functions &#8680; spline

.center[![](resources/basis-fun-anim.gif)]

???

But if we choose different weights we get more wiggly spline

---

# Maximize penalized log-likelihood &#8680; &beta;

.center[![](resources/gam-crs-animation.gif)]

???

Fitting a GAM involves finding the weights for the basis functions that produce a spline that fits the data best, subject to some constraints

<!-- ---

# GAM summary

1. GAMs give us a framework to model flexible nonlinear relationships

2. Use little functions (**basis functions**) to make big functions (**smooths**)

3. Use a **penalty** to trade off wiggliness/generality

4. Need to make sure your smooths are **wiggly enough** -->

---

# Arthropod abundance

.center[![:scale 90%](./resources/arthropod-abundance.svg)
]

---

# Arthropod abundance

Negative binomial GAM fitted to single site (AEG1)

.center[![:scale 70%](./resources/arthropod-abundance-site-aeg1-negbin-gam.svg)
]

---

# Grassland arthropod abundance

Our example:

* Abundance of all identified Arthropods

* Grassland sites only

* Models

* Linear global and site-specific trends (random slopes & intercepts)
    * Smooth trend of `year` plus `site` random intercepts
    * Regional smooths of `year` plus `site` random intercepts
    * Regional smooths of `year`, `site` specific *random* smooths
    * Regional smooths of `year`, year-to-year effects, `site` specific *random* smooths

---

# Random smooths?

* Random intercepts
* Random slopes
* Random smooths?

Random smooths yield a separate smooth for each subject

Share a wiggliness penalty (plus `\(\lambda\)` for intercepts & slopes)

But individual trends can have different shape

---

# Random smooth model

`site` specific *random* smooths

.center[![:scale 70%](./resources/arthropod-abundance-m3.svg)
]

Pedersen _et al_ (2019) [PeerJ **7**:e6876](https://doi.org/10.7717/peerj.6876)

---

# Model A

Regional smooths of `year`, `site` specific *random* smooths

.center[![:scale 80%](./resources/arthropod-abundance-m4.svg)
]

---

# Model B

Regional smooths of `year`, year-to-year effects, `site` specific *random* smooths

.center[![:scale 80%](./resources/arthropod-abundance-m5.svg)
]

---

# Results

* Model A (without Y2Y effects): AIC 15122 (255.7 EDF)
* Model B (with Y2Y effects): AIC 15123 (254.2 EDF)
* Equivalent mixed linear trend model (with Y2Y effects): AIC 15468 (111.8 EDF)

Use year-to-year effects or not?

(For these data) **A** and **B** are different ways to decompose the temporal effects

* Without Y2Y &#8594;	wigglier regional smooths
* With Y2Y &#8594; less wiggly regional smooths

???

Even with relatively simple data, GAMs yield much better fit than linear equiv.

<!-- ---

# Nothing new under the sun

Fewster, Buckland, *et al* (2000) *Ecology*

(except Simon Wood's excellent {mgcv} software & ability to fit random smooths) -->

---

# Summary

How we model trends in biodiversity data can change our view of losses and gains

Using "smooth" trends gives much better fit for the arthropod data here

Y2Y effects and / or smooth trends represent different decompositions of the *temporal* effect

Non/linear methods needed if we want to assess changes in rates of change

---

# Slides

* &copy; Simpson (2022) [![Creative Commons Licence](https://i.creativecommons.org/l/by/4.0/88x31.png)](http://creativecommons.org/licenses/by/4.0/)

* HTML Slide deck [gavinsimpson.github.io/esa-csee-2022/](https://gavinsimpson.github.io/esa-csee-2022/)

* RMarkdown [https://github.com/gavinsimpson/esa-csee-2022](https://github.com/gavinsimpson/esa-csee-2022)

* DOI: 10.5281/zenodo.7003949 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7003948.svg)](https://doi.org/10.5281/zenodo.7003948)

* @ucfagls on Twitter

---

# Representing trends

With the penalty idea we can imagine several ways to account for temporal effects in GAMs

1. Fixed year effects
2. Random year effects
3. MRF equivalent of AR
4. Smooth
5. Linear
6. ...

1 & 5 typically not penalized (but could be via `paraPen`)

Could combine these hierarchically