Spring meadows example

In this example, we will use constrained ordination to explore the environmental controls on the species composition of the vegetation of spring meadows from the mountains of the westernmost Carpathians (Hájek et al, 2002). The species composition data are records of all vascular plants and bryophytes in spring fen meadows. In this field, the samples are commonly referred to as relevés. The data are cover estimates recorded using the Braun-Blanquet scale:

r <5% cover, very few individuals
+ <5% cover, few individuals
1 <5% cover, numerous individuals
2 5*–25% cover
3 25*–50% cover
4 50*–75% cover
5 75*–100% cover

which is a method widely used in ecology to survey large areas rapidly. The Braun-Blanquet values have been recoded to the numeric values 1–7: on this scale the relationship with species cover is not linear, but rather corresponds approximately to a logarithmic transformation of the original cover values. In that sense then, the original cover values represented by the data have already been subjected to a log-transformation.

The relevés are associated with a set of environmental data from measurements on the spring water chemistry, the organic content of the soil, and the slope of each sampling location.

# load vegan, dplyr & readr
library("vegan")
library("dplyr")
library("readr")

# load the data
spp <- read_csv("https://bit.ly/meadows-species") |>
    rename("sample_id" = "...1") |>
    tibble::column_to_rownames("sample_id")
env <- read_csv("https://bit.ly/meadows-env") |>
    rename("sample_id" = "...1")

We can look at the number of species, samples, and environmental variables:

dim(spp)

## [1]  70 285

dim(env)

## [1] 70 16

names(env)

##  [1] "sample_id" "Ca"        "Mg"        "Fe"        "K"         "Na"       
##  [7] "Si"        "SO4"       "PO4"       "NO3"       "NH3"       "Cl"       
## [13] "Corg"      "pH"        "conduct"   "slope"

We need to log-transform the ion concentration values, but not the other variables

env <- env |>
    mutate(across(Ca:Cl, .fns = log10))

It is easier to do this in the data than in the formula for the ordination model, but of course you could do this in the formula too. It is easier to do this as a data processing step here, because we will standardise the environmental data prior to fitting the model, to illustrate how to get standardised effect sizes.

To standardise the data we could use the scale() function, but it is instructive to see how data wrangling steps like this can be done with some simple dplyr commands.

z_score <- function(x) {
    (x - mean(x)) / sd(x)
}
env <- env |>
    mutate(across(-sample_id, .fns = z_score))

The final step is to remove the sample_id column:

env <- env |>
    tibble::column_to_rownames("sample_id")

Now we can fit a CCA, downweighting rare taxa:

ord <- cca(downweight(spp) ~ ., data = env)
ord

## Call: cca(formula = downweight(spp) ~ Ca + Mg + Fe + K + Na + Si + SO4
## + PO4 + NO3 + NH3 + Cl + Corg + pH + conduct + slope, data = env)
## 
##               Inertia Proportion Rank
## Total          3.0963     1.0000     
## Constrained    1.1002     0.3553   15
## Unconstrained  1.9961     0.6447   54
## Inertia is scaled Chi-square 
## 
## Eigenvalues for constrained axes:
##   CCA1   CCA2   CCA3   CCA4   CCA5   CCA6   CCA7   CCA8   CCA9  CCA10  CCA11 
## 0.4638 0.1225 0.0728 0.0625 0.0559 0.0489 0.0452 0.0407 0.0398 0.0330 0.0303 
##  CCA12  CCA13  CCA14  CCA15 
## 0.0251 0.0227 0.0200 0.0169 
## 
## Eigenvalues for unconstrained axes:
##     CA1     CA2     CA3     CA4     CA5     CA6     CA7     CA8 
## 0.15674 0.12581 0.09683 0.08406 0.07758 0.07382 0.07041 0.06289 
## (Showing 8 of 54 unconstrained eigenvalues)

We can do an omnibus test for the overall model using anova()

set.seed(1)
anova(ord, permutations = 999)

## Permutation test for cca under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: cca(formula = downweight(spp) ~ Ca + Mg + Fe + K + Na + Si + SO4 + PO4 + NO3 + NH3 + Cl + Corg + pH + conduct + slope, data = env)
##          Df ChiSquare      F Pr(>F)    
## Model    15    1.1002 1.9841  0.001 ***
## Residual 54    1.9962                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

and we can also test the significance of the constrained axes in turn

set.seed(42)
anova(ord, permutations = 499, by = "axis", parallel = 4)

## Permutation test for cca under reduced model
## Forward tests for axes
## Permutation: free
## Number of permutations: 499
## 
## Model: cca(formula = downweight(spp) ~ Ca + Mg + Fe + K + Na + Si + SO4 + PO4 + NO3 + NH3 + Cl + Corg + pH + conduct + slope, data = env)
##          Df ChiSquare       F Pr(>F)   
## CCA1      1   0.46381 12.5471  0.002 **
## CCA2      1   0.12246  3.3129  0.004 **
## CCA3      1   0.07284  1.9705  0.374   
## CCA4      1   0.06249  1.6906  0.724   
## CCA5      1   0.05594  1.5133  0.918   
## CCA6      1   0.04890  1.3230  0.998   
## CCA7      1   0.04525  1.2240  1.000   
## CCA8      1   0.04073  1.1019  1.000   
## CCA9      1   0.03978  1.0761  1.000   
## CCA10     1   0.03296  0.8917  1.000   
## CCA11     1   0.03033  0.8204  1.000   
## CCA12     1   0.02507  0.6781  1.000   
## CCA13     1   0.02267  0.6134  1.000   
## CCA14     1   0.02002  0.5416  1.000   
## CCA15     1   0.01689  0.4570  1.000   
## Residual 54   1.99615                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

While the omnibus test indicates that the model includes some significant effects somewhere, the tests of the constrained axes indicates that most of the effect is in the first two constrained axes, suggesting that we have used too many superfluous environmental variables. This is also suggested by the strong arch in the fitted ordination

plot(ord)

We can run forward selection with the adjusted \(R^2\) stopping rules to select a more parsimonious model

set.seed(67)
lwr <- cca(downweight(spp) ~ 1, data = env)
ord2 <- ordiR2step(lwr, scope = formula(ord),
    permutations = how(nperm = 9999))

The resulting model includes seven environmental variables

ord2$anova

##                  R2.adj Df    AIC       F Pr(>F)    
## + Ca            0.12477  1 401.70 10.8236 0.0001 ***
## + NO3           0.13567  1 401.80  1.8376 0.0014 ** 
## + Na            0.14582  1 401.92  1.8000 0.0004 ***
## + Si            0.15580  1 402.04  1.7647 0.0007 ***
## + Corg          0.16346  1 402.32  1.6009 0.0024 ** 
## + NH3           0.16836  1 402.77  1.4055 0.0193 *  
## + Fe            0.17280  1 403.25  1.3608 0.0310 *  
## <All variables> 0.17679                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

An overall test of the significance can be computed using anova()

set.seed(3)
anova(ord2, permutations = how(nperm = 9999), parallel = 4)

## Permutation test for cca under reduced model
## Permutation: free
## Number of permutations: 9999
## 
## Model: cca(formula = downweight(spp) ~ Ca + NO3 + Na + Si + Corg + NH3 + Fe, data = env)
##          Df ChiSquare      F Pr(>F)    
## Model     7   0.79547 3.0622  1e-04 ***
## Residual 62   2.30083                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

and the marginal effects of the remaining variables can also be assessed

set.seed(34)
anova(ord2, by = "margin", permutations = how(nperm = 9999), parallel = 4)

## Permutation test for cca under reduced model
## Marginal effects of terms
## Permutation: free
## Number of permutations: 9999
## 
## Model: cca(formula = downweight(spp) ~ Ca + NO3 + Na + Si + Corg + NH3 + Fe, data = env)
##          Df ChiSquare      F Pr(>F)    
## Ca        1   0.16510 4.4490 0.0001 ***
## NO3       1   0.05731 1.5443 0.0379 *  
## Na        1   0.05815 1.5669 0.0341 *  
## Si        1   0.06288 1.6944 0.0187 *  
## Corg      1   0.05808 1.5651 0.0338 *  
## NH3       1   0.05236 1.4109 0.0651 .  
## Fe        1   0.05050 1.3608 0.0845 .  
## Residual 62   2.30083                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

For a more parsimonious model we might also consider dropping the effects of log NH3 and log Fe:

ord3 <- update(ord2, . ~ . - Fe - NH3)

set.seed(3)
anova(ord3, permutations = how(nperm = 9999), parallel = 4)

## Permutation test for cca under reduced model
## Permutation: free
## Number of permutations: 9999
## 
## Model: cca(formula = downweight(spp) ~ Ca + NO3 + Na + Si + Corg, data = env)
##          Df ChiSquare      F Pr(>F)    
## Model     5   0.69252 3.6876  1e-04 ***
## Residual 64   2.40379                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

set.seed(34)
anova(ord3, by = "margin", permutations = how(nperm = 9999), parallel = 4)

## Permutation test for cca under reduced model
## Marginal effects of terms
## Permutation: free
## Number of permutations: 9999
## 
## Model: cca(formula = downweight(spp) ~ Ca + NO3 + Na + Si + Corg, data = env)
##          Df ChiSquare      F Pr(>F)    
## Ca        1   0.16745 4.4584 0.0001 ***
## NO3       1   0.06924 1.8434 0.0105 *  
## Na        1   0.06979 1.8580 0.0106 *  
## Si        1   0.06706 1.7854 0.0147 *  
## Corg      1   0.06013 1.6009 0.0288 *  
## Residual 64   2.40379                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can prepare an ordination diagram of the fitted CCA, using the LC scores to limit the visual impact of the arch, which remains even now, because of the strong turnover in the species composition among sites

plot(ord3, display = c("lc", "bp"), type = "n")
points(ord3, display = "lc")
ordipointlabel(ord3, display = "lc", add = TRUE)
text(ord3, display = "bp", col = "blue")

Also with species scores

scl <- "species"
plot(ord3, display = c("species", "lc", "bp"), type = "n", scaling = scl)
points(ord3, display = "species", pch = 3, col = "red", cex = 0.7)
points(ord3, display = "lc", scaling = scl)
ordipointlabel(ord3, display = "lc", add = TRUE, scaling = scl)
text(ord3, display = "bp", col = "blue", scaling = scl)

Spring meadows example

Gavin Simpson

May 29, 2024