Variation partitioning

The data used in this example come from a field fertilisation experiment (Pyšek & Lepš 1991). A barley field was fertilised with three types of nitrogen fertiliser

ammonium sulphate,
calcium-ammonium nitrate, and
liquid urea

and two different total nitrogen doses.

In 122 plot, the weed community species composition was recorded using Braun-Blanquet scale, and transformed to the ordinal 1–7 ordinal scale. The % cover of barley was also estimated for all plots. The aim of this analysis is to consider how species composition of the weed community changes following application of the nitrogen-based fertiliser. A problem however is that the fertiliser may change the composition of the weed community, but it will also increase the cover of barley in the fields, which will indirectly affect the composition of the weeds. Here we will attempt to partition variation in the composition of the weed community into a component that is explained by the dose of the nitrogen fertiliser applied and that component that can be explained by the change in the cover of barley.

## Loading required package: permute

Load the data

library("vegan")
library("readr")
library("dplyr")

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

fspp <- read_csv("https://bit.ly/fertil-sp") %>%
  tibble::column_to_rownames("sample")

## Rows: 122 Columns: 45

## ── Column specification ──────────────────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): sample
## dbl (44): veroar, anaarv, falcon, arenar, thlasp, myosar, violar, stelme, eu...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

fenv <- read_csv("https://bit.ly/fertil-env") %>%
  tibble::column_to_rownames("sample")

## Rows: 122 Columns: 4
## ── Column specification ──────────────────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): sample
## dbl (3): dose, cover, nsp
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Variation partitioning by hand

First, fit the three RDA models we need to decompose variation in the weed community data:

# all fractions [dose + cover + shared]:
m_all <- rda(fspp ~ dose + cover, data = fenv)

# fractions [dose + shared]:
m_dose <- rda(fspp ~ dose, data = fenv)

# fractions [cover + shared]:
m_cover <- rda(fspp ~ cover, data = fenv)

Next, compute the adjusted R² for each model

# fraction [dose + shared + cover]
dsc <- RsquareAdj(m_all)$adj.r.squared
dsc

## [1] 0.1286354

# fraction [dose + shared]:
ds <- RsquareAdj(m_dose)$adj.r.squared
ds

## [1] 0.06113442

# fraction [cover + shared]:
cs <- RsquareAdj(m_cover)$adj.r.squared
cs

## [1] 0.08875315

Now we can compute the variation of the specific components

the unique dose component is computed from dsc - cs
the unique cover component is computed from dsc - ds
the shared component is computed as (ds + cs) - dsc

If the last one is not clear, ds + cs could be written d2sc = dssc and by subtracting a d a c and one s we get the unique shared component s. Another way to think of it is that each of the two models we fitted with a single covariate contains an estimate of the shared component, so the sum of their variations explained contains \(2 \times \text{shared}\), while m_all only contains a single estimate of the shared component.

The unique and shared components can be computed using

dose_r2   <- dsc - cs
cover_r2  <- dsc - ds
shared_r2 <- (ds + cs) - dsc

The adjusted R² components are then

dose = 0.04
cover = 0.07
shared = 0.02

The residual or unexplained variation is computed from the total intertia minus the inertia of the full model m_all

var_resid <- m_all$tot.chi - sum(eigenvals(m_all, model = "constrained"))
var_resid

## [1] 18.28077

The variation explained by each component needs some additional models fitting, from which we can extract the sum of the constrained eigenvalues for the model to get the individual components

# unique dose
dose <- rda(fspp ~ dose + Condition(cover), data = fenv)
sum(eigenvals(dose, model = "constrained"))

## [1] 0.9973597

# unique cover
cover <- rda(fspp ~ cover + Condition(dose), data = fenv)
sum(eigenvals(cover, model = "constrained"))

## [1] 1.581655

You can also do it using the original models and the same sums as for R²:

var_dsc <- sum(eigenvals(m_all, model = "constrained"))
var_ds  <- sum(eigenvals(m_dose, model = "constrained"))
var_cs  <- sum(eigenvals(m_cover, model = "constrained"))

var_dose   <- var_dsc - var_cs
var_cover  <- var_dsc - var_ds
var_shared <- (var_ds + var_cs) - var_dsc

The advantage of knowing how to get the estimate unique components directly is that you can test those components with a permutation test, as shown above.

For printing, let’s combine these into a table

results <- data.frame(Component = c("Total", "Dose", "Cover", "Shared", "Unexplained"),
    Variation = c(m_all$tot.chi, var_dose, var_cover, var_shared, var_resid),
    R2_adj = c(100, dose_r2, cover_r2, shared_r2, 1 - sum(dose_r2, cover_r2, shared_r2)))
results

##     Component  Variation       R2_adj
## 1       Total 21.3320688 100.00000000
## 2        Dose  0.9973597   0.03988225
## 3       Cover  1.5816553   0.06750099
## 4      Shared  0.4722841   0.02125216
## 5 Unexplained 18.2807697   0.87136460

Rendering the table with knitr::kable() (and rounding to 3 digits) we get

Component	Variation	R2_adj
Total	21.332	100.000
Dose	0.997	0.040
Cover	1.582	0.068
Shared	0.472	0.021
Unexplained	18.281	0.871

Variation partitioning the easy way

Vegan contains function varpart() which can do the variation partitioning automatically for up to four components. What we did above is for two components. Doing the variation partitioning by hand for three components is relatively simple, but needs a quiet room and good bookkeeping skills to be sure to fit all the required models and do the maths needed to compute the shared components. Doing this by hand for four components takes a lot of time and care.

vp <- varpart (fspp, ~ dose, ~ cover, data = fenv)
vp

## 
## Partition of variance in RDA 
## 
## Call: varpart(Y = fspp, X = ~dose, ~cover, data = fenv)
## 
## Explanatory tables:
## X1:  ~dose
## X2:  ~cover 
## 
## No. of explanatory tables: 2 
## Total variation (SS): 2581.2 
##             Variance: 21.332 
## No. of observations: 122 
## 
## Partition table:
##                      Df R.squared Adj.R.squared Testable
## [a+c] = X1            1   0.06889       0.06113     TRUE
## [b+c] = X2            1   0.09628       0.08875     TRUE
## [a+b+c] = X1+X2       2   0.14304       0.12864     TRUE
## Individual fractions                                    
## [a] = X1|X2           1                 0.03988     TRUE
## [b] = X2|X1           1                 0.06750     TRUE
## [c]                   0                 0.02125    FALSE
## [d] = Residuals                         0.87136    FALSE
## ---
## Use function 'rda' to test significance of fractions of interest

The Testable column contains an indicator of whether you can test the components. Here was can test all but the shared and residual components. For higher order variation partitioning, different components will be testable.

We have fitted all the models needed to test all the Testable components. These tests are:

# [a+c] = [dose + shared]
set.seed(1)
anova(m_dose)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = fspp ~ dose, data = fenv)
##           Df Variance      F Pr(>F)    
## Model      1   1.4696 8.8789  0.001 ***
## Residual 120  19.8624                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# [b+c] = [cover + shared]
set.seed(1)
anova(m_dose)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = fspp ~ dose, data = fenv)
##           Df Variance      F Pr(>F)    
## Model      1   1.4696 8.8789  0.001 ***
## Residual 120  19.8624                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# [a+b+c] = [dose + cover + shared]
set.seed(1)
anova(m_all)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = fspp ~ dose + cover, data = fenv)
##           Df Variance      F Pr(>F)    
## Model      2   3.0513 9.9313  0.001 ***
## Residual 119  18.2808                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# [a] = [dose]
set.seed(1)
anova(dose)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = fspp ~ dose + Condition(cover), data = fenv)
##           Df Variance      F Pr(>F)    
## Model      1   0.9974 6.4924  0.001 ***
## Residual 119  18.2808                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# [b] = [cover]
set.seed(1)
anova(cover)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = fspp ~ cover + Condition(dose), data = fenv)
##           Df Variance      F Pr(>F)    
## Model      1   1.5817 10.296  0.001 ***
## Residual 119  18.2808                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The notation in the table X1 | X2 means that we have Y ~ X1 + Condition(X2) and note that we can have multiple variables in one or more of the components.

We can plot the venn diagram

plot(vp, digits = 2, Xnames = c("dose", "cover"), bg = c("navy", "tomato"))

Variation partitioning

Gavin Simpson

May 31, 2024

Variation partitioning by hand

Variation partitioning the easy way