Post (1986) carried out an experiment to investigate the effect of time of ploughing on the subsequent weed composition in summer barley fields. Three different ploughing times were used. There are 13 weed species observed in 12 plots in a randomized block experiment of four complete blocks of three plots each.
Traditionally, a multivariate ANOVA could have been used to analyses data like this. However, as there are more species (response variables; 13) than there are samples (12), MANOVA cannot be used. Instead, RDA is a useful option.
library("vegan")
## Loading required package: permute
library("dplyr")
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library("readr")
plough <- read_csv(url("https://bit.ly/plough-spp"), skip = 1) |>
rename(label = "...1") |>
select(-`...2`) |>
janitor::clean_names() |>
filter(!is.na(label)) |>
tibble::column_to_rownames("label")
## New names:
## • `` -> `...1`
## • `` -> `...2`
## Rows: 12 Columns: 15
## ── Column specification ──────────────────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): ...1, ...2
## dbl (13): Che alb, Pol per, Pol avi, Spe arv, Mat rec, Gna uli, Cap bur, Ste...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
plough_design <- read_csv(url("https://bit.ly/plough-design"), skip = 1) |>
rename(label = "...1", ploughing_time = "Treatmnt") |>
select(-`...2`) |>
janitor::clean_names() |>
filter(!is.na(label)) |>
tibble::column_to_rownames("label")
## New names:
## Rows: 12 Columns: 4
## ── Column specification
## ────────────────────────────────────────────────────────────────────── Delimiter: "," chr
## (4): ...1, ...2, Block, Treatmnt
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column
## types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
## • `` -> `...2`
We are interested in the effect of ploughing time and wish to remove
any possible effect of the blocks. We cannot achieve this by using
block as a covariate (in Condition()) when we
estimate the model using PERMANOVA. Instead we will have to include
block in our model as a variable. Some zero abundances are
recorded as NA values, so we must replace those with 0s
before we proceed. Instead of log-transforming the species counts, we
might use a non-Euclidean dissimilarity coefficient with PERMANOVA
instead of RDA.
plough <- plough |>
mutate(across(che_alb:ape_spi, .fns = tidyr::replace_na, replace = 0))
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `across(che_alb:ape_spi, .fns = tidyr::replace_na, replace = 0)`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
##
## # Previously
## across(a:b, mean, na.rm = TRUE)
##
## # Now
## across(a:b, \(x) mean(x, na.rm = TRUE))
A correct permutation test for these data should keep samples contained at the block level and only permute within, not between, blocks.
h <- with(plough_design,
how(blocks = block, nperm = 999))
The model and permutation test are specified as
ord <- adonis2(plough ~ ploughing_time, data = plough_design,
method = "bray", add = "lingoes", permutations = h)
## Set of permutations < 'minperm'. Generating entire set.
ord
## Permutation test for adonis under reduced model
## Blocks: block
## Permutation: free
## Number of permutations: 1295
##
## adonis2(formula = plough ~ ploughing_time, data = plough_design, permutations = h, method = "bray", add = "lingoes")
## Df SumOfSqs R2 F Pr(>F)
## Model 2 0.29341 0.24029 1.4233 0.004 **
## Residual 9 0.92768 0.75971
## Total 11 1.22109 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Notice how we get a warning/message from permute
Set of permutations < ‘minperm’. Generating entire set.
This means that there are not a large number of permutations of these data owing to the block restriction.
Note how with adonis2() we don’t have the option of
using Condition() in the model formula. As such we must
restrict the permutations to within the blocks of the experimental
design. We could also include the block effect in the
model
ord2 <- adonis2(plough ~ ploughing_time + block, data = plough_design,
method = "bray", add = "lingoes", permutations = h, by = "margin")
## Set of permutations < 'minperm'. Generating entire set.
ord2
## Permutation test for adonis under reduced model
## Marginal effects of terms
## Blocks: block
## Permutation: free
## Number of permutations: 1295
##
## adonis2(formula = plough ~ ploughing_time + block, data = plough_design, permutations = h, method = "bray", add = "lingoes", by = "margin")
## Df SumOfSqs R2 F Pr(>F)
## ploughing_time 2 0.29341 0.24029 3.6163 0.005 **
## block 3 0.68427 0.56038 5.6225 0.005 **
## Residual 6 0.24341 0.19934
## Total 11 1.22109 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We tested the marginal effects of the model terms as the default is to do sequential tests, but we don’t want the order in which we added the model terms to the formula to influence the results.
To have greater confidence in the results, we would like to know if there are different variances of the communities in the three groups defined by the ploughing time.
disp <- betadisper(vegdist(plough), plough_design$ploughing_time)
disp
##
## Homogeneity of multivariate dispersions
##
## Call: betadisper(d = vegdist(plough), group =
## plough_design$ploughing_time)
##
## No. of Positive Eigenvalues: 8
## No. of Negative Eigenvalues: 3
##
## Average distance to median:
## pltime 1 pltime 2 pltime 3
## 0.2301 0.2245 0.2716
##
## Eigenvalues for PCoA axes:
## (Showing 8 of 11 eigenvalues)
## PCoA1 PCoA2 PCoA3 PCoA4 PCoA5 PCoA6 PCoA7 PCoA8
## 0.513062 0.213070 0.156225 0.075461 0.070744 0.041542 0.015583 0.008201
The anova() method performs a classic ANOVA-based test
of the dispersions (H0: groups have equal dispersions)
anova(disp)
## Analysis of Variance Table
##
## Response: Distances
## Df Sum Sq Mean Sq F value Pr(>F)
## Groups 2 0.005309 0.0026544 0.1854 0.8338
## Residuals 9 0.128831 0.0143146
Are there any differences between groups in terms of their dispersions?
A permutation test can also be used, instead of assuming the data meet the assumptions of the ANOVA
permutest(disp, pairwise = TRUE, permutations = h)
## Set of permutations < 'minperm'. Generating entire set.
##
## Permutation test for homogeneity of multivariate dispersions
## Blocks: block
## Permutation: free
## Number of permutations: 1295
##
## Response: Distances
## Df Sum Sq Mean Sq F N.Perm Pr(>F)
## Groups 2 0.005309 0.0026544 0.1854 999 0.602
## Residuals 9 0.128831 0.0143146
##
## Pairwise comparisons:
## (Observed p-value below diagonal, permuted p-value above diagonal)
## pltime 1 pltime 2 pltime 3
## pltime 1 0.93600 0.447
## pltime 2 0.95708 0.306
## pltime 3 0.63784 0.51344
Here we asked for pairwise comparisons to investigate whether there are pairs of groups that have different dispersions instead of just looking at the overall test.
Tukey’s honest significant differences can also be used to pairwise compare the groups; while this is a parametric test, the confidence intervals are constructed to account for multiple comparisons
## Tukey's Honest Significant Differences
(disp_HSD <- TukeyHSD(disp))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = distances ~ group, data = df)
##
## $group
## diff lwr upr p adj
## pltime 2-pltime 1 -0.005560485 -0.2417662 0.2306452 0.9976218
## pltime 3-pltime 1 0.041577405 -0.1946283 0.2777831 0.8770854
## pltime 3-pltime 2 0.047137891 -0.1890678 0.2833436 0.8454733
plot(disp_HSD)
If we want to visualise the estimated dispersions, we can plot the betadisper object
plot(disp)
An alternative plot is produced using 1 SD data ellipses
## with data ellipses instead of hulls
plot(disp, ellipse = TRUE, hull = FALSE) # 1 sd data ellipse