au-multivariate-stats

Multivariate analysis of biological data using R

Aarhus University PhD Course

Course leader

Gavin Simpson, Assistant Professor, Department of Animal and Veterinary Sciences, Aarhus University gavin@anivet.au.dk

Getting started

To download the materials for the course, do the following:

  1. Create a new directory (folder) on your computer where you want to store the course materials. You must have read/write permissions on this location and avoid placing it on a shared drive (i.e. no OneDrive or GoogleDrive folders),
  2. Start RStudio (or the R editor of you choice),
    • If using RStudio, use the Files pane to navigate to the folder you created. Once you have the right folder open in the Files pane, click on the More entry in the toolbar for the Files pane, and select Set As Working Directory.
    • If using some other editor, set your working directory to the folder you created.
  3. In the R console, run install.packages("usethis"),
  4. As long as you have set the working directory to the new folder you created for the materials, you can now run

    usethis::use_course("gavinsimpson/au-multivariate-stats", destdir = "./")
    

    This will download the course materials to the folder, and it will start a new RStudio session in this working directory.

  5. Profit

Slides

Examples

Objectives of the course

The aim of the course is to provide an introduction to the analysis of multivariate data arising from observational and experimental studies with the R statistical software.

Learning outcomes and competences

After completing the course, participants will

  1. have a good introductory understanding of the main approaches used in the analysis of multivariate data sets
  2. be able to choose an appropriate method to use to analyse a data set
  3. understand how to use restricted permutation tests with constrained ordination methods to test the effects of predictor variables or experimental treatments
  4. be able to use the R statistical software to analyse multivariate data

Compulsory programme

Active participation in the course including attendance at lectures and completion of computer-based classes and exercises. A learn-by-teaching approach will be used for the computer-based assessments, where each participant will be expected to, as part of a small group, demonstrate their approach to the analysis of a problem data set from the previous day. Completion of short, computer-based assessments testing their understanding of a topic and the practical skills taught.

Course content

The course is based on a series of lectures and computer based practical classes led by an international expert in the analysis of multivariate data, who is also one of the senior developers of the vegan R package and the creator of the permute R package.

The course covers the following topics:

Prerequisites

This course is suitable for Phd students (including senior thesis-based masters students) and researchers working with multivariate data sets in biology (inter alia animal science, ecology, agriculture, microbial ecology/microbiology). Some basic prior knowledge of R is required.

Computing requirements

Participants need to bring their own laptop with the latest version of R installed (version 4.4.0 or later), as well as the current version of RStudio. If you use another editor for your R code feel free to use it instead of Rstudio, but we cannot help you if you encounter problems with it.

You can download R from cloud.r-project.org and select from the three links at the top of the page as required for your operating system.

You can download RStudio from www.rstudio.com and choose from the list of installers as appropriate for your operating system.

If you have already installed R and RStudio, please check that they are both up-to-date. Within R you can run:

version

and look at the entry next to version.string:

> version
               _
platform       aarch64-apple-darwin20
arch           aarch64
os             darwin20
system         aarch64, darwin20
status
major          4
minor          5.3
year           2026
month          03
day            11
svn rev        89597
language       R
version.string R version 4.5.3 (2026-03-11)
nickname       Reassured Reassurer

This should include 4.5.3 if you are running the latest release (at the time of writing), but should be no lower than 4.5.0. If the installed version of R is < 4.5.0, install a newer version of R by downloading and running one of the installers from cloud.r-project.org as mentioned above.

To check that RStudio is up-to-date, open RStudio, open the Help menu, and choose Check for Updates. RStudio will then check to see if there is a newer version available and if there is it will give you the option to download the newer version.

Prior to arriving at AU Viborg, make sure you have updated your installed R packages and that you have installed the following packages: tidyverse, vegan, and ggvegan. To do this, open RStudio (or R) and in the console window (usually lower left, with a prompt that looks like >) run

nc <- parallel::detectCores(logical = FALSE)

This checks to see how many CPU cores you have available, which we use in the next chunk.

update.packages(ask = FALSE, checkBuilt = TRUE, Ncpus = nc - 1)

Change the value of Ncpus to the number cores you have on your computer as this will speed up package updates if you have many packages installed that require updating. If you want to work while this is being done, set Ncpus to a number less than that returned by parallel::detectCores(logical = FALSE).

Now we can install the required packages

install.packages(c("tidyverse", "vegan", "ggvegan", "rmarkdown", "knitr"))