skip to content

Evolutionary Ecology Group

 
Logo of the R package tidysdm. Drawn by Michela Leonardi

Michela, Margherita and Andrea recently published a preprint presenting tidysdm, an R package to perform Species Distribution modelling with tidymodels.
In the right column are linked a few resources to familiarise with it.

Michela Leonardi, Margherita Colucci, Andrea Manica (2023), tidysdm: leveraging the flexibility of tidymodels for Species Distribution Modelling in R. bioRxiv 2023.07.24.550358

Abstract
In species distribution modelling (SDM), it is common practice to explore multiple machine-learning algorithms and combine their results into ensembles. This is no easy task in R: different algorithms were developed independently, with inconsistent syntax and data structures. Specialised SDM packages integrate multiple algorithms by creating a complex interface between the user (providing a unified input and receiving a unified output), and the back-end code (that tackles the specific needs depending on the algorithm). This requires a lot of work to create and maintain the right interface, and it prevents an easy integration of other methods that may become available.

Here we present tidysdm, an R package that solves this problem by taking advantage of the tidymodels universe. Being part of the tidyverse, (i) it has standardised grammar and data structures providing a coherent interface for modelling, (ii) includes packages designed for fitting, tuning, and validating various models, and (iii) allows easy integration of new algorithms and methods.

tidysdm allows easy, flexible and quick species distribution modelling by supporting standard algorithms, including additional SDM-oriented functions, and giving the opportunity of using any algorithm or procedure to fit, tune and validate a large number of different models. Additionally, it provides further functions to easily fit models based on paleo/time-scattered data.

The package includes two vignettes detailing standard procedures for present-day and time-scattered data. These vignettes also showcase the integration with pastclim (Leonardi et al. 2023) to allow easier access to palaeoclimatic data series, if needed, but users can bring in their own climatic data in standard formats.