Clustering metabarcoding data: a model-based approach

Luisa Ferrari, University of Modena and Reggio Emilia

Co-authors: Maria Franco-Villoria,  University of Modena & Reggio Emilia; Garritt Page, Brigham Young University; Massimo Ventrucci, University of Bologna; Alex Laini, University of Turin

Abstract: Metabarcoding is a highly efficient molecular technique that provides large species occurrence datasets. However, it presents a major limitation as only presence/absence of a species, not abundance, is detectable. Therefore, metabarcoding data requires the use of statistical tools designed for multivariate binary data. We aim to develop a model-based clustering strategy for metabarcoding data. Following a comparison of the methods from the literature, we propose to investigate an extension towards the inclusion of environmental covariates that often accompany occurrence data. In summary, this project seeks to maximize the utility of metabarcoding data with a context-appropriate clustering technique.