Non-parametric Bayesian methods for classification

Boris Hejblum and Anaïs Rouanet (University of Bordeaux)

Course description

When performing clustering, a recurring issue is how to choose the appropriate number of clusters. The Bayesian non parametric framework allows to directly estimate the number of clusters within model-based clustering.
The first part of the course will be devoted to the Gaussian Dirichlet Process Mixture model and its Chinese Restaurant Process representation. We will cover theoretical concepts as well as hand-on R practicals illustrating those. The second part of the course will cover the case of supervised clustering where the clustering structure is guided by an outcome. We will illustrate this using the freely available R package PReMiuM on a real epidemiological data application.
Requirements: participants of this course should have a working knowledge of R. Previous exposure to the Bayesian framework analysis and MCMC algorithms would be helpful to understand the concepts covered in this course.

Message to attendees

Dear course participants,
You will find instructions and materials for the course in the following GitHub repository: https://github.com/borishejblum/BNPclusteringCNC21
We are looking forward to the class
Anaïs & Boris