The summer school is a joint effort by ERS-IASC (European Regional Section of International Association for Statistical Computing) ECAS (European Courses in Advanced Statistics) and CLADAG (Classification and Data Analysis Group of the Italian Statistical Society).

The course is intended to achieve postgraduate training in special areas of statistics for both researchers and professional data analysts. The focus is on classification and clustering methods, in conjonction with related visualization techniques, with particular emphasis on modern high-dimensional data sets (MHDS). MHDS have recently emerged because of the fast improvement in data acquisition, storage and processing. The availability of massive data sets are of large interest also in machine learning, data science and computer science. Large data sets apply in many contexts such as biological experiments, financial markets, astronomy, etc. Classification and clustering play a key role in this new paradigm to discover the inhomogeneous structure often underlying these data, and become consequently even more emblematic methods of modern data analysis. Starting from basic concepts, the course will introduce the audience to novel techniques and software through extensive applications to real data. Numerical applications will be performed through a variety of software, including some R packages and some cloud-computing platforms (SaaS, Software as a Service) issuing from research but targeting many kinds of practitioners


Topic 1 // Introduction to Cluster Analysis and Classification
Multivariate data formats. Multivariate data and their visualization. Linear spaces, distances, dissimilarities, and geometric structures in several dimensions. Multivariate location-scale models. Clustering and classification. Types of clustering. Centroid-Based clustering methods. Agglomerative hierarchical methods. Spectral clustering. Density based methods.

Topic 2 // Mixture Mmodels, Model-based Clustering and Algorithms
Mixture models. Sampling from mixture models and clustered populations. Elliptical shaped clusters and the Gaussian model. Finite Gaussian mixture models (GMM) and model-based clustering. MLE estimation for GMM. EM-algorithm and its variants. Computational aspects for MLE of GMM: scale restrictions and cluster initialization. Clusterwise linear regression and cluster-weighted models.

Topic 3 // Model Selection, Variable Selection and Cluster Validation
Model-based clustering and model selection criteria: AIC, BIC, ICL. Strategies for model specification for the GMM model and its variants. High-dimensional data and variable selection. Dimensional reduction for clustering and classification. Estimating the number of clusters. Cluster validation and cluster stability. Criteria for comparing clusterings.

Topic 4 // Further Topics in Cluster Analysis and Classification
Robustness and clustered data. Robust methods for cluster analysis. Clustering with categorical variables and mixed type-data. Network data clustering. Clustering strategies and method selection

Topic 5 // Issues in Clustering Big Data
Three lectures on emerging fields in clustering big data: Co-clustering, clustering of high-dimensional data, clustering of time series


Monday, May 21, 2018
Introduction to cluster analysis and classification

Time Topic Lecturer
09.00-09.30 Introduction
09.30-11.00 Lecture 1 (Topic 1) C. Biernacki
11.00-11.30 Coffee break
11.30-13.30 Lecture 2 (Topic 2) S. Ingrassia
13.30-15.00 Lunch
15.00-16.00 Lecture 3 (Topic 1) C. Biernacki
16.00-17.00 Practical lab session on lectures 1-3 TBA

Tuesday, May 22, 2018
Mixture models, model-based clustering and algorithms

Time Topic Lecturer
08.30-10.30 Lecture 4 (Topic 1) C. Biernacki
10.30-11.00 Coffee break
11.00-13.00 Lecture 5 (Topic 2) S. Ingrassia
13.00-14.30 Lunch
14.30-15.30 Practical lab session on lectures 1-3 TBA
15.30-16.00 Coffee break
16.00-17.00 Practical lab session on lectures 4-5 TBA

Wednesday, May 23, 2018
Model selection, variable selection and cluster validation

Time Topic Lecturer
08.30-10.30 Lecture 6 (Topic 3) S. Ingrassia
10.30-11.00 Coffee break
11.00-13.00 Lecture 7 (Topic 3) P. Coretto
13.00-14.30 Lunch
14.30-15.30 Practical lab session on lectures 6-7 TBA
15.30-18.30 Social Event

Thursday, May 24, 2108
Further topics in cluster analysis and classification

Time Topic Lecturer
08.30-10.30 Lecture 8 (Topic 4) P. Coretto
10.30-11.00 Coffee break
11.00-13.00 Lecture 9 (Topic 4) P. Coretto
13.00-14.30 Lunch
14.30-15.30 Practical lab session on lectures 8-9 TBA
15.30-16.00 Coffee break
16.00-17.30 Discussion on future advances related to lectures 8-9 TBA

Friday, may 25, 2018
Three topics in clustering big data

Time Topic Lecturer
08.30-10.30 Coclustering C. Biernacki
10.30-11.00 Coffee break
11.00-13.00 Clustering of high-dimensional data C. Bouveyron
13.00-14.30 Lunch
14.30-16.30 Clustering of time series S. Frühwirth-Schnatter
16.30-16.45 Closing


Christophe Biernacki
UFR de Mathématiques
Université Lille 1

Charles Bouveyron
Laboratoire J.A. Dieudonné, UMR CNRS 7531,
and Equipe Asclepios, INRIA Sophia-Antipolis
Université Nice Côte d’Azur

Pietro Coretto
Department of Economics and Statistics
University of Salerno

Sylvia Frühwirth-Schnatter
Institute for Statistics and Mathematics
Vienna University of Economics and Business

Salvatore Ingrassia
Department of Economics and Business
University of Catania



Classes and coffee breaks will take place at
University of Catania,
Palazzo Centrale
Piazza Università, 2
95124 Catania

Lunches will be served at
Dimora De Mauro
Via Gesualdo Clementi, 5
95124 Catania


Shared apartments for four people have been reserved close to the school venue at Dimora De Mauro (that is the same place where the light lunches will be served).

Attendees can share these apartments at the cost of 35 euro/night per person, and this also includes the breakfast. For information and reservation contact Dimora De Mauro at reception@dimorademauro.com

Registration and Deadlines

The summer school can host no more than 35 participants

Registration Fee: €330

Fee includes: school attendance, coffee breaks, and lunches at Dimora De Mauro


Registration opens: November 11, 2017
Registration closes: April 8, 2018


In order to apply, please register with the online application form,
and follow instructions for completing the payment.