Scientific Calendar Event



Starts 11 Oct 2017 11:00
Ends 11 Oct 2017 12:00
Central European Time
ICTP
Central Area, 2nd floor, SISSA Building, via Beirut
Data sets can be considered an ensemble of realizations drawn from a density distribution. Obtaining a synthetic description of this distribution allows to rationalize the underlying generating process and building human-readable models. In simple cases, visualizing the distribution in a suitable low-dimensional projection is enough to capture its main features but real-world data sets are often embedded in a high-dimensional space.
I present a procedure that allows to obtain such a synthetic description in an automatic way with the only information of pairwise data distances (or similarities). This methodology is based on a reliable estimation of the intrinsic dimension of the dataset and the probability density function coupled with a modified Density Peaks clustering algorithm.
The final outcome of all this machinery working together is a hierarchical tree that summarizes the main features of the data set and a classification of the data that maps which of these features they belong to.