Vai direttamente ai contenuti della pagina
Eventi scientifici

Identifying informative distance measures in high-dimensional feature spaces

21 Novembre 2024
Ore:
11:30
Location:
Meeting Room, Building C, Area Science Park, Padriciano 99, Trieste
Speaker:
Alessandro Laio, SISSA - International School for Advanced Studies

Abstract:

Real-world data in biology, material science, medicine and beyond typically contain a large number of features that are  heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, say of two cells or two patients, one can build various distance measures using subsets of these features. Finding a small set of features that still retains sufficient information about the dataset is important for a successful  data analysis.
We introduce a statistical test that can assess the relative information retained when using  different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This test can  be used to identify the most informative distance measure and, therefore, the most informative set of features, out of a pool of candidates. The approach can be used to perform feature selection in molecular modeling and clinical analysis,  and  to infer causality in high-dimensional time series.