Interests
Scalability issues on Machine Learning and Classification
Currently, I am working on several topics related to scalabity issues on indexing, classification, machine learning, indexing and knowledge discovery. I am particularly interested on tasks like
- kNN search on Giga-records databases;
- non-supervised learning (clustering) on large (>106 records), high-dimensional (>100 dimensions) databases;
- multimodal information retrieval over millions of documents.
High-dimensional indexing
The subject of my thesis and of my recent research is the indexing of high-dimensional multimedia descriptors in order to accelerate the k nearest neighbours search (also known as kNN search or similarity search). I am studying how the use of multiple moderate-dimensional indexes can help to tame the "curse of dimensionality" and how the use of space-filling curves can help to store the indexes into fast, convenient and easy-to-update data-structures like the one-dimensional B-tree.
Related publications:
Content-based information retrieval (CBIR) and Image Identification
My thesis was also concerned on the application of CBIR to image identification (also known as copy detection, near-duplicate detection — terms which I avoid, because the target images, in my case, may have suffered strong modifications. This technique finds application in the detection of copyright violations and also — which was my main interest — in many Cultural Heritage activities.
Related publications:
Computer Sciences and Cultural Heritage
I am very fond of research in the interface between Computer Science and Cultural Heritage. My M.Sc. research was related to the serious (and still unanswered) problem of Digital Longevity: how to preserve digital data for decades and even centuries. During my Ph.D. my emphasis shifted from preservation to access: the use of CBIR to allow the retrieval of images whose metadata are missing. I am still interested in all problems pertaining to this rich interface: digital preservation, digitisation of collections, digital libraries, asset management for Cultural Heritage, and digital techniques for conservation / restoration.
Related publications:
|