Publications
This is a list of my most important publications. The list
is ordered in the inverse chronological order (most recent
first).
Faria, Fábio, Veloso, Adriano, Almeida, Humberto, Valle, Eduardo, Torres, Ricardo, Gonçalves, Marcos and Meira Jr., Wagner. “Learn to Rank for Content-Based Image Retrieval” in Proceedings of the 11th ACM International Conference on Multimedia Information Retrieval — MIR 2010. Philadelphia – PA, USA. March 29–31, 2010. (accepted)
fulltext |
Abstract: In Content-based Image Retrieval (CBIR), accurately ranking the returned images is of paramount importance, since users consider mostly the topmost results. The typical ranking strategy used by many CBIR systems is to employ image content descriptors, so that returned images that are most similar to the query image are placed higher in the rank. While this strategy is well accepted and widely used, improved results may be obtained by combining multiple image descriptors. In this paper we explore this idea, and introduce algorithms that learn to combine information coming from different descriptors. The proposed learning to rank algorithms are based on three diverse learning techniques: Support Vector Machines (CBIR-SVM), Genetic Programming (CBIR-GP), and Association Rules (CBIR-AR). Eighteen image content descriptors (color, texture, and shape information) are used as input and provided as training to the learning algorithms. We performed a systematic evaluation involving two complex and heterogeneous image databases (Corel e Caltech) and two evaluation measures (Precision and MAP). The empirical results show that all learning algorithms provide significant gains when compared to the typical ranking strategy in which descriptors are used in isolation. We concluded that, in general, CBIR-AR and CBIR-GP outperforms CBIR-SVM. A fine-grained analysis revealed the lack of correlation between the results provided by CBIR-AR and the results provided by the other two algorithms, which indicates the opportunity of an advantageous hybrid approach.
blog entry
|
Valle, Eduardo and Cord, Matthieu. “Advanced Techniques in CBIR: Local Descriptors, Visual Dictionaries and Bags of Features” in Proceedings of the 22nd Brazilian Symposium on Computer Graphics and Image Processing — SIBGRAPI 2009 (Tutorials). Rio de Janeiro – RJ, Brazil. October 11–14, 2009.
fulltext |
Abstract: Local descriptors have been extensively used in CBIR systems, where their robustness to intense geometric and photometric transformations allows the identification of a target object/image with great reliability. However, due to their excessive discriminating power, their application to the retrieval of complex categories is challenging. The introduction of the technique of visual dictionaries (also known as dictionary of visual terms) is an important step towards the conciliation
between the robustness of local descriptors and the flexibility of generalization needed by complex queries. As a bonus, we
become able to employ advanced retrieval techniques which were so far available only for textual data.
blog entry
|
Valle, Eduardo and Cord, Matthieu. “Similarity Search and Indexing for High-Dimensional Data” in Proceedings of the 24th Brazilian Symposium on Databases — SBBD 2009 (Tutorials). Fortaleza – CE, Brazil. p. 263. October 5–9, 2009.
extended abstract |
Abstract: Searching by similarity is a critical operation on many systems, and thus has attracted the attention of many disciplines in Computer Sciences, including Computational Geometry, Machine Learning, Multimedia and, of course, Databases. To perform efficiently, similarity search requires the support of indexing, which suffers from the infamous “curse of the dimensionality”. In this tutorial we will introduce the challenges of indexing and searching high-dimensional data, and present the most recent tools available to “tame the curse”. At the end, the audience will have a good grasp of the current state of the art, the most promising research trends and the challenges still faced by the technology.
blog entry | presentation
|
Valle, Eduardo, Picard, David and Cord, Matthieu. “Geometric Consistency Checking for Local-Descriptor Based Document Retrieval” in Proceedings of the 9th ACM Symposium on Document Engineering — DocEng 2009. pp. 135–138. Munich, Germany. September 15–18, 2009. DOI: 10.1145/1600193.1600224
fulltext |
Abstract: In this paper, we evaluate different geometric consistency schemes, which can be used in tandem with an efficient architecture, based on voting and local descriptors, to retrieve multimedia documents. In many contexts the geometric consistency enforcement is essential to boost the retrieval performance. Our empirical results show however, that geometric consistency alone is unable to guarantee high-quality results in databases that contain too many non-discriminating descriptors.
blog entry
|
Bertholdo, Flávio, Valle, Eduardo and Araújo, Arnaldo. “Layout-Aware Limiarization for Readability Enhancement of Degraded Historical Documents” in Proceedings of the 9th ACM Symposium on Document Engineering — DocEng 2009. pp. 131-134. Munich, Germany. September 15–18, 2009. DOI: 10.1145/1600193.1600223
fulltext |
Abstract:In this paper we propose a technique of limiarization (also known as thresholding or binarization) tailored to improve the readability of degraded historical documents. Limiarization is a simple image processing technique, which is employed in many complex tasks like image compression, object segmentation and character recognition. The technique also finds applications on itself: since it results in a high-contrast image, in which the foreground is clearly separated from the background, it can greatly improve the readability of a document, provided that other attributes (like character shape) do not suffer. Our technique exploits statistical characteristics of textual documents and applies both global and local thresholding. Under visual inspection on experiments made in a collection of severely degraded historical documents, it compares favorably with the state of the art.
blog entry
|
Picard, David, Cord, Matthieu and Valle, Eduardo. “Study of Sift Descriptors for Image Matching Based Localization in Urban Street View Context” in City Models, Roads and Traffic — ISPRS Workshop — CMRT 09. pp. 193–198. Paris, France. September 3–4, 2009.
Abstract: In this paper, we evaluate different geometric consistency schemes, which can be used in tandem with an efficient architecture, based on voting and local descriptors, to retrieve multimedia documents. In many contexts the geometric consistency enforcement is essential to boost the retrieval performance. Our empirical results show however, that geometric consistency alone is unable to guarantee high-quality results in databases that contain too many non-discriminating descriptors.
|
Valle, Eduardo, Cord, Matthieu and Philipp-Foliguet,
Sylvie. “High-Dimensional Descriptor Indexing for Large Multimedia Databases” in Proceedings of the 17th ACM Conference on Information and Knowledge Management — CIKM 2008. pp. 739–748. Napa Valley – CA, USA. October 26–30, 2008. DOI: 10.1145/1458082.1458181
fulltext |
Abstract: In this paper we address the subject of large multimedia database indexing for content-based retrieval. We introduce multicurves, a new scheme for indexing high-dimensional descriptors. This technique, based on the simultane-ous use of moderate-dimensional space-filling curves, has as main advantages the ability to handle high-dimensional data (100 dimensions and over), to allow the easy maintenance of the indexes (inclusion and deletion of data), and to adapt well to secondary storage, thus providing scalability to huge databases (millions, or even thousands of millions of descriptors). We use multicurves to perform the approximate k nearest neighbors search with a very good compromise between precision and speed. The evaluation of multicurves, carried out on large databases, demonstrates that the strategy compares well to other up-to-date k nearest neighbor search strategies. We also test multicurves on the real-world application of image identification for cultural institutions. In this application, which requires the fast search of a large amount of local descriptors, multicurves allows a dramatic speed-up in comparison to the brute-force strategy of sequential search, without any noticeable precision loss.
blog entry | presentation | video | publisher link
|
Valle, Eduardo, Cord, Matthieu and Philipp-Foliguet,
Sylvie. “Fast Identification of Visual Documents Using Local Descriptors” in Proceedings of the 8th ACM Symposium on Document Engineering — DOCENG 2008, . pp. 173–176. São Paulo, SP, Brazil. September 16–19, 2008. DOI: 10.1145/1410140.1410175
fulltext |
Abstract: In this paper we introduce a system for the identification of visual documents. Since it stems from content-based document indexing and retrieval, our system does not need to rely on textual annotations, watermarks or other metadata, which can be missing or incorrect. Our retrieval system is based on local descriptors, which have been shown to provide accurate and robust description. Because of the high computational costs associated to the
matching of local descriptors, we propose Projection KD-Forest: an indexing technique which allows efficient approximate k nearest neighbors search. Experiments demonstrate that the Projection KD-Forest allows the system to provide prompt results with negligible loss on accuracy. The Projection KD-Forest also compares well when contrasted to other strategies of k nearest neighbors search.
blog entry | presentation | publisher link
|
Valle, Eduardo. Local-descriptor matching for image identification systems. Ph.D. thesis. 181 p. École Doctorale Sciences et Ingénierie, Université de Cergy-Pontoise, Cergy, France. June 12, 2008.
fulltext (r/v) | fulltext (recto only) |
Abstract: Image identification (or copy detection) consists in retrieving the original from which a query image possibly derives, as well as any related metadata, such as titles, authors, copyright information, etc. The task is challenging because of the variety of transformations that the original image may have suffered. Image identification systems based on local descriptors have shown excellent efficacy, but often suffer from efficiency issues, since hundreds, even thousands of descriptors, have to be matched in order to find a single image. Our goal is to provide fast methods for descriptor matching, by creating efficient ways to perform the k-nearest neighbours search in high-dimensional spaces. In this way, we can gain the advantages from the use of local descriptors, while minimising the efficiency issues. We propose three new methods for the k nearest neighbours search (a.k.a. kNN search or similarity search): the 3-way trees — an improvement over the KD-trees using redundant, overlapping nodes; the projection KD-forests — a technique which uses multiple moderate dimensional KD-trees; and the multicurves, which is based on multiple moderate dimensional Hilbert space-filling curves. Those techniques try to reduce the amount of random access to the data, in order to be well adapted to the implementation in secondary memory.
blog entry | presentation| presentation (pt)
|
| Valle, Eduardo,
Cord, Matthieu and Philipp-Foliguet,
Sylvie. “Content-based Image Identification on Cultural
Databases” in Proceedings of the International
Cultural Heritage Informatics Meeting — ICHIM 2007. 5 p. Toronto – ON, Canada. October 24–26, 2007.
fulltext |
Abstract: Cultural institutions are often asked to perform the identification of images in newspapers, thesis and even postcards, where the references are too summary, missing or incorrect — in those cases, the visual data is the only reliable evidence for the identification. Image identification is challenging, because of the transformations the query image may have suffered: which include cropping, rotations, scale changes, etc. In this work we describe an automatic, content-based, system for image identification based on local descriptors. The indexing scheme we have developed allows the use of those descriptors without the long processing times normally associated to them.
blog entry | publisher link
|
| Valle, Eduardo,
Cord, Matthieu and Philipp-Foliguet,
Sylvie. “Matching Local Descriptors for Image Identification
on Cultural Databases” in Proceedings of the 9th
International Conference on Document Analysis and Recognition
— ICDAR 2007. pp. 679–683. vol II.
Curitiba –
PR, Brazil. September 23–26, 2007. DOI: 10.1109/ICDAR.2007.4377001
fulltext |
Abstract: In this paper we present a new method for high-dimensional descriptor matching, based on the KD-Tree, which is a classic method for nearest neighbours search. This new method, which we name 3-Way Tree, avoids the boundary effects that disrupt the KD-Tree in higher dimensionalities, by the addition of redundant, overlapping sub-trees. That way, more precision is obtained for the same querying times. We evaluate our method in the context of image identification for cul-tural collections, a task which can greatly benefit from the use of high-dimensional local descriptors computed around PoI (Points of Interest).
blog entry | presentation | publisher link
|
| Valle, Eduardo,
Cord, Matthieu and Philipp-Foliguet,
Sylvie. “3-way Trees: A Similarity Search Method
for High-Dimensional Descriptor Matching” in Proceedings of the 14th IEEE
International Conference on Image Processing — ICIP
2007. pp. 173–176. vol I. San
Antonio – TX, USA. September 16–19, 2007. DOI: 10.1109/ICIP.2007.4378919
fulltext |
Abstract: In this paper we look into the problem of high-dimensional local descriptor matching for image identification on cultural databases, presenting an important improvement over a classic method, the KD-Tree. Our method, the 3-Way Tree, uses redundant, overlapping sub-trees, in order to avoid the boundary effects that disrupt the KD-Tree in higher dimensionalities, achieving more precision for the same querying times.
blog entry | publisher link
|
| Valle, Eduardo,
Philipp-Foliguet, Sylvie and
Cord, Matthieu. “Image
Identification using Local Descriptors” in ImagEVAL
2006 Workshop. 5 p. Amsterdam,
Netherlands. July 12, 2007.
fulltext |
Abstract: Our participation in the Task 1 of the ImagEVAL competition was motivated by our previous work in image identification for cultural institutions. Our approach is based in local descriptors computed around points of interest. We used KD-Trees with the best-bin-first traversal in order to match the descriptors. To eliminate false positives, we ensured the consistency in the matching with the RANSAC algorithm and an affine transform model. We obtained good results.
presentation | publisher link
|
| Valle, Eduardo,
Cord, Matthieu and Philipp-Foliguet,
Sylvie. “Content-based Retrieval of Images for Cultural
Institutions using Local Descriptors” in Geometric
Modelling and Imaging — New Trends — GMAI
2006. pp. 177–182. London,
England. July 05–06, 2006. DOI: 10.1109/GMAI.2006.16.
fulltext |
Abstract: The task of identifying an image whose metadata are missing is often demanded from cultural image collections holders,
such as museums and archives. The query image may present distortions (cropping, rescaling rotations, colour changes, noise…) from the original, which poses an additional complication. The majority of proposed solutions are based on classic image signatures, such as the colour
histogram. Our approach, however, follows computer vision methods, and is based on local descriptors. In this paper we describe our approach, explain the SIFT method on which it is based and compared it to the Multiscale-CCV, an established scheme employed in a large scale practical
system. We demonstrate experimentally the efficacy of our approach, which achieved a 99,2% success rate, against 61,0% for the Multiscale-CCV, in a database of photos, drawings and paintings.
blog entry | presentation | publisher link
|
| Valle, Eduardo
and araújo, Arnaldo.
“Digitalização de acervos, desafio
para o futuro” in Revista do Arquivo Público
Mineiro, ano XLI, Julho–Dezembro 2005, pp.
128–143. Secretaria de Estado de Cultura / Arquivo
Público Mineiro, Belo Horizonte – MG, Brazil.
December, 2005. ISSN: 0104-8368. (In Portuguese)
fulltext |
Abstract: As inúmeras possibilidades que a digitalização oferece à preservação de acervos supõem a necessidade de estratégias a longo prazo para sua utilização, sob pena de se colocarem artefatos de valor permanente à mercê da fragilidade da tecnologia digital.
|
| Valle, Eduardo.
“Preservação digital e gestão
eletrônica de documentos para museus e arquivos:
O desafio dos acervos permanentes” in Anais
do Museu Histórico Nacional, v. 37. 10 p.
Museu
Histórico Nacional, Rio de Janeiro – RJ, Brazil.
2005. (In Portuguese)
fulltext |
Este artigo, escrito para os Anais do Museu Histórico Nacional, discute sobre os desafios da gestão eletrônica de documentos quando aplicada aos acervos permanentes, em particular são discutidas as questões de acesso, preservação, workflow e gestão de metadados.
|
| Amorim, Eliane,
Lopes, Carlos and Valle,
Eduardo. Introdução à preservação
de acervos digitais. 42 p. Secretaria de Estado
de Cultura / Arquivo
Público Mineiro, Belo Horizonte – MG, Brazil.
2005. ISBN 85-99528-01-7. (In Portuguese) |
| Lopes, Carlos, Valle, Eduardo,
Amorim, Eliane and Vieira, Fernanda. “Digitalizando
para Durar: a Experiência do Arquivo Público
Mineiro” in Anais do I Congresso Nacional de Arquivologia
— ABARQ. Associação
Brasiliense de Arquivologia, Brasília –
DF, Brazil. November 23–26, 2004. (In Portuguese)
fulltext |
Abstract: A adoção da tecnologia pelas instituições arquivísticas tem resultado na transformação e no surgimento de conceitos, processos, metodologias e comportamentos referentes à preservação, acesso e tratamento da informação. O Arquivo Público Mineiro há cinco anos vem utilizando a tecnologia como forma de otimizar suas ações de preservação e acesso. Esta comunicação pretende transmitir os resultados de estudos, reflexões e pesquisas desenvolvidas pelo corpo técnico da instituição durante a execução de vários projetos voltados para digitalização de acervos e desenvolvimentos de sistemas de recuperação da informação.
|
| Valle, Eduardo.
Sistemas de Informação Multimídia
na Preservação de Acervos Permanentes.
M.Sc. dissertation. 128 p. Departamento
de Ciência da Computação, Universidade
Federal de Minas Gerais / Belo Horizonte – MG, Brazil.
February 21, 2003. (In Portuguese)
fulltext |
Abstract: A preservação documental é atividade chave para a análise histórica e para a construção da identidade cultural. Entretanto essa tarefa enfrenta a dificuldade de classificar e armazenar enormes massas documentais, e o severo compromisso entre as dimensões conservação × acesso. A tecnologia digital oferece a possibilidade de romper esse compromisso, fornecendo aos usuários cópias de alta qualidade, ao mesmo tempo em que preserva os artefatos originais da manipulação desnecessária. Entretanto, os computadores não oferecem soluções simples para a preservação documental. A informação digital é mais sujeita à adulteração, vandalismo, e acidentes do que a analógica. Mais grave ainda, a rápida obsolescência é uma ameaça constante de perda de acesso aos acervos. Este trabalho explora os benefícios e desafios trazidos pelos sistemas de informação multimídia aos acervos de valor permanente.
|
| Valle, Eduardo,
araújo, Arnaldo. “Preserving Historical Collections Using Multimedia Information Systems” in Anals of the 8th Brazilian Symposium on Miltimedia and Hypermedia Systems — SBMidia 2002. 8 p. Universidade Federal do Ceará
/ Sociedade Brasileira de Computação, Fortaleza –
CE, Brazil. October 7–10, 2002. |
| Valle, Eduardo,
araújo, Arnaldo, Vieira,
Fernanda and Costa, Carla. “A
Tool for Workflow Management in the Composition of Multimedia
Databases from Preexistent Documents” in Anals of the 8th Brazilian Symposium on Miltimedia and Hypermedia Systems — SBMidia 2002. 4 p. Universidade Federal do Ceará
/ Sociedade Brasileira de Computação, Fortaleza –
CE, Brazil. October 7–10, 2002. |
|
Valle, Eduardo, araújo,
Arnaldo and Andrade, Nelson.
“Sistema de Informação Multimídia
para Acervos Históricos de Valor Permanente”
in Anais do 7º Simpósio Brasileiro de Sistemas Multimídia
e Hipermídia — SBMidia 2001. pp. 234–235. Universidade Federal de Santa
Catarina / Sociedade Brasileira de Computação,
Florianópolis –
SC, Brazil. October 15–19, 2001. (In Portuguese) |
|