FABIO JOSE AYRES
Projetos de Pesquisa
Unidades Organizacionais
Resumo profissional
Área de pesquisa
Nome para créditos
2 resultados
Resultados de Busca
Agora exibindo 1 - 2 de 2
Artigo Científico Detection of the Optic Nerve Head in Fundus Images of the Retina with Gabor Filters and Phase Portrait Analysis(2010) Rangayyan, Rangaraj M.; Zhu, Xiaolu; FABIO JOSE AYRES; Ells, Anna L.We propose a method using Gabor filters and phase portraits to automatically locate the optic nerve head (ONH) in fundus images of the retina. Because the center of the ONH is at or near the focal point of convergence of the retinal vessels, the method includes detection of the vessels using Gabor filters, detection of peaks in the node map obtained via phase portrait analysis, and an intensity-based condition. The method was tested on 40 images from the Digital Retinal Images for Vessel Extraction (DRIVE) database and 81 images from the Structured Analysis of the Retina (STARE) database. An ophthalmologist independently marked the center of the ONH for evaluation of the results. The evaluation of the results includes free-response receiver operating characteristics (FROC) and a measure of distance between the manually marked and detected centers. With the DRIVE database, the centers of the ONH were detected with an average distance of 0.36 mm (18 pixels) to the corresponding centers marked by the ophthalmologist. FROC analysis indicated a sensitivity of 100% at 2.7 false positives per image. With the STARE database, FROC analysis indicated a sensitivity of 88.9% at 4.6 false positives per image.Trabalho de Evento Unsupervised Improvement of Audio-Text Cross-Modal Representations(2023) Wang, Zhepei; Subakan, Cem; Subramani, Krishna; Wu, Junkai; TIAGO FERNANDES TAVARES; FABIO JOSE AYRES; Smaragdis, ParisRecent advances in using language models to obtain cross-modal audio-text representations have overcome the limitations of conventional training approaches that use predefined labels. This has allowed the community to make progress in tasks like zero-shot classification, which would otherwise not be possible. However, learning such representations requires a large amount of human-annotated audio-text pairs. In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio. We explore domain-unspecific and domain-specific curation methods to create audio-text pairs that we use to further improve the model. We also show that when domain-specific curation is used in conjunction with a soft-labeled contrastive loss, we are able to obtain significant improvement in terms of zero-shot classification performance on downstream sound event classification or acoustic scene classification tasks.