1 resultado para Speech Recognition System using MFCC
em Digital Peer Publishing
Filtro por publicador
- KUPS-Datenbank - Universität zu Köln - Kölner UniversitätsPublikationsServer (1)
- Aberdeen University (1)
- Abertay Research Collections - Abertay University’s repository (1)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (6)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (7)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (6)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (9)
- Archive of European Integration (2)
- Aston University Research Archive (62)
- Biblioteca de Teses e Dissertações da USP (1)
- Biblioteca Digital - Universidad Icesi - Colombia (1)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (19)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (27)
- Biblioteca Virtual del Sistema Sanitario Público de Andalucía (BV-SSPA), Junta de Andalucía. Consejería de Salud y Bienestar Social, Spain (4)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (20)
- Brock University, Canada (3)
- Bucknell University Digital Commons - Pensilvania - USA (3)
- Bulgarian Digital Mathematics Library at IMI-BAS (4)
- CentAUR: Central Archive University of Reading - UK (58)
- CiencIPCA - Instituto Politécnico do Cávado e do Ave, Portugal (1)
- Cochin University of Science & Technology (CUSAT), India (32)
- Collection Of Biostatistics Research Archive (1)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (42)
- CORA - Cork Open Research Archive - University College Cork - Ireland (1)
- Dalarna University College Electronic Archive (12)
- Digital Commons - Michigan Tech (5)
- Digital Commons at Florida International University (9)
- Digital Peer Publishing (1)
- DigitalCommons@The Texas Medical Center (1)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (29)
- DRUM (Digital Repository at the University of Maryland) (2)
- Duke University (1)
- Galway Mayo Institute of Technology, Ireland (1)
- INSTITUTO DE PESQUISAS ENERGÉTICAS E NUCLEARES (IPEN) - Repositório Digital da Produção Técnico Científica - BibliotecaTerezine Arantes Ferra (3)
- Instituto Politécnico de Leiria (1)
- Instituto Politécnico do Porto, Portugal (25)
- Instituto Superior de Psicologia Aplicada - Lisboa (1)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (4)
- Lume - Repositório Digital da Universidade Federal do Rio Grande do Sul (1)
- Martin Luther Universitat Halle Wittenberg, Germany (4)
- Massachusetts Institute of Technology (14)
- National Center for Biotechnology Information - NCBI (17)
- QSpace: Queen's University - Canada (1)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (3)
- RDBU - Repositório Digital da Biblioteca da Unisinos (1)
- Repositório Científico da Universidade de Évora - Portugal (2)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (10)
- Repositório da Produção Científica e Intelectual da Unicamp (4)
- Repositório digital da Fundação Getúlio Vargas - FGV (1)
- Repositório Digital da UNIVERSIDADE DA MADEIRA - Portugal (1)
- Repositório do Centro Hospitalar de Lisboa Central, EPE - Centro Hospitalar de Lisboa Central, EPE, Portugal (1)
- Repositorio Institucional da UFLA (RIUFLA) (1)
- Repositório Institucional da Universidade de Brasília (1)
- Repositório Institucional da Universidade Estadual de São Paulo - UNESP (4)
- Repositório Institucional da Universidade Federal do Rio Grande do Norte (1)
- Repositorio Institucional de la Universidad de Málaga (2)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (196)
- Repositorio Institucional Universidad de Medellín (1)
- Repositorio Institucional Universidad EAFIT - Medelin - Colombia (1)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (12)
- School of Medicine, Washington University, United States (11)
- Scielo Saúde Pública - SP (31)
- South Carolina State Documents Depository (1)
- Universidad de Alicante (3)
- Universidad del Rosario, Colombia (3)
- Universidad Politécnica de Madrid (40)
- Universidade do Minho (8)
- Universidade Federal do Pará (16)
- Universidade Federal do Rio Grande do Norte (UFRN) (15)
- Universitat de Girona, Spain (4)
- Universitätsbibliothek Kassel, Universität Kassel, Germany (1)
- Université de Lausanne, Switzerland (34)
- Université de Montréal (2)
- Université de Montréal, Canada (19)
- University of Canberra Research Repository - Australia (1)
- University of Michigan (5)
- University of Queensland eSpace - Australia (29)
- University of Southampton, United Kingdom (1)
- University of Washington (3)
Resumo:
Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.