873 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI
Resumo:
Single shortest path extraction algorithms have been used in a number of areas such as network flow and image analysis. In image analysis, shortest path techniques can be used for object boundary detection, crack detection, or stereo disparity estimation. Sometimes one needs to find multiple paths as opposed to a single path in a network or an image where the paths must satisfy certain constraints. In this paper, we propose a new algorithm to extract multiple paths simultaneously within an image using a constrained expanded trellis (CET) for feature extraction and object segmentation. We also give a number of application examples for our multiple paths extraction algorithm.
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image's multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partition's center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images have similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the dimensionality curse existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms image's text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partition's center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. To effectively integrate multi-features, we also investigated the following evidence combination techniques-Certainty Factor, Dempster Shafer Theory, Compound Probability, and Linear Combination. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude. And Certainty Factor and Dempster Shafer Theory perform best in combining multiple similarities from corresponding multiple features.
Resumo:
Objective: The description and evaluation of the performance of a new real-time seizure detection algorithm in the newborn infant. Methods: The algorithm includes parallel fragmentation of EEG signal into waves; wave-feature extraction and averaging; elementary, preliminary and final detection. The algorithm detects EEG waves with heightened regularity, using wave intervals, amplitudes and shapes. The performance of the algorithm was assessed with the use of event-based and liberal and conservative time-based approaches and compared with the performance of Gotman's and Liu's algorithms. Results: The algorithm was assessed on multi-channel EEG records of 55 neonates including 17 with seizures. The algorithm showed sensitivities ranging 83-95% with positive predictive values (PPV) 48-77%. There were 2.0 false positive detections per hour. In comparison, Gotman's algorithm (with 30 s gap-closing procedure) displayed sensitivities of 45-88% and PPV 29-56%; with 7.4 false positives per hour and Liu's algorithm displayed sensitivities of 96-99%, and PPV 10-25%; with 15.7 false positives per hour. Conclusions: The wave-sequence analysis based algorithm displayed higher sensitivity, higher PPV and a substantially lower level of false positives than two previously published algorithms. Significance: The proposed algorithm provides a basis for major improvements in neonatal seizure detection and monitoring. Published by Elsevier Ireland Ltd. on behalf of International Federation of Clinical Neurophysiology.
Resumo:
Lots of work has been done in texture feature extraction for rectangular images, but not as much attention has been paid to the arbitrary-shaped regions available in region-based image retrieval (RBIR) systems. In This work, we present a texture feature extraction algorithm, based on projection onto convex sets (POCS) theory. POCS iteratively concentrates more and more energy into the selected coefficients from which texture features of an arbitrary-shaped region can be extracted. Experimental results demonstrate the effectiveness of the proposed algorithm for image retrieval purposes.
Resumo:
This paper presents a corpus-based descriptive analysis of the most prevalent transfer effects and connected speech processes observed in a comparison of 11 Vietnamese English speakers (6 females, 5 males) and 12 Australian English speakers (6 males, 6 females) over 24 grammatical paraphrase items. The phonetic processes are segmentally labelled in terms of IPA diacritic features using the EMU speech database system with the aim of labelling departures from native-speaker pronunciation. An analysis of prosodic features was made using ToBI framework. The results show many phonetic and prosodic processes which make non-native speakers’ speech distinct from native ones. The corpusbased methodology of analysing foreign accent may have implications for the evaluation of non-native accent, accented speech recognition and computer assisted pronunciation- learning.
Resumo:
O trabalho propõe uma investigação sobre a produção audiovisual piauiense, buscando entender como são negociadas as questões culturais e as especificidades das narrativas apresentadas no desenvolvimento histórico da cultura visual piauiense. Analisa-se, assim, as questões discursivas, imagéticas e tecnológicas que abordam aquela realidade cultural influenciada pelas vanguardas cinematográficas brasileiras, no caso o cinema novo e o cinema marginal. A pesquisa tem como corpus as produções superoitistas feitas a partir de 1972 até meados 1985, quando se encerra um segundo ciclo cinematográfico. Este estudo, de natureza qualitativa, emprega pesquisa bibliográfica e documental, com apoio em entrevistas e análise de documentos da época para o estudo das narrativas apresentadas pelas produções audiovisuais. Leva-se em consideração a influência das questões social, política, tecnológica e econômica do Piauí na construção desses filmes. Conclui-se que as práticas culturais e os recursos tecnológicos constituem uma cultura visual que representa as angústias e críticas locais e traduzem a tipificação do sujeito no Piauí.
Resumo:
A sociedade digital nos abraça em todos os aspectos do cotidiano e uma parte significativa da população vive conectada em multiplataformas. Com a instantaneidade dos fluxos de comunicação, vivemos uma rotina onde muitos acessos estão a um clique ou toque. A televisão como mídia preponderante durante várias décadas, na sua transição digital comporta uma função além da TV que conhecíamos, como display interativo que se conecta e absorve conteúdos provenientes de várias fontes. Os consagrados modelos mundiais de distribuição de audiovisual, especialmente pelo Broadcast, sofrem as consequências da mudança do comportamento do seu público pelas novas oportunidades de acesso aos conteúdos, agora interativos e sob demanda. Neste contexto, os modelos das SmartTVs (TVs conectadas) em Broadband (Banda Larga) apresentam opções diferenciadas e requerem um espaço cada vez maior na conexão com todos os outros displays. Com este cenário, o presente estudo busca descrever e analisar as novas ofertas de conteúdos, aplicativos, possibilidades e tendências do hibridismo das fontes para a futura TV.
Resumo:
Keyword identification in one of two simultaneous sentences is improved when the sentences differ in F0, particularly when they are almost continuously voiced. Sentences of this kind were recorded, monotonised using PSOLA, and re-synthesised to give a range of harmonic ?F0s (0, 1, 3, and 10 semitones). They were additionally re-synthesised by LPC with the LPC residual frequency shifted by 25% of F0, to give excitation with inharmonic but regularly spaced components. Perceptual identification of frequency-shifted sentences showed a similar large improvement with nominal ?F0 as seen for harmonic sentences, although overall performance was about 10% poorer. We compared performance with that of two autocorrelation-based computational models comprising four stages: (i) peripheral frequency selectivity and half-wave rectification; (ii) within-channel periodicity extraction; (iii) identification of the two major peaks in the summary autocorrelation function (SACF); (iv) a template-based approach to speech recognition using dynamic time warping. One model sampled the correlogram at the target-F0 period and performed spectral matching; the other deselected channels dominated by the interferer and performed matching on the short-lag portion of the residual SACF. Both models reproduced the monotonic increase observed in human performance with increasing ?F0 for the harmonic stimuli, but not for the frequency-shifted stimuli. A revised version of the spectral-matching model, which groups patterns of periodicity that lie on a curve in the frequency-delay plane, showed a closer match to the perceptual data for frequency-shifted sentences. The results extend the range of phenomena originally attributed to harmonic processing to grouping by common spectral pattern.
Resumo:
Dementia, including Alzheimer’s disease (AD), is a major disorder causing visual problems in the elderly population. The pathology of AD includes the deposition in the brain of abnormal aggregates of ß-amyloid (Aß) in the form of senile plaques (SP) and abnormally phosphorylated tau in the form of neurofibrillary tangles (NFT). A variety of visual problems have been reported in patients with AD including loss of visual acuity (VA), colour vision and visual fields; changes in pupillary response to mydriatics, defects in fixation and in smooth and saccadic eye movements; changes in contrast sensitivity and in visual evoked potentials (VEP); and disturbances of complex visual functions such as reading, visuospatial function, and in the naming and identification of objects. Many of these changes are controversial with conflicting data in the literature and no ocular or visual feature can be regarded as particularly diagnostic of AD. In addition, some pathological changes have been observed to affect the eye, visual pathway, and visual cortex in AD. The optometrist has a role in helping a patient with AD, if it is believed that signs and symptoms of the disease are present, so as to optimize visual function and improve the quality of life. (J Optom 2009;2:103-111 ©2009 Spanish Council of Optometry)
Resumo:
This paper discusses the first of three studies which collectively represent a convergence of two ongoing research agendas: (1) the empirically-based comparison of the effects of evaluation environment on mobile usability evaluation results; and (2) the effect of environment - in this case lobster fishing boats - on achievable speech-recognition accuracy. We describe, in detail, our study and outline our results to date based on preliminary analysis. Broadly speaking, the potential for effective use of speech for data collection and vessel control looks very promising - surprisingly so! We outline our ongoing analysis and further work.
Resumo:
Much has been written about the marketing aspects of promotional material in general, and several scholars (particularly in linguistics) have addressed questions relating to the structure and function of advertisements, focusing on images, rhetorical structure, semiotic functions, discourse features and audio-visual media, amongst other aspects of the genre. Not much, on the other hand, has been written within translation studies about the complexities involved in the transfer of an advertising message. Contributors to this volume explore various interdependent aspects of the interlingual and intercultural transfer of an advertising message. They emphasize features of culture specificity, of multi-medial semiotic interaction, of values and stereotypes, and most importantly, they recommend strategies and approaches to assist translators. Topics covered include a critique of the Western-based approach to advertising in the context of the Far East; different perceptions of the concept of cleanliness in advertising texts in Italy, Russia and the UK; the Walls Cornetto strategy of internationalization of product appeal, followed by localization; the role of the translator in recreating appeal in different lingua-cultural contexts; what constitutes 'Italianness' in advertisements for British consumers; and strategies for repackaging France as a tourist destination.
Resumo:
This paper discusses the first of three studies which collectively represent a convergence of two ongoing research agendas: (1) the empirically-based comparison of the effects of evaluation environment on mobile usability evaluation results; and (2) the effect of environment - in this case lobster fishing boats - on achievable speech-recognition accuracy. We describe, in detail, our study and outline our results to date based on preliminary analysis. Broadly speaking, the potential for effective use of speech for data collection and vessel control looks very promising - surprisingly so! We outline our ongoing analysis and further work.
Resumo:
The article gives an account of the various microfilming initiatives taken in Malta during the last thirty years. Various archives have managed to microfilm their holdings under co-operation agreements with international societies, or manuscript libraries. The advent of digital technology is now posing new challenges and opportunities for the archives sector. The idea of a National Memory Project that will try to bridge the different approaches in the preservation of records in the various public, private, and ecclesiastical archives in Malta is discussed. Technical challenges are highlighted, as are the opportunities that arise from collaboration and active participation in international projects such as the European Visual Archives (EVA), and the SEEDI initiative.
Resumo:
Drawing on the newest findings of politeness research, this paper proposes an interactionally grounded approach to computer-mediated discourse (CMD). Through the analysis of naturally occurring text-based synchronous interactions of a virtual team the paper illustrates that the interactional politeness approach can account for linguistic phenomena not yet fully explored in computer-mediated discourse analysis. Strategies used for compensating for the lack of audio-visual information in computer-mediated communication, strategies to compensate for the technological constraints of the medium, and strategies to aid interaction management are examined from an interactional politeness viewpoint and compared to the previous findings of CMD analysis. The conclusion of this preliminary research suggests that the endeavour to communicate along the lines of politeness norms in a work-based virtual environment contradicts some of the previous findings of CMD research (unconventional orthography, capitalization, economizing), and that other areas (such as emoticons, backchannel signals and turn-taking strategies) need to be revisited and re-examined from an interactional perspective to fully understand how language functions in this merely text-based environment.
Resumo:
Good estimates of ecosystem complexity are essential for a number of ecological tasks: from biodiversity estimation, to forest structure variable retrieval, to feature extraction by edge detection and generation of multifractal surface as neutral models for e.g. feature change assessment. Hence, measuring ecological complexity over space becomes crucial in macroecology and geography. Many geospatial tools have been advocated in spatial ecology to estimate ecosystem complexity and its changes over space and time. Among these tools, free and open source options especially offer opportunities to guarantee the robustness of algorithms and reproducibility. In this paper we will summarize the most straightforward measures of spatial complexity available in the Free and Open Source Software GRASS GIS, relating them to key ecological patterns and processes.