Biblioteca Digital

909 resultados para audio segmentation

Speaker diarization: Segmentation and clustering of speeches

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.

Nueva técnica de fusión de clasificadores aplicada a la mejora de la segmentación de audio

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Este artículo presenta un nuevo algoritmo de fusión de clasificadores a partir de su matriz de confusión de la que se extraen los valores de precisión (precision) y cobertura (recall) de cada uno de ellos. Los únicos datos requeridos para poder aplicar este nuevo método de fusión son las clases o etiquetas asignadas por cada uno de los sistemas y las clases de referencia en la parte de desarrollo de la base de datos. Se describe el algoritmo propuesto y se recogen los resultados obtenidos en la combinación de las salidas de dos sistemas participantes en la campaña de evaluación de segmentación de audio Albayzin 2012. Se ha comprobado la robustez del algoritmo, obteniendo una reducción relativa del error de segmentación del 6.28% utilizando para realizar la fusión el sistema con menor y mayor tasa de error de los presentados a la evaluación.

Detection of Raga-characteristic phrases from Hindustani Classical Music Audio

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Melodic motifs form essential building blocks in Indian Classical music. The motifs, or key phrases, providestrong cues to the identity of the underlying raga in both Hindustani and Carnatic styles of Indian music. Automatic identification and clustering of similar motifs is relevant in this context. The inherent variations in various instances of a characteristic phrase in a bandish (composition)performance make it challenging to identify similar phrases in a performance. A nyas svara (long note)marks the ending of these phrases. The proposed method does segmentation of phrases through identification ofnyas and computes similarity with the reference characteristic phrase.

Visual Speech Segmentation: Using Facial Cues to Locate Word Boundaries in Continuous Speech

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition.

A comparison of open-source segmentation architectures for dealing with imperfect data from the media in speech synthesis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traditional Text-To-Speech (TTS) systems have been developed using especially-designed non-expressive scripted recordings. In order to develop a new generation of expressive TTS systems in the Simple4All project, real recordings from the media should be used for training new voices with a whole new range of speaking styles. However, for processing this more spontaneous material, the new systems must be able to deal with imperfect data (multi-speaker recordings, background and foreground music and noise), filtering out low-quality audio segments and creating mono-speaker clusters. In this paper we compare several architectures for combining speaker diarization and music and noise detection which improve the precision and overall quality of the segmentation.

Metodo de sincronização de cameras de video utilizando a banda de audio

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

Medidas de audição de pais de indivíduos com deficiência auditiva de herança autossômica recessiva

Relevância:

20.00% 20.00%

Publicador:

Resumo:

TEMA: avaliação audiológica de pais de indivíduos com perda auditiva de herança autossômica recessiva. OBJETIVO: estudar o perfil audiológico de pais de indivíduos com perda auditiva, de herança autossômica recessiva, inferida pela história familial ou por testes moleculares que detectaram mutação no gene GJB2, responsável por codificar a Conexina 26. MÉTODO: 36 indivíduos entre 30 e 60 anos foram avaliados e divididos em dois grupos: grupo controle, sem queixas auditivas e sem história familiar de deficiência auditiva, e grupo de estudos composto por pais heterozigotos em relação a genes de surdez de herança autossômica recessiva inespecífica ou portadores heterozigotos de mutação no gene da Conexina 26. Todos foram submetidos à audiometria tonal liminar (0,25kHz a 8), audiometria de altas freqüências (9kHz a 20) e emissões otoacústicas produtos de distorção (EOAPD). RESULTADOS: houve diferenças significativas na amplitude das EOAPD nas freqüências 1001 e 1501Hz entre os grupos, sendo maior a amplitude no grupo controle. Não houve diferença significativa entre os grupos para os limiares tonais de 0,25 a 20KHz. CONCLUSÃO: as EOAPD foram mais eficazes, em comparação com a audiometria tonal liminar, para detectar diferenças auditivas entre os grupos. Mais pesquisas são necessárias para verificar a confiabilidade destes dados.

Fatores relacionados à autopercepção da audição entre idosos do município de São Paulo - Projeto SABE

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este artigo relata a influência de fatores sociodemográficos e de saúde na autopercepção da audição entre os idosos do projeto " Saúde, Bem-Estar e Envelhecimento" (Projeto SABE) no município de São Paulo. O estudo incluiu 2.143 indivíduos de 60 anos e mais. Um modelo de regressão logística ordinal, considerando o desenho da amostra, foi usado na análise multivariável. O aumento da idade; o sexo masculino; morar acompanhado; relatar tontura; memória regular/ ruim e saúde regular ou ruim aumentaram a chance de autopercepção ruim da audição. O conhecimento da autopercepção da audição e dos seus fatores relacionados é importante para avaliar a qualidade de vida dos idosos e a necessidade de reabilitação auditiva

Effects of meal size and proximal-distal segmentation on gastric activity

Relevância:

20.00% 20.00%

Publicador:

Resumo:

AIM: To evaluate the effects of meal size and three segmentations on intragastric distribution of the meal and gastric motility, by scintigraphy. METHODS: Twelve healthy volunteers were randomly assessed, twice, by scintigraphy. The test meal consisted of 60 or 180 mL of yogurt labeled with 64 MBq (99m)Tc-tin colloid. Anterior and posterior dynamic frames were simultaneously acquired for 18 min and all data were analyzed in MatLab. Three proximal-distal segmentations using regions of interest were adopted for both meals. RESULTS: Intragastric distribution of the meal between the proximal and distal compartments was strongly influenced by the way in which the stomach was divided, showing greater proximal retention after the 180 mL. An important finding was that both dominant frequencies (1 and 3 cpm) were simultaneously recorded in the proximal and distal stomach; however, the power ratio of those dominant frequencies varied in agreement with the segmentation adopted and was independent of the meal size. CONCLUSION: It was possible to simultaneously evaluate the static intragastric distribution and phasic contractility from the same recording using our scintigraphic approach. (C) 2010 Baishideng. All rights reserved.

Gene Expression Noise in Spatial Patterning: hunchback Promoter Structure Affects Noise Amplitude and Distribution in Drosophila Segmentation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Positional information in developing embryos is specified by spatial gradients of transcriptional regulators. One of the classic systems for studying this is the activation of the hunchback (hb) gene in early fruit fly (Drosophila) segmentation by the maternally-derived gradient of the Bicoid (Bcd) protein. Gene regulation is subject to intrinsic noise which can produce variable expression. This variability must be constrained in the highly reproducible and coordinated events of development. We identify means by which noise is controlled during gene expression by characterizing the dependence of hb mRNA and protein output noise on hb promoter structure and transcriptional dynamics. We use a stochastic model of the hb promoter in which the number and strength of Bcd and Hb (self-regulatory) binding sites can be varied. Model parameters are fit to data from WT embryos, the self-regulation mutant hb(14F), and lacZ reporter constructs using different portions of the hb promoter. We have corroborated model noise predictions experimentally. The results indicate that WT (self-regulatory) Hb output noise is predominantly dependent on the transcription and translation dynamics of its own expression, rather than on Bcd fluctuations. The constructs and mutant, which lack self-regulation, indicate that the multiple Bcd binding sites in the hb promoter (and their strengths) also play a role in buffering noise. The model is robust to the variation in Bcd binding site number across a number of fly species. This study identifies particular ways in which promoter structure and regulatory dynamics reduce hb output noise. Insofar as many of these are common features of genes (e. g. multiple regulatory sites, cooperativity, self-feedback), the current results contribute to the general understanding of the reproducibility and determinacy of spatial patterning in early development.

Improvements on ICA mixture models for image pre-processing and segmentation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.

AUTOMATIC CORONARY WALL SEGMENTATION IN INTRAVASCULAR ULTRASOUND IMAGES USING BINARY MORPHOLOGICAL RECONSTRUCTION

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Intravascular ultrasound (IVUS) image segmentation can provide more detailed vessel and plaque information, resulting in better diagnostics, evaluation and therapy planning. A novel automatic segmentation proposal is described herein; the method relies on a binary morphological object reconstruction to segment the coronary wall in IVUS images. First, a preprocessing followed by a feature extraction block are performed, allowing for the desired information to be extracted. Afterward, binary versions of the desired objects are reconstructed, and their contours are extracted to segment the image. The effectiveness is demonstrated by segmenting 1300 images, in which the outcomes had a strong correlation to their corresponding gold standard. Moreover, the results were also corroborated statistically by having as high as 92.72% and 91.9% of true positive area fraction for the lumen and media adventitia border, respectively. In addition, this approach can be adapted easily and applied to other related modalities, such as intravascular optical coherence tomography and intravascular magnetic resonance imaging. (E-mail: matheuscardosomg@hotmail.com) (C) 2011 World Federation for Ultrasound in Medicine & Biology.

Unsupervised cell nucleus segmentation with active contours

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The task of segmenting cell nuclei from cytoplasm in conventional Papanicolaou (Pap) stained cervical cell images is a classical image analysis problem which may prove to be crucial to the development of successful systems which automate the analysis of Pap smears for detection of cancer of the cervix. Although simple thresholding techniques will extract the nucleus in some cases, accurate unsupervised segmentation of very large image databases is elusive. Conventional active contour models as introduced by Kass, Witkin and Terzopoulos (1988) offer a number of advantages in this application, but suffer from the well-known drawbacks of initialisation and minimisation. Here we show that a Viterbi search-based dual active contour algorithm is able to overcome many of these problems and achieve over 99% accurate segmentation on a database of 20 130 Pap stained cell images. (C) 1998 Elsevier Science B.V. All rights reserved.

Personal Values as Basis for Strategic Segmentation: a study with Professionals from Sao Paulo

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important segmentation basis used by firms is related to consumers` personal values which are investigated in this study. It was used a descriptive research with the survey method of data collection in a sample of executives from Sao Paulo who are considered to be potential buyers of high value and innovative goods. An exploratory factor analysis was employed in order to reduce the values scale used and a cluster analysis was performed to identify the groups of executives according to the importance attached to different person values. Concluding, it was observed that there was a similarity among the three personal values dimensions, named as Civility (concerns about having a good conduct before society according to social rules of interaction), Self-Direction (intellectual aspects and practical orientation in their conducts) and Conformity (restriction of actions, inclinations and impulses, that are likely to harm others and would violate expectations) and the ones reported in the theory Rokeach`s theory about instrumental personal values. Furthermore, three groups of executives were identified (good conduct group, low restriction group and high restriction group). The differences observed in the importance of personal values here presented by the dimensions called Civility, Self-Direction and Conformity can lead to different buying behaviors and product preferences. From the results found in this study the companies could adapt their current and new products offers, as well as their communication in order to better serve these segments of executives from Sao Paulo.

Targeted! Population segmentation, electronic surveillance and governing the unemployed in Australia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Targeting is increasingly used to manage people. It operates by segmenting populations and providing different levels of opportunities and services to these groups. Each group is subject to different levels of surveillance and scrutiny. This article examines the deployment of targeting in Australian social security. Three case studies of targeting are presented in Australia's management of benefit overpayment and fraud, the distribution of employment services and the application of workfare. In conceptualizing surveillance as governance, the analysis examines the rationalities, technologies and practices that make targeting thinkable, practicable and achievable. In the case studies, targeting is variously conceptualized and justified by calculative risk discourses, moral discourses of obligation and notions of welfare dependency Advanced information technologies are also seen as particularly important in giving rise to the capacity to think about and act on population segments.

«
1
2
3
4
5
6
7
8
...
60
61
»