998 resultados para speech segmentation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) represents an established method for the detection and diagnosis of breast lesions. While mass-like enhancing lesions can be easily categorized according to the Breast Imaging Reporting and Data System (BI-RADS) MRI lexicon, a majority of diagnostically challenging lesions, the so called non-mass-like enhancing lesions, remain both qualitatively as well as quantitatively difficult to analyze. Thus, the evaluation of kinetic and/or morphological characteristics of non-masses represents a challenging task for an automated analysis and is of crucial importance for advancing current computer-aided diagnosis (CAD) systems. Compared to the well-characterized mass-enhancing lesions, non-masses have no well-defined and blurred tumor borders and a kinetic behavior that is not easily generalizable and thus discriminative for malignant and benign non-masses. To overcome these difficulties and pave the way for novel CAD systems for non-masses, we will evaluate several kinetic and morphological descriptors separately and a novel technique, the Zernike velocity moments, to capture the joint spatio-temporal behavior of these lesions, and additionally consider the impact of non-rigid motion compensation on a correct diagnosis.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have examined the ability of observers to parse bimodal local-motion distributions into two global motion surfaces, either overlapping (yielding transparent motion) or spatially segregated (yielding a motion boundary). The stimuli were random dot kinematograms in which the direction of motion of each dot was drawn from one of two rectangular probability distributions. A wide range of direction distribution widths and separations was tested. The ability to discriminate the direction of motion of one of the two motion surfaces from the direction of a comparison stimulus was used as an objective test of the perception of two discrete surfaces. Performance for both transparent and spatially segregated motion was remarkably good, being only slightly inferior to that achieved with a single global motion surface. Performance was consistently better for segregated motion than for transparency. Whereas transparent motion was only perceived with direction distributions which were separated by a significant gap, segregated motion could be seen with abutting or even partially overlapping direction distributions. For transparency, the critical gap increased with the range of directions in the distribution. This result does not support models in which transparency depends on detection of a minimum size of gap defining a bimodal direction distribution. We suggest, instead, that the operations which detect bimodality are scaled (in the direction domain) with the overall range of distributions. This yields a flexible, adaptive system that determines whether a gap in the direction distribution serves as a segmentation cue or is smoothed as part of a unitary computation of global motion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mechanisms underlying the parsing of a spatial distribution of velocity vectors into two adjacent (spatially segregated) or overlapping (transparent) motion surfaces were examined using random dot kinematograms. Parsing might occur using either of two principles. Surfaces might be defined on the basis of similarity of motion vectors and then sharp perceptual boundaries drawn between different surfaces (continuity-based segmentation). Alternatively, detection of a high gradient of direction or speed separating the motion surfaces might drive the process (discontinuity-based segmentation). To establish which method is used, we examined the effect of blurring the motion direction gradient. In the case of a sharp direction gradient, each dot had one of two directions differing by 135°. With a shallow gradient, most dots had one of two directions but the directions of the remainder spanned the range between one motion-defined surface and the other. In the spatial segregation case the gradient defined a central boundary separating two regions. In the transparent version the dots were randomly positioned. In both cases all dots moved with the same speed and existed for only two frames before being randomly replaced. The ability of observers to parse the motion distribution was measured in terms of their ability to discriminate the direction of one of the two surfaces. Performance was hardly affected by spreading the gradient over at least 25% of the dots (corresponding to a 1° strip in the segregation case). We conclude that detection of sharp velocity gradients is not necessary for distinguishing different motion surfaces.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Research on speech and emotion is moving from a period of exploratory research into one where there is a prospect of substantial applications, notably in human-computer interaction. Progress in the area relies heavily on the development of appropriate databases. This paper addresses the issues that need to be considered in developing databases of emotional speech, and shows how the challenge of developing apropriate databases is being addressed in three major recent projects - the Belfast project, the Reading-Leeds project and the CREST-ESP project. From these and other studies the paper draws together the tools and methods that have been developed, addresses the problems that arise and indicates the future directions for the development of emotional speech databases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper provides a summary of our studies on robust speech recognition based on a new statistical approach – the probabilistic union model. We consider speech recognition given that part of the acoustic features may be corrupted by noise. The union model is a method for basing the recognition on the clean part of the features, thereby reducing the effect of the noise on recognition. To this end, the union model is similar to the missing feature method. However, the two methods achieve this end through different routes. The missing feature method usually requires the identity of the noisy data for noise removal, while the union model combines the local features based on the union of random events, to reduce the dependence of the model on information about the noise. We previously investigated the applications of the union model to speech recognition involving unknown partial corruption in frequency band, in time duration, and in feature streams. Additionally, a combination of the union model with conventional noise-reduction techniques was studied, as a means of dealing with a mixture of known or trainable noise and unknown unexpected noise. In this paper, a unified review, in the context of dealing with unknown partial feature corruption, is provided into each of these applications, giving the appropriate theory and implementation algorithms, along with an experimental evaluation.