998 resultados para Speech segmentation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tissue microarray (TMA) is a high throughput analysis tool to identify new diagnostic and prognostic markers in human cancers. However, standard automated method in tumour detection on both routine histochemical and immunohistochemistry (IHC) images is under developed. This paper presents a robust automated tumour cell segmentation model which can be applied to both routine histochemical tissue slides and IHC slides and deal with finer pixel-based segmentation in comparison with blob or area based segmentation by existing approaches. The presented technique greatly improves the process of TMA construction and plays an important role in automated IHC quantification in biomarker analysis where excluding stroma areas is critical. With the finest pixel-based evaluation (instead of area-based or object-based), the experimental results show that the proposed method is able to achieve 80% accuracy and 78% accuracy in two different types of pathological virtual slides, i.e., routine histochemical H&E and IHC images, respectively. The presented technique greatly reduces labor-intensive workloads for pathologists and highly speeds up the process of TMA construction and provides a possibility for fully automated IHC quantification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There are multiple reasons to expect that recognising the verbal content of emotional speech will be a difficult problem, and recognition rates reported in the literature are in fact low. Including information about prosody improves recognition rate for emotions simulated by actors, but its relevance to the freer patterns of spontaneous speech is unproven. This paper shows that recognition rate for spontaneous emotionally coloured speech can be improved by using a language model based on increased representation of emotional utterances. The models are derived by adapting an already existing corpus, the British National Corpus (BNC). An emotional lexicon is used to identify emotionally coloured words, and sentences containing these words are recombined with the BNC to form a corpus with a raised proportion of emotional material. Using a language model based on that technique improves recognition rate by about 20%. (c) 2005 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech recognition and language analysis of spontaneous speech arising in naturally spoken conversations are becoming the subject of much research. However, there is a shortage of spontaneous speech corpora that are freely available for academics. We therefore undertook the building of a natural conversation speech database, recording over 200 hours of conversations in English by over 600 local university students. With few exceptions, the students used their own cell phones from their own rooms or homes to speak to one another, and they were permitted to speak on any topic they chose. Although they knew that they were being recorded and that they would receive a small payment, their conversations in the corpus are probably very close to being natural and spontaneous. This paper describes a detailed case study of the problems we faced and the methods we used to make the recordings and control the collection of these social science data on a limited budget.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper studies single-channel speech separation, assuming unknown, arbitrary temporal dynamics for the speech signals to be separated. A data-driven approach is described, which matches each mixed speech segment against a composite training segment to separate the underlying clean speech segments. To advance the separation accuracy, the new approach seeks and separates the longest mixed speech segments with matching composite training segments. Lengthening the mixed speech segments to match reduces the uncertainty of the constituent training segments, and hence the error of separation. For convenience, we call the new approach Composition of Longest Segments, or CLOSE. The CLOSE method includes a data-driven approach to model long-range temporal dynamics of speech signals, and a statistical approach to identify the longest mixed speech segments with matching composite training segments. Experiments are conducted on the Wall Street Journal database, for separating mixtures of two simultaneous large-vocabulary speech utterances spoken by two different speakers. The results are evaluated using various objective and subjective measures, including the challenge of large-vocabulary continuous speech recognition. It is shown that the new separation approach leads to significant improvement in all these measures.

Relevância:

20.00% 20.00%

Publicador: