992 resultados para speech features


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a Text-to-Speech system based on time-domain techniques that employ pitch-synchronous manipulation of the speech waveforms, one of the most important issues that affect the output quality is the way the analysis points of the speech signal are estimated and the actual points, i.e. the analysis pitchmarks. In this paper we present our methodology for calculating the pitchmarks of a speech waveform, a pitchmark detection algorithm, which after thorough experimentation and in comparison with other algorithms, proves to behave better with our TD-PSOLA-based Text-to-Speech synthesizer (Time- Domain Pitch-Synchronous Overlap Add Text to Speech System).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most HMM-based TTS systems use a hard voiced/unvoiced classification to produce a discontinuous F0 signal which is used for the generation of the source-excitation. When a mixed source excitation is used, this decision can be based on two different sources of information: the state-specific MSD-prior of the F0 models, and/or the frame-specific features generated by the aperiodicity model. This paper examines the meaning of these variables in the synthesis process, their interaction, and how they affect the perceived quality of the generated speech The results of several perceptual experiments show that when using mixed excitation, subjects consistently prefer samples with very few or no false unvoiced errors, whereas a reduction in the rate of false voiced errors does not produce any perceptual improvement. This suggests that rather than using any form of hard voiced/unvoiced classification, e.g., the MSD-prior, it is better for synthesis to use a continuous F0 signal and rely on the frame-level soft voiced/unvoiced decision of the aperiodicity model. © 2011 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For speech recognition, mismatches between training and testing for speaker and noise are normally handled separately. The work presented in this paper aims at jointly applying speaker adaptation and model-based noise compensation by embedding speaker adaptation as part of the noise mismatch function. The proposed method gives a faster and more optimum adaptation compared to compensating for these two factors separately. It is also more consistent with respect to the basic assumptions of speaker and noise adaptation. Experimental results show significant and consistent gains from the proposed method. © 2011 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fundamental frequency, or F0 is critical for high quality speech synthesis in HMM based speech synthesis. Traditionally, F0 values are considered to depend on a binary voicing decision such that they are continuous in voiced regions and undefined in unvoiced regions. Multi-space distribution HMM (MSDHMM) has been used for modelling the discontinuous F0. Recently, a continuous F0 modelling framework has been proposed and shown to be effective, where continuous F0 observations are assumed to always exist and voicing labels are explicitly modelled by an independent stream. In this paper, a refined continuous F0 modelling approach is proposed. Here, F0 values are assumed to be dependent on voicing labels and both are jointly modelled in a single stream. Due to the enforced dependency, the new method can effectively reduce the voicing classification error. Subjective listening tests also demonstrate that the new approach can yield significant improvements on the naturalness of the synthesised speech. A dynamic random unvoiced F0 generation method is also investigated. Experiments show that it has significant effect on the quality of synthesised speech. © 2011 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Structured precision modelling is an important approach to improve the intra-frame correlation modelling of the standard HMM, where Gaussian mixture model with diagonal covariance are used. Previous work has all been focused on direct structured representation of the precision matrices. In this paper, a new framework is proposed, where the structure of the Cholesky square root of the precision matrix is investigated, referred to as Cholesky Basis Superposition (CBS). Each Cholesky matrix associated with a particular Gaussian distribution is represented as a linear combination of a set of Gaussian independent basis upper-triangular matrices. Efficient optimization methods are derived for both combination weights and basis matrices. Experiments on a Chinese dictation task showed that the proposed approach can significantly outperformed the direct structured precision modelling with similar number of parameters as well as full covariance modelling. © 2011 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

On the basis of observation data of water temperature and salinity the mean seasonal geostrophic circulation in open region of the South China Sea (SCS) was computed by the dynamic method relative to the 800 decibar reference surface. The results of computation let go to following notices: In both main monsoons (winter and summer) there are two main geostrophic eddies: the anticlockwise eddy in the northern and northwestern part, and the clockwise eddy in the southern part of the SCS with corresponding divergent and convergent zones. The main frontal zones go along the middle latitudes of the sea from the southern continental shelf of Vietnam to the area west of Luzon Island. The strength and stability of the current in winter are higher than in summer. The Kuroshio has an enough strong branch intruding into the SCS through Bashi Strait in winter creating in the sea the water structure similar to that of the Northwest Pacific subtropical and tropical regions. In summer the Kuroshio water can intrude directly only into the area southwest of Taiwan.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Based on the hydrodynamic model and Shore Protection Manual (CERC - USA) we have calculated wave field characteristics in the typical wind conditions (wind velocity equal to 13m/s in the high frequency direction of the wind regime). Comparison between measured and calculated wave parameters was presented and these results were corresponded to each other. The following main wave characteristics were calculated: -Pattern of the refraction wave field. -Average wave height field. -Longshore current velocity field in surf zone. From distribution features of wave field characteristics in research areas, it could be summarized as following: - The formation of wave fields in the research areas was unequal because of their local difference of hydrometeorological conditions, river discharge, bottom relief… - At Cuadai (Dai mouth, Hoian) area in the N direction of incident wave field, wave has caused serious variation of the coastline. The coastline in the whole region, especially, at the south of the mouth was eroded and the foreland in the north of the mouth was deposited. - At Cai river mouth (Nhatrang) area in the E direction of incident wave field, wave has effected strongly and directly to the inshore and channel structure. - At Phanthiet bay area in the SW direction of incident wave field, wave has effected strongly to the whole shoreline from Da point to Ne point and caused serious erosion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

728 human genes were divided to four groups according to the GC contents of their coding sequences (from GC<0.43 to GC>0.58). Examination of synonymous-codon bias in the 4 groups show that NTG (N represents any base of T, A, C, G) is most favored and NCG

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces a method by which intuitive feature entities can be created from ILP (InterLevel Product) coefficients. The ILP transform is a pyramid of decimated complex-valued coefficients at multiple scales, derived from dual-tree complex wavelets, whose phases indicate the presence of different feature types (edges and ridges). We use an Expectation-Maximization algorithm to cluster large ILP coefficients that are spatially adjacent and similar in phase. We then demonstrate the relationship that these clusters possess with respect to observable image content, and conclude with a look at potential applications of these clusters, such as rotation- and scale-invariant object recognition. © 2005 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, Iranian and French male and female Oncorhynchus mykiss broodstocks were divided into two groups 50 and 24 respectively in Research center of genetic and breeding of coldwater fishes, Yasouj, Iran and the genetic structure of them was investigated using 6 microsatellite markers. Then 19 morphometric and 5 meristic of broodstock were measured and compared in two populations. Along with broodstock maturation, fertilization 1:1(female:male) were randomly assigned and occurred in 25 of 12 Iranian and French treatment respectively. Reproductive parameters were recorded for the whole family. Average number of observed alleles in Iranian and French stocks was 6.68 and 6.83, respectively. Average number of effective alleles in Iranian and French stocks was 3.13 and 3.45 respectively. Fixation index Fst was calculated based on allelic frequency between two stocks was 0.058 with significant difference between 2 stocks. Morphometric analysis showed significant difference between two stocks in 8 characteristics. Meristic characters was without significant difference in broodstock groups. Eyed percentage for french broodstock calculated zero and deleted. Fertilization rate (100-0), the eyed percentage (98- 0), The hatch rate (98-0), the average fecundity 4114.708, the average eggs size 4.88 mm, Survival in the first three months 19-73% calculated for Iranian broodstocks. Considering the quality of eggs and larvae at different stages and selection between the different family and the within family remained 10 treatments and are kept as future broodstocks. The relationship between fecundity - egg size, fecundity - weight , fecundity - length, egg size- weight was performed using regression. The results showed that Fecundity was influenced more by weight and productive length. The research is beginning to ID the broodstock in our country.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hidden Markov model (HMM)-based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervised speaker adaptation, previous work has used a supplementary set of acoustic models to estimate the transcription of the adaptation data. This paper first presents an approach to the unsupervised speaker adaptation task for HMM-based speech synthesis models which avoids the need for such supplementary acoustic models. This is achieved by defining a mapping between HMM-based synthesis models and ASR-style models, via a two-pass decision tree construction process. Second, it is shown that this mapping also enables unsupervised adaptation of HMM-based speech synthesis models without the need to perform linguistic analysis of the estimated transcription of the adaptation data. Third, this paper demonstrates how this technique lends itself to the task of unsupervised cross-lingual adaptation of HMM-based speech synthesis models, and explains the advantages of such an approach. Finally, listener evaluations reveal that the proposed unsupervised adaptation methods deliver performance approaching that of supervised adaptation.