778 resultados para PITCH
Resumo:
We present an extrema based unwarping technique for signals with time-varying periodicity. We show that for arbitrary variation of pitch periodicity in speech signal,the unwarping technique maps the signals to periodic signals which enable eficient estimation of periodicity. We demonstrate the e�ectiveness of the new technique using both synthetic and real speech signals.
Resumo:
A joint analysis-synthesis framework is developed for the compressive sensing (CS) recovery of speech signals. The signal is assumed to be sparse in the residual domain with the linear prediction filter used as the sparse transformation. Importantly this transform is not known apriori, since estimating the predictor filter requires the knowledge of the signal. Two prediction filters, one comb filter for pitch and another all pole formant filter are needed to induce maximum sparsity. An iterative method is proposed for the estimation of both the prediction filters and the signal itself. Formant prediction filter is used as the synthesis transform, while the pitch filter is used to model the periodicity in the residual excitation signal, in the analysis mode. Significant improvement in the LLR measure is seen over the previously reported formant filter estimation.
Resumo:
Grating Compression Transform (GCT) is a two-dimensional analysis of speech signal which has been shown to be effective in multi-pitch tracking in speech mixtures. Multi-pitch tracking methods using GCT apply Kalman filter framework to obtain pitch tracks which requires training of the filter parameters using true pitch tracks. We propose an unsupervised method for obtaining multiple pitch tracks. In the proposed method, multiple pitch tracks are modeled using time-varying means of a Gaussian mixture model (GMM), referred to as TVGMM. The TVGMM parameters are estimated using multiple pitch values at each frame in a given utterance obtained from different patches of the spectrogram using GCT. We evaluate the performance of the proposed method on all voiced speech mixtures as well as random speech mixtures having well separated and close pitch tracks. TVGMM achieves multi-pitch tracking with 51% and 53% multi-pitch estimates having error <= 20% for random mixtures and all-voiced mixtures respectively. TVGMM also results in lower root mean squared error in pitch track estimation compared to that by Kalman filtering.
Resumo:
A characterization of the voice source (VS) signal by the pitch synchronous (PS) discrete cosine transform (DCT) is proposed. With the integrated linear prediction residual (ILPR) as the VS estimate, the PS DCT of the ILPR is evaluated as a feature vector for speaker identification (SID). On TIMIT and YOHO databases, using a Gaussian mixture model (GMM)-based classifier, it performs on par with existing VS-based features. On the NIST 2003 database, fusion with a GMM-based classifier using MFCC features improves the identification accuracy by 12% in absolute terms, proving that the proposed characterization has good promise as a feature for SID studies. (C) 2015 Acoustical Society of America
Resumo:
This work is aimed at optimizing the wind turbine rotor speed setpoint algorithm. Several intelligent adjustment strategies have been investigated in order to improve a reward function that takes into account the power captured from the wind and the turbine speed error. After different approaches including Reinforcement Learning, the best results were obtained using a Particle Swarm Optimization (PSO)-based wind turbine speed setpoint algorithm. A reward improvement of up to 10.67% has been achieved using PSO compared to a constant approach and 0.48% compared to a conventional approach. We conclude that the pitch angle is the most adequate input variable for the turbine speed setpoint algorithm compared to others such as rotor speed, or rotor angular acceleration.
Resumo:
The aim of the present study is to analyse the influence of different large-sided games (LSGs) on the physical and physiological variables in under-12s (U12) and -13s (U13) soccer players. The effects of the combination of different number of players per team, 7, 9, and 11 (P7, P9, and P11, respectively) with three relative pitch areas, 100, 200, and 300 m(2) (A100, A200, and A300, respectively), were analysed in this study. The variables analysed were: 1) global indicator such as total distance (TD); work:rest ratio (W:R); player-load (PL) and maximal speed (V-max); 2) heart rate (HR) mean and time spent in different intensity zones of HR (<75%, 75-84%, 84-90% and >90%), and; 3) five absolute (<8, 8-13, 13-16 and >16 Km h(-1)) and three relative speed categories (<40%, 40-60% and >60% V-max). The results support the theory that a change in format (player number and pitch dimensions) affects no similarly in the two players categories. Although it can seem that U13 players are more demanded in this kind of LSG, when the work load is assessed from a relative point of view, great pitch dimensions and/or high number of player per team are involved in the training task to the U12 players. The results of this study could alert to the coaches to avoid some types of LSGs for the U12 players such as:P11 played in A100, A200 or A300, P9 played in A200 or A300 and P7 played in A300 due to that U13>U12 in several physical and physiological variables (W:R, time spent in 84-90% HRmax, distance in 8-13 and 13-16 Km h(-1) and time spent in 40-60% V-max). These results may help youth soccer coaches to plan the progressive introduction of LSGs so that task demands are adapted to the physiological and physical development of participants.
Resumo:
In a Text-to-Speech system based on time-domain techniques that employ pitch-synchronous manipulation of the speech waveforms, one of the most important issues that affect the output quality is the way the analysis points of the speech signal are estimated and the actual points, i.e. the analysis pitchmarks. In this paper we present our methodology for calculating the pitchmarks of a speech waveform, a pitchmark detection algorithm, which after thorough experimentation and in comparison with other algorithms, proves to behave better with our TD-PSOLA-based Text-to-Speech synthesizer (Time- Domain Pitch-Synchronous Overlap Add Text to Speech System).