Biblioteca Digital

7 resultados para Singing Voice

em Indian Institute of Science - Bangalore - Índia

Epoch extraction based on integrated linear prediction residual using plosion index

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Epoch is defined as the instant of significant excitation within a pitch period of voiced speech. Epoch extraction continues to attract the interest of researchers because of its significance in speech analysis. Existing high performance epoch extraction algorithms require either dynamic programming techniques or a priori information of the average pitch period. An algorithm without such requirements is proposed based on integrated linear prediction residual (ILPR) which resembles the voice source signal. Half wave rectified and negated ILPR (or Hilbert transform of ILPR) is used as the pre-processed signal. A new non-linear temporal measure named the plosion index (PI) has been proposed for detecting `transients' in speech signal. An extension of PI, called the dynamic plosion index (DPI) is applied on pre-processed signal to estimate the epochs. The proposed DPI algorithm is validated using six large databases which provide simultaneous EGG recordings. Creaky and singing voice samples are also analyzed. The algorithm has been tested for its robustness in the presence of additive white and babble noise and on simulated telephone quality speech. The performance of the DPI algorithm is found to be comparable or better than five state-of-the-art techniques for the experiments considered.

Veja mais

A Pattern Recognition Model of Voice-Based Personal Verification Systems for Forensic Applications

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract-The success of automatic speaker recognition in laboratory environments suggests applications in forensic science for establishing the Identity of individuals on the basis of features extracted from speech. A theoretical model for such a verification scheme for continuous normaliy distributed featureIss developed. The three cases of using a) single feature, b)multipliendependent measurements of a single feature, and c)multpleindependent features are explored.The number iofndependent features needed for areliable personal identification is computed based on the theoretcal model and an expklatory study of some speech featues.

Veja mais

Integrating voice and data on SALAN: an experimental local area network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with the integration of voice and data on an experimental local area network used by the School of Automation, of the Indian Institute of Science. SALAN (School of Automation Local Area Network) consists of a number of microprocessor-based communication nodes linked to a shared coaxial cable transmission medium. The communication nodes handle the various low-level functions associated with computer communication, and interface user data equipment to the network. SALAN at present provides a file transfer facility between an Intel Series III microcomputer development system and a Texas Instruments Model 990/4 microcomputer system. Further, a packet voice communication system has also been implemented on SALAN. The various aspects of the design and implementation of the above two utilities are discussed.

Veja mais

Performance of cellular CDMA with voice/data traffic with an SIR based admission control

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyze the performance of an SIR based admission control strategy in cellular CDMA systems with both voice and data traffic. Most studies In the current literature to estimate CDMA system capacity with both voice and data traf-Bc do not take signal-tlFlnterference ratio (SIR) based admission control into account In this paper, we present an analytical approach to evaluate the outage probability for voice trafllc, the average system throughput and the mean delay for data traffic for a volce/data CDMA system which employs an SIR based admission controL We show that for a dataaniy system, an improvement of about 25% In both the Erlang capacity as well as the mean delay performance is achieved with an SIR based admission control as compared to code availability based admission control. For a mixed voice/data srtem with 10 Erlangs of voice traffic, the Lmprovement in the mean delay performance for data Is about 40%.Ah, for a mean delay of 50 ms with 10 Erlangs voice traffic, the data Erlang capacity improves by about 9%.

Veja mais

Performance analysis of data and voice connections in a cognitive radio network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the performance of cognitive (secondary) users in a cognitive radio network which uses a channel whenever the primary users are not using the channel. The usage of the channel by the primary users is modelled by an ON-OFF renewal process. The cognitive users may be transmitting data using TCP connections and voice traffic. The voice traffic is given priority over the data traffic. We theoretically compute the mean delay of TCP and voice packets and also the mean throughput of the different TCP connections. We compare the theoretical results with simulations.

Veja mais

Estimation of voice-onset time in continuous speech using temporal measures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an automatic acoustic-phonetic method for estimating voice-onset time of stops. This method requires neither transcription of the utterance nor training of a classifier. It makes use of the plosion index for the automatic detection of burst onsets of stops. Having detected the burst onset, the onset of the voicing following the burst is detected using the epochal information and a temporal measure named the maximum weighted inner product. For validation, several experiments are carried out on the entire TIMIT database and two of the CMU Arctic corpora. The performance of the proposed method compares well with three state-of-the-art techniques. (C) 2014 Acoustical Society of America

Veja mais

Voice source characterization using pitch synchronous discrete cosine transform for speaker identification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A characterization of the voice source (VS) signal by the pitch synchronous (PS) discrete cosine transform (DCT) is proposed. With the integrated linear prediction residual (ILPR) as the VS estimate, the PS DCT of the ILPR is evaluated as a feature vector for speaker identification (SID). On TIMIT and YOHO databases, using a Gaussian mixture model (GMM)-based classifier, it performs on par with existing VS-based features. On the NIST 2003 database, fusion with a GMM-based classifier using MFCC features improves the identification accuracy by 12% in absolute terms, proving that the proposed characterization has good promise as a feature for SID studies. (C) 2015 Acoustical Society of America

Veja mais

7 resultados para Singing Voice

em Indian Institute of Science - Bangalore - Índia

Filtro por publicador

Epoch extraction based on integrated linear prediction residual using plosion index

A Pattern Recognition Model of Voice-Based Personal Verification Systems for Forensic Applications

Integrating voice and data on SALAN: an experimental local area network

Performance of cellular CDMA with voice/data traffic with an SIR based admission control

Performance analysis of data and voice connections in a cognitive radio network

Estimation of voice-onset time in continuous speech using temporal measures

Voice source characterization using pitch synchronous discrete cosine transform for speaker identification