Biblioteca Digital

883 resultados para Voice synthesization

An analytical model for an IEEE~802.11 WLAN using DCF with two types of VoIP calls

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we develop and numerically explore the modeling heuristic of using saturation attempt probabilities as state dependent attempt probabilities in an IEEE 802.11e infrastructure network carrying packet telephone calls and TCP controlled file downloads, using enhanced distributed channel access (EDCA). We build upon the fixed point analysis and performance insights. When there are a certain number of nodes of each class contending for the channel (i.e., have nonempty queues), then their attempt probabilities are taken to be those obtained from saturation analysis for that number of nodes. Then we model the system queue dynamics at the network nodes. With the proposed heuristic, the system evolution at channel slot boundaries becomes a Markov renewal process, and regenerative analysis yields the desired performance measures. The results obtained from this approach match well with ns2 simulations. We find that, with the default IEEE 802.11e EDCA parameters for AC 1 and AC 3, the voice call capacity decreases if even one file download is initiated by some station. Subsequently, reducing the voice calls increases the file download capacity almost linearly (by 1/3 Mbps per voice call for the 11 Mbps PHY)

Zero-phase inverse filtering for extraction of source characteristics

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The instants at which significant excitation of vocal tract take place during voicing are referred to as epochs. Epochs and strengths of excitation pulses at epochs are useful in characterizing voice source. Epoch filtering technique proposed by the authors determine epochs from speech waveform. In this paper we propose zero-phase inverse filtering to obtain strengths of excitation pulses at epochs. Zero-phase inverse filter compensates the gross spectral envelope of short-time spectrum of speech without affecting phase characteristics. Linear prediction analysis is used to realize the zero-phase inverse filter. Source characteristics that can be derived from speech using this technique are illustrated with examples.

A Scalable Architecture for VoIP Conferencing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Real-Time services are traditionally supported on circuit switched network. However, there is a need to port these services on packet switched network. Architecture for audio conferencing application over the Internet in the light of ITU-T H.323 recommendations is considered. In a conference, considering packets only from a set of selected clients can reduce speech quality degradation because mixing packets from all clients can lead to lack of speech clarity. A distributed algorithm and architecture for selecting clients for mixing is suggested here based on a new quantifier of the voice activity called “Loudness Number” (LN). The proposed system distributes the computation load and reduces the load on client terminals. The highlights of this architecture are scalability, bandwidth saving and speech quality enhancement. Client selection for playing out tries to mimic a physical conference where the most vocal participants attract more attention. The contributions of the paper are expected to aid H.323 recommendations implementations for Multipoint Processors (MP). A working prototype based on the proposed architecture is already functional.

Performance analysis of UDP with energy efficient link layer on Markov fading channels

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we analyze the throughput and energy efficiency performance of user datagram protocol (UDP) using linear, binary exponential, and geometric backoff algorithms at the link layer (LL) on point-to-point wireless fading links. Using a first-order Markov chain representation of the packet success/failure process on fading channels, we derive analytical expressions for throughput and energy efficiency of UDP/LL with and without LL backoff. The analytical results are verified through simulations. We also evaluate the mean delay and delay variation of voice packets and energy efficiency performance over a wireless link that uses UDP for transport of voice packets and the proposed backoff algorithms at the LL. We show that the proposed LL backoff algorithms achieve energy efficiency improvement of the order of 2-3 dB compared to LL with no backoff, without compromising much on the throughput and delay performance at the UDP layer. Such energy savings through protocol means will improve the battery life in wireless mobile terminals.

Better QOS with W/T SPR codes for the transmission of multimedia signals in Optical CDMA networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper optical code-division multiple-access (O-CDMA) packet network is considered, which offers inherent security in the access networks. The application of O-CDMA to multimedia transmission (voice, data, and video) is investigated. The simultaneous transmission of various services is achieved by assigning to each user unique multiple code signatures. Thus, by applying a parallel mapping technique, we achieve multi-rate services. A random access protocol is proposed, here, where all distinct codes are used, for packet transmission. The codes, Optical Orthogonal Code (OOC), or 1D codes and Wavelength/Time Single-Pulse-per-Row (W/T SPR), or 2D codes, are analyzed. These 1D and 2D codes with varied weight are used to differentiate the Quality of Service (QoS). The theoretical bit error probability corresponding to the quality of each service is established using 1D and 2D codes in the receiver noiseless case and compared. The results show that, using 2D codes QoS in multimedia transmission is better than using 1D codes.

Epoch extraction based on integrated linear prediction residual using plosion index

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Epoch is defined as the instant of significant excitation within a pitch period of voiced speech. Epoch extraction continues to attract the interest of researchers because of its significance in speech analysis. Existing high performance epoch extraction algorithms require either dynamic programming techniques or a priori information of the average pitch period. An algorithm without such requirements is proposed based on integrated linear prediction residual (ILPR) which resembles the voice source signal. Half wave rectified and negated ILPR (or Hilbert transform of ILPR) is used as the pre-processed signal. A new non-linear temporal measure named the plosion index (PI) has been proposed for detecting `transients' in speech signal. An extension of PI, called the dynamic plosion index (DPI) is applied on pre-processed signal to estimate the epochs. The proposed DPI algorithm is validated using six large databases which provide simultaneous EGG recordings. Creaky and singing voice samples are also analyzed. The algorithm has been tested for its robustness in the presence of additive white and babble noise and on simulated telephone quality speech. The performance of the DPI algorithm is found to be comparable or better than five state-of-the-art techniques for the experiments considered.

Subchannel allocation and power control in femtocells to provide quality of service

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Femtocells are a new concept which improves the coverage and capacity of a cellular system. We consider the problem of channel allocation and power control to different users within a Femtocell. Knowing the channels available, the channel states and the rate requirements of different users the Femtocell base station (FBS), allocates the channels to different users to satisfy their requirements. Also, the Femtocell should use minimal power so as to cause least interference to its neighboring Femtocells and outside users. We develop efficient, low complexity algorithms which can be used online by the Femtocell. The users may want to transmit data or voice. We compare our algorithms with the optimal solutions.

Robust Whisper Activity Detection Using Long-Term Log Energy Variation of Sub-Band Signal

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The goal in the whisper activity detection (WAD) is to find the whispered speech segments in a given noisy recording of whispered speech. Since whispering lacks the periodic glottal excitation, it resembles an unvoiced speech. This noise-like nature of the whispered speech makes WAD a more challenging task compared to a typical voice activity detection (VAD) problem. In this paper, we propose a feature based on the long term variation of the logarithm of the short-time sub-band signal energy for WAD. We also propose an automatic sub-band selection algorithm to maximally discriminate noisy whisper from noise. Experiments with eight noise types in four different signal-to-noise ratio (SNR) conditions show that, for most of the noises, the performance of the proposed WAD scheme is significantly better than that of the existing VAD schemes and whisper detection schemes when used for WAD.

A fast algorithm for speech polarity detection using long-term linear prediction

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speech polarity detection is a crucial first step in many speech processing techniques. In this paper, an algorithm is proposed that improvises the existing technique using the skewness of the voice source (VS) signal. Here, the integrated linear prediction residual (ILPR) is used as the VS estimate, which is obtained using linear prediction on long-term frames of the low-pass filtered speech signal. This excludes the unvoiced regions from analysis and also reduces the computation. Further, a modified skewness measure is proposed for decision, which also considers the magnitude of the skewness of the ILPR along with its sign. With the detection error rate (DER) as the performance metric, the algorithm is tested on 8 large databases and its performance (DER=0.20%) is found to be comparable to that of the best technique (DER=0.06%) on both clean and noisy speech. Further, the proposed method is found to be ten times faster than the best technique.

Stability Analysis of a Max-Min Fair Rate Control Protocol (RCP) in a Small Buffer Regime

Relevância:

10.00% 10.00%

Publicador:

STOCHASTICALLY SCALABLE FLOW CONTROL

Relevância:

10.00% 10.00%

Publicador:

Stability and fairness of explicit congestion control with small buffers.

Relevância:

10.00% 10.00%

Publicador:

Robust noise reduction for speech and audio signals

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statistical model-based methods are presented for the reconstruction of autocorrelated signals in impulsive plus continuous noise environments. Signals are modelled as autoregressive and noise sources as discrete and continuous mixtures of Gaussians, allowing for robustness in highly impulsive and non-Gaussian environments. Markov Chain Monte Carlo methods are used for reconstruction of the corrupted waveforms within a Bayesian probabilistic framework and results are presented for contaminated voice and audio signals.

Los dilemas de la acción colectiva global. Un estudio de caso : el movimiento sindical en el MERCOSUR (1991-2012)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Resumen: Este trabajo analiza la acción colectiva sindical postnacional en el MERCOSUR en un período histórico signado por profundas mutaciones políticas, económicas, y productivas sociales (1991-2012) a partir de los relatos y representaciones de sus protagonistas. El trabajo cualitativo intentará explicar la configuración del sindicalismo internacional en la globalización, y describir las estrategias del movimiento obrero mercosureño. La metodología cualitativa ilustra un trabajo de campo a partir de entrevistas en profundidad a 34 sindicalistas del Mercosur, y entrevistas adicionales a tres representantes de la Confederación Sindical de las Américas, dos empresarios del Mercosur, un especialista académico en la dimensión sociolaboral de la integración regional y un representante de la OIT en la región. La metodología de análisis e interpretación de dichas entrevistas ha sido la teoría fundamentada, entendida como la técnica más idónea de aprehender los procesos sociales a través de las voces de los líderes obreros, comprender su realidad, sus representaciones y sistema de valores, sus ideas y su acción colectiva. La literatura de los movimientos sociales en la globalización capitalista ha puesto el énfasis en la emergencia de nuevos colectivos cuyos reclamos se concentran en el reconocimiento (Fraser y Honneth, 2006) de sus identidades que el modelo fordista de producción pareció invisibilizar y soslayar ante la primacía de las prácticas económicas y demandas distributivas. Esta tesis conjuga una perspectiva dualista y demuestra que las estrategias de reconocimiento y las reivindicaciones de redistribución de tipo clasista se resignificaron en el escenario postnacional a través de la Coordinadora de Centrales Sindicales del Cono Sur –CCSCS- (subregional) y, con un desarrollo menor: los Sindicatos Globales (FSI, GUFs) en la acción sectorial [1991-2012]. Para arribar al núcleo configurativo de sus representaciones y su sistema de valores, la investigación transitó por los sentidos y significados del trabajo, las mutaciones productivas y de las condiciones del trabajo, las teorías del fin del trabajo, la precarización y la representación de los trabajadores más frágiles: mujeres, jóvenes y migrantes. En un segundo orden se interpeló sobre la gobernanza mundial, los organismos internacionales, el régimen normativo internacional, la civilización capitalista, para luego abordar el estudio específico del Mercosur y la acción obrera en dicho proceso. El núcleo determinó que para los representantes obreros la acción colectiva sindical debe ser postnacional y su objetivo es limitar la globalización capitalista neoliberal. La CCSCS conformó desde sus inicios un movimiento capaz de elevarse al rango supranacional para representar la voz de los trabajadores del MERCOSUR. La pluralidad configuró su mayor virtud durante sus primeros 20 años, reconociendo una experiencia de aprendizaje de tolerancia y respeto, que ellos definen como la unidad en la diversidad. Esta entidad constituye un patrimonio único como paradigma del sindicalismo postnacional. Los sindicatos del Cono Sur adoptaron diversas modalidades de acción colectiva: a) reactiva (con repertorios de insubordinación, de lucha y resistencia al modelo neoliberal), b) proactiva (con repertorios de incidencia normativa en el MERCOSUR) y c) participativa (con repertorios de producción propositiva de incidencia en la dimensión social del MERCOSUR). Su acción colectiva reactiva, normativa y propositiva fue eficaz a mediano plazo para participar e incidir en el MERCOSUR, crear una dimensión social del bloque y dotar de derechos normativos a los ciudadanos de la región. Su acción tuvo un sentido político de gran poder instituyente, con capacidad movilización y alta exposición pública. Sin embargo, en la segunda década su lógica de construcción quedó subordinada a los procesos nacionales y a los partidos gobernantes, dejó de ser performativa y de creación política, dirimiéndose en la esfera social junto a otros movimientos sociales emergentes, y provocó un ciclo de desmovilización. Simultáneamente, emergió con fuerza otra modalidad de sindicalismo postnacional con la fusión y refundación de los Sindicatos Globales. Su acción sectorial contribuye a restaurar las demandas de distribución que habían quedado soslayadas, pero esta tesis manifiesta que los protagonistas afirman que sus marcos de acción colectiva deberán ser conjuntos para ser exitosa. El sindicalismo postnacional en el MERCOSUR se define a sí mismos como agente de desarrollo, protagonista del modelo socioproductivo, pero también como vehículo partícipe de la democracia y de una matriz sustentable de desarrollo

Evaluación perceptivo auditiva de voces degradas y su correlación con medidas acústicas

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Resumen: El objetivo es determinar utilizando las mediciones acústicas, qué información es más relevante para el oyente al momento de categorizar el grado general de disfonía. Se eligieron 8 (4 voces femeninas y 4 voces masculinas. Cada emisión fue evaluada auditivo perceptualmente a través del item G de la escala GRBAS por 10 oyentes experimentados y acústicamente mediante medidas de aperiodicidad, ruido y caos. El estudio estadístico de análisis discriminante señala la importancia de GNE, Jit y Jitter_cc y Lyapunov como parámetros predictores del grado general de disfonía. La aplicación del método k-means evidencia que existen rasgos en los parámetros acústicos empleados que permiten agrupar objetivamente las voces estudiadas con 100% de precisión para la clase 0, 96% a la clase 2 y 79% a la clase 3. Un mayor número y variabilidad de casos se necesita a fin de verificar los resultados preliminares.

«
1
2
...
50
51
52
53
54
55
56
...
58
59
»