Biblioteca Digital

9 resultados para fundamental frequency

em Universidad Politécnica de Madrid

Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Universidad Politécnica de Madrid, which outperform the results of the baseline system. One of the features is the intensity channel contribution, a feature related to the location of the speaker. The second feature is the logarithm of the interpolated fundamental frequency. It is the first time that both features are applied to the clustering stage of multiple distant microphone meetings diarization. It is shown that the inclusion of both features improves the baseline results by 15.36% and 16.71% relative to the development set and the RT 09 set, respectively. If we consider speaker errors only, the relative improvement is 23% and 32.83% on the development set and the RT09 set, respectively.

Veja mais

Wheel-Rail Contact Forces in High-Speed Simply Supported Bridges at Resonance

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The response of high-speed bridges at resonance, particularly under flexural vibrations, constitutes a subject of research for many scientists and engineers at the moment. The topic is of great interest because, as a matter of fact, such kind of behaviour is not unlikely to happen due to the elevated operating speeds of modern rains, which in many cases are equal to or even exceed 300 km/h ( [1,2]). The present paper addresses the subject of the evolution of the wheel-rail contact forces during resonance situations in simply supported bridges. Based on a dimensionless formulation of the equations of motion presented in [4], very similar to the one introduced by Klasztorny and Langer in [3], a parametric study is conducted and the contact forces in realistic situations analysed in detail. The effects of rail and wheel irregularities are not included in the model. The bridge is idealised as an Euler-Bernoulli beam, while the train is simulated by a system consisting of rigid bodies, springs and dampers. The situations such that a severe reduction of the contact force could take place are identified and compared with typical situations in actual bridges. To this end, the simply supported bridge is excited at resonace by means of a theoretical train consisting of 15 equidistant axles. The mechanical characteristics of all axles (unsprung mass, semi-sprung mass, and primary suspension system) are identical. This theoretical train permits the identification of the key parameters having an influence on the wheel-rail contact forces. In addition, a real case of a 17.5 m bridges traversed by the Eurostar train is analysed and checked against the theoretical results. The influence of three fundamental parameters is investigated in great detail: a) the ratio of the fundamental frequency of the bridge and natural frequency of the primary suspension of the vehicle; b) the ratio of the total mass of the bridge and the semi-sprung mass of the vehicle and c) the ratio between the length of the bridge and the characteristic distance between consecutive axles. The main conclusions derived from the investigation are: The wheel-rail contact forces undergo oscillations during the passage of the axles over the bridge. During resonance, these oscillations are more severe for the rear wheels than for the front ones. If denotes the span of a simply supported bridge, and the characteristic distance between consecutive groups of loads, the lower the value of , the greater the oscillations of the contact forces at resonance. For or greater, no likelihood of loss of wheel-rail contact has been detected. The ratio between the frequency of the primary suspension of the vehicle and the fundamental frequency of the bridge is denoted by (frequency ratio), and the ratio of the semi-sprung mass of the vehicle (mass of the bogie) and the total mass of the bridge is denoted by (mass ratio). For any given frequency ratio, the greater the mass ratio, the greater the oscillations of the contact forces at resonance. The oscillations of the contact forces at resonance, and therefore the likelihood of loss of wheel-rail contact, present a minimum for approximately between 0.5 and 1. For lower or higher values of the frequency ratio the oscillations of the contact forces increase. Neglecting the possible effects of torsional vibrations, the metal or composite bridges with a low linear mass have been found to be the ones where the contact forces may suffer the most severe oscillations. If single-track, simply supported, composite or metal bridges were used in high-speed lines, and damping ratios below 1% were expected, the minimum contact forces at resonance could drop to dangerous values. Nevertheless, this kind of structures is very unusual in modern high-speed railway lines.

Veja mais

Towards glottal source controllability in expressive speech synthesis

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In order to obtain more human like sounding humanmachine interfaces we must first be able to give them expressive capabilities in the way of emotional and stylistic features so as to closely adequate them to the intended task. If we want to replicate those features it is not enough to merely replicate the prosodic information of fundamental frequency and speaking rhythm. The proposed additional layer is the modification of the glottal model, for which we make use of the GlottHMM parameters. This paper analyzes the viability of such an approach by verifying that the expressive nuances are captured by the aforementioned features, obtaining 95% recognition rates on styled speaking and 82% on emotional speech. Then we evaluate the effect of speaker bias and recording environment on the source modeling in order to quantify possible problems when analyzing multi-speaker databases. Finally we propose a speaking styles separation for Spanish based on prosodic features and check its perceptual significance.

Veja mais

Effects of traffic loads on reinforced concrete railroad culverts

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the context of the present conference paper culverts are defined as an opening or conduit passing through an embankment usually for the purpose of conveying water or providing safe pedestrian and animal crossings under rail infrastructure. The clear opening of culverts may reach values of up to 12m however, values around 3m are encountered much more frequently. Depending on the topography, the number of culverts is about 10 times that of bridges. In spite of this, their dynamic behavior has received far less attention than that of bridges. The fundamental frequency of culverts is considerably higher than that of bridges even in the case of short span bridges. As the operational speed of modern high-speed passenger rail systems rises, higher frequencies are excited and thus more energy is encountered in frequency bands where the fundamental frequency of box culverts is located. Many research efforts have been spent on the subject of ballast instability due to bridge resonance, since it was first observed when high-speed trains were introduced to the Paris/Lyon rail line. To prevent this phenomenon from occurring, design codes establish a limit value for the vertical deck acceleration. Obviously one needs some sort of numerical model in order to estimate this acceleration level and at that point things get quite complicated. Not only acceleration but also displacement values are of interest e.g. to estimate the impact factor. According to design manuals the structural design should consider the depth of cover, trench width and condition, bedding type, backfill material, and compaction. The same applies to the numerical model however, the question is: What type of model is appropriate for this job? A 3D model including the embankment and an important part of the soil underneath the culvert is computationally very expensive and hard to justify taking into account the associated costs. Consequently, there is a clear need for simplified models and design rules in order to achieve reasonable costs. This paper will describe the results obtained from a 2D finite element model which has been calibrated by means of a 3D model and experimental data obtained at culverts that belong to the high-speed railway line that links the two towns of Segovia and Valladolid in Spain

Veja mais

Relevance of the glottal pulse and the vocal tract in gender detection

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Gender detection is a very important objective to improve efficiency in tasks as speech or speaker recognition, among others. Traditionally gender detection has been focused on fundamental frequency (f0) and cepstral features derived from voiced segments of speech. The methodology presented here consists in obtaining uncorrelated glottal and vocal tract components which are parameterized as mel-frequency coefficients. K-fold and cross-validation using QDA and GMM classifiers showed that better detection rates are reached when glottal source and vocal tract parameters are used in a gender-balanced database of running speech from 340 speakers.

Veja mais

Cepstral peak prominence: a comprehensive analysis

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An analytical study of cepstral peak prominence (CPP) is presented, intended to provide an insight into its meaning and relation with voice perturbation parameters. To carry out this analysis, a parametric approach is adopted in which voice production is modelled using the traditional source-filter model and the first cepstral peak is assumed to have Gaussian shape. It is concluded that the meaning of CPP is very similar to that of the first rahmonic and some insights are provided on its dependence with fundamental frequency and vocal tract resonances. It is further shown that CPP integrates measures of voice waveform and periodicity perturbations, be them either amplitude, frequency or noise.

Veja mais

Desarrollo de una herramienta de análisis de la señal de voz

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El habla es la principal herramienta de comunicación de la que dispone el ser humano que, no sólo le permite expresar su pensamiento y sus sentimientos sino que le distingue como individuo. El análisis de la señal de voz es fundamental para múltiples aplicaciones como pueden ser: síntesis y reconocimiento de habla, codificación, detección de patologías, identificación y reconocimiento de locutor… En el mercado se pueden encontrar herramientas comerciales o de libre distribución para realizar esta tarea. El objetivo de este Proyecto Fin de Grado es reunir varios algoritmos de análisis de la señal de voz en una única herramienta que se manejará a través de un entorno gráfico. Los algoritmos están siendo utilizados en el Grupo de investigación en Aplicaciones MultiMedia y Acústica de la Universidad Politécnica de Madrid para llevar a cabo su tarea investigadora y para ofertar talleres formativos a los alumnos de grado de la Escuela Técnica Superior de Ingeniería y Sistemas de Telecomunicación. Actualmente se ha encontrado alguna dificultad para poder aplicar los algoritmos ya que se han ido desarrollando a lo largo de varios años, por distintas personas y en distintos entornos de programación. Se han adaptado los programas existentes para generar una única herramienta en MATLAB que permite: . Detección de voz . Detección sordo/sonoro . Extracción y revisión manual de frecuencia fundamental de los sonidos sonoros . Extracción y revisión manual de formantes de los sonidos sonoros En todos los casos el usuario puede ajustar los parámetros de análisis y se ha mantenido y, en algunos casos, ampliado la funcionalidad de los algoritmos existentes. Los resultados del análisis se pueden manejar directamente en la aplicación o guardarse en un fichero. Por último se ha escrito el manual de usuario de la aplicación y se ha generado una aplicación independiente que puede instalarse y ejecutarse aunque no se disponga del software o de la versión adecuada de MATLAB. ABSTRACT. The speech is the main communication tool which has the human that as well as allowing to express his thoughts and feelings distinguishes him as an individual. The analysis of speech signal is essential for multiple applications such as: synthesis and recognition of speech, coding, detection of pathologies, identification and speaker recognition… In the market you can find commercial or open source tools to perform this task. The aim of this Final Degree Project is collect several algorithms of speech signal analysis in a single tool which will be managed through a graphical environment. These algorithms are being used in the research group Aplicaciones MultiMedia y Acústica at the Universidad Politécnica de Madrid to carry out its research work and to offer training workshops for students at the Escuela Técnica Superior de Ingeniería y Sistemas de Telecomunicación. Currently some difficulty has been found to be able to apply the algorithms as they have been developing over several years, by different people and in different programming environments. Existing programs have been adapted to generate a single tool in MATLAB that allows: . Voice Detection . Voice/Unvoice Detection . Extraction and manual review of fundamental frequency of voiced sounds . Extraction and manual review formant voiced sounds In all cases the user can adjust the scan settings, we have maintained and in some cases expanded the functionality of existing algorithms. The analysis results can be managed directly in the application or saved to a file. Finally we have written the application user’s manual and it has generated a standalone application that can be installed and run although the user does not have MATLAB software or the appropriate version.

Veja mais

High frequency high-order Rayleigh modes in ZnO/GaAs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strong high-order Rayleigh or Sezawa modes, in addition to the fundamental Rayleigh mode, have been observed in ZnO/GaAs(001) systems along the [110] propagation direction of GaAs. The dispersion of the different acoustic waves has been calculated and compared to the experimental data. The bandwidth and impedance matching characteristics of the multimode SAW delay lines operating at high frequencies (2.5-3.5 GHz regime) have been investigated.

Veja mais

Criteria for mathematical model selection for satellite vibro-acoustic analysis depending on frequency range

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Satellites and space equipment are exposed to diffuse acoustic fields during the launch process. The use of adequate techniques to model the response to the acoustic loads is a fundamental task during the design and verification phases. Considering the modal density of each element is necessary to identify the correct methodology. In this report selection criteria are presented in order to choose the correct modelling technique depending on the frequency ranges. A model satellite’s response to acoustic loads is presented, determining the modal densities of each component in different frequency ranges. The paper proposes to select the mathematical method in each modal density range and the differences in the response estimation due to the different used techniques. In addition, the methodologies to analyse the intermediate range of the system are discussed. The results are compared with experimental testing data obtained in an experimental modal test.

Veja mais

9 resultados para fundamental frequency

em Universidad Politécnica de Madrid

Filtro por publicador