914 resultados para Speech Recognition System using LPC


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The design of a modern aircraft is based on three pillars: theoretical results, experimental test and computational simulations. As a results of this, Computational Fluid Dynamic (CFD) solvers are widely used in the aeronautical field. These solvers require the correct selection of many parameters in order to obtain successful results. Besides, the computational time spent in the simulation depends on the proper choice of these parameters. In this paper we create an expert system capable of making an accurate prediction of the number of iterations and time required for the convergence of a computational fluid dynamic (CFD) solver. Artificial neural network (ANN) has been used to design the expert system. It is shown that the developed expert system is capable of making an accurate prediction the number of iterations and time required for the convergence of a CFD solver.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Functional validation of complex digital systems is a hard and critical task in the design flow. In particular, when dealing with communication systems, like Multiband Orthogonal Frequency Division Multiplexing Ultra Wideband (MB-OFDM UWB), the design decisions taken during the process have to be validated at different levels in an easy way. In this work, a unified algorithm-architecture-circuit co-design environment for this type of systems, to be implemented in FPGA, is presented. The main objective is to find an efficient methodology for designing a configurable optimized MB-OFDM UWB system by using as few efforts as possible in verification stage, so as to speed up the development period. Although this efficient design methodology is tested and considered to be suitable for almost all types of complex FPGA designs, we propose a solution where both the circuit and the communication channel are tested at different levels (algorithmic, RTL, hardware device) using a common testbench.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is clear evidence that investment in intelligent transportation system technologies brings major social and economic benefits. Technological advances in the area of automatic systems in particular are becoming vital for the reduction of road deaths. We here describe our approach to automation of one the riskiest autonomous manœuvres involving vehicles – overtaking. The approach is based on a stereo vision system responsible for detecting any preceding vehicle and triggering the autonomous overtaking manœuvre. To this end, a fuzzy-logic based controller was developed to emulate how humans overtake. Its input is information from the vision system and from a positioning-based system consisting of a differential global positioning system (DGPS) and an inertial measurement unit (IMU). Its output is the generation of action on the vehicle’s actuators, i.e., the steering wheel and throttle and brake pedals. The system has been incorporated into a commercial Citroën car and tested on the private driving circuit at the facilities of our research center, CAR, with different preceding vehicles – a motorbike, car, and truck – with encouraging results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

New forms of natural interactions between human operators and UAVs (Unmanned Aerial Vehicle) are demanded by the military industry to achieve a better balance of the UAV control and the burden of the human operator. In this work, a human machine interface (HMI) based on a novel gesture recognition system using depth imagery is proposed for the control of UAVs. Hand gesture recognition based on depth imagery is a promising approach for HMIs because it is more intuitive, natural, and non-intrusive than other alternatives using complex controllers. The proposed system is based on a Support Vector Machine (SVM) classifier that uses spatio-temporal depth descriptors as input features. The designed descriptor is based on a variation of the Local Binary Pattern (LBP) technique to efficiently work with depth video sequences. Other major consideration is the especial hand sign language used for the UAV control. A tradeoff between the use of natural hand signs and the minimization of the inter-sign interference has been established. Promising results have been achieved in a depth based database of hand gestures especially developed for the validation of the proposed system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An implementation of a real-time 3D videoconferencing system using the currently available technology is presented. This appr oach is based on the side by side spatial compression of the stereoscopic images . The encoder and the decoder have b een implemented in a standard personal computer and a conventional 3D comp atible TV has been used to present the frames. Moreover, the users without 3D technology can use the system because 2D compatibility mode has been implemented in the decoder. The performance res ults show that a conventional computer can be used for encod ing/decoding audio and video streams and the delay in the transmission is lower than 200 ms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En el presente trabajo se aborda el problema del seguimiento de objetos, cuyo objetivo es encontrar la trayectoria de un objeto en una secuencia de video. Para ello, se ha desarrollado un método de seguimiento-por-detección que construye un modelo de apariencia en un dominio comprimido usando una nueva e innovadora técnica: “compressive sensing”. La única información necesaria es la situación del objeto a seguir en la primera imagen de la secuencia. El seguimiento de objetos es una aplicación típica del área de visión artificial con un desarrollo de bastantes años. Aun así, sigue siendo una tarea desafiante debido a varios factores: cambios de iluminación, oclusión parcial o total de los objetos y complejidad del fondo de la escena, los cuales deben ser considerados para conseguir un seguimiento robusto. Para lidiar lo más eficazmente posible con estos factores, hemos propuesto un algoritmo de tracking que entrena un clasificador Máquina Vector Soporte (“Support Vector Machine” o SVM en sus siglas en inglés) en modo online para separar los objetos del fondo de la escena. Con este fin, hemos generado nuestro modelo de apariencia por medio de un descriptor de características muy robusto que describe los objetos y el fondo devolviendo un vector de dimensiones muy altas. Por ello, se ha implementado seguidamente un paso para reducir la dimensionalidad de dichos vectores y así poder entrenar nuestro clasificador en un dominio mucho menor, al que denominamos domino comprimido. La reducción de la dimensionalidad de los vectores de características se basa en la teoría de “compressive sensing”, que dice que una señal con poca dispersión (pocos componentes distintos de cero) puede estar bien representada, e incluso puede ser reconstruida, a partir de un conjunto muy pequeño de muestras. La teoría de “compressive sensing” se ha aplicado satisfactoriamente en este trabajo y diferentes técnicas de medida y reconstrucción han sido probadas para evaluar nuestros vectores reducidos, de tal forma que se ha verificado que son capaces de preservar la información de los vectores originales. También incluimos una actualización del modelo de apariencia del objeto a seguir, mediante el reentrenamiento de nuestro clasificador en cada cuadro de la secuencia con muestras positivas y negativas, las cuales han sido obtenidas a partir de la posición predicha por el algoritmo de seguimiento en cada instante temporal. El algoritmo propuesto ha sido evaluado en distintas secuencias y comparado con otros algoritmos del estado del arte de seguimiento, para así demostrar el éxito de nuestro método.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this Master Thesis is the analysis, design and development of a robust and reliable Human-Computer Interaction interface, based on visual hand-gesture recognition. The implementation of the required functions is oriented to the simulation of a classical hardware interaction device: the mouse, by recognizing a specific hand-gesture vocabulary in color video sequences. For this purpose, a prototype of a hand-gesture recognition system has been designed and implemented, which is composed of three stages: detection, tracking and recognition. This system is based on machine learning methods and pattern recognition techniques, which have been integrated together with other image processing approaches to get a high recognition accuracy and a low computational cost. Regarding pattern recongition techniques, several algorithms and strategies have been designed and implemented, which are applicable to color images and video sequences. The design of these algorithms has the purpose of extracting spatial and spatio-temporal features from static and dynamic hand gestures, in order to identify them in a robust and reliable way. Finally, a visual database containing the necessary vocabulary of gestures for interacting with the computer has been created.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The high incidence of neurological disorders in patients afflicted with acquired immunodeficiency syndrome (AIDS) may result from human immunodeficiency virus type 1 (HIV-1) induction of chemotactic signals and cytokines within the brain by virus-encoded gene products. Transforming growth factor beta1 (TGF-beta1) is an immunomodulator and potent chemotactic molecule present at elevated levels in HIV-1-infected patients, and its expression may thus be induced by viral trans-activating proteins such as Tat. In this report, a replication-defective herpes simplex virus (HSV)-1 tat gene transfer vector, dSTat, was used to transiently express HIV-1 Tat in glial cells in culture and following intracerebral inoculation in mouse brain in order to directly determine whether Tat can increase TGF-beta1 mRNA expression. dSTat infection of Vero cells transiently transfected by a panel of HIV-1 long terminal repeat deletion mutants linked to the bacterial chloramphenicol acetyltransferase reporter gene demonstrated that vector-expressed Tat activated the long terminal repeat in a trans-activation response element-dependent fashion independent of the HSV-mediated induction of the HIV-1 enhancer, or NF-kappaB domain. Northern blot analysis of human astrocytic glial U87-MG cells transfected by dSTat vector DNA resulted in a substantial increase in steady-state levels of TGF-beta1 mRNA. Furthermore, intracerebral inoculation of dSTat followed by Northern blot analysis of whole mouse brain RNA revealed an increase in levels of TGF-beta1 mRNA similar to that observed in cultured glial cells transfected by dSTat DNA. These results provided direct in vivo evidence for the involvement of HIV-1 Tat in activation of TGF-beta1 gene expression in brain. Tat-mediated stimulation of TGF-beta1 expression suggests a novel pathway by which HIV-1 may alter the expression of cytokines in the central nervous system, potentially contributing to the development of AIDS-associated neurological disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces the session on advanced speech recognition technology. The two papers comprising this session argue that current technology yields a performance that is only an order of magnitude in error rate away from human performance and that incremental improvements will bring us to that desired level. I argue that, to the contrary, present performance is far removed from human performance and a revolution in our thinking is required to achieve the goal. It is further asserted that to bring about the revolution more effort should be expended on basic research and less on trying to prematurely commercialize a deficient technology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the past decade, tremendous advances in the state of the art of automatic speech recognition by machine have taken place. A reduction in the word error rate by more than a factor of 5 and an increase in recognition speeds by several orders of magnitude (brought about by a combination of faster recognition search algorithms and more powerful computers), have combined to make high-accuracy, speaker-independent, continuous speech recognition for large vocabularies possible in real time, on off-the-shelf workstations, without the aid of special hardware. These advances promise to make speech recognition technology readily available to the general public. This paper focuses on the speech recognition advances made through better speech modeling techniques, chiefly through more accurate mathematical modeling of speech sounds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Federal Highway Administration, Implementation Division, Washington, D.C.