961 resultados para Robust speech recognition
Resumo:
This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using Kohonen network. It would help in recognizing Malayalam text entered using pen-like devices. It will be more natural and efficient way for users to enter text using a pen than keyboard and mouse. To identify the difference between similar characters in Malayalam a novel feature extraction method has been adopted-a combination of context bitmap and normalized (x, y) coordinates. The system reported an accuracy of 88.75% which is writer independent with a recognition time of 15-32 milliseconds
Resumo:
Modeling nonlinear systems using Volterra series is a century old method but practical realizations were hampered by inadequate hardware to handle the increased computational complexity stemming from its use. But interest is renewed recently, in designing and implementing filters which can model much of the polynomial nonlinearities inherent in practical systems. The key advantage in resorting to Volterra power series for this purpose is that nonlinear filters so designed can be made to work in parallel with the existing LTI systems, yielding improved performance. This paper describes the inclusion of a quadratic predictor (with nonlinearity order 2) with a linear predictor in an analog source coding system. Analog coding schemes generally ignore the source generation mechanisms but focuses on high fidelity reconstruction at the receiver. The widely used method of differential pnlse code modulation (DPCM) for speech transmission uses a linear predictor to estimate the next possible value of the input speech signal. But this linear system do not account for the inherent nonlinearities in speech signals arising out of multiple reflections in the vocal tract. So a quadratic predictor is designed and implemented in parallel with the linear predictor to yield improved mean square error performance. The augmented speech coder is tested on speech signals transmitted over an additive white gaussian noise (AWGN) channel.
Resumo:
This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).
Resumo:
This paper describes certain findings of intonation and intensity study of emotive speech with the minimal use of signal processing algorithms. This study was based on six basic emotions and the neutral, elicited from 1660 English utterances obtained from the speech recordings of six Indian women. The correctness of the emotional content was verified through perceptual listening tests. Marked similarity was noted among pitch contours of like-worded, positive valence emotions, though no such similarity was observed among the four negative valence emotional expressions. The intensity patterns were also studied. The results of the study were validated using arbitrary television recordings for four emotions. The findings are useful to technical researchers, social psychologists and to the common man interested in the dynamics of vocal expression of emotions
Resumo:
Für große Windenergieanlagen werden neue Pitchregler wie Einzelblattregler oder Turmdämpfungsregler entwickelt. Während diese neuen Pitchregler die Elemente der Windenergieanlagen entlasten, wird das Pitchantriebssystem stärker belastet. Die Pitchantriebe müssen weitaus häufiger bei höherer Amplitude arbeiten. Um die neuen Pitchregler nutzen zu können, muss zunächst das Problem der Materialermüdung der Pitchantriebssysteme gelöst werden. Das Getriebespiel in Getrieben und zwischen Ritzeln und dem Zahnkranz erhöht die Materialermüdung in den Pitchantriebssystemen. In dieser Studie werden als Lösung zwei Pitchantriebe pro Blatt vorgeschlagen. Die beiden Pitchantriebe erzeugen eine Spannung auf dem Pitchantriebssystem und kompensieren das Getriebespiel. Drehmomentspitzen, die eine Materialermüdung verursachen, treten bei diesem System mit zwei Pitchmotoren nicht mehr auf. Ein Reglerausgang wird via Drehmomentverteiler auf die beiden Pitchantriebe übertragen. Es werden mehrere Methoden verglichen und der leistungsfähigste Drehmomentverteiler ausgewählt. Während die Pitchantriebe in Bewegung sind, ändert sich die Spannung auf den Getrieben. Die neuen Pitchregler verstellen den Pitchwinkel in einer sinusförmigen Welle. Der Profilgenerator, der derzeit als Pitchwinkelregler verwendet wird, kann eine Phasenverzögerung im sinusförmigen Pitchwinkel verursachen. Zusätzlich erzeugen große Windenergieanlagen eine hohe Last, die sich störend auf die Pitchbewegung auswirkt. Änderungen der viskosen Reibung und Nichtlinearität der Gleitreibung bzw. Coulombsche Reibung des Pitchregelsystems erschweren zudem die Entwicklung eines Pitchwinkelreglers. Es werden zwei robuste Regler (H∞ und μ–synthesis ) vorgestellt und mit zwei herkömmlichen Reglern (PD und Kaskadenregler) verglichen. Zur Erprobung des Pitchantriebssystems und des Pitchwinkelreglers wird eine Prüfanordnung verwendet. Da der Kranz nicht mit einem Positionssensor ausgestattet ist, wird ein Überwachungselement entwickelt, das die Kranzposition meldet. Neben den beiden Pitchantrieben sind zwei Lastmotoren mit dem Kranz verbunden. Über die beiden Lastmotoren wird das Drehmoment um die Pitchachse einer Windenergieanlage simuliert. Das Drehmoment um die Pitchachse setzt sich zusammen aus Schwerkraft, aerodynamischer Kraft, zentrifugaler Belastung, Reibung aufgrund des Kippmoments und der Beschleunigung bzw. Verzögerung des Rotorblatts. Das Blatt wird als Zweimassenschwinger modelliert. Große Windenergieanlagen und neue Pitchregler für die Anlagen erfordern ein neues Pitchantriebssystem. Als Hardware-Lösung bieten sich zwei Pitchantriebe an mit einem robusten Regler als Software.
Resumo:
We are currently at the cusp of a revolution in quantum technology that relies not just on the passive use of quantum effects, but on their active control. At the forefront of this revolution is the implementation of a quantum computer. Encoding information in quantum states as “qubits” allows to use entanglement and quantum superposition to perform calculations that are infeasible on classical computers. The fundamental challenge in the realization of quantum computers is to avoid decoherence – the loss of quantum properties – due to unwanted interaction with the environment. This thesis addresses the problem of implementing entangling two-qubit quantum gates that are robust with respect to both decoherence and classical noise. It covers three aspects: the use of efficient numerical tools for the simulation and optimal control of open and closed quantum systems, the role of advanced optimization functionals in facilitating robustness, and the application of these techniques to two of the leading implementations of quantum computation, trapped atoms and superconducting circuits. After a review of the theoretical and numerical foundations, the central part of the thesis starts with the idea of using ensemble optimization to achieve robustness with respect to both classical fluctuations in the system parameters, and decoherence. For the example of a controlled phasegate implemented with trapped Rydberg atoms, this approach is demonstrated to yield a gate that is at least one order of magnitude more robust than the best known analytic scheme. Moreover this robustness is maintained even for gate durations significantly shorter than those obtained in the analytic scheme. Superconducting circuits are a particularly promising architecture for the implementation of a quantum computer. Their flexibility is demonstrated by performing optimizations for both diagonal and non-diagonal quantum gates. In order to achieve robustness with respect to decoherence, it is essential to implement quantum gates in the shortest possible amount of time. This may be facilitated by using an optimization functional that targets an arbitrary perfect entangler, based on a geometric theory of two-qubit gates. For the example of superconducting qubits, it is shown that this approach leads to significantly shorter gate durations, higher fidelities, and faster convergence than the optimization towards specific two-qubit gates. Performing optimization in Liouville space in order to properly take into account decoherence poses significant numerical challenges, as the dimension scales quadratically compared to Hilbert space. However, it can be shown that for a unitary target, the optimization only requires propagation of at most three states, instead of a full basis of Liouville space. Both for the example of trapped Rydberg atoms, and for superconducting qubits, the successful optimization of quantum gates is demonstrated, at a significantly reduced numerical cost than was previously thought possible. Together, the results of this thesis point towards a comprehensive framework for the optimization of robust quantum gates, paving the way for the future realization of quantum computers.
Resumo:
The report describes a recognition system called GROPER, which performs grouping by using distance and relative orientation constraints that estimate the likelihood of different edges in an image coming from the same object. The thesis presents both a theoretical analysis of the grouping problem and a practical implementation of a grouping system. GROPER also uses an indexing module to allow it to make use of knowledge of different objects, any of which might appear in an image. We test GROPER by comparing it to a similar recognition system that does not use grouping.
Resumo:
This thesis addresses the problem of developing automatic grasping capabilities for robotic hands. Using a 2-jointed and a 4-jointed nmodel of the hand, we establish the geometric conditions necessary for achieving form closure grasps of cylindrical objects. We then define and show how to construct the grasping pre-image for quasi-static (friction dominated) and zero-G (inertia dominated) motions for sensorless and sensor-driven grasps with and without arm motions. While the approach does not rely on detailed modeling, it is computationally inexpensive, reliable, and easy to implement. Example behaviors were successfully implemented on the Salisbury hand and on a planar 2-fingered, 4 degree-of-freedom hand.
Resumo:
Two formulations of model-based object recognition are described. MAP Model Matching evaluates joint hypotheses of match and pose, while Posterior Marginal Pose Estimation evaluates the pose only. Local search in pose space is carried out with the Expectation--Maximization (EM) algorithm. Recognition experiments are described where the EM algorithm is used to refine and evaluate pose hypotheses in 2D and 3D. Initial hypotheses for the 2D experiments were generated by a simple indexing method: Angle Pair Indexing. The Linear Combination of Views method of Ullman and Basri is employed as the projection model in the 3D experiments.
Resumo:
The report addresses the problem of visual recognition under two sources of variability: geometric and photometric. The geometric deals with the relation between 3D objects and their views under orthographic and perspective projection. The photometric deals with the relation between 3D matte objects and their images under changing illumination conditions. Taken together, an alignment-based method is presented for recognizing objects viewed from arbitrary viewing positions and illuminated by arbitrary settings of light sources.