875 resultados para Linear Multi-step Formulae
Resumo:
This thesis presents an original approach to parametric speech coding at rates below 1 kbitsjsec, primarily for speech storage applications. Essential processes considered in this research encompass efficient characterization of evolutionary configuration of vocal tract to follow phonemic features with high fidelity, representation of speech excitation using minimal parameters with minor degradation in naturalness of synthesized speech, and finally, quantization of resulting parameters at the nominated rates. For encoding speech spectral features, a new method relying on Temporal Decomposition (TD) is developed which efficiently compresses spectral information through interpolation between most steady points over time trajectories of spectral parameters using a new basis function. The compression ratio provided by the method is independent of the updating rate of the feature vectors, hence allows high resolution in tracking significant temporal variations of speech formants with no effect on the spectral data rate. Accordingly, regardless of the quantization technique employed, the method yields a high compression ratio without sacrificing speech intelligibility. Several new techniques for improving performance of the interpolation of spectral parameters through phonetically-based analysis are proposed and implemented in this research, comprising event approximated TD, near-optimal shaping event approximating functions, efficient speech parametrization for TD on the basis of an extensive investigation originally reported in this thesis, and a hierarchical error minimization algorithm for decomposition of feature parameters which significantly reduces the complexity of the interpolation process. Speech excitation in this work is characterized based on a novel Multi-Band Excitation paradigm which accurately determines the harmonic structure in the LPC (linear predictive coding) residual spectra, within individual bands, using the concept 11 of Instantaneous Frequency (IF) estimation in frequency domain. The model yields aneffective two-band approximation to excitation and computes pitch and voicing with high accuracy as well. New methods for interpolative coding of pitch and gain contours are also developed in this thesis. For pitch, relying on the correlation between phonetic evolution and pitch variations during voiced speech segments, TD is employed to interpolate the pitch contour between critical points introduced by event centroids. This compresses pitch contour in the ratio of about 1/10 with negligible error. To approximate gain contour, a set of uniformly-distributed Gaussian event-like functions is used which reduces the amount of gain information to about 1/6 with acceptable accuracy. The thesis also addresses a new quantization method applied to spectral features on the basis of statistical properties and spectral sensitivity of spectral parameters extracted from TD-based analysis. The experimental results show that good quality speech, comparable to that of conventional coders at rates over 2 kbits/sec, can be achieved at rates 650-990 bits/sec.
Resumo:
Shell structures find use in many fields of engineering, notably structural, mechanical, aerospace and nuclear-reactor disciplines. Axisymmetric shell structures are used as dome type of roofs, hyperbolic cooling towers, silos for storage of grain, oil and industrial chemicals and water tanks. Despite their thin walls, strength is derived due to the curvature. The generally high strength-to-weight ratio of the shell form, combined with its inherent stiffness, has formed the basis of this vast application. With the advent in computation technology, the finite element method and optimisation techniques, structural engineers have extremely versatile tools for the optimum design of such structures. Optimisation of shell structures can result not only in improved designs, but also in a large saving of material. The finite element method being a general numerical procedure that could be used to treat any shell problem to any desired degree of accuracy, requires several runs in order to obtain a complete picture of the effect of one parameter on the shell structure. This redesign I re-analysis cycle has been achieved via structural optimisation in the present research, and MSC/NASTRAN (a commercially available finite element code) has been used in this context for volume optimisation of axisymmetric shell structures under axisymmetric and non-axisymmetric loading conditions. The parametric study of different axisymmetric shell structures has revealed that the hyperbolic shape is the most economical solution of shells of revolution. To establish this, axisymmetric loading; self-weight and hydrostatic pressure, and non-axisymmetric loading; wind pressure and earthquake dynamic forces have been modelled on graphical pre and post processor (PATRAN) and analysis has been performed on two finite element codes (ABAQUS and NASTRAN), numerical model verification studies are performed, and optimum material volume required in the walls of cylindrical, conical, parabolic and hyperbolic forms of axisymmetric shell structures are evaluated and reviewed. Free vibration and transient earthquake analysis of hyperbolic shells have been performed once it was established that hyperbolic shape is the most economical under all possible loading conditions. Effect of important parameters of hyperbolic shell structures; shell wall thickness, height and curvature, have been evaluated and empirical relationships have been developed to estimate an approximate value of the lowest (first) natural frequency of vibration. The outcome of this thesis has been the generation of new research information on performance characteristics of axisymmetric shell structures that will facilitate improved designs of shells with better choice of shapes and enhanced levels of economy and performance. Key words; Axisymmetric shell structures, Finite element analysis, Volume Optimisation_ Free vibration_ Transient response.
Resumo:
Artificial neural network (ANN) learning methods provide a robust and non-linear approach to approximating the target function for many classification, regression and clustering problems. ANNs have demonstrated good predictive performance in a wide variety of practical problems. However, there are strong arguments as to why ANNs are not sufficient for the general representation of knowledge. The arguments are the poor comprehensibility of the learned ANN, and the inability to represent explanation structures. The overall objective of this thesis is to address these issues by: (1) explanation of the decision process in ANNs in the form of symbolic rules (predicate rules with variables); and (2) provision of explanatory capability by mapping the general conceptual knowledge that is learned by the neural networks into a knowledge base to be used in a rule-based reasoning system. A multi-stage methodology GYAN is developed and evaluated for the task of extracting knowledge from the trained ANNs. The extracted knowledge is represented in the form of restricted first-order logic rules, and subsequently allows user interaction by interfacing with a knowledge based reasoner. The performance of GYAN is demonstrated using a number of real world and artificial data sets. The empirical results demonstrate that: (1) an equivalent symbolic interpretation is derived describing the overall behaviour of the ANN with high accuracy and fidelity, and (2) a concise explanation is given (in terms of rules, facts and predicates activated in a reasoning episode) as to why a particular instance is being classified into a certain category.
Resumo:
Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.
Resumo:
The control and coordination of multiple mobile robots is a challenging task; particularly in environments with multiple, rapidly moving obstacles and agents. This paper describes a robust approach to multi-robot control, where robustness is gained from competency at every layer of robot control. The layers are: (i) a central coordination system (MAPS), (ii) an action system (AES), (iii) a navigation module, and (iv) a low level dynamic motion control system. The multi-robot coordination system assigns each robot a role and a sub-goal. Each robots action execution system then assumes the assigned role and attempts to achieve the specified sub-goal. The robots navigation system directs the robot to specific goal locations while ensuring that the robot avoids any obstacles. The motion system maps the heading and speed information from the navigation system to force-constrained motion. This multi-robot system has been extensively tested and applied in the robot soccer domain using both centralized and distributed coordination.