986 resultados para Training algorithms


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Stationary processes are random variables whose value is a signal and whose distribution is invariant to translation in the domain of the signal. They are intimately connected to convolution, and therefore to the Fourier transform, since the covariance matrix of a stationary process is a Toeplitz matrix, and Toeplitz matrices are the expression of convolution as a linear operator. This thesis utilises this connection in the study of i) efficient training algorithms for object detection and ii) trajectory-based non-rigid structure-from-motion.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Structural Support Vector Machines (SSVMs) and Conditional Random Fields (CRFs) are popular discriminative methods used for classifying structured and complex objects like parse trees, image segments and part-of-speech tags. The datasets involved are very large dimensional, and the models designed using typical training algorithms for SSVMs and CRFs are non-sparse. This non-sparse nature of models results in slow inference. Thus, there is a need to devise new algorithms for sparse SSVM and CRF classifier design. Use of elastic net and L1-regularizer has already been explored for solving primal CRF and SSVM problems, respectively, to design sparse classifiers. In this work, we focus on dual elastic net regularized SSVM and CRF. By exploiting the weakly coupled structure of these convex programming problems, we propose a new sequential alternating proximal (SAP) algorithm to solve these dual problems. This algorithm works by sequentially visiting each training set example and solving a simple subproblem restricted to a small subset of variables associated with that example. Numerical experiments on various benchmark sequence labeling datasets demonstrate that the proposed algorithm scales well. Further, the classifiers designed are sparser than those designed by solving the respective primal problems and demonstrate comparable generalization performance. Thus, the proposed SAP algorithm is a useful alternative for sparse SSVM and CRF classifier design.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In dieser Arbeit wird ein Verfahren zum Einsatz neuronaler Netzwerke vorgestellt, das auf iterative Weise Klassifikation und Prognoseschritte mit dem Ziel kombiniert, bessere Ergebnisse der Prognose im Vergleich zu einer einmaligen hintereinander Ausführung dieser Schritte zu erreichen. Dieses Verfahren wird am Beispiel der Prognose der Windstromerzeugung abhängig von der Wettersituation erörtert. Eine Verbesserung wird in diesem Rahmen mit einzelnen Ausreißern erreicht. Verschiedene Aspekte werden in drei Kapiteln diskutiert: In Kapitel 1 werden die verwendeten Daten und ihre elektronische Verarbeitung vorgestellt. Die Daten bestehen zum einen aus Windleistungshochrechnungen für die Bundesrepublik Deutschland der Jahre 2011 und 2012, welche als Transparenzanforderung des Erneuerbaren Energiegesetzes durch die Übertragungsnetzbetreiber publiziert werden müssen. Zum anderen werden Wetterprognosen, die der Deutsche Wetterdienst im Rahmen der Grundversorgung kostenlos bereitstellt, verwendet. Kapitel 2 erläutert zwei aus der Literatur bekannte Verfahren - Online- und Batchalgorithmus - zum Training einer selbstorganisierenden Karte. Aus den dargelegten Verfahrenseigenschaften begründet sich die Wahl des Batchverfahrens für die in Kapitel 3 erläuterte Methode. Das in Kapitel 3 vorgestellte Verfahren hat im modellierten operativen Einsatz den gleichen Ablauf, wie eine Klassifikation mit anschließender klassenspezifischer Prognose. Bei dem Training des Verfahrens wird allerdings iterativ vorgegangen, indem im Anschluss an das Training der klassenspezifischen Prognose ermittelt wird, zu welcher Klasse der Klassifikation ein Eingabedatum gehören sollte, um mit den vorliegenden klassenspezifischen Prognosemodellen die höchste Prognosegüte zu erzielen. Die so gewonnene Einteilung der Eingaben kann genutzt werden, um wiederum eine neue Klassifikationsstufe zu trainieren, deren Klassen eine verbesserte klassenspezifisch Prognose ermöglichen.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Communication signal processing applications often involve complex-valued (CV) functional representations for signals and systems. CV artificial neural networks have been studied theoretically and applied widely in nonlinear signal and data processing [1–11]. Note that most artificial neural networks cannot be automatically extended from the real-valued (RV) domain to the CV domain because the resulting model would in general violate Cauchy-Riemann conditions, and this means that the training algorithms become unusable. A number of analytic functions were introduced for the fully CV multilayer perceptrons (MLP) [4]. A fully CV radial basis function (RBF) nework was introduced in [8] for regression and classification applications. Alternatively, the problem can be avoided by using two RV artificial neural networks, one processing the real part and the other processing the imaginary part of the CV signal/system. A even more challenging problem is the inverse of a CV

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The search for ever smaller device and without loss of performance has been increasingly investigated by researchers involving applied electromagnetics. Antennas using ceramics materials with a high dielectric constant, whether acting as a substract element of patch radiating or as the radiant element are in evidence in current research, that due to the numerous advantages offered, such as: low profile, ability to reduce the its dimensions when compared to other devices, high efficiency of ratiation, suitability the microwave range and/or millimeter wave, low temperature coefficient and low cost. The reason for this high efficiency is that the dielectric losses of ceramics are very low when compared to commercially materials sold used in printed circuit boards, such as fiberglass and phenolite. These characteristics make ceramic devices suitable for operation in the microwave band. Combining the design of patch antennas and/or dielectric resonator antenna (DRA) to certain materials and the method of synthesis of these powders in the manufacture of devices, it s possible choose a material with a dielectric constant appropriate for the design of an antenna with the desired size. The main aim of this work is the design of patch antennas and DRA antennas on synthesis of ceramic powders (synthesis by combustion and polymeric precursors - Pe- chini method) nanostructured with applications in the microwave band. The conventional method of mix oxides was also used to obtain nanometric powders for the preparation of tablets and dielectric resonators. The devices manufactured and studied on high dielectric constant materials make them good candidates to have their small size compared to other devices operating at the same frequency band. The structures analyzed are excited by three different techniques: i) microstrip line, ii) aperture coupling and iii) inductive coupling. The efficiency of these techniques have been investigated experimentally and compared with simulations by Ansoft HFSS, used in the accurate analysis of the electromagnetic behavior of antennas over the finite element method (FEM). In this thesis a literature study on the theory of microstrip antennas and DRA antenna is performed. The same study is performed about the materials and methods of synthesis of ceramic powders, which are used in the manufacture of tablets and dielectric cylinders that make up the devices investigated. The dielectric media which were used to support the analysis of the DRA and/or patch antennas are analyzed using accurate simulations using the finite difference time domain (FDTD) based on the relative electrical permittivity (er) and loss tangent of these means (tand). This work also presents a study on artificial neural networks, showing the network architecture used and their characteristics, as well as the training algorithms that were used in training and modeling some parameters associated with the devices investigated

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The microstrip antennas are in constant evidence in current researches due to several advantages that it presents. Fractal geometry coupled with good performance and convenience of the planar structures are an excellent combination for design and analysis of structures with ever smaller features and multi-resonant and broadband. This geometry has been applied in such patch microstrip antennas to reduce its size and highlight its multi-band behavior. Compared with the conventional microstrip antennas, the quasifractal patch antennas have lower frequencies of resonance, enabling the manufacture of more compact antennas. The aim of this work is the design of quasi-fractal patch antennas through the use of Koch and Minkowski fractal curves applied to radiating and nonradiating antenna s edges of conventional rectangular patch fed by microstrip inset-fed line, initially designed for the frequency of 2.45 GHz. The inset-fed technique is investigated for the impedance matching of fractal antennas, which are fed through lines of microstrip. The efficiency of this technique is investigated experimentally and compared with simulations carried out by commercial software Ansoft Designer used for precise analysis of the electromagnetic behavior of antennas by the method of moments and the neural model proposed. In this dissertation a study of literature on theory of microstrip antennas is done, the same study is performed on the fractal geometry, giving more emphasis to its various forms, techniques for generation of fractals and its applicability. This work also presents a study on artificial neural networks, showing the types/architecture of networks used and their characteristics as well as the training algorithms that were used for their implementation. The equations of settings of the parameters for networks used in this study were derived from the gradient method. It will also be carried out research with emphasis on miniaturization of the proposed new structures, showing how an antenna designed with contours fractals is capable of a miniaturized antenna conventional rectangular patch. The study also consists of a modeling through artificial neural networks of the various parameters of the electromagnetic near-fractal antennas. The presented results demonstrate the excellent capacity of modeling techniques for neural microstrip antennas and all algorithms used in this work in achieving the proposed models were implemented in commercial software simulation of Matlab 7. In order to validate the results, several prototypes of antennas were built, measured on a vector network analyzer and simulated in software for comparison

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work proposes a methodology for non destructive testing (NDT) of reinforced concrete structures, using superficial magnetic fields and artificial neural networks, in order to identify the size and position of steel bars, embedded into the concrete. For the purposes of this paper, magnetic induction curves were obtained by using a finite element program. Perceptron Multilayered (PML) ANNs, with Levemberg-Marquardt training algorithm were used. The results presented very good agreement with the expect ones, encouraging the development of real systems based upon the proposed methodology.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este proyecto tiene como objetivo la implementación de un sistema capaz de analizar el movimiento corporal a partir de unos puntos cinemáticos. Estos puntos cinemáticos se obtienen con un programa previo y se captan con la cámara kinect. Para ello el primer paso es realizar un estudio sobre las técnicas y conocimientos existentes relacionados con el movimiento de las personas. Se sabe que Rudolph Laban fue uno de sus mayores exponentes y gracias a sus observaciones se establece una relación entre la personalidad, el estado anímico y la forma de moverse de un individuo. Laban acuñó el término esfuerzo, que hace referencia al modo en que se administra la energía que genera el movimiento y de qué manera se modula en las secuencias, es una manera de describir la intención de las expresiones internas. El esfuerzo se divide en 4 categorías: peso, espacio, tiempo y flujo, y cada una de estas categorías tiene una polaridad denominada elemento de esfuerzo. Con estos 8 elementos de esfuerzo un movimiento queda caracterizado. Para poder cuantificar los citados elementos de esfuerzo se buscan movimientos que representen a alguno de ellos. Los movimientos se graban con la cámara kinect y se guardan sus valores en un archivo csv. Para el procesado de estos datos se establece que el sistema más adecuado es una red neuronal debido a su flexibilidad y capacidad a la hora de procesar entradas no lineales. Para la implementación de la misma se requiere un amplio estudio que incluye: topologías, funciones de activación, tipos de aprendizaje, algoritmos de entrenamiento entre otros. Se decide que la red tenga dos capas ocultas, para mejor procesado de los datos, que sea estática, siga un proceso de cálculo hacia delante (Feedforward) y el algoritmo por el que se rija su aprendizaje sea el de retropropagación (Backpropagation) En una red estática las entradas han de ser valores fijos, es decir, no pueden variar en el tiempo por lo que habrá que implementar un programa intermedio que haga una media aritmética de los valores. Una segunda prueba con la misma red trata de comprobar si sería capaz de reconocer movimientos que estuvieran caracterizados por más de un elemento de esfuerzo. Para ello se vuelven a grabar los movimientos, esta vez en parejas de dos, y el resto del proceso es igual. ABSTRACT. The aim of this project is the implementation of a system able to analyze body movement from cinematic data. This cinematic data was obtained with a previous program. The first step is carrying out a study about the techniques and knowledge existing nowadays related to people movement. It is known that Rudolf Laban was one the greatest exponents of this field and thanks to his observations a relation between personality, mood and the way the person moves was made. Laban coined the term effort, that refers to the way energy generated from a movement is managed and how it is modulated in the sequence, this is a method of describing the inner intention of the person. The effort is divided into 4 categories: weight, space, time and flow, and each of these categories have 2 polarities named elements of effort. These 8 elements typify a movement. We look for movements that are made of these elements so we can quantify them. The movements are recorded with the kinect camera and saved in a csv file. In order to process this data a neural network is chosen owe to its flexibility and capability of processing non-linear inputs. For its implementation it is required a wide study regarding: topology, activation functions, different types of learning methods and training algorithms among others. The neural network for this project will have 2 hidden layers, it will be static and follow a feedforward process ruled by backpropagation. In a static net the inputs must be fixed, this means they cannot vary in time, so we will have to implement an intermediate program to calculate the average of our data. A second test for our net will be checking its ability to recognize more than one effort element in just one movement. In order to do this all the movements are recorded again but this time in pairs, the rest of the process remains the same.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a framework for calculating globally optimal parameters, within a given time frame, for on-line learning in multilayer neural networks. We demonstrate the capability of this method by computing optimal learning rates in typical learning scenarios. A similar treatment allows one to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule as well as to compare different training methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.