994 resultados para Training algorithms
Resumo:
Les tâches de vision artificielle telles que la reconnaissance d’objets demeurent irrésolues à ce jour. Les algorithmes d’apprentissage tels que les Réseaux de Neurones Artificiels (RNA), représentent une approche prometteuse permettant d’apprendre des caractéristiques utiles pour ces tâches. Ce processus d’optimisation est néanmoins difficile. Les réseaux profonds à base de Machine de Boltzmann Restreintes (RBM) ont récemment été proposés afin de guider l’extraction de représentations intermédiaires, grâce à un algorithme d’apprentissage non-supervisé. Ce mémoire présente, par l’entremise de trois articles, des contributions à ce domaine de recherche. Le premier article traite de la RBM convolutionelle. L’usage de champs réceptifs locaux ainsi que le regroupement d’unités cachées en couches partageant les même paramètres, réduit considérablement le nombre de paramètres à apprendre et engendre des détecteurs de caractéristiques locaux et équivariant aux translations. Ceci mène à des modèles ayant une meilleure vraisemblance, comparativement aux RBMs entraînées sur des segments d’images. Le deuxième article est motivé par des découvertes récentes en neurosciences. Il analyse l’impact d’unités quadratiques sur des tâches de classification visuelles, ainsi que celui d’une nouvelle fonction d’activation. Nous observons que les RNAs à base d’unités quadratiques utilisant la fonction softsign, donnent de meilleures performances de généralisation. Le dernière article quand à lui, offre une vision critique des algorithmes populaires d’entraînement de RBMs. Nous montrons que l’algorithme de Divergence Contrastive (CD) et la CD Persistente ne sont pas robustes : tous deux nécessitent une surface d’énergie relativement plate afin que leur chaîne négative puisse mixer. La PCD à "poids rapides" contourne ce problème en perturbant légèrement le modèle, cependant, ceci génère des échantillons bruités. L’usage de chaînes tempérées dans la phase négative est une façon robuste d’adresser ces problèmes et mène à de meilleurs modèles génératifs.
Resumo:
The Optimum-Path Forest (OPF) classifier is a recent and promising method for pattern recognition, with a fast training algorithm and good accuracy results. Therefore, the investigation of a combining method for this kind of classifier can be important for many applications. In this paper we report a fast method to combine OPF-based classifiers trained with disjoint training subsets. Given a fixed number of subsets, the algorithm chooses random samples, without replacement, from the original training set. Each subset accuracy is improved by a learning procedure. The final decision is given by majority vote. Experiments with simulated and real data sets showed that the proposed combining method is more efficient and effective than naive approach provided some conditions. It was also showed that OPF training step runs faster for a series of small subsets than for the whole training set. The combining scheme was also designed to support parallel or distributed processing, speeding up the procedure even more. © 2011 Springer-Verlag.
Resumo:
Attractor properties of a popular discrete-time neural network model are illustrated through numerical simulations. The most complex dynamics is found to occur within particular ranges of parameters controlling the symmetry and magnitude of the weight matrix. A small network model is observed to produce fixed points, limit cycles, mode-locking, the Ruelle-Takens route to chaos, and the period-doubling route to chaos. Training algorithms for tuning this dynamical behaviour are discussed. Training can be an easy or difficult task, depending whether the problem requires the use of temporal information distributed over long time intervals. Such problems require training algorithms which can handle hidden nodes. The most prominent of these algorithms, back propagation through time, solves the temporal credit assignment problem in a way which can work only if the relevant information is distributed locally in time. The Moving Targets algorithm works for the more general case, but is computationally intensive, and prone to local minima.
Resumo:
Spiking Neural Networks (SNNs) are bio-inspired Artificial Neural Networks (ANNs) utilizing discrete spiking signals, akin to neuron communication in the brain, making them ideal for real-time and energy-efficient Cyber-Physical Systems (CPSs). This thesis explores their potential in Structural Health Monitoring (SHM), leveraging low-cost MEMS accelerometers for early damage detection in motorway bridges. The study focuses on Long Short-Term SNNs (LSNNs), although their complex learning processes pose challenges. Comparing LSNNs with other ANN models and training algorithms for SHM, findings indicate LSNNs' effectiveness in damage identification, comparable to ANNs trained using traditional methods. Additionally, an optimized embedded LSNN implementation demonstrates a 54% reduction in execution time, but with longer pre-processing due to spike-based encoding. Furthermore, SNNs are applied in UAV obstacle avoidance, trained directly using a Reinforcement Learning (RL) algorithm with event-based input from a Dynamic Vision Sensor (DVS). Performance evaluation against Convolutional Neural Networks (CNNs) highlights SNNs' superior energy efficiency, showing a 6x decrease in energy consumption. The study also investigates embedded SNN implementations' latency and throughput in real-world deployments, emphasizing their potential for energy-efficient monitoring systems. This research contributes to advancing SHM and UAV obstacle avoidance through SNNs' efficient information processing and decision-making capabilities within CPS domains.
Resumo:
The ability to determine the location and relative strength of all transcription-factor binding sites in a genome is important both for a comprehensive understanding of gene regulation and for effective promoter engineering in biotechnological applications. Here we present a bioinformatically driven experimental method to accurately define the DNA-binding sequence specificity of transcription factors. A generalized profile was used as a predictive quantitative model for binding sites, and its parameters were estimated from in vitro-selected ligands using standard hidden Markov model training algorithms. Computer simulations showed that several thousand low- to medium-affinity sequences are required to generate a profile of desired accuracy. To produce data on this scale, we applied high-throughput genomics methods to the biochemical problem addressed here. A method combining systematic evolution of ligands by exponential enrichment (SELEX) and serial analysis of gene expression (SAGE) protocols was coupled to an automated quality-controlled sequence extraction procedure based on Phred quality scores. This allowed the sequencing of a database of more than 10,000 potential DNA ligands for the CTF/NFI transcription factor. The resulting binding-site model defines the sequence specificity of this protein with a high degree of accuracy not achieved earlier and thereby makes it possible to identify previously unknown regulatory sequences in genomic DNA. A covariance analysis of the selected sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism.
Resumo:
In dieser Arbeit wird ein Verfahren zum Einsatz neuronaler Netzwerke vorgestellt, das auf iterative Weise Klassifikation und Prognoseschritte mit dem Ziel kombiniert, bessere Ergebnisse der Prognose im Vergleich zu einer einmaligen hintereinander Ausführung dieser Schritte zu erreichen. Dieses Verfahren wird am Beispiel der Prognose der Windstromerzeugung abhängig von der Wettersituation erörtert. Eine Verbesserung wird in diesem Rahmen mit einzelnen Ausreißern erreicht. Verschiedene Aspekte werden in drei Kapiteln diskutiert: In Kapitel 1 werden die verwendeten Daten und ihre elektronische Verarbeitung vorgestellt. Die Daten bestehen zum einen aus Windleistungshochrechnungen für die Bundesrepublik Deutschland der Jahre 2011 und 2012, welche als Transparenzanforderung des Erneuerbaren Energiegesetzes durch die Übertragungsnetzbetreiber publiziert werden müssen. Zum anderen werden Wetterprognosen, die der Deutsche Wetterdienst im Rahmen der Grundversorgung kostenlos bereitstellt, verwendet. Kapitel 2 erläutert zwei aus der Literatur bekannte Verfahren - Online- und Batchalgorithmus - zum Training einer selbstorganisierenden Karte. Aus den dargelegten Verfahrenseigenschaften begründet sich die Wahl des Batchverfahrens für die in Kapitel 3 erläuterte Methode. Das in Kapitel 3 vorgestellte Verfahren hat im modellierten operativen Einsatz den gleichen Ablauf, wie eine Klassifikation mit anschließender klassenspezifischer Prognose. Bei dem Training des Verfahrens wird allerdings iterativ vorgegangen, indem im Anschluss an das Training der klassenspezifischen Prognose ermittelt wird, zu welcher Klasse der Klassifikation ein Eingabedatum gehören sollte, um mit den vorliegenden klassenspezifischen Prognosemodellen die höchste Prognosegüte zu erzielen. Die so gewonnene Einteilung der Eingaben kann genutzt werden, um wiederum eine neue Klassifikationsstufe zu trainieren, deren Klassen eine verbesserte klassenspezifisch Prognose ermöglichen.
Resumo:
Communication signal processing applications often involve complex-valued (CV) functional representations for signals and systems. CV artificial neural networks have been studied theoretically and applied widely in nonlinear signal and data processing [1–11]. Note that most artificial neural networks cannot be automatically extended from the real-valued (RV) domain to the CV domain because the resulting model would in general violate Cauchy-Riemann conditions, and this means that the training algorithms become unusable. A number of analytic functions were introduced for the fully CV multilayer perceptrons (MLP) [4]. A fully CV radial basis function (RBF) nework was introduced in [8] for regression and classification applications. Alternatively, the problem can be avoided by using two RV artificial neural networks, one processing the real part and the other processing the imaginary part of the CV signal/system. A even more challenging problem is the inverse of a CV
Resumo:
The search for ever smaller device and without loss of performance has been increasingly investigated by researchers involving applied electromagnetics. Antennas using ceramics materials with a high dielectric constant, whether acting as a substract element of patch radiating or as the radiant element are in evidence in current research, that due to the numerous advantages offered, such as: low profile, ability to reduce the its dimensions when compared to other devices, high efficiency of ratiation, suitability the microwave range and/or millimeter wave, low temperature coefficient and low cost. The reason for this high efficiency is that the dielectric losses of ceramics are very low when compared to commercially materials sold used in printed circuit boards, such as fiberglass and phenolite. These characteristics make ceramic devices suitable for operation in the microwave band. Combining the design of patch antennas and/or dielectric resonator antenna (DRA) to certain materials and the method of synthesis of these powders in the manufacture of devices, it s possible choose a material with a dielectric constant appropriate for the design of an antenna with the desired size. The main aim of this work is the design of patch antennas and DRA antennas on synthesis of ceramic powders (synthesis by combustion and polymeric precursors - Pe- chini method) nanostructured with applications in the microwave band. The conventional method of mix oxides was also used to obtain nanometric powders for the preparation of tablets and dielectric resonators. The devices manufactured and studied on high dielectric constant materials make them good candidates to have their small size compared to other devices operating at the same frequency band. The structures analyzed are excited by three different techniques: i) microstrip line, ii) aperture coupling and iii) inductive coupling. The efficiency of these techniques have been investigated experimentally and compared with simulations by Ansoft HFSS, used in the accurate analysis of the electromagnetic behavior of antennas over the finite element method (FEM). In this thesis a literature study on the theory of microstrip antennas and DRA antenna is performed. The same study is performed about the materials and methods of synthesis of ceramic powders, which are used in the manufacture of tablets and dielectric cylinders that make up the devices investigated. The dielectric media which were used to support the analysis of the DRA and/or patch antennas are analyzed using accurate simulations using the finite difference time domain (FDTD) based on the relative electrical permittivity (er) and loss tangent of these means (tand). This work also presents a study on artificial neural networks, showing the network architecture used and their characteristics, as well as the training algorithms that were used in training and modeling some parameters associated with the devices investigated
Resumo:
The microstrip antennas are in constant evidence in current researches due to several advantages that it presents. Fractal geometry coupled with good performance and convenience of the planar structures are an excellent combination for design and analysis of structures with ever smaller features and multi-resonant and broadband. This geometry has been applied in such patch microstrip antennas to reduce its size and highlight its multi-band behavior. Compared with the conventional microstrip antennas, the quasifractal patch antennas have lower frequencies of resonance, enabling the manufacture of more compact antennas. The aim of this work is the design of quasi-fractal patch antennas through the use of Koch and Minkowski fractal curves applied to radiating and nonradiating antenna s edges of conventional rectangular patch fed by microstrip inset-fed line, initially designed for the frequency of 2.45 GHz. The inset-fed technique is investigated for the impedance matching of fractal antennas, which are fed through lines of microstrip. The efficiency of this technique is investigated experimentally and compared with simulations carried out by commercial software Ansoft Designer used for precise analysis of the electromagnetic behavior of antennas by the method of moments and the neural model proposed. In this dissertation a study of literature on theory of microstrip antennas is done, the same study is performed on the fractal geometry, giving more emphasis to its various forms, techniques for generation of fractals and its applicability. This work also presents a study on artificial neural networks, showing the types/architecture of networks used and their characteristics as well as the training algorithms that were used for their implementation. The equations of settings of the parameters for networks used in this study were derived from the gradient method. It will also be carried out research with emphasis on miniaturization of the proposed new structures, showing how an antenna designed with contours fractals is capable of a miniaturized antenna conventional rectangular patch. The study also consists of a modeling through artificial neural networks of the various parameters of the electromagnetic near-fractal antennas. The presented results demonstrate the excellent capacity of modeling techniques for neural microstrip antennas and all algorithms used in this work in achieving the proposed models were implemented in commercial software simulation of Matlab 7. In order to validate the results, several prototypes of antennas were built, measured on a vector network analyzer and simulated in software for comparison
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
This work proposes a methodology for non destructive testing (NDT) of reinforced concrete structures, using superficial magnetic fields and artificial neural networks, in order to identify the size and position of steel bars, embedded into the concrete. For the purposes of this paper, magnetic induction curves were obtained by using a finite element program. Perceptron Multilayered (PML) ANNs, with Levemberg-Marquardt training algorithm were used. The results presented very good agreement with the expect ones, encouraging the development of real systems based upon the proposed methodology.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Este proyecto tiene como objetivo la implementación de un sistema capaz de analizar el movimiento corporal a partir de unos puntos cinemáticos. Estos puntos cinemáticos se obtienen con un programa previo y se captan con la cámara kinect. Para ello el primer paso es realizar un estudio sobre las técnicas y conocimientos existentes relacionados con el movimiento de las personas. Se sabe que Rudolph Laban fue uno de sus mayores exponentes y gracias a sus observaciones se establece una relación entre la personalidad, el estado anímico y la forma de moverse de un individuo. Laban acuñó el término esfuerzo, que hace referencia al modo en que se administra la energía que genera el movimiento y de qué manera se modula en las secuencias, es una manera de describir la intención de las expresiones internas. El esfuerzo se divide en 4 categorías: peso, espacio, tiempo y flujo, y cada una de estas categorías tiene una polaridad denominada elemento de esfuerzo. Con estos 8 elementos de esfuerzo un movimiento queda caracterizado. Para poder cuantificar los citados elementos de esfuerzo se buscan movimientos que representen a alguno de ellos. Los movimientos se graban con la cámara kinect y se guardan sus valores en un archivo csv. Para el procesado de estos datos se establece que el sistema más adecuado es una red neuronal debido a su flexibilidad y capacidad a la hora de procesar entradas no lineales. Para la implementación de la misma se requiere un amplio estudio que incluye: topologías, funciones de activación, tipos de aprendizaje, algoritmos de entrenamiento entre otros. Se decide que la red tenga dos capas ocultas, para mejor procesado de los datos, que sea estática, siga un proceso de cálculo hacia delante (Feedforward) y el algoritmo por el que se rija su aprendizaje sea el de retropropagación (Backpropagation) En una red estática las entradas han de ser valores fijos, es decir, no pueden variar en el tiempo por lo que habrá que implementar un programa intermedio que haga una media aritmética de los valores. Una segunda prueba con la misma red trata de comprobar si sería capaz de reconocer movimientos que estuvieran caracterizados por más de un elemento de esfuerzo. Para ello se vuelven a grabar los movimientos, esta vez en parejas de dos, y el resto del proceso es igual. ABSTRACT. The aim of this project is the implementation of a system able to analyze body movement from cinematic data. This cinematic data was obtained with a previous program. The first step is carrying out a study about the techniques and knowledge existing nowadays related to people movement. It is known that Rudolf Laban was one the greatest exponents of this field and thanks to his observations a relation between personality, mood and the way the person moves was made. Laban coined the term effort, that refers to the way energy generated from a movement is managed and how it is modulated in the sequence, this is a method of describing the inner intention of the person. The effort is divided into 4 categories: weight, space, time and flow, and each of these categories have 2 polarities named elements of effort. These 8 elements typify a movement. We look for movements that are made of these elements so we can quantify them. The movements are recorded with the kinect camera and saved in a csv file. In order to process this data a neural network is chosen owe to its flexibility and capability of processing non-linear inputs. For its implementation it is required a wide study regarding: topology, activation functions, different types of learning methods and training algorithms among others. The neural network for this project will have 2 hidden layers, it will be static and follow a feedforward process ruled by backpropagation. In a static net the inputs must be fixed, this means they cannot vary in time, so we will have to implement an intermediate program to calculate the average of our data. A second test for our net will be checking its ability to recognize more than one effort element in just one movement. In order to do this all the movements are recorded again but this time in pairs, the rest of the process remains the same.
Resumo:
We present a framework for calculating globally optimal parameters, within a given time frame, for on-line learning in multilayer neural networks. We demonstrate the capability of this method by computing optimal learning rates in typical learning scenarios. A similar treatment allows one to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule as well as to compare different training methods.
Resumo:
A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.