42 resultados para Initialisation
Resumo:
En apprentissage automatique, domaine qui consiste à utiliser des données pour apprendre une solution aux problèmes que nous voulons confier à la machine, le modèle des Réseaux de Neurones Artificiels (ANN) est un outil précieux. Il a été inventé voilà maintenant près de soixante ans, et pourtant, il est encore de nos jours le sujet d'une recherche active. Récemment, avec l'apprentissage profond, il a en effet permis d'améliorer l'état de l'art dans de nombreux champs d'applications comme la vision par ordinateur, le traitement de la parole et le traitement des langues naturelles. La quantité toujours grandissante de données disponibles et les améliorations du matériel informatique ont permis de faciliter l'apprentissage de modèles à haute capacité comme les ANNs profonds. Cependant, des difficultés inhérentes à l'entraînement de tels modèles, comme les minima locaux, ont encore un impact important. L'apprentissage profond vise donc à trouver des solutions, en régularisant ou en facilitant l'optimisation. Le pré-entraînnement non-supervisé, ou la technique du ``Dropout'', en sont des exemples. Les deux premiers travaux présentés dans cette thèse suivent cette ligne de recherche. Le premier étudie les problèmes de gradients diminuants/explosants dans les architectures profondes. Il montre que des choix simples, comme la fonction d'activation ou l'initialisation des poids du réseaux, ont une grande influence. Nous proposons l'initialisation normalisée pour faciliter l'apprentissage. Le second se focalise sur le choix de la fonction d'activation et présente le rectifieur, ou unité rectificatrice linéaire. Cette étude a été la première à mettre l'accent sur les fonctions d'activations linéaires par morceaux pour les réseaux de neurones profonds en apprentissage supervisé. Aujourd'hui, ce type de fonction d'activation est une composante essentielle des réseaux de neurones profonds. Les deux derniers travaux présentés se concentrent sur les applications des ANNs en traitement des langues naturelles. Le premier aborde le sujet de l'adaptation de domaine pour l'analyse de sentiment, en utilisant des Auto-Encodeurs Débruitants. Celui-ci est encore l'état de l'art de nos jours. Le second traite de l'apprentissage de données multi-relationnelles avec un modèle à base d'énergie, pouvant être utilisé pour la tâche de désambiguation de sens.
Resumo:
Ce mémoire s'intéresse à la reconstruction d'un modèle 3D à partir de plusieurs images. Le modèle 3D est élaboré avec une représentation hiérarchique de voxels sous la forme d'un octree. Un cube englobant le modèle 3D est calculé à partir de la position des caméras. Ce cube contient les voxels et il définit la position de caméras virtuelles. Le modèle 3D est initialisé par une enveloppe convexe basée sur la couleur uniforme du fond des images. Cette enveloppe permet de creuser la périphérie du modèle 3D. Ensuite un coût pondéré est calculé pour évaluer la qualité de chaque voxel à faire partie de la surface de l'objet. Ce coût tient compte de la similarité des pixels provenant de chaque image associée à la caméra virtuelle. Finalement et pour chacune des caméras virtuelles, une surface est calculée basée sur le coût en utilisant la méthode de SGM. La méthode SGM tient compte du voisinage lors du calcul de profondeur et ce mémoire présente une variation de la méthode pour tenir compte des voxels précédemment exclus du modèle par l'étape d'initialisation ou de creusage par une autre surface. Par la suite, les surfaces calculées sont utilisées pour creuser et finaliser le modèle 3D. Ce mémoire présente une combinaison innovante d'étapes permettant de créer un modèle 3D basé sur un ensemble d'images existant ou encore sur une suite d'images capturées en série pouvant mener à la création d'un modèle 3D en temps réel.
Resumo:
La tesis se centra en la Visión por Computador y, más concretamente, en la segmentación de imágenes, la cual es una de las etapas básicas en el análisis de imágenes y consiste en la división de la imagen en un conjunto de regiones visualmente distintas y uniformes considerando su intensidad, color o textura. Se propone una estrategia basada en el uso complementario de la información de región y de frontera durante el proceso de segmentación, integración que permite paliar algunos de los problemas básicos de la segmentación tradicional. La información de frontera permite inicialmente identificar el número de regiones presentes en la imagen y colocar en el interior de cada una de ellas una semilla, con el objetivo de modelar estadísticamente las características de las regiones y definir de esta forma la información de región. Esta información, conjuntamente con la información de frontera, es utilizada en la definición de una función de energía que expresa las propiedades requeridas a la segmentación deseada: uniformidad en el interior de las regiones y contraste con las regiones vecinas en los límites. Un conjunto de regiones activas inician entonces su crecimiento, compitiendo por los píxeles de la imagen, con el objetivo de optimizar la función de energía o, en otras palabras, encontrar la segmentación que mejor se adecua a los requerimientos exprsados en dicha función. Finalmente, todo esta proceso ha sido considerado en una estructura piramidal, lo que nos permite refinar progresivamente el resultado de la segmentación y mejorar su coste computacional. La estrategia ha sido extendida al problema de segmentación de texturas, lo que implica algunas consideraciones básicas como el modelaje de las regiones a partir de un conjunto de características de textura y la extracción de la información de frontera cuando la textura es presente en la imagen. Finalmente, se ha llevado a cabo la extensión a la segmentación de imágenes teniendo en cuenta las propiedades de color y textura. En este sentido, el uso conjunto de técnicas no-paramétricas de estimación de la función de densidad para la descripción del color, y de características textuales basadas en la matriz de co-ocurrencia, ha sido propuesto para modelar adecuadamente y de forma completa las regiones de la imagen. La propuesta ha sido evaluada de forma objetiva y comparada con distintas técnicas de integración utilizando imágenes sintéticas. Además, se han incluido experimentos con imágenes reales con resultados muy positivos.
Resumo:
Hidden Markov Models (HMMs) have been successfully applied to different modelling and classification problems from different areas over the recent years. An important step in using HMMs is the initialisation of the parameters of the model as the subsequent learning of HMM’s parameters will be dependent on these values. This initialisation should take into account the knowledge about the addressed problem and also optimisation techniques to estimate the best initial parameters given a cost function, and consequently, to estimate the best log-likelihood. This paper proposes the initialisation of Hidden Markov Models parameters using the optimisation algorithm Differential Evolution with the aim to obtain the best log-likelihood.
Resumo:
In this article a simple and effective algorithm is introduced for the system identification of the Wiener system using observational input/output data. The nonlinear static function in the Wiener system is modelled using a B-spline neural network. The Gauss–Newton algorithm is combined with De Boor algorithm (both curve and the first order derivatives) for the parameter estimation of the Wiener model, together with the use of a parameter initialisation scheme. Numerical examples are utilised to demonstrate the efficacy of the proposed approach.
Resumo:
We develop a complex-valued (CV) B-spline neural network approach for efficient identification and inversion of CV Wiener systems. The CV nonlinear static function in the Wiener system is represented using the tensor product of two univariate B-spline neural networks. With the aid of a least squares parameter initialisation, the Gauss-Newton algorithm effectively estimates the model parameters that include the CV linear dynamic model coefficients and B-spline neural network weights. The identification algorithm naturally incorporates the efficient De Boor algorithm with both the B-spline curve and first order derivative recursions. An accurate inverse of the CV Wiener system is then obtained, in which the inverse of the CV nonlinear static function of the Wiener system is calculated efficiently using the Gaussian-Newton algorithm based on the estimated B-spline neural network model, with the aid of the De Boor recursions. The effectiveness of our approach for identification and inversion of CV Wiener systems is demonstrated using the application of digital predistorter design for high power amplifiers with memory
Resumo:
We investigate the initialisation of Northern Hemisphere sea ice in the global climate model ECHAM5/MPI-OM by assimilating sea-ice concentration data. The analysis updates for concentration are given by Newtonian relaxation, and we discuss different ways of specifying the analysis updates for mean thickness. Because the conservation of mean ice thickness or actual ice thickness in the analysis updates leads to poor assimilation performance, we introduce a proportional dependence between concentration and mean thickness analysis updates. Assimilation with these proportional mean-thickness analysis updates leads to good assimilation performance for sea-ice concentration and thickness, both in identical-twin experiments and when assimilating sea-ice observations. The simulation of other Arctic surface fields in the coupled model is, however, not significantly improved by the assimilation. To understand the physical aspects of assimilation errors, we construct a simple prognostic model of the sea-ice thermodynamics, and analyse its response to the assimilation. We find that an adjustment of mean ice thickness in the analysis update is essential to arrive at plausible state estimates. To understand the statistical aspects of assimilation errors, we study the model background error covariance between ice concentration and ice thickness. We find that the spatial structure of covariances is best represented by the proportional mean-thickness analysis updates. Both physical and statistical evidence supports the experimental finding that assimilation with proportional mean-thickness updates outperforms the other two methods considered. The method described here is very simple to implement, and gives results that are sufficiently good to be used for initialising sea ice in a global climate model for seasonal to decadal predictions.
Resumo:
Urban land surface models (LSM) are commonly evaluated for short periods (a few weeks to months) because of limited observational data. This makes it difficult to distinguish the impact of initial conditions on model performance or to consider the response of a model to a range of possible atmospheric conditions. Drawing on results from the first urban LSM comparison, these two issues are considered. Assessment shows that the initial soil moisture has a substantial impact on the performance. Models initialised with soils that are too dry are not able to adjust their surface sensible and latent heat fluxes to realistic values until there is sufficient rainfall. Models initialised with too wet soils are not able to restrict their evaporation appropriately for periods in excess of a year. This has implications for short term evaluation studies and implies the need for soil moisture measurements to improve data assimilation and model initialisation. In contrast, initial conditions influencing the thermal storage have a much shorter adjustment timescale compared to soil moisture. Most models partition too much of the radiative energy at the surface into the sensible heat flux at the probable expense of the net storage heat flux.
Resumo:
In the 1960s North Atlantic sea surface temperatures (SST) cooled rapidly. The magnitude of the cooling was largest in the North Atlantic subpolar gyre (SPG), and was coincident with a rapid freshening of the SPG. Here we analyze hindcasts of the 1960s North Atlantic cooling made with the UK Met Office’s decadal prediction system (DePreSys), which is initialised using observations. It is shown that DePreSys captures—with a lead time of several years—the observed cooling and freshening of the North Atlantic SPG. DePreSys also captures changes in SST over the wider North Atlantic and surface climate impacts over the wider region, such as changes in atmospheric circulation in winter and sea ice extent. We show that initialisation of an anomalously weak Atlantic Meridional Overturning Circulation (AMOC), and hence weak northward heat transport, is crucial for DePreSys to predict the magnitude of the observed cooling. Such an anomalously weak AMOC is not captured when ocean observations are not assimilated (i.e. it is not a forced response in this model). The freshening of the SPG is also dominated by ocean salt transport changes in DePreSys; in particular, the simulation of advective freshwater anomalies analogous to the Great Salinity Anomaly were key. Therefore, DePreSys suggests that ocean dynamics played an important role in the cooling of the North Atlantic in the 1960s, and that this event was predictable.
Resumo:
Dynamical downscaling is frequently used to investigate the dynamical variables of extra-tropical cyclones, for example, precipitation, using very high-resolution models nested within coarser resolution models to understand the processes that lead to intense precipitation. It is also used in climate change studies, using long timeseries to investigate trends in precipitation, or to look at the small-scale dynamical processes for specific case studies. This study investigates some of the problems associated with dynamical downscaling and looks at the optimum configuration to obtain the distribution and intensity of a precipitation field to match observations. This study uses the Met Office Unified Model run in limited area mode with grid spacings of 12, 4 and 1.5 km, driven by boundary conditions provided by the ECMWF Operational Analysis to produce high-resolution simulations for the Summer of 2007 UK flooding events. The numerical weather prediction model is initiated at varying times before the peak precipitation is observed to test the importance of the initialisation and boundary conditions, and how long the simulation can be run for. The results are compared to raingauge data as verification and show that the model intensities are most similar to observations when the model is initialised 12 hours before the peak precipitation is observed. It was also shown that using non-gridded datasets makes verification more difficult, with the density of observations also affecting the intensities observed. It is concluded that the simulations are able to produce realistic precipitation intensities when driven by the coarser resolution data.
Resumo:
Operational forecasting centres are currently developing data assimilation systems for coupled atmosphere-ocean models. Strongly coupled assimilation, in which a single assimilation system is applied to a coupled model, presents significant technical and scientific challenges. Hence weakly coupled assimilation systems are being developed as a first step, in which the coupled model is used to compare the current state estimate with observations, but corrections to the atmosphere and ocean initial conditions are then calculated independently. In this paper we provide a comprehensive description of the different coupled assimilation methodologies in the context of four dimensional variational assimilation (4D-Var) and use an idealised framework to assess the expected benefits of moving towards coupled data assimilation. We implement an incremental 4D-Var system within an idealised single column atmosphere-ocean model. The system has the capability to run both strongly and weakly coupled assimilations as well as uncoupled atmosphere or ocean only assimilations, thus allowing a systematic comparison of the different strategies for treating the coupled data assimilation problem. We present results from a series of identical twin experiments devised to investigate the behaviour and sensitivities of the different approaches. Overall, our study demonstrates the potential benefits that may be expected from coupled data assimilation. When compared to uncoupled initialisation, coupled assimilation is able to produce more balanced initial analysis fields, thus reducing initialisation shock and its impact on the subsequent forecast. Single observation experiments demonstrate how coupled assimilation systems are able to pass information between the atmosphere and ocean and therefore use near-surface data to greater effect. We show that much of this benefit may also be gained from a weakly coupled assimilation system, but that this can be sensitive to the parameters used in the assimilation.
Resumo:
Decadal predictions on timescales from one year to one decade are gaining importance since this time frame falls within the planning horizon of politics, economy and society. The present study examines the decadal predictability of regional wind speed and wind energy potentials in three generations of the MiKlip (‘Mittelfristige Klimaprognosen’) decadal prediction system. The system is based on the global Max-Planck-Institute Earth System Model (MPI-ESM), and the three generations differ primarily in the ocean initialisation. Ensembles of uninitialised historical and yearly initialised hindcast experiments are used to assess the forecast skill for 10 m wind speeds and wind energy output (Eout) over Central Europe with lead times from one year to one decade. With this aim, a statistical-dynamical downscaling (SDD) approach is used for the regionalisation. Its added value is evaluated by comparison of skill scores for MPI-ESM large-scale wind speeds and SDD-simulated regional wind speeds. All three MPI-ESM ensemble generations show some forecast skill for annual mean wind speed and Eout over Central Europe on yearly and multi-yearly time scales. This forecast skill is mostly limited to the first years after initialisation. Differences between the three ensemble generations are generally small. The regionalisation preserves and sometimes increases the forecast skills of the global runs but results depend on lead time and ensemble generation. Moreover, regionalisation often improves the ensemble spread. Seasonal Eout skills are generally lower than for annual means. Skill scores are lowest during summer and persist longest in autumn. A large-scale westerly weather type with strong pressure gradients over Central Europe is identified as potential source of the skill for wind energy potentials, showing a similar forecast skill and a high correlation with Eout anomalies. These results are promising towards the establishment of a decadal prediction system for wind energy applications over Central Europe.
Resumo:
This paper describes the development and basic evaluation of decadal predictions produced using the HiGEM coupled climate model. HiGEM is a higher resolution version of the HadGEM1 Met Office Unified Model. The horizontal resolution in HiGEM has been increased to 1.25◦ × 0.83◦ in longitude and latitude for the atmosphere, and 1/3◦ × 1/3◦ globally for the ocean. The HiGEM decadal predictions are initialised using an anomaly assimilation scheme that relaxes anomalies of ocean temperature and salinity to observed anomalies. 10 year hindcasts are produced for 10 start dates (1960, 1965,..., 2000, 2005). To determine the relative contributions to prediction skill from initial conditions and external forcing, the HiGEM decadal predictions are compared to uninitialised HiGEM transient experiments. The HiGEM decadal predictions have substantial skill for predictions of annual mean surface air temperature and 100 m upper ocean temperature. For lead times up to 10 years, anomaly correlations (ACC) over large areas of the North Atlantic Ocean, the Western Pacific Ocean and the Indian Ocean exceed values of 0.6. Initialisation of the HiGEM decadal predictions significantly increases skill over regions of the Atlantic Ocean,the Maritime Continent and regions of the subtropical North and South Pacific Ocean. In particular, HiGEM produces skillful predictions of the North Atlantic subpolar gyre for up to 4 years lead time (with ACC > 0.7), which are significantly larger than the uninitialised HiGEM transient experiments.
Resumo:
To plan testing activities, testers face the challenge of determining a strategy, including a test coverage criterion that offers an acceptable compromise between the available resources and test goals. Known theoretical properties of coverage criteria do not always help and, thus, empirical data are needed. The results of an experimental evaluation of several coverage criteria for finite state machines (FSMs) are presented, namely, state and transition coverage; initialisation fault and transition fault coverage. The first two criteria focus on FSM structure, whereas the other two on potential faults in FSM implementations. The authors elaborate a comparison approach that includes random generation of FSM, construction of an adequate test suite and test minimisation for each criterion to ensure that tests are obtained in a uniform way. The last step uses an improved greedy algorithm.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)