866 resultados para High-Dimensional Space Geometrical Informatics (HDSGI)
Resumo:
In this paper a multiple classifier machine learning methodology for Predictive Maintenance (PdM) is presented. PdM is a prominent strategy for dealing with maintenance issues given the increasing need to minimize downtime and associated costs. One of the challenges with PdM is generating so called ’health factors’ or quantitative indicators of the status of a system associated with a given maintenance issue, and determining their relationship to operating costs and failure risk. The proposed PdM methodology allows dynamical decision rules to be adopted for maintenance management and can be used with high-dimensional and censored data problems. This is achieved by training multiple classification modules with different prediction horizons to provide different performance trade-offs in terms of frequency of unexpected breaks and unexploited lifetime and then employing this information in an operating cost based maintenance decision system to minimise expected costs. The effectiveness of the methodology is demonstrated using a simulated example and a benchmark semiconductor manufacturing maintenance problem.
Resumo:
Os ecossistemas de água doce – responsáveis por funções ambientais importantes e pelo fornecimento de bens e serviços insubstituíveis – têm vindo a ser severamente afectados por perturbações antropogénicas. A conversão de floresta em terreno agrícola afecta os sistemas aquáticos através de uma série de mecanismos: sedimentação; excesso de nutrientes; contaminação; alterações hidrológicas; e remoção de vegetação ripícola. As comunidades de macroinvertebrados de água doce – devido à sua diversidade, ubiquidade e sensibilidade às perturbações ambientais – revelam-se como particularmente adequadas para estudos de avaliação da integridade ecológica destes sistemas expostos simultaneamente a múltiplos factores de impacto. O uso sistemático de respostas biológicas para avaliação de mudanças ambientais – ou biomonitorização – pode ser levado a cabo através de diversas metodologias, que, de uma forma geral, não consideram aspectos funcionais das comunidades biológicas e têm aplicabilidade geograficamente restrita. A biomonitorização através de atributos biológicos (características que reflectem a adaptação das espécies ao seu meio ambiente) revela-se como uma ferramenta promissora na resolução dos problemas referidos, apresentando vantagens adicionais: relações causa-efeito directas; melhoria na diferenciação de impactos; e integração da variabilidade natural. O presente estudo apresenta uma revisão critica do estado-da-arte actual na área do uso de atributos biológicos em biomonitorização. Até à data de publicação, não estava disponível nenhum outro trabalho com a base conceptual do uso de atributos de macroinvertebrados enquanto descritores de comunidades e para efeitos de biomonitorização e gestão de sistemas de água doce. Descrevem-se as teorias ecológicas de suporte destas metodologias (conceitos de habitat-molde e de filtros paisagísticos) e os estudos que aplicaram estas teorias em cenários reais, tendo-se chamado a atenção para questões técnicas e possíveis soluções. As necessidades futuras nesta área englobam: o desenvolvimento de uma só ferramenta de biomonitorização de aplicação alargada; uma maior compreensão da variabilidade natural nas comunidades biológicas; diminuição dos efeitos de soluções de compromisso biológico e sindromas; realização de estudos autoecológicos adicionais; e detecção de impactos específicos em cenários de impacto complexos. Um dos objectivos deste estudo foi contribuir para a melhoria das técnicas de biomonitorização através de atributos, focalizando em comunidades de macroinvertebrados ribeirinhas em diferentes regiões biogeográficas (as bacias hidrográficas dos rios: Little e Salmon em New Brunswick, Canadá; Anllóns na Galiza, Espanha; Reventazón em Cartago, Costa Rica). Em cada região, foram estudados gradientes de uso agrícola de solo, incluindo desde bacias hidrográficas quase exclusivamente cobertas por floresta até bacias sob a influência maioritária de actividades agrícolas intensivas. Em cada gradiente de uso de solo, a caracterização da comunidade biológica (por amostragem de macroinvertebrados em troços de rápidos) foi acompanhada pela caracterização do habitat circundante (incluindo propriedades da bacia hidrográfica, análise química das águas e outras propriedades à escala local). A comunidade de macroinvertebrados foi caracterizada através de informação taxonómica, métricas estruturais, índices de diversidade, métricas de tolerância, índices bióticos e através da compilação de atributos biológicos e fisiológicos gerais, de história de vida e de resistência a perturbações. Análises estatísticas univariadas e multivariadas foram usadas para evidenciar os gradientes biológicos e físico-químicos, confirmar a sua co-variação, testar a significância da discriminação de níveis de impacto e estabelecer comparações inter-regionais. A estrutura de comunidades revelou os complexos gradientes de impacto, que por sua vez co-variaram significativamente com os gradientes de uso de solo. Os gradientes de impacto relacionaram-se sobretudo com entrada de nutrientes e sedimentação. Os gradientes biológicos definidos pelas medidas estruturais seleccionadas co-variaram com os gradientes de impacto estudados, muito embora apenas algumas variáveis estruturais tenham individualmente discriminado as categorias de uso de solo definidas a priori. Não foi detectada consistência nas respostas das medidas estruturais entre regiões biogeográficas, tendo-se confirmadado que as interpretações puramente taxonómicas de impactos são difíceis de extrapolar entre regiões. Os gradientes biológicos definidos através dos atributos seleccionados também co-variaram com os gradientes de perturbação, tendo sido possível obter uma melhor discriminação de categorias de uso de solo. Nas diferentes regiões, a discriminação de locais mais impactados foi feita com base num conjunto similar de atributos, que inclui tamanho, voltinismo, técnicas reproductivas, microhabitat, preferências de corrente e substrato, hábitos alimentares e formas de resistência. Este conjunto poderá vir a ser usado para avaliar de forma predictiva os efeitos das modificações severas de uso de solo impostas pela actividade agrícola. Quando analisadas simultaneamente através dos atributos, as comunidades das três regiões permitiram uma moderada mas significativa discriminação de níveis de impacto. Estas análises corroboram as evidências de que as mudanças nas comunidades de macroinvertebrados aquáticos em locais sob a influência de agricultura intensiva podem seguir uma trajectória convergente no espaço multidimensional, independentemente de factores geográficos. Foram fornecidas pistas para a identificação de parâmetros específicos que deverão ser tidos em conta no planeamento de novos programas de biomonitorização com comunidades de macroinvertebrados bentónicos, para aplicação numa gestão fluvial verdadeiramente ecológica, nestas e noutras regiões. Foram ainda sugeridas possíveis linhas futuras de investigação.
Resumo:
In this paper a complex-order van der Pol oscillator is considered. The complex derivative Dα±ȷβ , with α,β∈R + is a generalization of the concept of integer derivative, where α=1, β=0. By applying the concept of complex derivative, we obtain a high-dimensional parameter space. Amplitude and period values of the periodic solutions of the two versions of the complex-order van der Pol oscillator are studied for variation of these parameters. Fourier transforms of the periodic solutions of the two oscillators are also analyzed.
Resumo:
This work presents new, efficient Markov chain Monte Carlo (MCMC) simulation methods for statistical analysis in various modelling applications. When using MCMC methods, the model is simulated repeatedly to explore the probability distribution describing the uncertainties in model parameters and predictions. In adaptive MCMC methods based on the Metropolis-Hastings algorithm, the proposal distribution needed by the algorithm learns from the target distribution as the simulation proceeds. Adaptive MCMC methods have been subject of intensive research lately, as they open a way for essentially easier use of the methodology. The lack of user-friendly computer programs has been a main obstacle for wider acceptance of the methods. This work provides two new adaptive MCMC methods: DRAM and AARJ. The DRAM method has been built especially to work in high dimensional and non-linear problems. The AARJ method is an extension to DRAM for model selection problems, where the mathematical formulation of the model is uncertain and we want simultaneously to fit several different models to the same observations. The methods were developed while keeping in mind the needs of modelling applications typical in environmental sciences. The development work has been pursued while working with several application projects. The applications presented in this work are: a winter time oxygen concentration model for Lake Tuusulanjärvi and adaptive control of the aerator; a nutrition model for Lake Pyhäjärvi and lake management planning; validation of the algorithms of the GOMOS ozone remote sensing instrument on board the Envisat satellite of European Space Agency and the study of the effects of aerosol model selection on the GOMOS algorithm.
Resumo:
We study the workings of the factor analysis of high-dimensional data using artificial series generated from a large, multi-sector dynamic stochastic general equilibrium (DSGE) model. The objective is to use the DSGE model as a laboratory that allow us to shed some light on the practical benefits and limitations of using factor analysis techniques on economic data. We explain in what sense the artificial data can be thought of having a factor structure, study the theoretical and finite sample properties of the principal components estimates of the factor space, investigate the substantive reason(s) for the good performance of di¤usion index forecasts, and assess the quality of the factor analysis of highly dissagregated data. In all our exercises, we explain the precise relationship between the factors and the basic macroeconomic shocks postulated by the model.
Resumo:
Ce mémoire de maîtrise présente une nouvelle approche non supervisée pour détecter et segmenter les régions urbaines dans les images hyperspectrales. La méthode proposée n ́ecessite trois étapes. Tout d’abord, afin de réduire le coût calculatoire de notre algorithme, une image couleur du contenu spectral est estimée. A cette fin, une étape de réduction de dimensionalité non-linéaire, basée sur deux critères complémentaires mais contradictoires de bonne visualisation; à savoir la précision et le contraste, est réalisée pour l’affichage couleur de chaque image hyperspectrale. Ensuite, pour discriminer les régions urbaines des régions non urbaines, la seconde étape consiste à extraire quelques caractéristiques discriminantes (et complémentaires) sur cette image hyperspectrale couleur. A cette fin, nous avons extrait une série de paramètres discriminants pour décrire les caractéristiques d’une zone urbaine, principalement composée d’objets manufacturés de formes simples g ́eométriques et régulières. Nous avons utilisé des caractéristiques texturales basées sur les niveaux de gris, la magnitude du gradient ou des paramètres issus de la matrice de co-occurrence combinés avec des caractéristiques structurelles basées sur l’orientation locale du gradient de l’image et la détection locale de segments de droites. Afin de réduire encore la complexité de calcul de notre approche et éviter le problème de la ”malédiction de la dimensionnalité” quand on décide de regrouper des données de dimensions élevées, nous avons décidé de classifier individuellement, dans la dernière étape, chaque caractéristique texturale ou structurelle avec une simple procédure de K-moyennes et ensuite de combiner ces segmentations grossières, obtenues à faible coût, avec un modèle efficace de fusion de cartes de segmentations. Les expérimentations données dans ce rapport montrent que cette stratégie est efficace visuellement et se compare favorablement aux autres méthodes de détection et segmentation de zones urbaines à partir d’images hyperspectrales.
Resumo:
Les méthodes de Monte Carlo par chaînes de Markov (MCCM) sont des méthodes servant à échantillonner à partir de distributions de probabilité. Ces techniques se basent sur le parcours de chaînes de Markov ayant pour lois stationnaires les distributions à échantillonner. Étant donné leur facilité d’application, elles constituent une des approches les plus utilisées dans la communauté statistique, et tout particulièrement en analyse bayésienne. Ce sont des outils très populaires pour l’échantillonnage de lois de probabilité complexes et/ou en grandes dimensions. Depuis l’apparition de la première méthode MCCM en 1953 (la méthode de Metropolis, voir [10]), l’intérêt pour ces méthodes, ainsi que l’éventail d’algorithmes disponibles ne cessent de s’accroître d’une année à l’autre. Bien que l’algorithme Metropolis-Hastings (voir [8]) puisse être considéré comme l’un des algorithmes de Monte Carlo par chaînes de Markov les plus généraux, il est aussi l’un des plus simples à comprendre et à expliquer, ce qui en fait un algorithme idéal pour débuter. Il a été sujet de développement par plusieurs chercheurs. L’algorithme Metropolis à essais multiples (MTM), introduit dans la littérature statistique par [9], est considéré comme un développement intéressant dans ce domaine, mais malheureusement son implémentation est très coûteuse (en termes de temps). Récemment, un nouvel algorithme a été développé par [1]. Il s’agit de l’algorithme Metropolis à essais multiples revisité (MTM revisité), qui définit la méthode MTM standard mentionnée précédemment dans le cadre de l’algorithme Metropolis-Hastings sur un espace étendu. L’objectif de ce travail est, en premier lieu, de présenter les méthodes MCCM, et par la suite d’étudier et d’analyser les algorithmes Metropolis-Hastings ainsi que le MTM standard afin de permettre aux lecteurs une meilleure compréhension de l’implémentation de ces méthodes. Un deuxième objectif est d’étudier les perspectives ainsi que les inconvénients de l’algorithme MTM revisité afin de voir s’il répond aux attentes de la communauté statistique. Enfin, nous tentons de combattre le problème de sédentarité de l’algorithme MTM revisité, ce qui donne lieu à un tout nouvel algorithme. Ce nouvel algorithme performe bien lorsque le nombre de candidats générés à chaque itérations est petit, mais sa performance se dégrade à mesure que ce nombre de candidats croît.
Resumo:
Les algorithmes d'apprentissage profond forment un nouvel ensemble de méthodes puissantes pour l'apprentissage automatique. L'idée est de combiner des couches de facteurs latents en hierarchies. Cela requiert souvent un coût computationel plus elevé et augmente aussi le nombre de paramètres du modèle. Ainsi, l'utilisation de ces méthodes sur des problèmes à plus grande échelle demande de réduire leur coût et aussi d'améliorer leur régularisation et leur optimization. Cette thèse adresse cette question sur ces trois perspectives. Nous étudions tout d'abord le problème de réduire le coût de certains algorithmes profonds. Nous proposons deux méthodes pour entrainer des machines de Boltzmann restreintes et des auto-encodeurs débruitants sur des distributions sparses à haute dimension. Ceci est important pour l'application de ces algorithmes pour le traitement de langues naturelles. Ces deux méthodes (Dauphin et al., 2011; Dauphin and Bengio, 2013) utilisent l'échantillonage par importance pour échantilloner l'objectif de ces modèles. Nous observons que cela réduit significativement le temps d'entrainement. L'accéleration atteint 2 ordres de magnitude sur plusieurs bancs d'essai. Deuxièmement, nous introduisont un puissant régularisateur pour les méthodes profondes. Les résultats expérimentaux démontrent qu'un bon régularisateur est crucial pour obtenir de bonnes performances avec des gros réseaux (Hinton et al., 2012). Dans Rifai et al. (2011), nous proposons un nouveau régularisateur qui combine l'apprentissage non-supervisé et la propagation de tangente (Simard et al., 1992). Cette méthode exploite des principes géometriques et permit au moment de la publication d'atteindre des résultats à l'état de l'art. Finalement, nous considérons le problème d'optimiser des surfaces non-convexes à haute dimensionalité comme celle des réseaux de neurones. Tradionellement, l'abondance de minimum locaux était considéré comme la principale difficulté dans ces problèmes. Dans Dauphin et al. (2014a) nous argumentons à partir de résultats en statistique physique, de la théorie des matrices aléatoires, de la théorie des réseaux de neurones et à partir de résultats expérimentaux qu'une difficulté plus profonde provient de la prolifération de points-selle. Dans ce papier nous proposons aussi une nouvelle méthode pour l'optimisation non-convexe.
Resumo:
We develop an algorithm that computes the gravitational potentials and forces on N point-masses interacting in three-dimensional space. The algorithm, based on analytical techniques developed by Rokhlin and Greengard, runs in order N time. In contrast to other fast N-body methods such as tree codes, which only approximate the interaction potentials and forces, this method is exact ?? computes the potentials and forces to within any prespecified tolerance up to machine precision. We present an implementation of the algorithm for a sequential machine. We numerically verify the algorithm, and compare its speed with that of an O(N2) direct force computation. We also describe a parallel version of the algorithm that runs on the Connection Machine in order 0(logN) time. We compare experimental results with those of the sequential implementation and discuss how to minimize communication overhead on the parallel machine.
Resumo:
We have developed a technique called RISE (Random Image Structure Evolution), by which one may systematically sample continuous paths in a high-dimensional image space. A basic RISE sequence depicts the evolution of an object's image from a random field, along with the reverse sequence which depicts the transformation of this image back into randomness. The processing steps are designed to ensure that important low-level image attributes such as the frequency spectrum and luminance are held constant throughout a RISE sequence. Experiments based on the RISE paradigm can be used to address some key open issues in object perception. These include determining the neural substrates underlying object perception, the role of prior knowledge and expectation in object perception, and the developmental changes in object perception skills from infancy to adulthood.
Resumo:
This paper tackles the path planning problem for oriented vehicles travelling in the non-Euclidean 3-Dimensional space; spherical space S3. For such problem, the orientation of the vehicle is naturally represented by orthonormal frame bundle; the rotation group SO(4). Orthonormal frame bundles of space forms coincide with their isometry groups and therefore the focus shifts to control systems defined on Lie groups. The oriented vehicles, in this case, are constrained to travel at constant speed in a forward direction and their angular velocities directly controlled. In this paper we identify controls that induce steady motions of these oriented vehicles and yield closed form parametric expressions for these motions. The paths these vehicles trace are defined explicitly in terms of the controls and therefore invariant with respect to the coordinate system used to describe the motion.
Resumo:
This report describes the analysis and development of novel tools for the global optimisation of relevant mission design problems. A taxonomy was created for mission design problems, and an empirical analysis of their optimisational complexity performed - it was demonstrated that the use of global optimisation was necessary on most classes and informed the selection of appropriate global algorithms. The selected algorithms were then applied to the di®erent problem classes: Di®erential Evolution was found to be the most e±cient. Considering the speci¯c problem of multiple gravity assist trajectory design, a search space pruning algorithm was developed that displays both polynomial time and space complexity. Empirically, this was shown to typically achieve search space reductions of greater than six orders of magnitude, thus reducing signi¯cantly the complexity of the subsequent optimisation. The algorithm was fully implemented in a software package that allows simple visualisation of high-dimensional search spaces, and e®ective optimisation over the reduced search bounds.
Resumo:
For more than half a century, emotion researchers have attempted to establish the dimensional space that most economically accounts for similarities and differences in emotional experience. Today, many researchers focus exclusively on two-dimensional models involving valence and arousal. Adopting a theoretically based approach, we show for three languages that four dimensions are needed to satisfactorily represent similarities and differences in the meaning of emotion words. In order of importance, these dimensions are evaluation-pleasantness, potency-control, activation-arousal, and unpredictability. They were identified on the basis of the applicability of 144 features representing the six components of emotions: (a) appraisals of events, (b) psychophysiological changes, (c) motor expressions, (d) action tendencies, (e) subjective experiences, and (f) emotion regulation.
Resumo:
Mean field models (MFMs) of cortical tissue incorporate salient, average features of neural masses in order to model activity at the population level, thereby linking microscopic physiology to macroscopic observations, e.g., with the electroencephalogram (EEG). One of the common aspects of MFM descriptions is the presence of a high-dimensional parameter space capturing neurobiological attributes deemed relevant to the brain dynamics of interest. We study the physiological parameter space of a MFM of electrocortical activity and discover robust correlations between physiological attributes of the model cortex and its dynamical features. These correlations are revealed by the study of bifurcation plots, which show that the model responses to changes in inhibition belong to two archetypal categories or “families”. After investigating and characterizing them in depth, we discuss their essential differences in terms of four important aspects: power responses with respect to the modeled action of anesthetics, reaction to exogenous stimuli such as thalamic input, and distributions of model parameters and oscillatory repertoires when inhibition is enhanced. Furthermore, while the complexity of sustained periodic orbits differs significantly between families, we are able to show how metamorphoses between the families can be brought about by exogenous stimuli. We here unveil links between measurable physiological attributes of the brain and dynamical patterns that are not accessible by linear methods. They instead emerge when the nonlinear structure of parameter space is partitioned according to bifurcation responses. We call this general method “metabifurcation analysis”. The partitioning cannot be achieved by the investigation of only a small number of parameter sets and is instead the result of an automated bifurcation analysis of a representative sample of 73,454 physiologically admissible parameter sets. Our approach generalizes straightforwardly and is well suited to probing the dynamics of other models with large and complex parameter spaces.
Resumo:
The Asian summer monsoon is a high dimensional and highly nonlinear phenomenon involving considerable moisture transport towards land from the ocean, and is critical for the whole region. We have used daily ECMWF reanalysis (ERA-40) sea-level pressure (SLP) anomalies to the seasonal cycle, over the region 50-145°E, 20°S-35°N to study the nonlinearity of the Asian monsoon using Isomap. We have focused on the two-dimensional embedding of the SLP anomalies for ease of interpretation. Unlike the unimodality obtained from tests performed in empirical orthogonal function space, the probability density function, within the two-dimensional Isomap space, turns out to be bimodal. But a clustering procedure applied to the SLP data reveals support for three clusters, which are identified using a three-component bivariate Gaussian mixture model. The modes are found to appear similar to active and break phases of the monsoon over South Asia in addition to a third phase, which shows active conditions over the Western North Pacific. Using the low-level wind field anomalies the active phase over South Asia is found to be characterised by a strengthening and an eastward extension of the Somali jet whereas during the break phase the Somali jet is weakened near southern India, while the monsoon trough in northern India also weakens. Interpretation is aided using the APHRODITE gridded land precipitation product for monsoon Asia. The effect of large-scale seasonal mean monsoon and lower boundary forcing, in the form of ENSO, is also investigated and discussed. The outcome here is that ENSO is shown to perturb the intraseasonal regimes, in agreement with conceptual ideas.