972 resultados para Gaussian and t-copulas


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyse the variations produced on tsunami propagation and impact over a straight coastline because of the presence of a submarine canyon incised in the continental margin. For ease of calculation we assume that the shoreline and the shelf edge are parallel and that the incident wave approaches them normally. A total of 512 synthetic scenarios have been computed by combining the bathymetry of a continental margin incised by a parameterised single canyon and the incident tsunami waves. The margin bathymetry, the canyon and the tsunami waves have been generated using mathematical functions (e.g. Gaussian). Canyon parameters analysed are: (i) incision length into the continental shelf, which for a constant shelf width relates directly to the distance from the canyon head to the coast, (ii) canyon width, and (iii) canyon orientation with respect to the shoreline. Tsunami wave parameters considered are period and sign. The COMCOT tsunami model from Cornell University was applied to propagate the waves across the synthetic bathymetric surfaces. Five simulations of tsunami propagation over a non-canyoned margin were also performed for reference. The analysis of the results reveals a strong variation of tsunami arrival times and amplitudes reaching the coastline when a tsunami wave travels over a submarine canyon, with changing maximum height location and alongshore extension. In general, the presence of a submarine canyon lowers the arrival time to the shoreline but prevents wave build-up just over the canyon axis. This leads to a decrease in tsunami amplitude at the coastal stretch located just shoreward of the canyon head, which results in a lower run-up in comparison with a non-canyoned margin. Contrarily, an increased wave build-up occurs on both sides of the canyon head, generating two coastal stretches with an enhanced run-up. These aggravated or reduced tsunami effects are modified with (i) proximity of the canyon tip to the coast, amplifying the wave height, (ii) canyon width, enlarging the areas with lower and higher maximum height wave along the coastline, and (iii) canyon obliquity with respect to the shoreline and shelf edge, increasing wave height shoreward of the leeward flank of the canyon. Moreover, the presence of a submarine canyon near the coast produces a variation of wave energy along the shore, eventually resulting in edge waves shoreward of the canyon head. Edge waves subsequently spread out alongshore reaching significant amplitudes especially when coupling with tsunami secondary waves occurs. Model results have been groundtruthed using the actual bathymetry of Blanes Canyon area in the North Catalan margin. This paper underlines the effects of the presence, morphology and orientation of submarine canyons as a determining factor on tsunami propagation and impact, which could prevail over other effects deriving from coastal configuration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a validation study on statistical nonsupervised brain tissue classification techniques in magnetic resonance (MR) images. Several image models assuming different hypotheses regarding the intensity distribution model, the spatial model and the number of classes are assessed. The methods are tested on simulated data for which the classification ground truth is known. Different noise and intensity nonuniformities are added to simulate real imaging conditions. No enhancement of the image quality is considered either before or during the classification process. This way, the accuracy of the methods and their robustness against image artifacts are tested. Classification is also performed on real data where a quantitative validation compares the methods' results with an estimated ground truth from manual segmentations by experts. Validity of the various classification methods in the labeling of the image as well as in the tissue volume is estimated with different local and global measures. Results demonstrate that methods relying on both intensity and spatial information are more robust to noise and field inhomogeneities. We also demonstrate that partial volume is not perfectly modeled, even though methods that account for mixture classes outperform methods that only consider pure Gaussian classes. Finally, we show that simulated data results can also be extended to real data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the estimation of the code-phase(pseudorange) and the carrier-phase of the direct signal received from a direct-sequence spread-spectrum satellite transmitter. Thesignal is received by an antenna array in a scenario with interferenceand multipath propagation. These two effects are generallythe limiting error sources in most high-precision positioning applications.A new estimator of the code- and carrier-phases is derivedby using a simplified signal model and the maximum likelihood(ML) principle. The simplified model consists essentially ofgathering all signals, except for the direct one, in a component withunknown spatial correlation. The estimator exploits the knowledgeof the direction-of-arrival of the direct signal and is much simplerthan other estimators derived under more detailed signal models.Moreover, we present an iterative algorithm, that is adequate for apractical implementation and explores an interesting link betweenthe ML estimator and a hybrid beamformer. The mean squarederror and bias of the new estimator are computed for a numberof scenarios and compared with those of other methods. The presentedestimator and the hybrid beamforming outperform the existingtechniques of comparable complexity and attains, in manysituations, the Cramér–Rao lower bound of the problem at hand.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The parameter setting of a differential evolution algorithm must meet several requirements: efficiency, effectiveness, and reliability. Problems vary. The solution of a particular problem can be represented in different ways. An algorithm most efficient in dealing with a particular representation may be less efficient in dealing with other representations. The development of differential evolution-based methods contributes substantially to research on evolutionary computing and global optimization in general. The objective of this study is to investigatethe differential evolution algorithm, the intelligent adjustment of its controlparameters, and its application. In the thesis, the differential evolution algorithm is first examined using different parameter settings and test functions. Fuzzy control is then employed to make control parameters adaptive based on an optimization process and expert knowledge. The developed algorithms are applied to training radial basis function networks for function approximation with possible variables including centers, widths, and weights of basis functions and both having control parameters kept fixed and adjusted by fuzzy controller. After the influence of control variables on the performance of the differential evolution algorithm was explored, an adaptive version of the differential evolution algorithm was developed and the differential evolution-based radial basis function network training approaches were proposed. Experimental results showed that the performance of the differential evolution algorithm is sensitive to parameter setting, and the best setting was found to be problem dependent. The fuzzy adaptive differential evolution algorithm releases the user load of parameter setting and performs better than those using all fixedparameters. Differential evolution-based approaches are effective for training Gaussian radial basis function networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a very simple method for increasing the algorithm speed for separating sources from PNL mixtures or invertingWiener systems. The method is based on a pertinent initialization of the inverse system, whose computational cost is very low. The nonlinear part is roughly approximated by pushing the observations to be Gaussian; this method provides a surprisingly good approximation even when the basic assumption is not fully satisfied. The linear part is initialized so that outputs are decorrelated. Experiments shows the impressive speed improvement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The electrical and electroluminescence (EL) properties at room and high temperatures of oxide/ nitride/oxide (ONO)-based light emitting capacitors are studied. The ONO multidielectric layer is enriched with silicon by means of ion implantation. The exceeding silicon distribution follows a Gaussian profile with a maximum of 19%, centered close to the lower oxide/nitride interface. The electrical measurements performed at room and high temperatures allowed to unambiguously identify variable range hopping (VRH) as the dominant electrical conduction mechanism at low voltages, whereas at moderate and high voltages, a hybrid conduction formed by means of variable range hopping and space charge-limited current enhanced by Poole-Frenkel effect predominates. The EL spectra at different temperatures are also recorded, and the correlation between charge transport mechanisms and EL properties is discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We use the analogy between scattering of a wave from a potential, and the precession of a spin-half particle in a magnetic field, to gain insight into the design of an antireflection coating for electrons in a semiconductor superlattice. It is shown that the classic recipes derived for optics are generally not applicable due to the different dispersion law for electrons. Using the stability conditions we show that a Poisson distribution of impedance steps is a better approximation than is a Gaussian distribution. Examples are given of filters with average transmissivity exceeding 95% over an allowed band.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate how correlations between the diversity of the connectivity of networks and the dynamics at their nodes affect the macroscopic behavior. In particular, we study the synchronization transition of coupled stochastic phase oscillators that represent the node dynamics. Crucially in our work, the variability in the number of connections of the nodes is correlated with the width of the frequency distribution of the oscillators. By numerical simulations on Erdös-Rényi networks, where the frequencies of the oscillators are Gaussian distributed, we make the counterintuitive observation that an increase in the strength of the correlation is accompanied by an increase in the critical coupling strength for the onset of synchronization. We further observe that the critical coupling can solely depend on the average number of connections or even completely lose its dependence on the network connectivity. Only beyond this state, a weighted mean-field approximation breaks down. If noise is present, the correlations have to be stronger to yield similar observations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

RX J1826.2-1450/LS 5039 has been recently proposed to be a radio emitting high mass X-ray binary. In this paper, we present an analysis of its X-ray timing and spectroscopic properties using different instruments on board the RXTE satellite. The timing analysis indicates the absence of pulsed or periodic emission on time scales of 0.02-2000 s and 2-200 d, respectively. The source spectrum is well represented by a power-law model, plus a Gaussian component describing a strong iron line at 6.6 keV. Significant emission is seen up to 30 keV, and no exponential cut-off at high energy is required. We also study the radio properties of the system according to the GBI-NASA Monitoring Program. RX J1826.2-1450/LS 5039 continues to display moderate radio variability with a clearly non-thermal spectral index. No strong radio outbursts have been detected after several months.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A continuous random variable is expanded as a sum of a sequence of uncorrelated random variables. These variables are principal dimensions in continuous scaling on a distance function, as an extension of classic scaling on a distance matrix. For a particular distance, these dimensions are principal components. Then some properties are studied and an inequality is obtained. Diagonal expansions are considered from the same continuous scaling point of view, by means of the chi-square distance. The geometric dimension of a bivariate distribution is defined and illustrated with copulas. It is shown that the dimension can have the power of continuum.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work, we use the rule of mixtures to develop an equivalent material model in which the total strain energy density is split into the isotropic part related to the matrix component and the anisotropic energy contribution related to the fiber effects. For the isotropic energy part, we select the amended non-Gaussian strain energy density model, while the energy fiber effects are added by considering the equivalent anisotropic volumetric fraction contribution, as well as the isotropized representation form of the eight-chain energy model that accounts for the material anisotropic effects. Furthermore, our proposed material model uses a phenomenological non-monotonous softening function that predicts stress softening effects and has an energy term, derived from the pseudo-elasticity theory, that accounts for residual strain deformations. The model’s theoretical predictions are compared with experimental data collected from human vaginal tissues, mice skin, poly(glycolide-co-caprolactone) (PGC25 3-0) and polypropylene suture materials and tracheal and brain human tissues. In all cases examined here, our equivalent material model closely follows stress-softening and residual strain effects exhibited by experimental data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We generalize to arbitrary waiting-time distributions some results which were previously derived for discrete distributions. We show that for any two waiting-time distributions with the same mean delay time, that with higher dispersion will lead to a faster front. Experimental data on the speed of virus infections in a plaque are correctly explained by the theoretical predictions using a Gaussian delay-time distribution, which is more realistic for this system than the Dirac delta distribution considered previously [J. Fort and V. Méndez, Phys. Rev. Lett.89, 178101 (2002)]

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ongoing development of the digital media has brought a new set of challenges with it. As images containing more than three wavelength bands, often called spectral images, are becoming a more integral part of everyday life, problems in the quality of the RGB reproduction from the spectral images have turned into an important area of research. The notion of image quality is often thought to comprise two distinctive areas – image quality itself and image fidelity, both dealing with similar questions, image quality being the degree of excellence of the image, and image fidelity the measure of the match of the image under study to the original. In this thesis, both image fidelity and image quality are considered, with an emphasis on the influence of color and spectral image features on both. There are very few works dedicated to the quality and fidelity of spectral images. Several novel image fidelity measures were developed in this study, which include kernel similarity measures and 3D-SSIM (structural similarity index). The kernel measures incorporate the polynomial, Gaussian radial basis function (RBF) and sigmoid kernels. The 3D-SSIM is an extension of a traditional gray-scale SSIM measure developed to incorporate spectral data. The novel image quality model presented in this study is based on the assumption that the statistical parameters of the spectra of an image influence the overall appearance. The spectral image quality model comprises three parameters of quality: colorfulness, vividness and naturalness. The quality prediction is done by modeling the preference function expressed in JNDs (just noticeable difference). Both image fidelity measures and the image quality model have proven to be effective in the respective experiments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.