965 resultados para Maximum-entropy probability density
Resumo:
The space subdivision in cells resulting from a process of random nucleation and growth is a subject of interest in many scientific fields. In this paper, we deduce the expected value and variance of these distributions while assuming that the space subdivision process is in accordance with the premises of the Kolmogorov-Johnson-Mehl-Avrami model. We have not imposed restrictions on the time dependency of nucleation and growth rates. We have also developed an approximate analytical cell size probability density function. Finally, we have applied our approach to the distributions resulting from solid phase crystallization under isochronal heating conditions
Resumo:
The properties and cosmological importance of a class of non-topological solitons, Q-balls, are studied. Aspects of Q-ball solutions and Q-ball cosmology discussed in the literature are reviewed. Q-balls are particularly considered in the Minimal Supersymmetric Standard Model with supersymmetry broken by a hidden sector mechanism mediated by either gravity or gauge interactions. Q-ball profiles, charge-energy relations and evaporation rates for realistic Q-ball profiles are calculated for general polynomial potentials and for the gravity mediated scenario. In all of the cases, the evaporation rates are found to increase with decreasing charge. Q-ball collisions are studied by numerical means in the two supersymmetry breaking scenarios. It is noted that the collision processes can be divided into three types: fusion, charge transfer and elastic scattering. Cross-sections are calculated for the different types of processes in the different scenarios. The formation of Q-balls from the fragmentation of the Aflieck-Dine -condensate is studied by numerical and analytical means. The charge distribution is found to depend strongly on the initial energy-charge ratio of the condensate. The final state is typically noted to consist of Q- and anti-Q-balls in a state of maximum entropy. By studying the relaxation of excited Q-balls the rate at which excess energy can be emitted is calculated in the gravity mediated scenario. The Q-ball is also found to withstand excess energy well without significant charge loss. The possible cosmological consequences of these Q-ball properties are discussed.
Resumo:
The Proctor test is time-consuming and requires sampling of several kilograms of soil. Proctor test parameters were predicted in Mollisols, Entisols and Vertisols of the Pampean region of Argentina under different management systems. They were estimated from a minimum number of readily available soil properties (soil texture, total organic C) and management (training data set; n = 73). The results were used to generate a soil compaction susceptibility model, which was subsequently validated using a second group of independent data (test data set; n = 24). Soil maximum bulk density was estimated as follows: Maximum bulk density (Mg m-3) = 1.4756 - 0.00599 total organic C (g kg-1) + 0.0000275 sand (g kg-1) + 0.0539 management. Management was equal to 0 for uncropped and untilled soils and 1 for conventionally tilled soils. The established models predicted the Proctor test parameters reasonably well, based on readily available soil properties. Tillage systems induced changes in the maximum bulk density regardless of total organic matter content or soil texture. The lower maximum apparent bulk density values under no-tillage require a revision of the relative compaction thresholds for different no-tillage crops.
Resumo:
We study the effects of time and space correlations of an external additive colored noise on the steady-state behavior of a time-dependent Ginzburg-Landau model. Simulations show the existence of nonequilibrium phase transitions controlled by both the correlation time and length of the noise. A Fokker-Planck equation and the steady probability density of the process are obtained by means of a theoretical approximation.
Resumo:
Stochastic processes defined by a general Langevin equation of motion where the noise is the non-Gaussian dichotomous Markov noise are studied. A non-FokkerPlanck master differential equation is deduced for the probability density of these processes. Two different models are exactly solved. In the second one, a nonequilibrium bimodal distribution induced by the noise is observed for a critical value of its correlation time. Critical slowing down does not appear in this point but in another one.
Resumo:
We apply the formalism of the continuous-time random walk to the study of financial data. The entire distribution of prices can be obtained once two auxiliary densities are known. These are the probability densities for the pausing time between successive jumps and the corresponding probability density for the magnitude of a jump. We have applied the formalism to data on the U.S. dollardeutsche mark future exchange, finding good agreement between theory and the observed data.
Resumo:
We have studied the relaxation dynamics of a dilute assembly of ferromagnetic particles in suspension. A formalism based on the Smoluchowski equation, describing the evolution of the probability density for the directions of the magnetic moment and of the axis of easy magnetization of the particles, has been developed. We compute the rotational viscosity from a Green-Kubo formula and give an expression for the relaxation time of the particles which comes from the dynamic equations of the correlation functions. Concerning the relaxation time for the particles, our results agree quite well with experiments performed on different samples of ferromagnetic particles for which the magnetic energy, associated with the interaction between the magnetic moments and the external field, or the energy of anisotropy plays a dominant role.
Resumo:
A dynamical model based on a continuous addition of colored shot noises is presented. The resulting process is colored and non-Gaussian. A general expression for the characteristic function of the process is obtained, which, after a scaling assumption, takes on a form that is the basis of the results derived in the rest of the paper. One of these is an expansion for the cumulants, which are all finite, subject to mild conditions on the functions defining the process. This is in contrast with the Lévy distribution¿which can be obtained from our model in certain limits¿which has no finite moments. The evaluation of the spectral density and the form of the probability density function in the tails of the distribution shows that the model exhibits a power-law spectrum and long tails in a natural way. A careful analysis of the characteristic function shows that it may be separated into a part representing a Lévy process together with another part representing the deviation of our model from the Lévy process. This
Resumo:
We propose a generalization of the persistent random walk for dimensions greater than 1. Based on a cubic lattice, the model is suitable for an arbitrary dimension d. We study the continuum limit and obtain the equation satisfied by the probability density function for the position of the random walker. An exact solution is obtained for the projected motion along an axis. This solution, which is written in terms of the free-space solution of the one-dimensional telegraphers equation, may open a new way to address the problem of light propagation through thin slabs.
Resumo:
Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence-absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS-INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM-GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size (n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.
Resumo:
The quality of environmental data analysis and propagation of errors are heavily affected by the representativity of the initial sampling design [CRE 93, DEU 97, KAN 04a, LEN 06, MUL07]. Geostatistical methods such as kriging are related to field samples, whose spatial distribution is crucial for the correct detection of the phenomena. Literature about the design of environmental monitoring networks (MN) is widespread and several interesting books have recently been published [GRU 06, LEN 06, MUL 07] in order to clarify the basic principles of spatial sampling design (monitoring networks optimization) based on Support Vector Machines was proposed. Nonetheless, modelers often receive real data coming from environmental monitoring networks that suffer from problems of non-homogenity (clustering). Clustering can be related to the preferential sampling or to the impossibility of reaching certain regions.
Resumo:
This correspondence studies the formulation of members ofthe Cohen-Posch class of positive time-frequency energy distributions.Minimization of cross-entropy measures with respect to different priorsand the case of no prior or maximum entropy were considered. It isconcluded that, in general, the information provided by the classicalmarginal constraints is very limited, and thus, the final distributionheavily depends on the prior distribution. To overcome this limitation,joint time and frequency marginals are derived based on a "directioninvariance" criterion on the time-frequency plane that are directly relatedto the fractional Fourier transform.
Resumo:
This paper presents a novel image classification scheme for benthic coral reef images that can be applied to both single image and composite mosaic datasets. The proposed method can be configured to the characteristics (e.g., the size of the dataset, number of classes, resolution of the samples, color information availability, class types, etc.) of individual datasets. The proposed method uses completed local binary pattern (CLBP), grey level co-occurrence matrix (GLCM), Gabor filter response, and opponent angle and hue channel color histograms as feature descriptors. For classification, either k-nearest neighbor (KNN), neural network (NN), support vector machine (SVM) or probability density weighted mean distance (PDWMD) is used. The combination of features and classifiers that attains the best results is presented together with the guidelines for selection. The accuracy and efficiency of our proposed method are compared with other state-of-the-art techniques using three benthic and three texture datasets. The proposed method achieves the highest overall classification accuracy of any of the tested methods and has moderate execution time. Finally, the proposed classification scheme is applied to a large-scale image mosaic of the Red Sea to create a completely classified thematic map of the reef benthos