853 resultados para heterogeneous regressions algorithms


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Simulated-annealing-based conditional simulations provide a flexible means of quantitatively integrating diverse types of subsurface data. Although such techniques are being increasingly used in hydrocarbon reservoir characterization studies, their potential in environmental, engineering and hydrological investigations is still largely unexploited. Here, we introduce a novel simulated annealing (SA) algorithm geared towards the integration of high-resolution geophysical and hydrological data which, compared to more conventional approaches, provides significant advancements in the way that large-scale structural information in the geophysical data is accounted for. Model perturbations in the annealing procedure are made by drawing from a probability distribution for the target parameter conditioned to the geophysical data. This is the only place where geophysical information is utilized in our algorithm, which is in marked contrast to other approaches where model perturbations are made through the swapping of values in the simulation grid and agreement with soft data is enforced through a correlation coefficient constraint. Another major feature of our algorithm is the way in which available geostatistical information is utilized. Instead of constraining realizations to match a parametric target covariance model over a wide range of spatial lags, we constrain the realizations only at smaller lags where the available geophysical data cannot provide enough information. Thus we allow the larger-scale subsurface features resolved by the geophysical data to have much more due control on the output realizations. Further, since the only component of the SA objective function required in our approach is a covariance constraint at small lags, our method has improved convergence and computational efficiency over more traditional methods. Here, we present the results of applying our algorithm to the integration of porosity log and tomographic crosshole georadar data to generate stochastic realizations of the local-scale porosity structure. Our procedure is first tested on a synthetic data set, and then applied to data collected at the Boise Hydrogeophysical Research Site.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, several anonymization algorithms have appeared for privacy preservation on graphs. Some of them are based on random-ization techniques and on k-anonymity concepts. We can use both of them to obtain an anonymized graph with a given k-anonymity value. In this paper we compare algorithms based on both techniques in orderto obtain an anonymized graph with a desired k-anonymity value. We want to analyze the complexity of these methods to generate anonymized graphs and the quality of the resulting graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work focuses on the prediction of the two main nitrogenous variables that describe the water quality at the effluent of a Wastewater Treatment Plant. We have developed two kind of Neural Networks architectures based on considering only one output or, in the other hand, the usual five effluent variables that define the water quality: suspended solids, biochemical organic matter, chemical organic matter, total nitrogen and total Kjedhal nitrogen. Two learning techniques based on a classical adaptative gradient and a Kalman filter have been implemented. In order to try to improve generalization and performance we have selected variables by means genetic algorithms and fuzzy systems. The training, testing and validation sets show that the final networks are able to learn enough well the simulated available data specially for the total nitrogen

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inference of Markov random field images segmentation models is usually performed using iterative methods which adapt the well-known expectation-maximization (EM) algorithm for independent mixture models. However, some of these adaptations are ad hoc and may turn out numerically unstable. In this paper, we review three EM-like variants for Markov random field segmentation and compare their convergence properties both at the theoretical and practical levels. We specifically advocate a numerical scheme involving asynchronous voxel updating, for which general convergence results can be established. Our experiments on brain tissue classification in magnetic resonance images provide evidence that this algorithm may achieve significantly faster convergence than its competitors while yielding at least as good segmentation results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Networks are evolving toward a ubiquitous model in which heterogeneousdevices are interconnected. Cryptographic algorithms are required for developing securitysolutions that protect network activity. However, the computational and energy limitationsof network devices jeopardize the actual implementation of such mechanisms. In thispaper, we perform a wide analysis on the expenses of launching symmetric and asymmetriccryptographic algorithms, hash chain functions, elliptic curves cryptography and pairingbased cryptography on personal agendas, and compare them with the costs of basic operatingsystem functions. Results show that although cryptographic power costs are high and suchoperations shall be restricted in time, they are not the main limiting factor of the autonomyof a device.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a Bayesian approach to the design of transmit prefiltering matrices in closed-loop schemes robust to channel estimation errors. The algorithms are derived for a multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) system. Two different optimizationcriteria are analyzed: the minimization of the mean square error and the minimization of the bit error rate. In both cases, the transmitter design is based on the singular value decomposition (SVD) of the conditional mean of the channel response, given the channel estimate. The performance of the proposed algorithms is analyzed,and their relationship with existing algorithms is indicated. As withother previously proposed solutions, the minimum bit error rate algorithmconverges to the open-loop transmission scheme for very poor CSI estimates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many engineering problems that can be formulatedas constrained optimization problems result in solutionsgiven by a waterfilling structure; the classical example is thecapacity-achieving solution for a frequency-selective channel.For simple waterfilling solutions with a single waterlevel and asingle constraint (typically, a power constraint), some algorithmshave been proposed in the literature to compute the solutionsnumerically. However, some other optimization problems result insignificantly more complicated waterfilling solutions that includemultiple waterlevels and multiple constraints. For such cases, itmay still be possible to obtain practical algorithms to evaluate thesolutions numerically but only after a painstaking inspection ofthe specific waterfilling structure. In addition, a unified view ofthe different types of waterfilling solutions and the correspondingpractical algorithms is missing.The purpose of this paper is twofold. On the one hand, itoverviews the waterfilling results existing in the literature from aunified viewpoint. On the other hand, it bridges the gap betweena wide family of waterfilling solutions and their efficient implementationin practice; to be more precise, it provides a practicalalgorithm to evaluate numerically a general waterfilling solution,which includes the currently existing waterfilling solutions andothers that may possibly appear in future problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, two probabilistic adaptive algorithmsfor jointly detecting active users in a DS-CDMA system arereported. The first one, which is based on the theory of hiddenMarkov models (HMM’s) and the Baum–Wech (BW) algorithm,is proposed within the CDMA scenario and compared withthe second one, which is a previously developed Viterbi-basedalgorithm. Both techniques are completely blind in the sense thatno knowledge of the signatures, channel state information, ortraining sequences is required for any user. Once convergencehas been achieved, an estimate of the signature of each userconvolved with its physical channel response (CR) and estimateddata sequences are provided. This CR estimate can be used toswitch to any decision-directed (DD) adaptation scheme. Performanceof the algorithms is verified via simulations as well as onexperimental data obtained in an underwater acoustics (UWA)environment. In both cases, performance is found to be highlysatisfactory, showing the near–far resistance of the analyzed algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To evaluate the impact of noninvasive ventilation (NIV) algorithms available on intensive care unit ventilators on the incidence of patient-ventilator asynchrony in patients receiving NIV for acute respiratory failure. Prospective multicenter randomized cross-over study. Intensive care units in three university hospitals. Patients consecutively admitted to the ICU and treated by NIV with an ICU ventilator were included. Airway pressure, flow and surface diaphragmatic electromyography were recorded continuously during two 30-min periods, with the NIV (NIV+) or without the NIV algorithm (NIV0). Asynchrony events, the asynchrony index (AI) and a specific asynchrony index influenced by leaks (AIleaks) were determined from tracing analysis. Sixty-five patients were included. With and without the NIV algorithm, respectively, auto-triggering was present in 14 (22%) and 10 (15%) patients, ineffective breaths in 15 (23%) and 5 (8%) (p = 0.004), late cycling in 11 (17%) and 5 (8%) (p = 0.003), premature cycling in 22 (34%) and 21 (32%), and double triggering in 3 (5%) and 6 (9%). The mean number of asynchronies influenced by leaks was significantly reduced by the NIV algorithm (p < 0.05). A significant correlation was found between the magnitude of leaks and AIleaks when the NIV algorithm was not activated (p = 0.03). The global AI remained unchanged, mainly because on some ventilators with the NIV algorithm premature cycling occurs. In acute respiratory failure, NIV algorithms provided by ICU ventilators can reduce the incidence of asynchronies because of leaks, thus confirming bench test results, but some of these algorithms can generate premature cycling.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract: To understand the processes of evolution, biologists are interested in the ability of a population to respond to natural or artificial selection. The amount of genetic variation is often viewed as the main factor allowing a species to answer to selection. Many theories have thus focused on the maintenance of genetic variability. Ecologists and population geneticists have long-suspected that the structure of the environment is connected to the maintenance of diversity. Theorists have shown that diversity can be permanently and stably maintained in temporal and spatial varying environment in certain conditions. Moreover, varying environments have been also theoretically demonstrated to cause the evolution of divergent life history strategies in the different niches constituting the environment. Although there is a huge number of theoretical studies selection and on life history evolution in heterogeneous environments, there is a clear lack of empirical studies. The purpose of this thesis was to. empirically study the evolutionary consequences of a heterogeneous environment in a freshwater snail Galba truncatula. Indeed, G. truncatula lives in two habitat types according the water availability. First, it can be found in streams or ponds which never completely dry out: a permanent habitat. Second, G. truncatula can be found in pools that freeze during winter and dry during summer: a temporary habitat. Using a common garden approach, we empirically demonstrated local adaptation of G. truncatula to temporary and permanent habitats. We used at first a comparison of molecular (FST) vs. quantitative (QST) genetic differentiation between temporary and permanent habitats. To confirm the pattern QST> FST between habitats suggesting local adaptation, we then tested the desiccation resistance of individuals from temporary and permanent habitats. This study confirmed that drought resistance seemed to be the main factor selected between habitats, and life history traits linked to the desiccation resistance were thus found divergent between habitats. However, despite this evidence of selection acting on mean values of traits between habitats, drift was suggested to be the main factor responsible of variation in variances-covariances between populations. At last, we found life history traits variation of individuals in a heterogeneous environment varying in parasite prevalence. This thesis empirically demonstrated the importance of heterogeneous environments in local adaptation and life history evolution and suggested that more experimental studies are needed to investigate this topic. Résumé: Les biologistes se sont depuis toujours intéressés en l'aptitude d'une population à répondre à la sélection naturelle. Cette réponse dépend de la quantité de variabilité génétique présente dans cette population. Plus particulièrement, les théoriciens se sont penchés sur la question du maintient de la variabilité génétique au sein d'environnements hétérogènes. Ils ont alors démontré que, sous certaines conditions, la diversité génétique peut se maintenir de manière stable et permanente dans des environnements variant au niveau spatial et temporel. De plus, ces environments variables ont été démontrés comme responsable de divergence de traits d'histoire de vie au sein des différentes niches constituant l'environnement. Cependant, malgré ce nombre important d'études théoriques portant sur la sélection et l'évolution des traits d'histoire de vie en environnement hétérogène, les études empiriques sont plus rares. Le but de cette thèse était donc d'étudier les conséquences évolutives d'un environnement hétérogène chez un esgarcot d'eau douce Galba truncatula. En effet, G. truncatula est trouvé dans deux types d'habitats qui diffèrent par leur niveau d'eau. Le premier, l'habitat temporaire, est constitué de flaques d'eau qui peuvent s'assécher pendant l'été et geler pendant l'hiver. Le second, l'habitat permanent, correspond à des marres ou à des ruisseaux qui ont un niveau d'eau constant durant toute l'année. Utilisant une approche expérimentale de type "jardin commun", nous avons démontré l'adaptation locale des individus à leur type d'habitat, permanent ou temporaire. Nous avons utilisé l'approche Fsr/QsT qui compare la différentiation génétique moléculaire avec la différentiation génétique quantitative entre les 2 habitats. Le phénomène d'adapation locale démontré par QsT > FsT, a été testé experimentalement en mesurant la résistance à la dessiccation d'individus d'habitat temporaire et permanent. Cette étude confirma que la résistance à la sécheresse a été sélectionné entre habitats et que les traits responsables de cette resistance sont différents entre habitats. Cependant si la sélection agit sur la valeur moyenne des traits entre habitats, la dérive génétique semble être le responsable majeur de la différence de variances-covariances entre populations. Pour finir, une variation de traits d'histoire de vie a été trouvée au sein d'un environnement hétérogène constitué de populations variants au niveau de leur taux de parasitisme. Pour conclure, cette thèse a donc démontré l'importance d'un environnement hétérogène sur l'adaptation locale et l'évolution des traits d'histoire de vie et suggère que plus d'études empiriques sur le sujet sont nécessaires.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To optimally manage a metapopulation, managers and conservation biologists can favor a type of habitat spatial distribution (e.g. aggregated or random). However, the spatial distribution that provides the highest habitat occupancy remains ambiguous and numerous contradictory results exist. Habitat occupancy depends on the balance between local extinction and colonization. Thus, the issue becomes even more puzzling when various forms of relationships - positive or negative co-variation - between local extinction and colonization rate within habitat types exist. Using an analytical model we demonstrate first that the habitat occupancy of a metapopulation is significantly affected by the presence of habitat types that display different extinction-colonization dynamics, considering: (i) variation in extinction or colonization rate and (ii) positive and negative co-variation between the two processes within habitat types. We consequently examine, with a spatially explicit stochastic simulation model, how different degrees of habitat aggregation affect occupancy predictions under similar scenarios. An aggregated distribution of habitat types provides the highest habitat occupancy when local extinction risk is spatially heterogeneous and high in some places, while a random distribution of habitat provides the highest habitat occupancy when colonization rates are high. Because spatial variability in local extinction rates always favors aggregation of habitats, we only need to know about spatial variability in colonization rates to determine whether aggregating habitat types increases, or not, metapopulation occupancy. From a comparison of the results obtained with the analytical and with the spatial-explicit stochastic simulation model we determine the conditions under which a simple metapopulation model closely matches the results of a more complex spatial simulation model with explicit heterogeneity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The complex chemical and physical nature of combustion and secondary organic aerosols (SOAs) in general precludes the complete characterization of both bulk and interfacial components. The bulk composition reveals the history of the growth process and therefore the source region, whereas the interface controls--to a large extent--the interaction with gases, biological membranes, and solid supports. We summarize the development of a soft interrogation technique, using heterogeneous chemistry, for the interfacial functional groups of selected probe gases [N(CH(3))(3), NH(2)OH, CF(3)COOH, HCl, O(3), NO(2)] of different reactivity. The technique reveals the identity and density of surface functional groups. Examples include acidic and basic sites, olefinic and polycyclic aromatic hydrocarbon (PAH) sites, and partially and completely oxidized surface sites. We report on the surface composition and oxidation states of laboratory-generated aerosols and of aerosols sampled in several bus depots. In the latter case, the biomarker 8-hydroxy-2'-deoxyguanosine, signaling oxidative stress caused by aerosol exposure, was isolated. The increase in biomarker levels over a working day is correlated with the surface density N(i)(O3) of olefinic and/or PAH sites obtained from O(3) uptakes as well as with the initial uptake coefficient, γ(0), of five probe gases used in the field. This correlation with γ(0) suggests the idea of competing pathways occurring at the interface of the aerosol particles between the generation of reactive oxygen species (ROS) responsible for oxidative stress and cellular antioxidants.