30 resultados para Statistical models of Box-Jenkins. Artificial neural networks (ANN). Oil flow curve

em Universit


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the general regression neural networks (GRNN) as a nonlinear regression method for the interpolation of monthly wind speeds in complex Alpine orography. GRNN is trained using data coming from Swiss meteorological networks to learn the statistical relationship between topographic features and wind speed. The terrain convexity, slope and exposure are considered by extracting features from the digital elevation model at different spatial scales using specialised convolution filters. A database of gridded monthly wind speeds is then constructed by applying GRNN in prediction mode during the period 1968-2008. This study demonstrates that using topographic features as inputs in GRNN significantly reduces cross-validation errors with respect to low-dimensional models integrating only geographical coordinates and terrain height for the interpolation of wind speed. The spatial predictability of wind speed is found to be lower in summer than in winter due to more complex and weaker wind-topography relationships. The relevance of these relationships is studied using an adaptive version of the GRNN algorithm which allows to select the useful terrain features by eliminating the noisy ones. This research provides a framework for extending the low-dimensional interpolation models to high-dimensional spaces by integrating additional features accounting for the topographic conditions at multiple spatial scales. Copyright (c) 2012 Royal Meteorological Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract In social insects, workers perform a multitude of tasks, such as foraging, nest construction, and brood rearing, without central control of how work is allocated among individuals. It has been suggested that workers choose a task by responding to stimuli gathered from the environment. Response-threshold models assume that individuals in a colony vary in the stimulus intensity (response threshold) at which they begin to perform the corresponding task. Here we highlight the limitations of these models with respect to colony performance in task allocation. First, we show with analysis and quantitative simulations that the deterministic response-threshold model constrains the workers' behavioral flexibility under some stimulus conditions. Next, we show that the probabilistic response-threshold model fails to explain precise colony responses to varying stimuli. Both of these limitations would be detrimental to colony performance when dynamic and precise task allocation is needed. To address these problems, we propose extensions of the response-threshold model by adding variables that weigh stimuli. We test the extended response-threshold model in a foraging scenario and show in simulations that it results in an efficient task allocation. Finally, we show that response-threshold models can be formulated as artificial neural networks, which consequently provide a comprehensive framework for modeling task allocation in social insects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The neuropathology of Alzheimer disease is characterized by senile plaques, neurofibrillary tangles and cell death. These hallmarks develop according to the differential vulnerability of brain networks, senile plaques accumulating preferentially in the associative cortical areas and neurofibrillary tangles in the entorhinal cortex and the hippocampus. We suggest that the main aetiological hypotheses such as the beta-amyloid cascade hypothesis or its variant, the synaptic beta-amyloid hypothesis, will have to consider neural networks not just as targets of degenerative processes but also as contributors of the disease's progression and of its phenotype. Three domains of research are highlighted in this review. First, the cerebral reserve and the redundancy of the network's elements are related to brain vulnerability. Indeed, an enriched environment appears to increase the cerebral reserve as well as the threshold of disease's onset. Second, disease's progression and memory performance cannot be explained by synaptic or neuronal loss only, but also by the presence of compensatory mechanisms, such as synaptic scaling, at the microcircuit level. Third, some phenotypes of Alzheimer disease, such as hallucinations, appear to be related to progressive dysfunction of neural networks as a result, for instance, of a decreased signal to noise ratio, involving a diminished activity of the cholinergic system. Overall, converging results from studies of biological as well as artificial neural networks lead to the conclusion that changes in neural networks contribute strongly to Alzheimer disease's progression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Closely related species may be very difficult to distinguish morphologically, yet sometimes morphology is the only reasonable possibility for taxonomic classification. Here we present learning-vector-quantization artificial neural networks as a powerful tool to classify specimens on the basis of geometric morphometric shape measurements. As an example, we trained a neural network to distinguish between field and root voles from Procrustes transformed landmark coordinates on the dorsal side of the skull, which is so similar in these two species that the human eye cannot make this distinction. Properly trained neural networks misclassified only 3% of specimens. Therefore, we conclude that the capacity of learning vector quantization neural networks to analyse spatial coordinates is a powerful tool among the range of pattern recognition procedures that is available to employ the information content of geometric morphometrics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents an experimental study about the classification ability of several classifiers for multi-classclassification of cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland lawenforcement authorities regularly ask forensic laboratories to determinate the chemotype of a seized cannabisplant and then to conclude if the plantation is legal or not. This classification is mainly performed when theplant is mature as required by the EU official protocol and then the classification of cannabis seedlings is a timeconsuming and costly procedure. A previous study made by the authors has investigated this problematic [1]and showed that it is possible to differentiate between drug type (illegal) and fibre type (legal) cannabis at anearly stage of growth using gas chromatography interfaced with mass spectrometry (GC-MS) based on therelative proportions of eight major leaf compounds. The aims of the present work are on one hand to continueformer work and to optimize the methodology for the discrimination of drug- and fibre type cannabisdeveloped in the previous study and on the other hand to investigate the possibility to predict illegal cannabisvarieties. Seven classifiers for differentiating between cannabis seedlings are evaluated in this paper, namelyLinear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Nearest NeighbourClassification (NNC), Learning Vector Quantization (LVQ), Radial Basis Function Support Vector Machines(RBF SVMs), Random Forest (RF) and Artificial Neural Networks (ANN). The performance of each method wasassessed using the same analytical dataset that consists of 861 samples split into drug- and fibre type cannabiswith drug type cannabis being made up of 12 varieties (i.e. 12 classes). The results show that linear classifiersare not able to manage the distribution of classes in which some overlap areas exist for both classificationproblems. Unlike linear classifiers, NNC and RBF SVMs best differentiate cannabis samples both for 2-class and12-class classifications with average classification results up to 99% and 98%, respectively. Furthermore, RBFSVMs correctly classified into drug type cannabis the independent validation set, which consists of cannabisplants coming from police seizures. In forensic case work this study shows that the discrimination betweencannabis samples at an early stage of growth is possible with fairly high classification performance fordiscriminating between cannabis chemotypes or between drug type cannabis varieties.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetically engineered bioreporters are an excellent complement to traditional methods of chemical analysis. The application of fluorescence flow cytometry to detection of bioreporter response enables rapid and efficient characterization of bacterial bioreporter population response on a single-cell basis. In the present study, intrapopulation response variability was used to obtain higher analytical sensitivity and precision. We have analyzed flow cytometric data for an arsenic-sensitive bacterial bioreporter using an artificial neural network-based adaptive clustering approach (a single-layer perceptron model). Results for this approach are far superior to other methods that we have applied to this fluorescent bioreporter (e.g., the arsenic detection limit is 0.01 microM, substantially lower than for other detection methods/algorithms). The approach is highly efficient computationally and can be implemented on a real-time basis, thus having potential for future development of high-throughput screening applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Summary : Division of labour is one of the most fascinating aspects of social insects. The efficient allocation of individuals to a multitude of different tasks requires a dynamic adjustment in response to the demands of a changing environment. A considerable number of theoretical models have focussed on identifying the mechanisms allowing colonies to perform efficient task allocation. The large majority of these models are built on the observation that individuals in a colony vary in their propensity (response threshold) to perform different tasks. Since individuals with a low threshold for a given task stimulus are more likely to perform that task than individuals with a high threshold, infra-colony variation in individual thresholds results in colony division of labour. These theoretical models suggest that variation in individual thresholds is affected by the within-colony genetic diversity. However, the models have not considered the genetic architecture underlying the individual response thresholds. This is important because a better understanding of division of labour requires determining how genotypic variation relates to differences in infra-colony response threshold distributions. In this thesis, we investigated the combined influence on task allocation efficiency of both, the within-colony genetic variability (stemming from variation in the number of matings by queens) and the number of genes underlying the response thresholds. We used an agent-based simulator to model a situation where workers in a colony had to perform either a regulatory task (where the amount of a given food item in the colony had to be maintained within predefined bounds) or a foraging task (where the quantity of a second type of food item collected had to be the highest possible). The performance of colonies was a function of workers being able to perform both tasks efficiently. To study the effect of within-colony genetic diversity, we compared the performance of colonies with queens mated with varying number of males. On the other hand, the influence of genetic architecture was investigated by varying the number of loci underlying the response threshold of the foraging and regulatory tasks. Artificial evolution was used to evolve the allelic values underlying the tasks thresholds. The results revealed that multiple matings always translated into higher colony performance, whatever the number of loci encoding the thresholds of the regulatory and foraging tasks. However, the beneficial effect of additional matings was particularly important when the genetic architecture of queens comprised one or few genes for the foraging task's threshold. By contrast, higher number of genes encoding the foraging task reduced colony performance with the detrimental effect being stronger when queens had mated with several males. Finally, the number of genes determining the threshold for the regulatory task only had a minor but incremental effect on colony performance. Overall, our numerical experiments indicate the importance of considering the effects of queen mating frequency, genetic architecture underlying task thresholds and the type of task performed when investigating the factors regulating the efficiency of division of labour in social insects. In this thesis we also investigate the task allocation efficiency of response threshold models and compare them with neural networks. While response threshold models are widely used amongst theoretical biologists interested in division of labour in social insects, our simulation reveals that they perform poorly compared to a neural network model. A major shortcoming of response thresholds is that they fail at one of the most crucial requirement of division of labour, the ability of individuals in a colony to efficiently switch between tasks under varying environmental conditions. Moreover, the intrinsic properties of the threshold models are that they lead to a large proportion of idle workers. Our results highlight these limitations of the response threshold models and provide an adequate substitute. Altogether, the experiments presented in this thesis provide novel contributions to the understanding of how division of labour in social insects is influenced by queen mating frequency and genetic architecture underlying worker task thresholds. Moreover, the thesis also provides a novel model of the mechanisms underlying worker task allocation that maybe more generally applicable than the widely used response threshold models. Resumé : La répartition du travail est l'un des aspects les plus fascinants des insectes vivant en société. Une allocation efficace de la multitude de différentes tâches entre individus demande un ajustement dynamique afin de répondre aux exigences d'un environnement en constant changement. Un nombre considérable de modèles théoriques se sont attachés à identifier les mécanismes permettant aux colonies d'effectuer une allocation efficace des tâches. La grande majorité des ces modèles sont basés sur le constat que les individus d'une même colonie diffèrent dans leur propension (inclination à répondre) à effectuer différentes tâches. Etant donné que les individus possédant un faible seuil de réponse à un stimulus associé à une tâche donnée sont plus disposés à effectuer cette dernière que les individus possédant un seuil élevé, les différences de seuils parmi les individus vivant au sein d'une même colonie mènent à une certaine répartition du travail. Ces modèles théoriques suggèrent que la variation des seuils des individus est affectée par la diversité génétique propre à la colonie. Cependant, ces modèles ne considèrent pas la structure génétique qui est à la base des seuils de réponse individuels. Ceci est très important car une meilleure compréhension de la répartition du travail requière de déterminer de quelle manière les variations génotypiques sont associées aux différentes distributions de seuils de réponse à l'intérieur d'une même colonie. Dans le cadre de cette thèse, nous étudions l'influence combinée de la variabilité génétique d'une colonie (qui prend son origine dans la variation du nombre d'accouplements des reines) avec le nombre de gènes supportant les seuils de réponse, vis-à-vis de la performance de l'allocation des tâches. Nous avons utilisé un simulateur basé sur des agents pour modéliser une situation où les travailleurs d'une colonie devaient accomplir une tâche de régulation (1a quantité d'une nourriture donnée doit être maintenue à l'intérieur d'un certain intervalle) ou une tâche de recherche de nourriture (la quantité d'une certaine nourriture doit être accumulée autant que possible). Dans ce contexte, 'efficacité des colonies tient en partie des travailleurs qui sont capable d'effectuer les deux tâches de manière efficace. Pour étudier l'effet de la diversité génétique d'une colonie, nous comparons l'efficacité des colonies possédant des reines qui s'accouplent avec un nombre variant de mâles. D'autre part, l'influence de la structure génétique a été étudiée en variant le nombre de loci à la base du seuil de réponse des deux tâches de régulation et de recherche de nourriture. Une évolution artificielle a été réalisée pour évoluer les valeurs alléliques qui sont à l'origine de ces seuils de réponse. Les résultats ont révélé que de nombreux accouplements se traduisaient toujours en une plus grande performance de la colonie, quelque soit le nombre de loci encodant les seuils des tâches de régulation et de recherche de nourriture. Cependant, les effets bénéfiques d'accouplements additionnels ont été particulièrement important lorsque la structure génétique des reines comprenait un ou quelques gènes pour le seuil de réponse pour la tâche de recherche de nourriture. D'autre part, un nombre plus élevé de gènes encodant la tâche de recherche de nourriture a diminué la performance de la colonie avec un effet nuisible d'autant plus fort lorsque les reines s'accouplent avec plusieurs mâles. Finalement, le nombre de gènes déterminant le seuil pour la tâche de régulation eu seulement un effet mineur mais incrémental sur la performance de la colonie. Pour conclure, nos expériences numériques révèlent l'importance de considérer les effets associés à la fréquence d'accouplement des reines, à la structure génétique qui est à l'origine des seuils de réponse pour les tâches ainsi qu'au type de tâche effectué au moment d'étudier les facteurs qui régulent l'efficacité de la répartition du travail chez les insectes vivant en communauté. Dans cette thèse, nous étudions l'efficacité de l'allocation des tâches des modèles prenant en compte des seuils de réponses, et les comparons à des réseaux de neurones. Alors que les modèles basés sur des seuils de réponse sont couramment utilisés parmi les biologistes intéressés par la répartition des tâches chez les insectes vivant en société, notre simulation montre qu'ils se révèlent peu efficace comparé à un modèle faisant usage de réseaux de neurones. Un point faible majeur des seuils de réponse est qu'ils échouent sur un point crucial nécessaire à la répartition des tâches, la capacité des individus d'une colonie à commuter efficacement entre des tâches soumises à des conditions environnementales changeantes. De plus, les propriétés intrinsèques des modèles basés sur l'utilisation de seuils conduisent à de larges populations de travailleurs inactifs. Nos résultats mettent en évidence les limites de ces modèles basés sur l'utilisation de seuils et fournissent un substitut adéquat. Ensemble, les expériences présentées dans cette thèse fournissent de nouvelles contributions pour comprendre comment la répartition du travail chez les insectes vivant en société est influencée par la fréquence d'accouplements des reines ainsi que par la structure génétique qui est à l'origine, pour un travailleur, du seuil de réponse pour une tâche. De plus, cette thèse fournit également un nouveau modèle décrivant les mécanismes qui sont à l'origine de l'allocation des tâches entre travailleurs, mécanismes qui peuvent être appliqué de manière plus générale que ceux couramment utilisés et basés sur des seuils de réponse.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a review of methodology for semi-supervised modeling with kernel methods, when the manifold assumption is guaranteed to be satisfied. It concerns environmental data modeling on natural manifolds, such as complex topographies of the mountainous regions, where environmental processes are highly influenced by the relief. These relations, possibly regionalized and nonlinear, can be modeled from data with machine learning using the digital elevation models in semi-supervised kernel methods. The range of the tools and methodological issues discussed in the study includes feature selection and semisupervised Support Vector algorithms. The real case study devoted to data-driven modeling of meteorological fields illustrates the discussed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new strategy for incremental building of multilayer feedforward neural networks is proposed in the context of approximation of functions from R-p to R-q using noisy data. A stopping criterion based on the properties of the noise is also proposed. Experimental results for both artificial and real data are performed and two alternatives of the proposed construction strategy are compared.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present research deals with an application of artificial neural networks for multitask learning from spatial environmental data. The real case study (sediments contamination of Geneva Lake) consists of 8 pollutants. There are different relationships between these variables, from linear correlations to strong nonlinear dependencies. The main idea is to construct a subsets of pollutants which can be efficiently modeled together within the multitask framework. The proposed two-step approach is based on: 1) the criterion of nonlinear predictability of each variable ?k? by analyzing all possible models composed from the rest of the variables by using a General Regression Neural Network (GRNN) as a model; 2) a multitask learning of the best model using multilayer perceptron and spatial predictions. The results of the study are analyzed using both machine learning and geostatistical tools.