26 resultados para Nearest neighbor


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Avalanche forecasting is a complex process involving the assimilation of multiple data sources to make predictions over varying spatial and temporal resolutions. Numerically assisted forecasting often uses nearest neighbour methods (NN), which are known to have limitations when dealing with high dimensional data. We apply Support Vector Machines to a dataset from Lochaber, Scotland to assess their applicability in avalanche forecasting. Support Vector Machines (SVMs) belong to a family of theoretically based techniques from machine learning and are designed to deal with high dimensional data. Initial experiments showed that SVMs gave results which were comparable with NN for categorical and probabilistic forecasts. Experiments utilising the ability of SVMs to deal with high dimensionality in producing a spatial forecast show promise, but require further work.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Plants such as Arabidopsis thaliana respond to foliar shade and neighbors who may become competitors for light resources by elongation growth to secure access to unfiltered sunlight. Challenges faced during this shade avoidance response (SAR) are different under a light-absorbing canopy and during neighbor detection where light remains abundant. In both situations, elongation growth depends on auxin and transcription factors of the phytochrome interacting factor (PIF) class. Using a computational modeling approach to study the SAR regulatory network, we identify and experimentally validate a previously unidentified role for long hypocotyl in far red 1, a negative regulator of the PIFs. Moreover, we find that during neighbor detection, growth is promoted primarily by the production of auxin. In contrast, in true shade, the system operates with less auxin but with an increased sensitivity to the hormonal signal. Our data suggest that this latter signal is less robust, which may reflect a cost-to-robustness tradeoff, a system trait long recognized by engineers and forming the basis of information theory.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Plants propagate electrical signals in response to artificial wounding. However, little is known about the electrophysiological responses of the phloem to wounding, and whether natural damaging stimuli induce propagating electrical signals in this tissue. Here, we used living aphids and the direct current (DC) version of the electrical penetration graph (EPG) to detect changes in the membrane potential of Arabidopsis sieve elements (SEs) during caterpillar wounding. Feeding wounds in the lamina induced fast depolarization waves in the affected leaf, rising to maximum amplitude (c. 60 mV) within 2 s. Major damage to the midvein induced fast and slow depolarization waves in unwounded neighbor leaves, but only slow depolarization waves in non-neighbor leaves. The slow depolarization waves rose to maximum amplitude (c. 30 mV) within 14 s. Expression of a jasmonate-responsive gene was detected in leaves in which SEs displayed fast depolarization waves. No electrical signals were detected in SEs of unwounded neighbor leaves of plants with suppressed expression of GLR3.3 and GLR3.6. EPG applied as a novel approach to plant electrophysiology allows cell-specific, robust, real-time monitoring of early electrophysiological responses in plant cells to damage, and is potentially applicable to a broad range of plant-herbivore interactions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

OBJECTIVES: Inequalities and inequities in health are an important public health concern. In Switzerland, mortality in the general population varies according to the socio-economic position (SEP) of neighbourhoods. We examined the influence of neighbourhood SEP on presentation and outcomes in HIV-positive individuals in the era of combination antiretroviral therapy (cART). METHODS: The neighbourhood SEP of patients followed in the Swiss HIV Cohort Study (SHCS) 2000-2013 was obtained on the basis of 2000 census data on the 50 nearest households (education and occupation of household head, rent, mean number of persons per room). We used Cox and logistic regression models to examine the probability of late presentation, virologic response to cART, loss to follow-up and death across quintiles of neighbourhood SEP. RESULTS: A total of 4489 SHCS participants were included. Presentation with advanced disease [CD4 cell count <200 cells/μl or AIDS] and with AIDS was less common in neighbourhoods of higher SEP: the age and sex-adjusted odds ratio (OR) comparing the highest with the lowest quintile of SEP was 0.71 [95% confidence interval (95% CI) 0.58-0.87] and 0.59 (95% CI 0.45-0.77), respectively. An undetectable viral load at 6 months of cART was more common in the highest than in the lowest quintile (OR 1.52; 95% CI 1.14-2.04). Loss to follow-up, mortality and causes of death were not associated with neighbourhood SEP. CONCLUSION: Late presentation was more common and virologic response to cART less common in HIV-positive individuals living in neighbourhoods of lower SEP, but in contrast to the general population, there was no clear trend for mortality.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The paper deals with the development and application of the methodology for automatic mapping of pollution/contamination data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve this problem. The automatic tuning of isotropic and an anisotropic GRNN model using cross-validation procedure is presented. Results are compared with k-nearest-neighbours interpolation algorithm using independent validation data set. Quality of mapping is controlled by the analysis of raw data and the residuals using variography. Maps of probabilities of exceeding a given decision level and ?thick? isoline visualization of the uncertainties are presented as examples of decision-oriented mapping. Real case study is based on mapping of radioactively contaminated territories.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Previous studies on childhood cancer and nuclear power plants (NPPs) produced conflicting results. We used a cohort approach to examine whether residence near NPPs was associated with leukaemia or any childhood cancer in Switzerland. METHODS: We computed person-years at risk for children aged 0-15 years born in Switzerland from 1985 to 2009, based on the Swiss censuses 1990 and 2000 and identified cancer cases from the Swiss Childhood Cancer Registry. We geo-coded place of residence at birth and calculated incidence rate ratios (IRRs) with 95% confidence intervals (CIs) comparing the risk of cancer in children born <5 km, 5-10 km and 10-15 km from the nearest NPP with children born >15 km away, using Poisson regression models. RESULTS: We included 2925 children diagnosed with cancer during 21 117 524 person-years of follow-up; 953 (32.6%) had leukaemia. Eight and 12 children diagnosed with leukaemia at ages 0-4 and 0-15 years, and 18 and 31 children diagnosed with any cancer were born <5 km from a NPP. Compared with children born >15 km away, the IRRs (95% CI) for leukaemia in 0-4 and 0-15 year olds were 1.20 (0.60-2.41) and 1.05 (0.60-1.86), respectively. For any cancer, corresponding IRRs were 0.97 (0.61-1.54) and 0.89 (0.63-1.27). There was no evidence of a dose-response relationship with distance (P > 0.30). Results were similar for residence at diagnosis and at birth, and when adjusted for potential confounders. Results from sensitivity analyses were consistent with main results. CONCLUSIONS: This nationwide cohort study found little evidence of an association between residence near NPPs and the risk of leukaemia or any childhood cancer.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Plants have the ability to use the composition of incident light as a cue to adapt development and growth to their environment. Arabidopsis thaliana as well as many crops are best adapted to sunny habitats. When subjected to shade, these plants exhibit a variety of physiological responses collectively called shade avoidance syndrome (SAS). It includes increased growth of hypocotyl and petioles, decreased growth rate of cotyledons and reduced branching and crop yield. These responses are mainly mediated by phytochrome photoreceptors, which exist either in an active, far-red light (FR) absorbing or an inactive, red light (R) absorbing isoform. In direct sunlight, the R to FR light (R/FR) ratio is high and converts the phytochromes into their physiologically active state. The phytochromes interact with downstream transcription factors such as PHYTOCHROME INTERACTING FACTOR (PIF), which are subsequently degraded. Light filtered through a canopy is strongly depleted in R, which result in a low R/FR ratio and renders the phytochromes inactive. Protein levels of downstream transcription factors are stabilized, which initiates the expression of shade-induced genes such as HFR1, PIL1 or ATHB-2. In my thesis, I investigated transcriptional responses mediated by the SAS in whole Arabidopsis seedlings. Using microarray and chromatin immunoprecipitation data, we identified genome-wide PIF4 and PIF5 dependent shade regulated gene as well as putative direct target genes of PIF5. This revealed evidence for a direct regulatory link between phytochrome signaling and the growth promoting phytohormone auxin (IAA) at the level of biosynthesis, transport and signaling. Subsequently, it was shown, that free-IAA levels are upregulated in response to shade. It is assumed that shade-induced auxin production takes predominantly place in cotyledons of seedlings. This implies, that IAA is subsequently transported basipetally to the hypocotyl and enhances elongation growth. The importance of auxin transport for growth responses has been established by chemical and genetic approaches. To gain a better understanding of spatio-temporal transcriptional regulation of shade-induce auxin, I generated in a second project, an organ specific high throughput data focusing on cotyledon and hypocotyl of young Arabidopsis seedlings. Interestingly, both organs show an opposite growth regulation by shade. I first investigated the spatio-transcriptional regulation of auxin re- sponsive gene, in order to determine how broad gene expression pattern can be explained by the hypothesized movement of auxin from cotyledons to hypocotyls in shade. The analysis suggests, that several genes are indeed regulated according to our prediction and others are regulated in a more complex manner. In addition, analysis of gene families of auxin biosynthetic and transport components, lead to the identification of essential family members for shade-induced growth re- sponses, which were subsequently experimentally confirmed. Finally, the analysis of expression pattern identified several candidate genes, which possibly explain aspects of the opposite growth response of the different organs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Neurons and astrocytes, the two major cell populations in the adult brain, are characterized by their own mode of intercellular communication--the synapses and the gap junctions (GJ), respectively. In addition, there is increasing evidence for dynamic and metabolic neuroglial interactions resulting in the modulation of synaptic transmission at the so-called "tripartite synapse". Based on this, we have investigated at the ultrastructural level how excitatory synapses (ES) and astroglial GJ are spatially distributed in layer IV of the barrel cortex of the adult mouse. We used specific antibodies for connexin (Cx) 30 and 43 to identify astroglial GJ, these two proteins are known to be present in the majority of astroglial GJ in the cerebral cortex. In electron-microscopic images, we measured the distance between two ES, between two GJ and between a GJ and its nearest ES. We found a ratio of two GJ per three ES in the hollow and septal areas. Taking into account the size of an astrocyte domain, the high density of GJ suggests the occurrence of reflexive type, i.e. GJ between processes of the same astrocyte. Interestingly, the distance between an ES and an astroglial GJ was found to be significantly lower than that between either two synapses or between two GJ. These observations indicate that the two modes of cell-to-cell communication are not randomly distributed in layer IV of the barrel cortex. Consequently, this feature may provide the morphological support for the recently reported functional interactions between neuronal circuits and astroglial networks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The analysis of rockfall characteristics and spatial distribution is fundamental to understand and model the main factors that predispose to failure. In our study we analysed LiDAR point clouds aiming to: (1) detect and characterise single rockfalls; (2) investigate their spatial distribution. To this end, different cluster algorithms were applied: 1a) Nearest Neighbour Clutter Removal (NNCR) in combination with the Expectation?Maximization (EM) in order to separate feature points from clutter; 1b) a density based algorithm (DBSCAN) was applied to isolate the single clusters (i.e. the rockfall events); 2) finally we computed the Ripley's K-function to investigate the global spatial pattern of the extracted rockfalls. The method allowed proper identification and characterization of more than 600 rockfalls occurred on a cliff located in Puigcercos (Catalonia, Spain) during a time span of six months. The spatial distribution of these events proved that rockfall were clustered distributed at a welldefined distance-range. Computations were carried out using R free software for statistical computing and graphics. The understanding of the spatial distribution of precursory rockfalls may shed light on the forecasting of future failures.