993 resultados para Spatial Clustering
Resumo:
We present a new framework for large-scale data clustering. The main idea is to modify functional dimensionality reduction techniques to directly optimize over discrete labels using stochastic gradient descent. Compared to methods like spectral clustering our approach solves a single optimization problem, rather than an ad-hoc two-stage optimization approach, does not require a matrix inversion, can easily encode prior knowledge in the set of implementable functions, and does not have an ?out-of-sample? problem. Experimental results on both artificial and real-world datasets show the usefulness of our approach.
Resumo:
It is estimated that around 230 people die each year due to radon (222Rn) exposure in Switzerland. 222Rn occurs mainly in closed environments like buildings and originates primarily from the subjacent ground. Therefore it depends strongly on geology and shows substantial regional variations. Correct identification of these regional variations would lead to substantial reduction of 222Rn exposure of the population based on appropriate construction of new and mitigation of already existing buildings. Prediction of indoor 222Rn concentrations (IRC) and identification of 222Rn prone areas is however difficult since IRC depend on a variety of different variables like building characteristics, meteorology, geology and anthropogenic factors. The present work aims at the development of predictive models and the understanding of IRC in Switzerland, taking into account a maximum of information in order to minimize the prediction uncertainty. The predictive maps will be used as a decision-support tool for 222Rn risk management. The construction of these models is based on different data-driven statistical methods, in combination with geographical information systems (GIS). In a first phase we performed univariate analysis of IRC for different variables, namely the detector type, building category, foundation, year of construction, the average outdoor temperature during measurement, altitude and lithology. All variables showed significant associations to IRC. Buildings constructed after 1900 showed significantly lower IRC compared to earlier constructions. We observed a further drop of IRC after 1970. In addition to that, we found an association of IRC with altitude. With regard to lithology, we observed the lowest IRC in sedimentary rocks (excluding carbonates) and sediments and the highest IRC in the Jura carbonates and igneous rock. The IRC data was systematically analyzed for potential bias due to spatially unbalanced sampling of measurements. In order to facilitate the modeling and the interpretation of the influence of geology on IRC, we developed an algorithm based on k-medoids clustering which permits to define coherent geological classes in terms of IRC. We performed a soil gas 222Rn concentration (SRC) measurement campaign in order to determine the predictive power of SRC with respect to IRC. We found that the use of SRC is limited for IRC prediction. The second part of the project was dedicated to predictive mapping of IRC using models which take into account the multidimensionality of the process of 222Rn entry into buildings. We used kernel regression and ensemble regression tree for this purpose. We could explain up to 33% of the variance of the log transformed IRC all over Switzerland. This is a good performance compared to former attempts of IRC modeling in Switzerland. As predictor variables we considered geographical coordinates, altitude, outdoor temperature, building type, foundation, year of construction and detector type. Ensemble regression trees like random forests allow to determine the role of each IRC predictor in a multidimensional setting. We found spatial information like geology, altitude and coordinates to have stronger influences on IRC than building related variables like foundation type, building type and year of construction. Based on kernel estimation we developed an approach to determine the local probability of IRC to exceed 300 Bq/m3. In addition to that we developed a confidence index in order to provide an estimate of uncertainty of the map. All methods allow an easy creation of tailor-made maps for different building characteristics. Our work is an essential step towards a 222Rn risk assessment which accounts at the same time for different architectural situations as well as geological and geographical conditions. For the communication of 222Rn hazard to the population we recommend to make use of the probability map based on kernel estimation. The communication of 222Rn hazard could for example be implemented via a web interface where the users specify the characteristics and coordinates of their home in order to obtain the probability to be above a given IRC with a corresponding index of confidence. Taking into account the health effects of 222Rn, our results have the potential to substantially improve the estimation of the effective dose from 222Rn delivered to the Swiss population.
Resumo:
The present research studies the spatial patterns of the distribution of the Swiss population (DSP). This description is carried out using a wide variety of global spatial structural analysis tools such as topological, statistical and fractal measures, which enable the estimation of the spatial degree of clustering of a point pattern. A particular attention is given to the analysis of the multifractality to characterize the spatial structure of the DSP at different scales. This will be achieved by measuring the generalized q-dimensions and the singularity spectrum. This research is based on high quality data of the Swiss Population Census of the Year 2000 at a hectometric resolution (grid 100 x 100 m) issued by the Swiss Federal Statistical Office (FSO).
Resumo:
This paper takes the shelf and digs into the complex population’s age structure of Catalan municipalities for the year 2009. Catalonia is a very heterogeneous territory, and age pyramids vary considerably across different areas of the territory, existing geographical factors shaping municipalities’ age distributions. By means of spatial statistics methodologies, this piece of research tries to assess which spatial factors determine the location, scale and shape of local distributions. The results show that there exist different distributional patterns across the geography according to specific local determinants. Keywords: Spatial Models. JEL Classification: C21.
Resumo:
Community-level patterns of functional traits relate to community assembly and ecosystem functioning. By modelling the changes of different indices describing such patterns - trait means, extremes and diversity in communities - as a function of abiotic gradients, we could understand their drivers and build projections of the impact of global change on the functional components of biodiversity. We used five plant functional traits (vegetative height, specific leaf area, leaf dry matter content, leaf nitrogen content and seed mass) and non-woody vegetation plots to model several indices depicting community-level patterns of functional traits from a set of abiotic environmental variables (topographic, climatic and edaphic) over contrasting environmental conditions in a mountainous landscape. We performed a variation partitioning analysis to assess the relative importance of these variables for predicting patterns of functional traits in communities, and projected the best models under several climate change scenarios to examine future potential changes in vegetation functional properties. Not all indices of trait patterns within communities could be modelled with the same level of accuracy: the models for mean and extreme values of functional traits provided substantially better predictive accuracy than the models calibrated for diversity indices. Topographic and climatic factors were more important predictors of functional trait patterns within communities than edaphic predictors. Overall, model projections forecast an increase in mean vegetation height and in mean specific leaf area following climate warming. This trend was important at mid elevation particularly between 1000 and 2000 m asl. With this study we showed that topographic, climatic and edaphic variables can successfully model descriptors of community-level patterns of plant functional traits such as mean and extreme trait values. However, which factors determine the diversity of functional traits in plant communities remains unclear and requires more investigations.
Resumo:
Abstract
Resumo:
Background: Conventional magnetic resonance imaging (MRI) techniques are highly sensitive to detect multiple sclerosis (MS) plaques, enabling a quantitative assessment of inflammatory activity and lesion load. In quantitative analyses of focal lesions, manual or semi-automated segmentations have been widely used to compute the total number of lesions and the total lesion volume. These techniques, however, are both challenging and time-consuming, being also prone to intra-observer and inter-observer variability.Aim: To develop an automated approach to segment brain tissues and MS lesions from brain MRI images. The goal is to reduce the user interaction and to provide an objective tool that eliminates the inter- and intra-observer variability.Methods: Based on the recent methods developed by Souplet et al. and de Boer et al., we propose a novel pipeline which includes the following steps: bias correction, skull stripping, atlas registration, tissue classification, and lesion segmentation. After the initial pre-processing steps, a MRI scan is automatically segmented into 4 classes: white matter (WM), grey matter (GM), cerebrospinal fluid (CSF) and partial volume. An expectation maximisation method which fits a multivariate Gaussian mixture model to T1-w, T2-w and PD-w images is used for this purpose. Based on the obtained tissue masks and using the estimated GM mean and variance, we apply an intensity threshold to the FLAIR image, which provides the lesion segmentation. With the aim of improving this initial result, spatial information coming from the neighbouring tissue labels is used to refine the final lesion segmentation.Results:The experimental evaluation was performed using real data sets of 1.5T and the corresponding ground truth annotations provided by expert radiologists. The following values were obtained: 64% of true positive (TP) fraction, 80% of false positive (FP) fraction, and an average surface distance of 7.89 mm. The results of our approach were quantitatively compared to our implementations of the works of Souplet et al. and de Boer et al., obtaining higher TP and lower FP values.Conclusion: Promising MS lesion segmentation results have been obtained in terms of TP. However, the high number of FP which is still a well-known problem of all the automated MS lesion segmentation approaches has to be improved in order to use them for the standard clinical practice. Our future work will focus on tackling this issue.
Resumo:
We describe the spatial distribution of tree height of Pinus uncinata at two undisturbed altitudinal treeline ecotones in the southern Pyrenees (Ordesa, O, and Tessó, T). At each site, a rectangular plot (30 x 140 m) was located with its longest side parallel to the slope and encompassing treeline and timberline. At site O, height increased abruptly going downslope with a high spatial autocorrelation at short distances. In contrast, the changes of tree height across the ecotone at site T were gradual, and tree height was less spatially autocorrelated. These results can be explained by the greater importance of wind and snow avalanches at sites O and T, respectively.
Resumo:
Matrix sublimation has demonstrated to be a powerful approach for high-resolution matrix-assisted laser desorption ionization (MALDI) imaging of lipids, providing very homogeneous solvent-free deposition. This work presents a comprehensive study aiming to evaluate current and novel matrix candidates for high spatial resolution MALDI imaging mass spectrometry of lipids from tissue section after deposition by sublimation. For this purpose, 12 matrices including 2,5-dihydroxybenzoic acid (DHB), sinapinic acid (SA), α-cyano-4-hydroxycinnamic acid (CHCA), 2,6-dihydroxyacetphenone (DHA), 2',4',6'-trihydroxyacetophenone (THAP), 3-hydroxypicolinic acid (3-HPA), 1,8-bis(dimethylamino)naphthalene (DMAN), 1,8,9-anthracentriol (DIT), 1,5-diaminonapthalene (DAN), p-nitroaniline (NIT), 9-aminoacridine (9-AA), and 2-mercaptobenzothiazole (MBT) were investigated for lipid detection efficiency in both positive and negative ionization modes, matrix interferences, and stability under vacuum. For the most relevant matrices, ion maps of the different lipid species were obtained from tissue sections at high spatial resolution and the detected peaks were characterized by matrix-assisted laser desorption ionization time-of-flight/time-of-flight (MALDI-TOF/TOF) mass spectrometry. First proposed for imaging mass spectrometry (IMS) after sublimation, DAN has demonstrated to be of high efficiency providing rich lipid signatures in both positive and negative polarities with high vacuum stability and sub-20 μm resolution capacity. Ion images from adult mouse brain were generated with a 10 μm scanning resolution. Furthermore, ion images from adult mouse brain and whole-body fish tissue sections were also acquired in both polarity modes from the same tissue section at 100 μm spatial resolution. Sublimation of DAN represents an interesting approach to improve information with respect to currently employed matrices providing a deeper analysis of the lipidome by IMS.
Resumo:
Single-trial analysis of human electroencephalography (EEG) has been recently proposed for better understanding the contribution of individual subjects to a group-analysis effect as well as for investigating single-subject mechanisms. Independent Component Analysis (ICA) has been repeatedly applied to concatenated single-trial responses and at a single-subject level in order to extract those components that resemble activities of interest. More recently we have proposed a single-trial method based on topographic maps that determines which voltage configurations are reliably observed at the event-related potential (ERP) level taking advantage of repetitions across trials. Here, we investigated the correspondence between the maps obtained by ICA versus the topographies that we obtained by the single-trial clustering algorithm that best explained the variance of the ERP. To do this, we used exemplar data provided from the EEGLAB website that are based on a dataset from a visual target detection task. We show there to be robust correspondence both at the level of the activation time courses and at the level of voltage configurations of a subset of relevant maps. We additionally show the estimated inverse solution (based on low-resolution electromagnetic tomography) of two corresponding maps occurring at approximately 300 ms post-stimulus onset, as estimated by the two aforementioned approaches. The spatial distribution of the estimated sources significantly correlated and had in common a right parietal activation within Brodmann's Area (BA) 40. Despite their differences in terms of theoretical bases, the consistency between the results of these two approaches shows that their underlying assumptions are indeed compatible.
Resumo:
The objective of this study was to evaluate the efficiency of spatial statistical analysis in the selection of genotypes in a plant breeding program and, particularly, to demonstrate the benefits of the approach when experimental observations are not spatially independent. The basic material of this study was a yield trial of soybean lines, with five check varieties (of fixed effect) and 110 test lines (of random effects), in an augmented block design. The spatial analysis used a random field linear model (RFML), with a covariance function estimated from the residuals of the analysis considering independent errors. Results showed a residual autocorrelation of significant magnitude and extension (range), which allowed a better discrimination among genotypes (increase of the power of statistical tests, reduction in the standard errors of estimates and predictors, and a greater amplitude of predictor values) when the spatial analysis was applied. Furthermore, the spatial analysis led to a different ranking of the genetic materials, in comparison with the non-spatial analysis, and a selection less influenced by local variation effects was obtained.
Resumo:
We study the families of periodic orbits of the spatial isosceles 3-body problem (for small enough values of the mass lying on the symmetry axis) coming via the analytic continuation method from periodic orbits of the circular Sitnikov problem. Using the first integral of the angular momentum, we reduce the dimension of the phase space of the problem by two units. Since periodic orbits of the reduced isosceles problem generate invariant two-dimensional tori of the nonreduced problem, the analytic continuation of periodic orbits of the (reduced) circular Sitnikov problem at this level becomes the continuation of invariant two-dimensional tori from the circular Sitnikov problem to the nonreduced isosceles problem, each one filled with periodic or quasi-periodic orbits. These tori are not KAM tori but just isotropic, since we are dealing with a three-degrees-of-freedom system. The continuation of periodic orbits is done in two different ways, the first going directly from the reduced circular Sitnikov problem to the reduced isosceles problem, and the second one using two steps: first we continue the periodic orbits from the reduced circular Sitnikov problem to the reduced elliptic Sitnikov problem, and then we continue those periodic orbits of the reduced elliptic Sitnikov problem to the reduced isosceles problem. The continuation in one or two steps produces different results. This work is merely analytic and uses the variational equations in order to apply Poincar´e’s continuation method.
Resumo:
The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.
Resumo:
We consider 2n masses located at the vertices of two nested regular polyhedra with the same number of vertices. Assuming that the masses in each polyhedron are equal, we prove that for each ratio of the masses of the inner and the outer polyhedron there exists a unique ratio of the length of the edges of the inner and the outer polyhedron such that the configuration is central.