947 resultados para spatial clustering algorithms


Relevância:

50.00% 50.00%

Publicador:

Resumo:

The increase in new electronic devices had generated a considerable increase in obtaining spatial data information; hence these data are becoming more and more widely used. As well as for conventional data, spatial data need to be analyzed so interesting information can be retrieved from them. Therefore, data clustering techniques can be used to extract clusters of a set of spatial data. However, current approaches do not consider the implicit semantics that exist between a region and an object’s attributes. This paper presents an approach that enhances spatial data mining process, so they can use the semantic that exists within a region. A framework was developed, OntoSDM, which enables spatial data mining algorithms to communicate with ontologies in order to enhance the algorithm’s result. The experiments demonstrated a semantically improved result, generating more interesting clusters, therefore reducing manual analysis work of an expert.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Distribution of socio-economic features in urban space is an important source of information for land and transportation planning. The metropolization phenomenon has changed the distribution of types of professions in space and has given birth to different spatial patterns that the urban planner must know in order to plan a sustainable city. Such distributions can be discovered by statistical and learning algorithms through different methods. In this paper, an unsupervised classification method and a cluster detection method are discussed and applied to analyze the socio-economic structure of Switzerland. The unsupervised classification method, based on Ward's classification and self-organized maps, is used to classify the municipalities of the country and allows to reduce a highly-dimensional input information to interpret the socio-economic landscape. The cluster detection method, the spatial scan statistics, is used in a more specific manner in order to detect hot spots of certain types of service activities. The method is applied to the distribution services in the agglomeration of Lausanne. Results show the emergence of new centralities and can be analyzed in both transportation and social terms.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Zonal management in vineyards requires the prior delineation of stable yield zones within the parcel. Among the different methodologies used for zone delineation, cluster analysis of yield data from several years is one of the possibilities cited in scientific literature. However, there exist reasonable doubts concerning the cluster algorithm to be used and the number of zones that have to be delineated within a field. In this paper two different cluster algorithms have been compared (k-means and fuzzy c-means) using the grape yield data corresponding to three successive years (2002, 2003 and 2004), for a ‘Pinot Noir’ vineyard parcel. Final choice of the most recommendable algorithm has been linked to obtaining a stable pattern of spatial yield distribution and to allowing for the delineation of compact and average sized areas. The general recommendation is to use reclassified maps of two clusters or yield classes (low yield zone and high yield zone) and, consequently, the site-specific vineyard management should be based on the prior delineation of just two different zones or sub-parcels. The two tested algorithms are good options for this purpose. However, the fuzzy c-means algorithm allows for a better zoning of the parcel, forming more compact areas and with more equilibrated zonal differences over time.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The analysis of rockfall characteristics and spatial distribution is fundamental to understand and model the main factors that predispose to failure. In our study we analysed LiDAR point clouds aiming to: (1) detect and characterise single rockfalls; (2) investigate their spatial distribution. To this end, different cluster algorithms were applied: 1a) Nearest Neighbour Clutter Removal (NNCR) in combination with the Expectation?Maximization (EM) in order to separate feature points from clutter; 1b) a density based algorithm (DBSCAN) was applied to isolate the single clusters (i.e. the rockfall events); 2) finally we computed the Ripley's K-function to investigate the global spatial pattern of the extracted rockfalls. The method allowed proper identification and characterization of more than 600 rockfalls occurred on a cliff located in Puigcercos (Catalonia, Spain) during a time span of six months. The spatial distribution of these events proved that rockfall were clustered distributed at a welldefined distance-range. Computations were carried out using R free software for statistical computing and graphics. The understanding of the spatial distribution of precursory rockfalls may shed light on the forecasting of future failures.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Clustering soil and crop data can be used as a basis for the definition of management zones because the data are grouped into clusters based on the similar interaction of these variables. Therefore, the objective of this study was to identify management zones using fuzzy c-means clustering analysis based on the spatial and temporal variability of soil attributes and corn yield. The study site (18 by 250-m in size) was located in Jaboticabal, São Paulo/Brazil. Corn yield was measured in one hundred 4.5 by 10-m cells along four parallel transects (25 observations per transect) over five growing seasons between 2001 and 2010. Soil chemical and physical attributes were measured. SAS procedure MIXED was used to identify which variable(s) most influenced the spatial variability of corn yield over the five study years. Basis saturation (BS) was the variable that better related to corn yield, thus, semivariograms models were fitted for BS and corn yield and then, data values were krigged. Management Zone Analyst software was used to carry out the fuzzy c-means clustering algorithm. The optimum number of management zones can change over time, as well as the degree of agreement between the BS and corn yield management zone maps. Thus, it is very important take into account the temporal variability of crop yield and soil attributes to delineate management zones accurately.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present some additions to a fuzzy variable radius niche technique called Dynamic Niche Clustering (DNC) (Gan and Warwick, 1999; 2000; 2001) that enable the identification and creation of niches of arbitrary shape through a mechanism called Niche Linkage. We show that by using this mechanism it is possible to attain better feature extraction from the underlying population.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we deal with the problem of boosting the Optimum-Path Forest (OPF) clustering approach using evolutionary-based optimization techniques. As the OPF classifier performs an exhaustive search to find out the size of sample's neighborhood that allows it to reach the minimum graph cut as a quality measure, we compared several optimization techniques that can obtain close graph cut values to the ones obtained by brute force. Experiments in two public datasets in the context of unsupervised network intrusion detection have showed the evolutionary optimization techniques can find suitable values for the neighborhood faster than the exhaustive search. Additionally, we have showed that it is not necessary to employ many agents for such task, since the neighborhood size is defined by discrete values, with constrain the set of possible solution to a few ones.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An integrated approach for multi-spectral segmentation of MR images is presented. This method is based on the fuzzy c-means (FCM) and includes bias field correction and contextual constraints over spatial intensity distribution and accounts for the non-spherical cluster's shape in the feature space. The bias field is modeled as a linear combination of smooth polynomial basis functions for fast computation in the clustering iterations. Regularization terms for the neighborhood continuity of intensity are added into the FCM cost functions. To reduce the computational complexity, the contextual regularizations are separated from the clustering iterations. Since the feature space is not isotropic, distance measure adopted in Gustafson-Kessel (G-K) algorithm is used instead of the Euclidean distance, to account for the non-spherical shape of the clusters in the feature space. These algorithms are quantitatively evaluated on MR brain images using the similarity measures.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

SUMMARY There is interest in the potential of companion animal surveillance to provide data to improve pet health and to provide early warning of environmental hazards to people. We implemented a companion animal surveillance system in Calgary, Alberta and the surrounding communities. Informatics technologies automatically extracted electronic medical records from participating veterinary practices and identified cases of enteric syndrome in the warehoused records. The data were analysed using time-series analyses and a retrospective space-time permutation scan statistic. We identified a seasonal pattern of reports of occurrences of enteric syndromes in companion animals and four statistically significant clusters of enteric syndrome cases. The cases within each cluster were examined and information about the animals involved (species, age, sex), their vaccination history, possible exposure or risk behaviour history, information about disease severity, and the aetiological diagnosis was collected. We then assessed whether the cases within the cluster were unusual and if they represented an animal or public health threat. There was often insufficient information recorded in the medical record to characterize the clusters by aetiology or exposures. Space-time analysis of companion animal enteric syndrome cases found evidence of clustering. Collection of more epidemiologically relevant data would enhance the utility of practice-based companion animal surveillance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study tested three hypotheses: (1) that there is clustering of the neuronal cytoplasmic inclusions (NCI), astrocytic plaques (AP) and ballooned neurons (BN) in corticobasal degeneration (CBD), (2) that the clusters of NCI and BN are not spatially correlated, and (3) that the lesions are correlated with disease ‘stage’. In 50% of the regions, clusters of lesions were 400–800 µm in diameter and regularly distributed parallel to the tissue boundary. Clusters of NCI and BN were larger in laminae II/III and V/VI, respectively. In a third of regions, the clusters of BN and NCI were negatively spatially correlated. Cluster size of the BN in the parahippocampal gyrus (PHG) was positively correlated with disease ‘stage’. The data suggest the following: (1) degeneration of the cortico-cortical pathways in CBD, (2) clusters of NCI and BN may affect different anatomical pathways and (3) BN may develop after the NCI in the PHG.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.