923 resultados para spatial clustering algorithms


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper provides a novel Exceptional Object Analysis for Finding Rare Environmental Events (EOAFREE). The major contribution of our EOAFREE method is that it proposes a general Improved Exceptional Object Analysis based on Noises (IEOAN) algorithm to efficiently detect and rank exceptional objects. Our IEOAN algorithm is more general than already known outlier detection algorithms to find exceptional objects that may be not on the border; and experimental study shows that our IEOAN algorithm is far more efficient than directly recursively using already known clustering algorithms that may not force every data instance to belong to a cluster to detect rare events. Another contribution is that it provides an approach to preprocess heterogeneous real world data through exploring domain knowledge, based on which it defines changes instead of the water data value itself as the input of the IEOAN algorithm to remove the geographical differences between any two sites and the temporal differences between any two years. The effectiveness of our EOAFREE method is demonstrated by a real world application - that is, to detect water pollution events from the water quality datasets of 93 sites distributed in 10 river basins in Victoria, Australia between 1975 and 2010. © 2012 Elsevier B.V..

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Extracellular data analysis has become a quintessential method for understanding the neurophysiological responses to stimuli. This demands stringent techniques owing to the complicated nature of the recording environment. In this paper, we highlight the challenges in extracellular multi-electrode recording and data analysis as well as the limitations pertaining to some of the currently employed methodologies. To address some of the challenges, we present a unified algorithm in the form of selective sorting. Selective sorting is modelled around hypothesized generative model, which addresses the natural phenomena of spikes triggered by an intricate neuronal population. The algorithm incorporates Cepstrum of Bispectrum, ad hoc clustering algorithms, wavelet transforms, least square and correlation concepts which strategically tailors a sequence to characterize and form distinctive clusters. Additionally, we demonstrate the influence of noise modelled wavelets to sort overlapping spikes. The algorithm is evaluated using both raw and synthesized data sets with different levels of complexity and the performances are tabulated for comparison using widely accepted qualitative and quantitative indicators.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Forest trees, like oaks, rely on high levels of genetic variation to adapt to varying environmental conditions. Thus, genetic variation and its distribution are important for the long-term survival and adaptability of oak populations. Climate change is projected to lead to increased drought and fire events as well as a northward migration of tree species, including oaks. Additionally, decline in oak regeneration has become increasingly concerning since it may lead to decreased gene flow and increased inbreeding levels. This will in turn lead to lowered levels of genetic diversity, negatively affecting the growth and survival of populations. At the same time, populations at the species’ distribution edge, like those in this study, could possess important stores of genetic diversity and adaptive potential, while also being vulnerable to climatic or anthropogenic changes. A survey of the level and distribution of genetic variation and identification of potentially adaptive genes is needed since adaptive genetic variation is essential for their long-term survival. Oaks possess a remarkable characteristic in that they maintain their species identity and specific environmental adaptations despite their propensity to hybridize. Thus, in the face of interspecific gene flow, some areas of the genome remain differentiated due to selection. This characteristic allows the study of local environmental adaptation through genetic variation analyses. Furthermore, using genic markers with known putative functions makes it possible to link those differentiated markers to potential adaptive traits (e.g., flowering time, drought stress tolerance). Demographic processes like gene flow and genetic drift also play an important role in how genes (including adaptive genes) are maintained or spread. These processes are influenced by disturbances, both natural and anthropogenic. An examination of how genetic variation is geographically distributed can display how these genetic processes and geographical disturbances influence genetic variation patterns. For example, the spatial clustering of closely related trees could promote inbreeding with associated negative effects (inbreeding depression), if gene flow is limited. In turn this can have negative consequences for a species’ ability to adapt to changing environmental conditions. In contrast, interspecific hybridization may also allow the transfer of genes between species that increase their adaptive potential in a changing environment. I have studied the ecologically divergent, interfertile red oaks, Quercus rubra and Q. ellipsoidalis, to identify genes with potential roles in adaptation to abiotic stress through traits such as drought tolerance and flowering time, and to assess the level and distribution of genetic variation. I found evidence for moderate gene flow between the two species and low interspecific genetic differences at most genetic markers (Lind and Gailing 2013). However, the screening of genic markers with potential roles in phenology and drought tolerance led to the identification of a CONSTANS-like (COL) gene, a candidate gene for flowering time and growth. This marker, located in the coding region of the gene, was highly differentiated between the two species in multiple geographical areas, despite interspecific gene flow, and may play a role in reproductive isolation and adaptive divergence between the two species (Lind-Riehl et al. 2014). Since climate change could result in a northward migration of trees species like oaks, this gene could be important in maintaining species identity despite increased contact zones between species (e.g., increased gene flow). Finally I examined differences in spatial genetic structure (SGS) and genetic variation between species and populations subjected to different management strategies and natural disturbances. Diverse management activities combined with various natural disturbances as well as species specific life history traits influenced SGS patterns and inbreeding levels (Lind-Riehl and Gailing submitted).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND African swine fever (ASF) is one of the most complex viral diseases affecting both domestic and wild pigs. It is caused by ASF virus (ASFV), the only DNA virus which can be efficiently transmitted by an arthropod vector, soft ticks of the genus Ornithodoros. These ticks can be part of ASFV-transmission cycles, and in Europe, O. erraticus was shown to be responsible for long-term maintenance of ASFV in Spain and Portugal. In 2014, the disease has been reintroduced into the European Union, affecting domestic pigs and, importantly, also the Eurasian wild boar population. In a first attempt to assess the risk of a tick-wild boar transmission cycle in Central Europe that would further complicate eradication of the disease, over 700 pre-existing serum samples from wild boar hunted in four representative German Federal States were investigated for the presence of antibodies directed against salivary antigen of Ornithodoros erraticus ticks using an indirect ELISA format. RESULTS Out of these samples, 16 reacted with moderate to high optical densities that could be indicative of tick bites in sampled wild boar. However, these samples did not show a spatial clustering (they were collected from distant geographical regions) and were of bad quality (hemolysis/impurities). Furthermore, all positive samples came from areas with suboptimal climate for soft ticks. For this reason, false positive reactions are likely. CONCLUSION In conclusion, the study did not provide stringent evidence for soft tick-wild boar contact in the investigated German Federal States and thus, a relevant involvement in the epidemiology of ASF in German wild boar is unlikely. This fact would facilitate the eradication of ASF in the area, although other complex relations (wild boar biology and interactions with domestic pigs) need to be considered.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Introducción: El Cáncer es prevenible en algunos casos, si se evita la exposición a sustancias cancerígenas en el medio ambiente. En Colombia, Cundinamarca es uno de los departamentos con mayores incrementos en la tasa de mortalidad y en el municipio de Sibaté, habitantes han manifestado preocupación por el incremento de la enfermedad. En el campo de la salud ambiental mundial, la georreferenciación aplicada al estudio de fenómenos en salud, ha tenido éxito con resultados válidos. El estudio propuso usar herramientas de información geográfica, para generar análisis de tiempo y espacio que hicieran visible el comportamiento del cáncer en Sibaté y sustentaran hipótesis de influencias ambientales sobre concentraciones de casos. Objetivo: Obtener incidencia y prevalencia de casos de cáncer en habitantes de Sibaté y georreferenciar los casos en un periodo de 5 años, con base en indagación de registros. Metodología: Estudio exploratorio descriptivo de corte transversal,sobre todos los diagnósticos de cáncer entre los años 2010 a 2014, encontrados en los archivos de la Secretaria de Salud municipal. Se incluyeron unicamente quienes tuvieron residencia permanente en el municipio y fueron diagnosticados con cáncer entre los años de 2010 a 2104. Sobre cada caso se obtuvo género, edad, estrato socioeconómico, nivel académico, ocupación y estado civil. Para el análisis de tiempo se usó la fecha de diagnóstico y para el análisis de espacio, la dirección de residencia, tipo de cáncer y coordenada geográfica. Se generaron coordenadas geográficas con un equipo GPS Garmin y se crearon mapas con los puntos de la ubicación de las viviendas de los pacientes. Se proceso la información, con Epi Info 7 Resultados: Se encontraron 107 casos de cáncer registrados en la Secretaria de Salud de Sibaté, 66 mujeres, 41 hombres. Sin división de género, el 30.93% de la población presento cáncer del sistema reproductor, el 18,56% digestivo y el 17,53% tegumentario. Se presentaron 2 grandes casos de agrupaciones espaciales en el territorio estudiado, una en el Barrio Pablo Neruda con 12 (21,05%) casos y en el casco Urbano de Sibaté con 38 (66,67%) casos. Conclusión: Se corroboro que el análisis geográfico con variables espacio temporales y de exposición, puede ser la herramienta para generar hipótesis sobre asociaciones de casos de cáncer con factores ambientales.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Social networks generally display a positively skewed degree distribution and higher values for clustering coefficient and degree assortativity than would be expected from the degree sequence. For some types of simulation studies, these properties need to be varied in the artificial networks over which simulations are to be conducted. Various algorithms to generate networks have been described in the literature but their ability to control all three of these network properties is limited. We introduce a spatially constructed algorithm that generates networks with constrained but arbitrary degree distribution, clustering coefficient and assortativity. Both a general approach and specific implementation are presented. The specific implementation is validated and used to generate networks with a constrained but broad range of property values. © Copyright JASSS.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

We present a novel approach to improving subspace clustering by exploiting the spatial constraints. The new method encourages the sparse solution to be consistent with the spatial geometry of the tracked points, by embedding weights into the sparse formulation. By doing so, we are able to correct sparse representations in a principled manner without introducing much additional computational cost. We discuss alternative ways to treat the missing and corrupted data using the latest theory in robust lasso regression and suggest numerical algorithms so solve the proposed formulation. The experiments on the benchmark Johns Hopkins 155 dataset demonstrate that exploiting spatial constraints significantly improves motion segmentation.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The increase in new electronic devices had generated a considerable increase in obtaining spatial data information; hence these data are becoming more and more widely used. As well as for conventional data, spatial data need to be analyzed so interesting information can be retrieved from them. Therefore, data clustering techniques can be used to extract clusters of a set of spatial data. However, current approaches do not consider the implicit semantics that exist between a region and an object’s attributes. This paper presents an approach that enhances spatial data mining process, so they can use the semantic that exists within a region. A framework was developed, OntoSDM, which enables spatial data mining algorithms to communicate with ontologies in order to enhance the algorithm’s result. The experiments demonstrated a semantically improved result, generating more interesting clusters, therefore reducing manual analysis work of an expert.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Two algorithms are outlined, each of which has interesting features for modeling of spatial variability of rock depth. In this paper, reduced level of rock at Bangalore, India, is arrived from the 652 boreholes data in the area covering 220 sqa <.km. Support vector machine (SVM) and relevance vector machine (RVM) have been utilized to predict the reduced level of rock in the subsurface of Bangalore and to study the spatial variability of the rock depth. The support vector machine (SVM) that is firmly based on the theory of statistical learning theory uses regression technique by introducing epsilon-insensitive loss function has been adopted. RVM is a probabilistic model similar to the widespread SVM, but where the training takes place in a Bayesian framework. Prediction results show the ability of learning machine to build accurate models for spatial variability of rock depth with strong predictive capabilities. The paper also highlights the capability ofRVM over the SVM model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new method for automated coronal loop tracking, in both spatial and temporal domains, is presented. Applying this technique to TRACE data, obtained using the 171 angstrom filter on 1998 July 14, we detect a coronal loop undergoing a 270 s kink-mode oscillation, as previously found by Aschwanden et al. However, we also detect flare-induced, and previously unnoticed, spatial periodicities on a scale of 3500 km, which occur along the coronal loop edge. Furthermore, we establish a reduction in oscillatory power for these spatial periodicities of 45% over a 222 s interval. We relate the reduction in detected oscillatory power to the physical damping of these loop-top oscillations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The problem of detecting spatially-coherent groups of data that exhibit anomalous behavior has started to attract attention due to applications across areas such as epidemic analysis and weather forecasting. Earlier efforts from the data mining community have largely focused on finding outliers, individual data objects that display deviant behavior. Such point-based methods are not easy to extend to find groups of data that exhibit anomalous behavior. Scan Statistics are methods from the statistics community that have considered the problem of identifying regions where data objects exhibit a behavior that is atypical of the general dataset. The spatial scan statistic and methods that build upon it mostly adopt the framework of defining a character for regions (e.g., circular or elliptical) of objects and repeatedly sampling regions of such character followed by applying a statistical test for anomaly detection. In the past decade, there have been efforts from the statistics community to enhance efficiency of scan statstics as well as to enable discovery of arbitrarily shaped anomalous regions. On the other hand, the data mining community has started to look at determining anomalous regions that have behavior divergent from their neighborhood.In this chapter,we survey the space of techniques for detecting anomalous regions on spatial data from across the data mining and statistics communities while outlining connections to well-studied problems in clustering and image segmentation. We analyze the techniques systematically by categorizing them appropriately to provide a structured birds eye view of the work on anomalous region detection;we hope that this would encourage better cross-pollination of ideas across communities to help advance the frontier in anomaly detection.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present some additions to a fuzzy variable radius niche technique called Dynamic Niche Clustering (DNC) (Gan and Warwick, 1999; 2000; 2001) that enable the identification and creation of niches of arbitrary shape through a mechanism called Niche Linkage. We show that by using this mechanism it is possible to attain better feature extraction from the underlying population.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we deal with the problem of boosting the Optimum-Path Forest (OPF) clustering approach using evolutionary-based optimization techniques. As the OPF classifier performs an exhaustive search to find out the size of sample's neighborhood that allows it to reach the minimum graph cut as a quality measure, we compared several optimization techniques that can obtain close graph cut values to the ones obtained by brute force. Experiments in two public datasets in the context of unsupervised network intrusion detection have showed the evolutionary optimization techniques can find suitable values for the neighborhood faster than the exhaustive search. Additionally, we have showed that it is not necessary to employ many agents for such task, since the neighborhood size is defined by discrete values, with constrain the set of possible solution to a few ones.