180 resultados para k-Means algorithm
em University of Queensland eSpace - Australia
Resumo:
In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree*. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones. © Springer-Verlag Berlin Heidelberg 2005
Resumo:
Understanding the ecological role of benthic microalgae, a highly productive component of coral reef ecosystems, requires information on their spatial distribution. The spatial extent of benthic microalgae on Heron Reef (southern Great Barrier Reef, Australia) was mapped using data from the Landsat 5 Thematic Mapper sensor. integrated with field measurements of sediment chlorophyll concentration and reflectance. Field-measured sediment chlorophyll concentrations. 2 ranging from 23-1.153 mg chl a m(2), were classified into low, medium, and high concentration classes (1-170, 171-290, and > 291 mg chl a m(-2)) using a K-means clustering algorithm. The mapping process assumed that areas in the Thematic Mapper image exhibiting similar reflectance levels in red and blue bands would correspond to areas of similar chlorophyll a levels. Regions of homogenous reflectance values corresponding to low, medium, and high chlorophyll levels were identified over the reef sediment zone by applying a standard image classification algorithm to the Thematic Mapper image. The resulting distribution map revealed large-scale ( > 1 km 2) patterns in chlorophyll a levels throughout the sediment zone of Heron Reef. Reef-wide estimates of chlorophyll a distribution indicate that benthic Microalgae may constitute up to 20% of the total benthic chlorophyll a at Heron Reef. and thus contribute significantly to total primary productivity on the reef.
Resumo:
Fuzzy data has grown to be an important factor in data mining. Whenever uncertainty exists, simulation can be used as a model. Simulation is very flexible, although it can involve significant levels of computation. This article discusses fuzzy decision-making using the grey related analysis method. Fuzzy models are expected to better reflect decision-making uncertainty, at some cost in accuracy relative to crisp models. Monte Carlo simulation is used to incorporate experimental levels of uncertainty into the data and to measure the impact of fuzzy decision tree models using categorical data. Results are compared with decision tree models based on crisp continuous data.
Resumo:
Collaborative recommendation is one of widely used recommendation systems, which recommend items to visitor on a basis of referring other's preference that is similar to current user. User profiling technique upon Web transaction data is able to capture such informative knowledge of user task or interest. With the discovered usage pattern information, it is likely to recommend Web users more preferred content or customize the Web presentation to visitors via collaborative recommendation. In addition, it is helpful to identify the underlying relationships among Web users, items as well as latent tasks during Web mining period. In this paper, we propose a Web recommendation framework based on user profiling technique. In this approach, we employ Probabilistic Latent Semantic Analysis (PLSA) to model the co-occurrence activities and develop a modified k-means clustering algorithm to build user profiles as the representatives of usage patterns. Moreover, the hidden task model is derived by characterizing the meaningful latent factor space. With the discovered user profiles, we then choose the most matched profile, which possesses the closely similar preference to current user and make collaborative recommendation based on the corresponding page weights appeared in the selected user profile. The preliminary experimental results performed on real world data sets show that the proposed approach is capable of making recommendation accurately and efficiently.
Resumo:
Examples from the Murray-Darling basin in Australia are used to illustrate different methods of disaggregation of reconnaissance-scale maps. One approach for disaggregation revolves around the de-convolution of the soil-landscape paradigm elaborated during a soil survey. The descriptions of soil ma units and block diagrams in a soil survey report detail soil-landscape relationships or soil toposequences that can be used to disaggregate map units into component landscape elements. Toposequences can be visualised on a computer by combining soil maps with digital elevation data. Expert knowledge or statistics can be used to implement the disaggregation. Use of a restructuring element and k-means clustering are illustrated. Another approach to disaggregation uses training areas to develop rules to extrapolate detailed mapping into other, larger areas where detailed mapping is unavailable. A two-level decision tree example is presented. At one level, the decision tree method is used to capture mapping rules from the training area; at another level, it is used to define the domain over which those rules can be extrapolated. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
Coarse-resolution thematic maps derived from remotely sensed data and implemented in GIS play an important role in coastal and marine conservation, research and management. Here, we describe an approach for fine-resolution mapping of land-cover types using aerial photography and ancillary GIs and ground data in a large (100 x 35 km) subtropical estuarine system (Moreton Bay, Queensland, Australia). We have developed and implemented a classification scheme representing 24 coastal (subtidal, intertidal. mangrove, supratidal and terrestrial) cover types relevant to the ecology of estuarine animals, nekton and shorebirds. The accuracy of classifications of the intertidal and subtidal cover types, as indicated by the agreement between the mapped (predicted) and reference (ground) data, was 77-88%, depending on the zone and level of generalization required. The variability and spatial distribution of habitat mosaics (landscape types) across the mapped environment were assessed using K-means clustering and validated with Classification and Regression Tree models. Seven broad landscape types could be distinguished and ways of incorporating the information on landscape composition into site-specific conservation and field research are discussed. This research illustrates the importance and potential applications of fine-resolution mapping for conservation and management of estuarine habitats and their terrestrial and aquatic wildlife. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Background: Pain is defined as both a sensory and an emotional experience. Acute postoperative tooth extraction pain is assessed and treated as a physiological (sensory) pain while chronic pain is a biopsychosocial problem. The purpose of this study was to assess whether psychological and social changes Occur in the acute pain state. Methods: A biopsychosocial pain questionnaire was completed by 438 subjects (165 males, 273 females) with acute postoperative pain at 24 hours following the surgical extraction of teeth and compared with 273 subjects (78 males, 195 females) with chronic orofacial pain. Statistical methods used a k-means cluster analysis. Results: Three clusters were identified in the acute pain group: 'unaffected', 'disabled' and 'depressed, anxious and disabled'. Psychosocial effects showed 24.8 per cent feeling 'distress/suffering' and 15.1 per cent 'sad and depressed'. Females reported higher pain intensity and more distress, depression and inadequate medication for pain relief (p
Resumo:
This paper describes the application of a new technique, rough clustering, to the problem of market segmentation. Rough clustering produces different solutions to k-means analysis because of the possibility of multiple cluster membership of objects. Traditional clustering methods generate extensional descriptions of groups, that show which objects are members of each cluster. Clustering techniques based on rough sets theory generate intensional descriptions, which outline the main characteristics of each cluster. In this study, a rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation (the general predisposition of consumers toward the act of shopping) and intention to purchase products via the Internet. The cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. The rough clusters obtained provide interpretations of different shopping orientations present in the data without the restriction of attempting to fit each object into only one segment. Such descriptions can be an aid to marketers attempting to identify potential segments of consumers.
Resumo:
In this paper use consider the problem of providing standard errors of the component means in normal mixture models fitted to univariate or multivariate data by maximum likelihood via the EM algorithm. Two methods of estimation of the standard errors are considered: the standard information-based method and the computationally-intensive bootstrap method. They are compared empirically by their application to three real data sets and by a small-scale Monte Carlo experiment.
Resumo:
Nitrogen adsorption on a surface of a non-porous reference material is widely used in the characterization. Traditionally, the enhancement of solid-fluid potential in a porous solid is accounted for by incorporating the surface curvature into the solid-fluid Potential of the flat reference surface. However, this calculation procedure has not been justified experimentally. In this paper, we derive the solid-fluid potential of mesoporous MCM-41 solid by using solely the adsorption isotherm of that solid. This solid-fluid potential is then compared with that of the non-porous reference surface. In derivation of the solid-fluid potential for both reference surface and mesoporous MCM-41 silica (diameter ranging front 3 to 6.5 nm) we employ the nonlocal density functional theory developed for amorphous solids. It is found that, to out, surprise, the solid-fluid potential of a porous solid is practically the same as that for the reference surface, indicating that there is no enhancement due to Surface curvature. This requires further investigations to explain this unusual departure from our conventional wisdom of curvature-induced enhancement. Accepting the curvature-independent solid-fluid potential derived from the non-porous reference surface, we analyze the hysteresis features of a series of MCM-41 samples. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
Recently Adams and Bischof (1994) proposed a novel region growing algorithm for segmenting intensity images. The inputs to the algorithm are the intensity image and a set of seeds - individual points or connected components - that identify the individual regions to be segmented. The algorithm grows these seed regions until all of the image pixels have been assimilated. Unfortunately the algorithm is inherently dependent on the order of pixel processing. This means, for example, that raster order processing and anti-raster order processing do not, in general, lead to the same tessellation. In this paper we propose an improved seeded region growing algorithm that retains the advantages of the Adams and Bischof algorithm fast execution, robust segmentation, and no tuning parameters - but is pixel order independent. (C) 1997 Elsevier Science B.V.
Resumo:
The popular Newmark algorithm, used for implicit direct integration of structural dynamics, is extended by means of a nodal partition to permit use of different timesteps in different regions of a structural model. The algorithm developed has as a special case an explicit-explicit subcycling algorithm previously reported by Belytschko, Yen and Mullen. That algorithm has been shown, in the absence of damping or other energy dissipation, to exhibit instability over narrow timestep ranges that become narrower as the number of degrees of freedom increases, making them unlikely to be encountered in practice. The present algorithm avoids such instabilities in the case of a one to two timestep ratio (two subcycles), achieving unconditional stability in an exponential sense for a linear problem. However, with three or more subcycles, the trapezoidal rule exhibits stability that becomes conditional, falling towards that of the central difference method as the number of subcycles increases. Instabilities over narrow timestep ranges, that become narrower as the model size increases, also appear with three or more subcycles. However by moving the partition between timesteps one row of elements into the region suitable for integration with the larger timestep these the unstable timestep ranges become extremely narrow, even in simple systems with a few degrees of freedom. As well, accuracy is improved. Use of a version of the Newmark algorithm that dissipates high frequencies minimises or eliminates these narrow bands of instability. Viscous damping is also shown to remove these instabilities, at the expense of having more effect on the low frequency response.