931 resultados para k-Means algorithm
Resumo:
The aim of this study was to group temporal profiles of 10-day composites NDVI product by similarity, which was obtained by the SPOT Vegetation sensor, for municipalities with high soybean production in the state of Paraná, Brazil, in the 2005/2006 cropping season. Data mining is a valuable tool that allows extracting knowledge from a database, identifying valid, new, potentially useful and understandable patterns. Therefore, it was used the methods for clusters generation by means of the algorithms K-Means, MAXVER and DBSCAN, implemented in the WEKA software package. Clusters were created based on the average temporal profiles of NDVI of the 277 municipalities with high soybean production in the state and the best results were found with the K-Means algorithm, grouping the municipalities into six clusters, considering the period from the beginning of October until the end of March, which is equivalent to the crop vegetative cycle. Half of the generated clusters presented spectro-temporal pattern, a characteristic of soybeans and were mostly under the soybean belt in the state of Paraná, which shows good results that were obtained with the proposed methodology as for identification of homogeneous areas. These results will be useful for the creation of regional soybean "masks" to estimate the planted area for this crop.
Resumo:
Positron Emission Tomography (PET) using 18F-FDG is playing a vital role in the diagnosis and treatment planning of cancer. However, the most widely used radiotracer, 18F-FDG, is not specific for tumours and can also accumulate in inflammatory lesions as well as normal physiologically active tissues making diagnosis and treatment planning complicated for the physicians. Malignant, inflammatory and normal tissues are known to have different pathways for glucose metabolism which could possibly be evident from different characteristics of the time activity curves from a dynamic PET acquisition protocol. Therefore, we aimed to develop new image analysis methods, for PET scans of the head and neck region, which could differentiate between inflammation, tumour and normal tissues using this functional information within these radiotracer uptake areas. We developed different dynamic features from the time activity curves of voxels in these areas and compared them with the widely used static parameter, SUV, using Gaussian Mixture Model algorithm as well as K-means algorithm in order to assess their effectiveness in discriminating metabolically different areas. Moreover, we also correlated dynamic features with other clinical metrics obtained independently of PET imaging. The results show that some of the developed features can prove to be useful in differentiating tumour tissues from inflammatory regions and some dynamic features also provide positive correlations with clinical metrics. If these proposed methods are further explored then they can prove to be useful in reducing false positive tumour detections and developing real world applications for tumour diagnosis and contouring.
Resumo:
Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold
Resumo:
Any automatically measurable, robust and distinctive physical characteristic or personal trait that can be used to identify an individual or verify the claimed identity of an individual, referred to as biometrics, has gained significant interest in the wake of heightened concerns about security and rapid advancements in networking, communication and mobility. Multimodal biometrics is expected to be ultra-secure and reliable, due to the presence of multiple and independent—verification clues. In this study, a multimodal biometric system utilising audio and facial signatures has been implemented and error analysis has been carried out. A total of one thousand face images and 250 sound tracks of 50 users are used for training the proposed system. To account for the attempts of the unregistered signatures data of 25 new users are tested. The short term spectral features were extracted from the sound data and Vector Quantization was done using K-means algorithm. Face images are identified based on Eigen face approach using Principal Component Analysis. The success rate of multimodal system using speech and face is higher when compared to individual unimodal recognition systems
Resumo:
The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified
Resumo:
Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element
Resumo:
Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.
Resumo:
Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.
Resumo:
The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column
Resumo:
Image segmentation is one of the image processing problems that deserves special attention from the scientific community. This work studies unsupervised methods to clustering and pattern recognition applicable to medical image segmentation. Natural Computing based methods have shown very attractive in such tasks and are studied here as a way to verify it's applicability in medical image segmentation. This work treats to implement the following methods: GKA (Genetic K-means Algorithm), GFCMA (Genetic FCM Algorithm), PSOKA (PSO and K-means based Clustering Algorithm) and PSOFCM (PSO and FCM based Clustering Algorithm). Besides, as a way to evaluate the results given by the algorithms, clustering validity indexes are used as quantitative measure. Visual and qualitative evaluations are realized also, mainly using data given by the BrainWeb brain simulator as ground truth
Resumo:
This work proposes a kinematic control scheme, using visual feedback for a robot arm with five degrees of freedom. Using computational vision techniques, a method was developed to determine the cartesian 3d position and orientation of the robot arm (pose) using a robot image obtained through a camera. A colored triangular label is disposed on the robot manipulator tool and efficient heuristic rules are used to obtain the vertexes of that label in the image. The tool pose is obtained from those vertexes through numerical methods. A color calibration scheme based in the K-means algorithm was implemented to guarantee the robustness of the vision system in the presence of light variations. The extrinsic camera parameters are computed from the image of four coplanar points whose cartesian 3d coordinates, related to a fixed frame, are known. Two distinct poses of the tool, initial and final, obtained from image, are interpolated to generate a desired trajectory in cartesian space. The error signal in the proposed control scheme consists in the difference between the desired tool pose and the actual tool pose. Gains are applied at the error signal and the signal resulting is mapped in joint incrementals using the pseudoinverse of the manipulator jacobian matrix. These incrementals are applied to the manipulator joints moving the tool to the desired pose
Resumo:
Navigation based on visual feedback for robots, working in a closed environment, can be obtained settling a camera in each robot (local vision system). However, this solution requests a camera and capacity of local processing for each robot. When possible, a global vision system is a cheapest solution for this problem. In this case, one or a little amount of cameras, covering all the workspace, can be shared by the entire team of robots, saving the cost of a great amount of cameras and the associated processing hardware needed in a local vision system. This work presents the implementation and experimental results of a global vision system for mobile mini-robots, using robot soccer as test platform. The proposed vision system consists of a camera, a frame grabber and a computer (PC) for image processing. The PC is responsible for the team motion control, based on the visual feedback, sending commands to the robots through a radio link. In order for the system to be able to unequivocally recognize each robot, each one has a label on its top, consisting of two colored circles. Image processing algorithms were developed for the eficient computation, in real time, of all objects position (robot and ball) and orientation (robot). A great problem found was to label the color, in real time, of each colored point of the image, in time-varying illumination conditions. To overcome this problem, an automatic camera calibration, based on clustering K-means algorithm, was implemented. This method guarantees that similar pixels will be clustered around a unique color class. The obtained experimental results shown that the position and orientation of each robot can be obtained with a precision of few millimeters. The updating of the position and orientation was attained in real time, analyzing 30 frames per second
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
O avanço nas áreas de comunicação sem fio e microeletrônica permite o desenvolvimento de equipamentos micro sensores com capacidade de monitorar grandes regiões. Formadas por milhares de nós sensores, trabalhando de forma colaborativa, as Redes de Sensores sem Fio apresentam severas restrições de energia, devido à capacidade limitada das baterias dos nós que compõem a rede. O consumo de energia pode ser minimizado, permitindo que apenas alguns nós especiais, chamados de Cluster Head, sejam responsáveis por receber os dados dos nós que formam seu cluster e propagar estes dados para um ponto de coleta denominado Estação Base. A escolha do Cluster Head ideal influencia no aumento do período de estabilidade da rede, maximizando seu tempo de vida útil. A proposta, apresentada nesta dissertação, utiliza Lógica Fuzzy e algoritmo k-means com base em informações centralizadas na Estação Base para eleição do Cluster Head ideal em Redes de Sensores sem Fio heterogêneas. Os critérios usados para seleção do Cluster Head são baseados na centralidade do nó, nível de energia e proximidade para a Estação Base. Esta dissertação apresenta as desvantagens de utilização de informações locais para eleição do líder do cluster e a importância do tratamento discriminatório sobre as discrepâncias energéticas dos nós que formam a rede. Esta proposta é comparada com os algoritmos Low Energy Adaptative Clustering Hierarchy (LEACH) e Distributed energy-efficient clustering algorithm for heterogeneous Wireless sensor networks (DEEC). Esta comparação é feita, utilizando o final do período de estabilidade, como também, o tempo de vida útil da rede.
Resumo:
The objective of this work was to typify, through physicochemical parameters, honey from Campos do Jordão’s microrregion, and verify how samples are grouped in accordance with the climatic production seasonality (summer and winter). It were assessed 30 samples of honey from beekeepers located in the cities of Monteiro Lobato, Campos do Jordão, Santo Antonio do Pinhal e São Bento do Sapucaí-SP, regarding both periods of honey production (November to February; July to September, during 2007 and 2008; n = 30). Samples were submitted to physicochemical analysis of total acidity, pH, humidity, water activity, density, aminoacids, ashes, color and electrical conductivity, identifying physicochemical standards of honey samples from both periods of production. Next, we carried out a cluster analysis of data using k-means algorithm, which grouped the samples into two classes (summer and winter). Thus, there was a supervised training of an Artificial Neural Network (ANN) using backpropagation algorithm. According to the analysis, the knowledge gained through the ANN classified the samples with 80% accuracy. It was observed that the ANNs have proved an effective tool to group samples of honey of the region of Campos do Jordao according to their physicochemical characteristics, depending on the different production periods.