859 resultados para Fuzzy c-means algorithm
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The objective of this work was to typify, through physicochemical parameters, honey from Campos do Jordão’s microrregion, and verify how samples are grouped in accordance with the climatic production seasonality (summer and winter). It were assessed 30 samples of honey from beekeepers located in the cities of Monteiro Lobato, Campos do Jordão, Santo Antonio do Pinhal e São Bento do Sapucaí-SP, regarding both periods of honey production (November to February; July to September, during 2007 and 2008; n = 30). Samples were submitted to physicochemical analysis of total acidity, pH, humidity, water activity, density, aminoacids, ashes, color and electrical conductivity, identifying physicochemical standards of honey samples from both periods of production. Next, we carried out a cluster analysis of data using k-means algorithm, which grouped the samples into two classes (summer and winter). Thus, there was a supervised training of an Artificial Neural Network (ANN) using backpropagation algorithm. According to the analysis, the knowledge gained through the ANN classified the samples with 80% accuracy. It was observed that the ANNs have proved an effective tool to group samples of honey of the region of Campos do Jordao according to their physicochemical characteristics, depending on the different production periods.
Resumo:
Recently there has been a considerable interest in dynamic textures due to the explosive growth of multimedia databases. In addition, dynamic texture appears in a wide range of videos, which makes it very important in applications concerning to model physical phenomena. Thus, dynamic textures have emerged as a new field of investigation that extends the static or spatial textures to the spatio-temporal domain. In this paper, we propose a novel approach for dynamic texture segmentation based on automata theory and k-means algorithm. In this approach, a feature vector is extracted for each pixel by applying deterministic partially self-avoiding walks on three orthogonal planes of the video. Then, these feature vectors are clustered by the well-known k-means algorithm. Although the k-means algorithm has shown interesting results, it only ensures its convergence to a local minimum, which affects the final result of segmentation. In order to overcome this drawback, we compare six methods of initialization of the k-means. The experimental results have demonstrated the effectiveness of our proposed approach compared to the state-of-the-art segmentation methods.
Resumo:
A non-hierarchical K-means algorithm is used to cluster 47 years (1960–2006) of 10-day HYSPLIT backward trajectories to the Pico Mountain (PM) observatory on a seasonal basis. The resulting cluster centers identify the major transport pathways and collectively comprise a long-term climatology of transport to the observatory. The transport climatology improves our ability to interpret the observations made there and our understanding of pollution source regions to the station and the central North Atlantic region. I determine which pathways dominate transport to the observatory and examine the impacts of these transport patterns on the O3, NOy, NOx, and CO measurements made there during 2001–2006. Transport from the U.S., Canada, and the Atlantic most frequently reaches the station, but Europe, east Africa, and the Pacific can also contribute significantly depending on the season. Transport from Canada was correlated with the North Atlantic Oscillation (NAO) in spring and winter, and transport from the Pacific was uncorrelated with the NAO. The highest CO and O3 are observed during spring. Summer is also characterized by high CO and O3 and the highest NOy and NOx of any season. Previous studies at the station attributed the summer time high CO and O3 to transport of boreal wildfire emissions (for 2002–2004), and boreal fires continued to affect the station during 2005 and 2006. The particle dispersion model FLEXPART was used to calculate anthropogenic and biomass-burning CO tracer values at the station in an attempt to identify the regions responsible for the high CO and O3 observations during spring and biomass-burning impacts in summer.
Resumo:
Magnetic resonance temperature imaging (MRTI) is recognized as a noninvasive means to provide temperature imaging for guidance in thermal therapies. The most common method of estimating temperature changes in the body using MR is by measuring the water proton resonant frequency (PRF) shift. Calculation of the complex phase difference (CPD) is the method of choice for measuring the PRF indirectly since it facilitates temperature mapping with high spatiotemporal resolution. Chemical shift imaging (CSI) techniques can provide the PRF directly with high sensitivity to temperature changes while minimizing artifacts commonly seen in CPD techniques. However, CSI techniques are currently limited by poor spatiotemporal resolution. This research intends to develop and validate a CSI-based MRTI technique with intentional spectral undersampling which allows relaxed parameters to improve spatiotemporal resolution. An algorithm based on autoregressive moving average (ARMA) modeling is developed and validated to help overcome limitations of Fourier-based analysis allowing highly accurate and precise PRF estimates. From the determined acquisition parameters and ARMA modeling, robust maps of temperature using the k-means algorithm are generated and validated in laser treatments in ex vivo tissue. The use of non-PRF based measurements provided by the technique is also investigated to aid in the validation of thermal damage predicted by an Arrhenius rate dose model.
Resumo:
Two series of volume numberings are used: Memorie ecclesiastiche di Città di Castello, v. [1]-5; Memorie civili di Città di Castello, v.1-2 [i.e., v. 6-7]
Resumo:
We present a test for identifying clusters in high dimensional data based on the k-means algorithm when the null hypothesis is spherical normal. We show that projection techniques used for evaluating validity of clusters may be misleading for such data. In particular, we demonstrate that increasingly well-separated clusters are identified as the dimensionality increases, when no such clusters exist. Furthermore, in a case of true bimodality, increasing the dimensionality makes identifying the correct clusters more difficult. In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
Esta tese incide sobre o desenvolvimento de modelos computacionais e de aplicações para a gestão do lado da procura, no âmbito das redes elétricas inteligentes. É estudado o desempenho dos intervenientes da rede elétrica inteligente, sendo apresentado um modelo do produtor-consumidor doméstico. O problema de despacho económico considerando previsão de produção e consumo de energia obtidos a partir de redes neuronais artificiais é apresentado. São estudados os modelos existentes no âmbito dos programas de resposta à procura e é desenvolvida uma ferramenta computacional baseada no algoritmo de fuzzy-clustering subtrativo. São analisados perfis de consumo e modos de operação, incluindo uma breve análise da introdução do veículo elétrico e de contingências na rede de energia elétrica. São apresentadas aplicações para a gestão de energia dos consumidores no âmbito do projeto piloto InovGrid. São desenvolvidos sistemas de automação para, aquisição monitorização, controlo e supervisão do consumo a partir de dados fornecidos pelos contadores inteligente que permitem a incorporação das ações dos consumidores na gestão do consumo de energia elétrica; SMART GRIDS - COMPUTATIONAL MODELS DEVELOPMENT AND DEMAND SIDE MANAGMENT APPLICATIONS Abstract: This thesis focuses on the development of computational models and its applications on the demand side management within the smart grid scope. The performance of the electrical network players is studied and a domestic prosumer model is presented. The economic dispatch problem considering the production forecast and the energy consumption obtained from artificial neural networks is also presented. The existing demand response models are studied and a computational tool based on the fuzzy subtractive clustering algorithm is developed. Energy consumption profiles and operational modes are analyzed, including a brief analysis of the electrical vehicle and contingencies on the electrical network. Consumer energy management applications within the scope of InovGrid pilot project are presented. Computational systems are developed for the acquisition, monitoring, control and supervision of consumption data provided by smart meters allowing to incorporate consumer actions on their electrical energy management.
Resumo:
The data acquired by Remote Sensing systems allow obtaining thematic maps of the earth's surface, by means of the registered image classification. This implies the identification and categorization of all pixels into land cover classes. Traditionally, methods based on statistical parameters have been widely used, although they show some disadvantages. Nevertheless, some authors indicate that those methods based on artificial intelligence, may be a good alternative. Thus, fuzzy classifiers, which are based on Fuzzy Logic, include additional information in the classification process through based-rule systems. In this work, we propose the use of a genetic algorithm (GA) to select the optimal and minimum set of fuzzy rules to classify remotely sensed images. Input information of GA has been obtained through the training space determined by two uncorrelated spectral bands (2D scatter diagrams), which has been irregularly divided by five linguistic terms defined in each band. The proposed methodology has been applied to Landsat-TM images and it has showed that this set of rules provides a higher accuracy level in the classification process
Resumo:
Active queue management (AQM) policies are those policies of router queue management that allow for the detection of network congestion, the notification of such occurrences to the hosts on the network borders, and the adoption of a suitable control policy. This paper proposes the adoption of a fuzzy proportional integral (FPI) controller as an active queue manager for Internet routers. The analytical design of the proposed FPI controller is carried out in analogy with a proportional integral (PI) controller, which recently has been proposed for AQM. A genetic algorithm is proposed for tuning of the FPI controller parameters with respect to optimal disturbance rejection. In the paper the FPI controller design metodology is described and the results of the comparison with random early detection (RED), tail drop, and PI controller are presented.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
A deep theoretical analysis of the graph cut image segmentation framework presented in this paper simultaneously translates into important contributions in several directions. The most important practical contribution of this work is a full theoretical description, and implementation, of a novel powerful segmentation algorithm, GC(max). The output of GC(max) coincides with a version of a segmentation algorithm known as Iterative Relative Fuzzy Connectedness, IRFC. However, GC(max) is considerably faster than the classic IRFC algorithm, which we prove theoretically and show experimentally. Specifically, we prove that, in the worst case scenario, the GC(max) algorithm runs in linear time with respect to the variable M=|C|+|Z|, where |C| is the image scene size and |Z| is the size of the allowable range, Z, of the associated weight/affinity function. For most implementations, Z is identical to the set of allowable image intensity values, and its size can be treated as small with respect to |C|, meaning that O(M)=O(|C|). In such a situation, GC(max) runs in linear time with respect to the image size |C|. We show that the output of GC(max) constitutes a solution of a graph cut energy minimization problem, in which the energy is defined as the a"" (a) norm ayenF (P) ayen(a) of the map F (P) that associates, with every element e from the boundary of an object P, its weight w(e). This formulation brings IRFC algorithms to the realm of the graph cut energy minimizers, with energy functions ayenF (P) ayen (q) for qa[1,a]. Of these, the best known minimization problem is for the energy ayenF (P) ayen(1), which is solved by the classic min-cut/max-flow algorithm, referred to often as the Graph Cut algorithm. We notice that a minimization problem for ayenF (P) ayen (q) , qa[1,a), is identical to that for ayenF (P) ayen(1), when the original weight function w is replaced by w (q) . Thus, any algorithm GC(sum) solving the ayenF (P) ayen(1) minimization problem, solves also one for ayenF (P) ayen (q) with qa[1,a), so just two algorithms, GC(sum) and GC(max), are enough to solve all ayenF (P) ayen (q) -minimization problems. We also show that, for any fixed weight assignment, the solutions of the ayenF (P) ayen (q) -minimization problems converge to a solution of the ayenF (P) ayen(a)-minimization problem (ayenF (P) ayen(a)=lim (q -> a)ayenF (P) ayen (q) is not enough to deduce that). An experimental comparison of the performance of GC(max) and GC(sum) algorithms is included. This concentrates on comparing the actual (as opposed to provable worst scenario) algorithms' running time, as well as the influence of the choice of the seeds on the output.
Resumo:
This paper presents a structural damage detection methodology based on genetic algorithms and dynamic parameters. Three chromosomes are used to codify an individual in the population. The first and second chromosomes locate and quantify damage, respectively. The third permits the self-adaptation of the genetic parameters. The natural frequencies and mode shapes are used to formulate the objective function. A numerical analysis was performed for several truss structures under different damage scenarios. The results have shown that the methodology can reliably identify damage scenarios using noisy measurements and that it results in only a few misidentified elements. (C) 2012 Civil-Comp Ltd and Elsevier Ltd. All rights reserved.
Resumo:
In the paper learning algorithm for adjusting weight coefficients of the Cascade Neo-Fuzzy Neural Network (CNFNN) in sequential mode is introduced. Concerned architecture has the similar structure with the Cascade-Correlation Learning Architecture proposed by S.E. Fahlman and C. Lebiere, but differs from it in type of artificial neurons. CNFNN consists of neo-fuzzy neurons, which can be adjusted using high-speed linear learning procedures. Proposed CNFNN is characterized by high learning rate, low size of learning sample and its operations can be described by fuzzy linguistic “if-then” rules providing “transparency” of received results, as compared with conventional neural networks. Using of online learning algorithm allows to process input data sequentially in real time mode.