164 resultados para Fuzzy C-Means clustering
Resumo:
This paper presents the design and implementation of an embedded soft sensor, i. e., a generic and autonomous hardware module, which can be applied to many complex plants, wherein a certain variable cannot be directly measured. It is implemented based on a fuzzy identification algorithm called ""Limited Rules"", employed to model continuous nonlinear processes. The fuzzy model has a Takagi-Sugeno-Kang structure and the premise parameters are defined based on the Fuzzy C-Means (FCM) clustering algorithm. The firmware contains the soft sensor and it runs online, estimating the target variable from other available variables. Tests have been performed using a simulated pH neutralization plant. The results of the embedded soft sensor have been considered satisfactory. A complete embedded inferential control system is also presented, including a soft sensor and a PID controller. (c) 2007, ISA. Published by Elsevier Ltd. All rights reserved.
Resumo:
There is a family of well-known external clustering validity indexes to measure the degree of compatibility or similarity between two hard partitions of a given data set, including partitions with different numbers of categories. A unified, fully equivalent set-theoretic formulation for an important class of such indexes was derived and extended to the fuzzy domain in a previous work by the author [Campello, R.J.G.B., 2007. A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment. Pattern Recognition Lett., 28, 833-841]. However, the proposed fuzzy set-theoretic formulation is not valid as a general approach for comparing two fuzzy partitions of data. Instead, it is an approach for comparing a fuzzy partition against a hard referential partition of the data into mutually disjoint categories. In this paper, generalized external indexes for comparing two data partitions with overlapping categories are introduced. These indexes can be used as general measures for comparing two partitions of the same data set into overlapping categories. An important issue that is seldom touched in the literature is also addressed in the paper, namely, how to compare two partitions of different subsamples of data. A number of pedagogical examples and three simulation experiments are presented and analyzed in details. A review of recent related work compiled from the literature is also provided. (c) 2010 Elsevier B.V. All rights reserved.
Resumo:
This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.
Resumo:
Este trabalho teve por objetivo estudar as causas de variação nos preços de bovinos da raça nelore pertencentes a rebanhos de seleção, os quais foram comercializados em leilões, para verificar as influências das avaliações genéticas e dos julgamentos de exterior sobre esses preços. Para tanto, foram computados os preços de venda de 426 bovinos da referida raça em 12 leilões ocorridos em diversas localidades brasileiras (regiões Centro-Oeste, Norte e Sudeste), entre os anos de 2002 e 2005. O valor médio foi de R$ 3.325,49, sendo o mínimo de R$ 1.400,00 e o máximo de R$ 10.500,00. Esses dados foram digitados juntamente com outras informações que eram apresentadas nos catálogos dos leilões. As informações registradas incluíram o sexo de cada animal, o nome do leilão e as DEPs informadas nos catálogos. Além da avaliação da influência das informações dos catálogos, também foi avaliada a influência das informações dos reprodutores, pais dos animais vendidos nos leilões, envolvendo suas DEPs publicadas em um sumário de reprodutores da raça e as pontuações de suas progênies em julgamentos. Os métodos estatísticos aplicados foram análises de variâncias e análises de agrupamento (método K-médias). Como resultado, foi observado que animais com superioridade genética em características relacionadas a desempenho ponderal, considerando-se os efeitos diretos e maternos, foram valorizados ao serem comercializados nos leilões. Em contra-partida, a pontuação dos reprodutores nos julgamentos não teve influência significativa sobre os preços médios de venda de suas progênies nos leilões.
Resumo:
Background: Since establishing universal free access to antiretroviral therapy in 1996, the Brazilian Health System has increased the number of centers providing HIV/AIDS outpatient care from 33 to 540. There had been no formal monitoring of the quality of these services until a survey of 336 AIDS health centers across 7 Brazilian states was undertaken in 2002. Managers of the services were asked to assess their clinics according to parameters of service inputs and service delivery processes. This report analyzes the survey results and identifies predictors of the overall quality of service delivery. Methods: The survey involved completion of a multiple-choice questionnaire comprising 107 parameters of service inputs and processes of delivering care, with responses assessed according to their likely impact on service quality using a 3-point scale. K-means clustering was used to group these services according to their scored responses. Logistic regression analysis was performed to identify predictors of high service quality. Results: The questionnaire was completed by 95.8% (322) of the managers of the sites surveyed. Most sites scored about 50% of the benchmark expectation. K-means clustering analysis identified four quality levels within which services could be grouped: 76 services (24%) were classed as level 1 (best), 53 (16%) as level 2 (medium), 113 (35%) as level 3 (poor), and 80 (25%) as level 4 (very poor). Parameters of service delivery processes were more important than those relating to service inputs for determining the quality classification. Predictors of quality services included larger care sites, specialization for HIV/AIDS, and location within large municipalities. Conclusion: The survey demonstrated highly variable levels of HIV/AIDS service quality across the sites. Many sites were found to have deficiencies in the processes of service delivery processes that could benefit from quality improvement initiatives. These findings could have implications for how HIV/AIDS services are planned in Brazil to achieve quality standards, such as for where service sites should be located, their size and staffing requirements. A set of service delivery indicators has been identified that could be used for routine monitoring of HIV/AIDS service delivery for HIV/AIDS in Brazil (and potentially in other similar settings).
Resumo:
Large parity-violating longitudinal single-spin asymmetries A(L)(e+) = 0.86(-0.14)(+0.30) and Ae(L)(e-) = 0.88(-0.71)(+0.12) are observed for inclusive high transverse momentum electrons and positrons in polarized p + p collisions at a center-of-mass energy of root s = 500 GeV with the PHENIX detector at RHIC. These e(+/-) come mainly from the decay of W(+/-) and Z(0) bosons, and their asymmetries directly demonstrate parity violation in the couplings of the W(+/-) to the light quarks. The observed electron and positron yields were used to estimate W(+/-) boson production cross sections for the e(+/-) channels of sigma(pp -> W(+)X) X BR(W(+) -> e(+) nu(e)) = 144.1 +/- 21.2(stat)(-10.3)(+3.4)(syst) +/- 21.6(norm) pb, and sigma(pp -> W(-)X) X BR(W(-) -> e(-) (nu) over bar (e)) = 3.17 +/- 12.1(stat)(-8.2)(+10.1)(syst) +/- 4.8(norm) pb.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In the current work, we studied the effect of the nonionic detergent dodecyloctaethyleneglycol, C(12)E(8), on the structure and oligomeric form of the Na,K-ATPase membrane enzyme (sodium-potassium pump) in aqueous suspension, by means of small-angle X-ray scattering (SAXS). Samples composed of 2 mg/mL of Na,K-ATPase, extracted from rabbit kidney medulla, in the presence of a small amount of C(12)E(8) (0.005 mg/mL) and in larger concentrations ranging from 2.7 to 27 mg/mL did not present catalytic activity. Under this condition, an oligomerization of the alpha subunits is expected. SAXS data were analyzed by means of a global fitting procedure supposing that the scattering is due to two independent contributions: one coming from the enzyme and the other one from C(12)E(8) micelles. In the small detergent content (0.005 mg/mL), the SAXS results evidenced that Na,K-ATPase is associated into aggregates larger than (alpha beta)(2) form. When 2.7 mg/mL of C(12)E(8) is added, the data analysis revealed the presence of alpha(4) aggregates in the solution and some free micelles. Increasing the detergent amount up to 27 mg/mL does not disturb the alpha(4) aggregate: just more micelles of the same size and shape are proportionally formed in solution. We believe that our results shed light on a better understanding of how nonionic detergents induce subunit dissociation and reassembling to minimize the exposure of hydrophobic residues to the aqueous solvent.
Resumo:
One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.
Resumo:
We study a symplectic chain with a non-local form of coupling by means of a standard map lattice where the interaction strength decreases with the lattice distance as a power-law, in Such a way that one can pass continuously from a local (nearest-neighbor) to a global (mean-field) type of coupling. We investigate the formation of map clusters, or spatially coherent structures generated by the system dynamics. Such clusters are found to be related to stickiness of chaotic phase-space trajectories near periodic island remnants, and also to the behavior of the diffusion coefficient. An approximate two-dimensional map is derived to explain some of the features of this connection. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
Background: The supraceliac aortic cross-clamping can be an option to save patients with hipovolemic shock due to abdominal trauma. However, this maneuver is associated with ischemia/reperfusion (I/R) injury strongly related to oxidative stress and reduction of nitric oxide bioavailability. Moreover, several studies demonstrated impairment in relaxation after I/R, but the time course of I/R necessary to induce vascular dysfunction is still controversial. We investigated whether 60 minutes of ischemia followed by 30 minutes of reperfusion do not change the relaxation of visceral arteries nor the plasma and renal levels of malondialdehyde (MDA) and nitrite plus nitrate (NOx). Methods: Male mongrel dogs (n = 27) were randomly allocated in one of the three groups: sham (no clamping, n = 9), ischemia (supraceliac aortic cross-clamping for 60 minutes, n = 9), and I/R (60 minutes of ischemia followed by reperfusion for 30 minutes, n = 9). Relaxation of visceral arteries (celiac trunk, renal and superior mesenteric arteries) was studied in organ chambers. MDA and NOx concentrations were determined using a commercially available kit and an ozone-based chemiluminescence assay, respectively. Results: Both acetylcholine and calcium ionophore caused relaxation in endothelium-intact rings and no statistical differences were observed among the three groups. Sodium nitroprusside promoted relaxation in endothelium-denuded rings, and there were no inter-group statistical differences. Both plasma and renal concentrations of MDA and NOx showed no significant difference among the groups. Conclusion: Supraceliac aortic cross-clamping for 60 minutes alone and followed by 30 minutes of reperfusion did not impair relaxation of canine visceral arteries nor evoke biochemical alterations in plasma or renal tissue.
Resumo:
The inorganic chemical characterization of suspended sediments is of utmost relevance for the knowledge of the dynamics and movement of chemical elements in the aquatic and wet ecosystems. Despite the complexity of the effective design for studying this ecological compartment, this work has tested a procedure for analyzing suspended sediments by instrumental neutron activation analysis, k(0) method (k(0)-INAA). The chemical elements As, Ba, Br, Ca, Ce, Co, Cr, Cs, Eu, Fe, Hf, Fig, K, La, Mo, Na, Ni, Rb, Sb, Sc, Se, Sm, Sr, Ta, Tb, Th, Yb and Zn were quantified in the suspended sediment compartment by means of k(0)-INAA. When compared with World Average for rivers, high mass fractions of Fe (222,900 mg/kg), Ba (4990 mg/kg), Zn (1350 mg/kg), Cr (646 mg/kg), Co (74.5 mg/kg), Br (113 mg/kg) and Mo (31.9 mg/kg) were quantified in suspended sediments from the Piracicaba River, the Piracicamirim Stream and the Marins Stream. Results of the principal component analysis for standardized chemical element mass fractions indicated an intricate correlation among chemical elements evaluated, as a response of the contribution of natural and anthropogenic sources of chemical elements for ecosystems. (C) 2010 Elsevier B.V. All rights reserved.