40 resultados para means clustering
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Este trabalho teve por objetivo estudar as causas de variação nos preços de bovinos da raça nelore pertencentes a rebanhos de seleção, os quais foram comercializados em leilões, para verificar as influências das avaliações genéticas e dos julgamentos de exterior sobre esses preços. Para tanto, foram computados os preços de venda de 426 bovinos da referida raça em 12 leilões ocorridos em diversas localidades brasileiras (regiões Centro-Oeste, Norte e Sudeste), entre os anos de 2002 e 2005. O valor médio foi de R$ 3.325,49, sendo o mínimo de R$ 1.400,00 e o máximo de R$ 10.500,00. Esses dados foram digitados juntamente com outras informações que eram apresentadas nos catálogos dos leilões. As informações registradas incluíram o sexo de cada animal, o nome do leilão e as DEPs informadas nos catálogos. Além da avaliação da influência das informações dos catálogos, também foi avaliada a influência das informações dos reprodutores, pais dos animais vendidos nos leilões, envolvendo suas DEPs publicadas em um sumário de reprodutores da raça e as pontuações de suas progênies em julgamentos. Os métodos estatísticos aplicados foram análises de variâncias e análises de agrupamento (método K-médias). Como resultado, foi observado que animais com superioridade genética em características relacionadas a desempenho ponderal, considerando-se os efeitos diretos e maternos, foram valorizados ao serem comercializados nos leilões. Em contra-partida, a pontuação dos reprodutores nos julgamentos não teve influência significativa sobre os preços médios de venda de suas progênies nos leilões.
Resumo:
Background: Since establishing universal free access to antiretroviral therapy in 1996, the Brazilian Health System has increased the number of centers providing HIV/AIDS outpatient care from 33 to 540. There had been no formal monitoring of the quality of these services until a survey of 336 AIDS health centers across 7 Brazilian states was undertaken in 2002. Managers of the services were asked to assess their clinics according to parameters of service inputs and service delivery processes. This report analyzes the survey results and identifies predictors of the overall quality of service delivery. Methods: The survey involved completion of a multiple-choice questionnaire comprising 107 parameters of service inputs and processes of delivering care, with responses assessed according to their likely impact on service quality using a 3-point scale. K-means clustering was used to group these services according to their scored responses. Logistic regression analysis was performed to identify predictors of high service quality. Results: The questionnaire was completed by 95.8% (322) of the managers of the sites surveyed. Most sites scored about 50% of the benchmark expectation. K-means clustering analysis identified four quality levels within which services could be grouped: 76 services (24%) were classed as level 1 (best), 53 (16%) as level 2 (medium), 113 (35%) as level 3 (poor), and 80 (25%) as level 4 (very poor). Parameters of service delivery processes were more important than those relating to service inputs for determining the quality classification. Predictors of quality services included larger care sites, specialization for HIV/AIDS, and location within large municipalities. Conclusion: The survey demonstrated highly variable levels of HIV/AIDS service quality across the sites. Many sites were found to have deficiencies in the processes of service delivery processes that could benefit from quality improvement initiatives. These findings could have implications for how HIV/AIDS services are planned in Brazil to achieve quality standards, such as for where service sites should be located, their size and staffing requirements. A set of service delivery indicators has been identified that could be used for routine monitoring of HIV/AIDS service delivery for HIV/AIDS in Brazil (and potentially in other similar settings).
Resumo:
One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.
Resumo:
This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.
Resumo:
We study a symplectic chain with a non-local form of coupling by means of a standard map lattice where the interaction strength decreases with the lattice distance as a power-law, in Such a way that one can pass continuously from a local (nearest-neighbor) to a global (mean-field) type of coupling. We investigate the formation of map clusters, or spatially coherent structures generated by the system dynamics. Such clusters are found to be related to stickiness of chaotic phase-space trajectories near periodic island remnants, and also to the behavior of the diffusion coefficient. An approximate two-dimensional map is derived to explain some of the features of this connection. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
In the southern region of Mato Grosso do Sul state, Brazil, a foot-and-mouth disease (FMD) epidemic started in September 2005. A total of 33 outbreaks were detected and 33,741 FMD-susceptible animals were slaughtered and destroyed. There were no reports of FMD cases in other species than bovines. Based on the data of this epidemic, it was carried out an analysis using the K-function and it was observed spatial clustering of outbreaks within a range of 25km. This observation may be related to the dynamics of foot-and-mouth disease spread and to the measures undertaken to control the disease dissemination. The control measures were effective once the disease did not spread to farms more than 47 km apart from the initial outbreaks.
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
Background: The supraceliac aortic cross-clamping can be an option to save patients with hipovolemic shock due to abdominal trauma. However, this maneuver is associated with ischemia/reperfusion (I/R) injury strongly related to oxidative stress and reduction of nitric oxide bioavailability. Moreover, several studies demonstrated impairment in relaxation after I/R, but the time course of I/R necessary to induce vascular dysfunction is still controversial. We investigated whether 60 minutes of ischemia followed by 30 minutes of reperfusion do not change the relaxation of visceral arteries nor the plasma and renal levels of malondialdehyde (MDA) and nitrite plus nitrate (NOx). Methods: Male mongrel dogs (n = 27) were randomly allocated in one of the three groups: sham (no clamping, n = 9), ischemia (supraceliac aortic cross-clamping for 60 minutes, n = 9), and I/R (60 minutes of ischemia followed by reperfusion for 30 minutes, n = 9). Relaxation of visceral arteries (celiac trunk, renal and superior mesenteric arteries) was studied in organ chambers. MDA and NOx concentrations were determined using a commercially available kit and an ozone-based chemiluminescence assay, respectively. Results: Both acetylcholine and calcium ionophore caused relaxation in endothelium-intact rings and no statistical differences were observed among the three groups. Sodium nitroprusside promoted relaxation in endothelium-denuded rings, and there were no inter-group statistical differences. Both plasma and renal concentrations of MDA and NOx showed no significant difference among the groups. Conclusion: Supraceliac aortic cross-clamping for 60 minutes alone and followed by 30 minutes of reperfusion did not impair relaxation of canine visceral arteries nor evoke biochemical alterations in plasma or renal tissue.
Resumo:
A great part of the interest in complex networks has been motivated by the presence of structured, frequently nonuniform, connectivity. Because diverse connectivity patterns tend to result in distinct network dynamics, and also because they provide the means to identify and classify several types of complex network, it becomes important to obtain meaningful measurements of the local network topology. In addition to traditional features such as the node degree, clustering coefficient, and shortest path, motifs have been introduced in the literature in order to provide complementary descriptions of the network connectivity. The current work proposes a different type of motif, namely, chains of nodes, that is, sequences of connected nodes with degree 2. These chains have been subdivided into cords, tails, rings, and handles, depending on the type of their extremities (e.g., open or connected). A theoretical analysis of the density of such motifs in random and scale-free networks is described, and an algorithm for identifying these motifs in general networks is presented. The potential of considering chains for network characterization has been illustrated with respect to five categories of real-world networks including 16 cases. Several interesting findings were obtained, including the fact that several chains were observed in real-world networks, especially the world wide web, books, and the power grid. The possibility of chains resulting from incompletely sampled networks is also investigated.
Resumo:
The generator-coordinate method is a flexible and powerful reformulation of the variational principle. Here we show that by introducing a generator coordinate in the Kohn-Sham equation of density-functional theory, excitation energies can be obtained from ground-state density functionals. As a viability test, the method is applied to ground-state energies and various types of excited-state energies of atoms and ions from the He and the Li isoelectronic series. Results are compared to a variety of alternative DFT-based approaches to excited states, in particular time-dependent density-functional theory with exact and approximate potentials.
Resumo:
The inorganic chemical characterization of suspended sediments is of utmost relevance for the knowledge of the dynamics and movement of chemical elements in the aquatic and wet ecosystems. Despite the complexity of the effective design for studying this ecological compartment, this work has tested a procedure for analyzing suspended sediments by instrumental neutron activation analysis, k(0) method (k(0)-INAA). The chemical elements As, Ba, Br, Ca, Ce, Co, Cr, Cs, Eu, Fe, Hf, Fig, K, La, Mo, Na, Ni, Rb, Sb, Sc, Se, Sm, Sr, Ta, Tb, Th, Yb and Zn were quantified in the suspended sediment compartment by means of k(0)-INAA. When compared with World Average for rivers, high mass fractions of Fe (222,900 mg/kg), Ba (4990 mg/kg), Zn (1350 mg/kg), Cr (646 mg/kg), Co (74.5 mg/kg), Br (113 mg/kg) and Mo (31.9 mg/kg) were quantified in suspended sediments from the Piracicaba River, the Piracicamirim Stream and the Marins Stream. Results of the principal component analysis for standardized chemical element mass fractions indicated an intricate correlation among chemical elements evaluated, as a response of the contribution of natural and anthropogenic sources of chemical elements for ecosystems. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The aim of the study was to evaluate the possible relationships between stress tolerance, training load, banal infections and salivary parameters during 4 weeks of regular training in fifteen basketball players. The Daily Analysis of Life Demands for Athletes` questionnaire (sources and symptoms of stress) and the Wisconsin Upper Respiratory Symptom Survey were used on a weekly basis. Salivary cortisol and salivary immunoglobulin A (SIgA) were collected at the beginning (before) and after the study, and measured by enzyme-linked immunosorbent assay (ELISA). Ratings of perceived exertion (training load) were also obtained. The results from ANOVA with repeated measures showed greater training loads, number of upper respiratory tract infection episodes and negative sensation to both symptoms and sources of stress, at week 2 (p < 0.05). Significant increases in cortisol levels and decreases in SIgA secretion rate were noted (before to after). Negative sensations to symptoms of stress at week 4 were inversely and significantly correlated with SIgA secretion rate. A positive and significant relationship between sources and symptoms of stress at week 4 and cortisol levels were verified. In summary, an approach incorporating in conjunction psychometric tools and salivary biomarkers could be an efficient means of monitoring reaction to stress in sport. Copyright (C) 2010 John Wiley & Sons, Ltd.