Biblioteca Digital

145 resultados para scenario clustering

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)

On comparing two sequences of numbers and its applications to clustering analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.

Spatial clustering analysis of the foot-and-mouth disease outbreaks in Mato Grosso do Sul state, Brazil - 2005

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the southern region of Mato Grosso do Sul state, Brazil, a foot-and-mouth disease (FMD) epidemic started in September 2005. A total of 33 outbreaks were detected and 33,741 FMD-susceptible animals were slaughtered and destroyed. There were no reports of FMD cases in other species than bovines. Based on the data of this epidemic, it was carried out an analysis using the K-function and it was observed spatial clustering of outbreaks within a range of 25km. This observation may be related to the dynamics of foot-and-mouth disease spread and to the measures undertaken to control the disease dissemination. The control measures were effective once the disease did not spread to farms more than 47 km apart from the initial outbreaks.

The full Bayesian significance test for mixture models: results in gene expression clustering

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.

Dynamics and constraints of the massive graviton dark matter flat cosmologies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We discuss the dynamics of the Universe within the framework of the massive graviton cold dark matter scenario (MGCDM) in which gravitons are geometrically treated as massive particles. In this modified gravity theory, the main effect of the gravitons is to alter the density evolution of the cold dark matter component in such a way that the Universe evolves to an accelerating expanding regime, as presently observed. Tight constraints on the main cosmological parameters of the MGCDM model are derived by performing a joint likelihood analysis involving the recent supernovae type Ia data, the cosmic microwave background shift parameter, and the baryonic acoustic oscillations as traced by the Sloan Digital Sky Survey red luminous galaxies. The linear evolution of small density fluctuations is also analyzed in detail. It is found that the growth factor of the MGCDM model is slightly different (similar to 1-4%) from the one provided by the conventional flat Lambda CDM cosmology. The growth rate of clustering predicted by MGCDM and Lambda CDM models are confronted to the observations and the corresponding best fit values of the growth index (gamma) are also determined. By using the expectations of realistic future x-ray and Sunyaev-Zeldovich cluster surveys we derive the dark matter halo mass function and the corresponding redshift distribution of cluster-size halos for the MGCDM model. Finally, we also show that the Hubble flow differences between the MGCDM and the Lambda CDM models provide a halo redshift distribution departing significantly from the those predicted by other dark energy models. These results suggest that the MGCDM model can observationally be distinguished from Lambda CDM and also from a large number of dark energy models recently proposed in the literature.

Constraints on cold dark matter accelerating cosmologies and cluster formation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We discuss the properties of homogeneous and isotropic flat cosmologies in which the present accelerating stage is powered only by the gravitationally induced creation of cold dark matter (CCDM) particles (Omega(m) = 1). For some matter creation rates proposed in the literature, we show that the main cosmological functions such as the scale factor of the universe, the Hubble expansion rate, the growth factor, and the cluster formation rate are analytically defined. The best CCDM scenario has only one free parameter and our joint analysis involving baryonic acoustic oscillations + cosmic microwave background (CMB) + SNe Ia data yields (Omega) over tilde = 0.28 +/- 0.01 (1 sigma), where (Omega) over tilde (m) is the observed matter density parameter. In particular, this implies that the model has no dark energy but the part of the matter that is effectively clustering is in good agreement with the latest determinations from the large- scale structure. The growth of perturbation and the formation of galaxy clusters in such scenarios are also investigated. Despite the fact that both scenarios may share the same Hubble expansion, we find that matter creation cosmologies predict stronger small scale dynamics which implies a faster growth rate of perturbations with respect to the usual Lambda CDM cosmology. Such results point to the possibility of a crucial observational test confronting CCDM with Lambda CDM scenarios through a more detailed analysis involving CMB, weak lensing, as well as the large-scale structure.

The importance of the industrialization of Brazilian shale when faced with the world energy scenario

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article discusses the importance of the industrialization of Brazilian shale based on factors such as: security of the national energy system security, global oil geopoliticsl, resources available, production costs, oil prices, environmental impacts and the national oil reserves. The study shows that the industrialization of shale always arises when issues such as peak oil or its geopolitics appear as factors that raise the price of oil to unrealistic levels. The article concludes that in the Brazilian case, shale oil may be classified as a strategic resource, economically viable, currently in development by the success of the retorting technology for extraction of shale oil and the price of crude oil. The article presents the conclusion that shale may be the driving factor for the formation of a technology park in Sao Mateus do Sul, due to the city`s economic dependence on Petrosix.

Is your vision consistent? A method for checking, based on scenario concepts

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Common sense tells us that the future is an essential element in any strategy. In addition, there is a good deal of literature on scenario planning, which is an important tool in considering the future in terms of strategy. However, in many organizations there is serious resistance to the development of scenarios, and they are not broadly implemented by companies. But even organizations that do not rely heavily on the development of scenarios do, in fact, construct visions to guide their strategies. But it might be asked, what happens when this vision is not consistent with the future? To address this problem, the present article proposes a method for checking the content and consistency of an organization`s vision of the future, no matter how it was conceived. The proposed method is grounded on theoretical concepts from the field of future studies, which are described in this article. This study was motivated by the search for developing new ways of improving and using scenario techniques as a method for making strategic decisions. The method was then tested on a company in the field of information technology in order to check its operational feasibility. The test showed that the proposed method is, in fact, operationally feasible and was capable of analyzing the vision of the company being studied, indicating both its shortcomings and points of inconsistency. (C) 2007 Elsevier Ltd. All rights reserved.

A spectral clustering algorithm for manufacturing cell formation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A graph clustering algorithm constructs groups of closely related parts and machines separately. After they are matched for the least intercell moves, a refining process runs on the initial cell formation to decrease the number of intercell moves. A simple modification of this main approach can deal with some practical constraints, such as the popular constraint of bounding the maximum number of machines in a cell. Our approach makes a big improvement in the computational time. More importantly, improvement is seen in the number of intercell moves when the computational results were compared with best known solutions from the literature. (C) 2009 Elsevier Ltd. All rights reserved.

Characterization of spliced leader genes of Trypanosoma (Megatrypanum) theileri: phylogeographical analysis of Brazilian isolates from cattle supports spatial clustering of genotypes and parity with ribosomal markers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Trypanosoma (Megatrypanum) theileri from cattle and trypanosomes of other artiodactyls form a clade of closely related species in analyses using ribosomal sequences. Analysis of polymorphic sequences of a larger number of trypanosomes from broader geographical origins is required to evaluate the Clustering of isolates as suggested by previous studies. Here, we determined the sequences of the spliced leader (SL) genes of 21 isolates from cattle and 2 from water buffalo from distant regions of Brazil. Analysis of SL gene repeats revealed that the 5S rRNA gene is inserted within the intergenic region. Phylogeographical patterns inferred using SL sequences showed at least 5 major genotypes of T. theileri distributed in 2 strongly divergent lineages. Lineage TthI comprises genotypes IA and IB from buffalo and cattle, respectively, from the Southeast and Central regions, whereas genotype IC is restricted to cattle from the Southern region. Lineage Tth II includes cattle genotypes IIA, which is restricted to the North and Northeast, and IIB, found in the Centre, West, North and Northeast. PCR-RFLP of SL genes revealed valuable markers for genotyping T. theileri. The results of this study emphasize the genetic complexity and corroborate the geographical structuring of T. theileri genotypes found in cattle.

Comparative phylogeography of Trypanosoma cruzi TCIIc: New hosts, association with terrestrial ecotopes, and spatial clustering

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We characterized 28 new isolates of Trypanosoma cruzi IIc (TCIIc) of mammals and triatomines from Northern to Southern Brazil, confirming the widespread distribution of this lineage. Phylogenetic analyses using cytochrome b and SSU rDNA sequences clearly separated TCIIc from TCIIa according to terrestrial and arboreal ecotopes of their preferential mammalian hosts and vectors. TCIIc was more closely related to TCIId/e, followed by TCIIa, and separated by large distances from TCIIb and TCI. Despite being indistinguishable by traditional genotyping and generally being assigned to Z3, we provide evidence that TCIIa from South America and TCIIa from North America correspond to independent lineages that circulate in distinct hosts and ecological niches. Armadillos, terrestrial didelphids and rodents, and domestic dogs were found infected by TCIIc in Brazil. We believe that, in Brazil, this is the first description of TCIIc from rodents and domestic dogs. Terrestrial triatomines of genera Panstrongylus and Triatoma were confirmed as vectors of TCIIc. Together, habitat, mammalian host and vector association corroborated the link between TCIIc and terrestrial transmission cycles/ecological niches. Analysis of ITS1 rDNA sequences disclosed clusters of TCIIc isolates in accordance with their geographic origin, independent of their host species. (C) 2009 Elsevier B.V. All rights reserved.

Evolutionary fuzzy clustering of relational data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.

Partitions selection strategy for set of clustering solutions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering is a difficult task: there is no single cluster definition and the data can have more than one underlying structure. Pareto-based multi-objective genetic algorithms (e.g., MOCK Multi-Objective Clustering with automatic K-determination and MOCLE-Multi-Objective Clustering Ensemble) were proposed to tackle these problems. However, the output of such algorithms can often contains a high number of partitions, becoming difficult for an expert to manually analyze all of them. In order to deal with this problem, we present two selection strategies, which are based on the corrected Rand, to choose a subset of solutions. To test them, they are applied to the set of solutions produced by MOCK and MOCLE in the context of several datasets. The study was also extended to select a reduced set of partitions from the initial population of MOCLE. These analysis show that both versions of selection strategy proposed are very effective. They can significantly reduce the number of solutions and, at the same time, keep the quality and the diversity of the partitions in the original set of solutions. (C) 2010 Elsevier B.V. All rights reserved.

Investigation of a new GRASP-based clustering algorithm applied to biological data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.

Multi-objective clustering ensemble for gene expression data analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective Clustering with automatic K-determination (MOCK). the algorithm most closely related to ours. (C) 2009 Elsevier B.V. All rights reserved.

On the efficiency of evolutionary fuzzy clustering

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.

«
1
2
3
4
5
6
7
8
9
10
»