957 resultados para Number of Clusters


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The identification of genetically homogeneous groups of individuals is a long standing issue in population genetics. A recent Bayesian algorithm implemented in the software STRUCTURE allows the identification of such groups. However, the ability of this algorithm to detect the true number of clusters (K) in a sample of individuals when patterns of dispersal among populations are not homogeneous has not been tested. The goal of this study is to carry out such tests, using various dispersal scenarios from data generated with an individual-based model. We found that in most cases the estimated 'log probability of data' does not provide a correct estimation of the number of clusters, K. However, using an ad hoc statistic DeltaK based on the rate of change in the log probability of data between successive K values, we found that STRUCTURE accurately detects the uppermost hierarchical level of structure for the scenarios we tested. As might be expected, the results are sensitive to the type of genetic marker used (AFLP vs. microsatellite), the number of loci scored, the number of populations sampled, and the number of individuals typed in each sample.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important problem in unsupervised data clustering is how to determine the number of clusters. Here we investigate how this can be achieved in an automated way by using interrelation matrices of multivariate time series. Two nonparametric and purely data driven algorithms are expounded and compared. The first exploits the eigenvalue spectra of surrogate data, while the second employs the eigenvector components of the interrelation matrix. Compared to the first algorithm, the second approach is computationally faster and not limited to linear interrelation measures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have performed Surface Evolver simulations of two-dimensional hexagonal bubble clusters consisting of a central bubble of area lambda surrounded by s shells or layers of bubbles of unit area. Clusters of up to twenty layers have been simulated, with lambda varying between 0.01 and 100. In monodisperse clusters (i.e., for lambda = 1) [M.A. Fortes, F Morgan, M. Fatima Vaz, Philos. Mag. Lett. 87 (2007) 561] both the average pressure of the entire Cluster and the pressure in the central bubble are decreasing functions of s and approach 0.9306 for very large s, which is the pressure in a bubble of an infinite monodisperse honeycomb foam. Here we address the effect of changing the central bubble area lambda. For small lambda the pressure in the central bubble and the average pressure were both found to decrease with s, as in monodisperse clusters. However, for large,, the pressure in the central bubble and the average pressure increase with s. The average pressure of large clusters was found to be independent of lambda and to approach 0.9306 asymptotically. We have also determined the cluster surface energies given by the equation of equilibrium for the total energy in terms of the area and the pressure in each bubble. When the pressures in the bubbles are not available, an approximate equation derived by Vaz et al. [M. Fatima Vaz, M.A. Fortes, F. Graner, Philos. Mag. Lett. 82 (2002) 575] was shown to provide good estimations for the cluster energy provided the bubble area distribution is narrow. This approach does not take cluster topology into account. Using this approximate equation, we find a good correlation between Surface Evolver Simulations and the estimated Values of energies and pressures. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The design of randomized controlled trials entails decisions that have economic as well as statistical implications. In particular, the choice of an individual or cluster randomization design may affect the cost of achieving the desired level of power, other things being equal. Furthermore, if cluster randomization is chosen, the researcher must decide how to balance the number of clusters, or sites, and the size of each site. This article investigates these interrelated statistical and economic issues. Its principal purpose is to elucidate the statistical and economic trade-offs to assist researchers to employ randomized controlled trials that have desired economic, as well as statistical, properties. (C) 2003 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is to suggest a method to find endogenously the points that group the individuals of a given distribution in k clusters, where k is endogenously determined. These points are the cut-points. Thus, we need to determine a partition of the N individuals into a number k of groups, in such way that individuals in the same group are as alike as possible, but as distinct as possible from individuals in other groups. This method can be applied to endogenously identify k groups in income distributions: possible applications can be poverty

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Affiliation: Pascal Michel : Département de pathologie et microbiologie, Faculté de médecine vétérinaire

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The influence of the occupation of the single particle levels on the impact parameter dependent K - K charge transfer occuring in collisions of 90 keV Ne{^9+} on Ne was studied using coupled channel calculations. The energy eigenvalues and matrixelements for the single particle levels were taken from ab initio self consistent MO-LCAO-DIRAC-FOCK-SLATER calculations with occupation numbers corresponding to the single particle amplitudes given by the coupled channel calculations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parasite virulence genes are usually associated with telomeres. The clustering of the telomeres, together with their particular spatial distribution in the nucleus of human parasites such as Plasmodium falciparum and Trypanosoma brucei, has been suggested to play a role in facilitating ectopic recombination and in the emergence of new antigenic variants. Leishmania parasites, as well as other trypanosomes, have unusual gene expression characteristics, such as polycistronic and constitutive transcription of protein-coding genes. Leishmania subtelomeric regions are even more unique because unlike these regions in other trypanosomes they are devoid of virulence genes. Given these peculiarities of Leishmania, we sought to investigate how telomeres are organized in the nucleus of Leishmania major parasites at both the human and insect stages of their life cycle. We developed a new automated and precise method for identifying telomere position in the three-dimensional space of the nucleus, and we found that the telomeres are organized in clusters present in similar numbers in both the human and insect stages. While the number of clusters remained the same, their distribution differed between the two stages. The telomeric clusters were found more concentrated near the center of the nucleus in the human stage than in the insect stage suggesting reorganization during the parasite's differentiation process between the two hosts. These data provide the first 3D analysis of Leishmania telomere organization. The possible biological implications of these findings are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims: This study aimed to classify alcohol-dependent outpatients on the basis of clinical factors and to verify if the resulting types show different treatment retention. Methods: The sample comprised 332 alcoholics that were enrolled in three different pharmacological trials carried out at Sao Paulo University, Brazil. Based on four clinical factors problem drinking onset age, familial alcoholism, alcohol dependence severity, and depression - K-means cluster analysis was performed by using the average silhouette width to determine the number of clusters. A direct logistic regression was performed to analyze the influence of clusters, medication groups, and Alcoholics Anonymous ( AA) attendance in treatment retention. Results: Two clusters were delineated. The cluster characterized by earlier onset age, more familial alcoholism, higher alcoholism severity, and less depression symptoms showed a higher chance of discontinuing the treatment, independently of medications used and AA attendance. Participation in AA was significantly related to treatment retention. Discussion: Health services should broaden the scope of services offered to meet heterogeneous needs of clients, and identify treatment practices and therapists which improve retention. Information about patients' characteristics linked to dropout should be used to make treatment programs more responsive and attractive, combining pharmacological agents with more intensive and diversified psychosocial interventions. Copyright (C) 2012 S. Karger AG, Basel

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Az amerikai történelemben a paradigmaváltások meglehetősen nagyszabású és alapvető hatásokkal jártak, míg Japánban mintha hosszú távon is a Tudományos vezetés (gyártás-orientált) paradigmájánál maradtak volna, mely aztán módosult valamennyit a többi paradigma megjelenésének hatására. A tanulmány ebből az elméleti megállapításból kiindulva vizsgálja meg mind szakirodalmi alapon, mind egy kérdőíves kutatás segítségével, a japán társadalom „reakcióidejét”: attitűdjeit a változással, az alkalmazkodással szemben. Alapfeltételezése, hogy a tartós versenyképesség fenntartásához elengedhetetlen a külső körülményekhez való folyamatos alkalmazkodás, ami időről időre a múlt gyakorlatainak megkérdőjelezését eredményezi. Rövid elemzésünk azon az előfeltevésen alapult, hogy egy vagy több, a számát és / vagy társadalmi befolyását tekintve jelentős társadalmi csoport továbbra is erősen támogatja a hagyományos japán vezetési gyakorlatot. A hipotézist sikeresen bebizonyítottuk: a hagyományos menedzsmentet szignifikánsan támogatta két olyan klaszter, melyek száma jelentősen meghaladta a panel fennmaradó részét és amelyek nagyobb fokú társadalmi befolyást is képviseltek. Megállapítottuk, hogy a régi rendszert valószínűleg azok a rangidős japán férfiak támogatják, akik állandó dolgozóként vagy vezetőként dolgoznak meglehetősen nagy vállalatoknál, hosszú távú foglalkoztatásban. Egy másik csoportot is azonosítottunk az előzővel szemben, mely fiatalabb tagokból áll, változatosabb és magasabb iskolai végzettség jellemzi. Még nem egyértelmű, hogy a második csoport tagjai életkoruk előrehaladtával betagozódnak majd a japán rendszerbe és vállalatokba, ahogy az a múltban is történt, vagy továbbra is küzdenek majd a hagyományos módszerek ellen. Mindenesetre célszerűnek látszik az eredmények alapján a kutatás továbbgondolása és magyarországi megvalósítása is, ami által a magyar gazdasági és kormányzati elit a versenyképesség szempontjából értékes információkhoz juthat, nevezetesen az elavultnak számító gyakorlatok beazonosításán és az ennek fennmaradásáért felelős csoportok leírásán keresztül. _________ In the American history, change between paradigms implied rather large-scale and deep effects, whereas the Japanese seem to keep the focus on the Scientific Management (or production-oriented) paradigm which was only party altered by the others (HR, etc.). Based on this affirmation, our study examines the reactivity of Japanese society both in a theoretical and in a practical sense. We assumed first that continuous adaptation to the external conditions is an essential element of long-term competitiveness, which implies the regular questioning of old management practices. We presumed that one or several group(s) of people is (are) still strongly supporting traditional Japanese management practices and they must be important in terms of number and/or social influence. We have found evidence for our hypothesis: two significantly supportive clusters which largely outnumbered the rest of the panel and represented a higher level of social influence as well. We stated that the old system is probably supported by senior Japanese males who work as regular (core) employees or managers in rather large companies and enjoy long-term employment. Another group was also identified as a contrast to the former one, with younger members, more diversity and a higher level of education. As they grow older, the second group may become socialized into the Japanese system as happened in the past, or continue to reject traditional methods. Based on our observations, it seems useful to broaden the focus of our research and carry out a similar study in Hungary as well. This will bring valuable information on competitiveness to the business and political elite, enabling them to identify inefficient old practices and to better target the groups responsible for maintaining those practices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE: To determine the association between language and number of citations of ophthalmology articles published in Brazilian journals. METHODS: This study was a systematic review. Original articles were identified by review of documents published at the two Brazilian ophthalmology journals indexed at Science Citation Index Expanded - SCIE [Arquivos Brasileiros de Oftalmologia (ABO) and Revista Brasileira de Oftalmologia (RBO)]. All document types (articles and reviews) listed at SCIE in English (English Group) or in Portuguese (Portuguese Group) from January 1, 2008 to December 31, 2009 were included, except: editorial materials; corrections; letters; and biographical items. The primary outcome was the number of citations through the end of second year after publication date. Subgroup analysis included likelihood of citation (cited at least once versus no citation), journal, and year of publication. RESULTS: The search at the web of science revealed 382 articles [107 (28%) in the English Group and 275 (72%) in the Portuguese Group]. Of those, 297 (77.7%) were published at the ABO and 85 (23.3%) at the RBO. The citation counts were statistically significantly higher (P<0.001) in the English Group (1.51 - SD 1.98 - range 0 to 11) compared with the Portuguese Group (0.57 - SD 1.06 - range 0 to 7). The likelihood citation was statistically significant higher (P<0.001) in the English Group (70/107 - 65.4%) compared with the Portuguese Group (89/275 - 32.7%). There were more articles published in English at the ABO (98/297 - 32.9%) than at the RBO (9/85 - 10.6%) [P<0.001]. There were no significant difference (P=0.967) at the proportion of articles published in English at the years 2008 (48/172 - 27.9%) and 2009 (59/210 - 28.1%). CONCLUSION: The number of citations of articles published in Portuguese at Brazilian ophthalmology journals is lower than the published in English. The results of this study suggest that the editorial boards should strongly encourage the authors to adopt English as the main language in their future articles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study evaluated the relationship among malocclusion, number of occlusal pairs, masticatory performance, masticatory time and masticatory ability in completely dentate subjects. Eighty healthy subjects (mean age = 19.40 ± 4.14 years) were grouped according to malocclusion diagnosis (n = 16): Class I, Class Class II-2, Class III and Normocclusion (control). Number of occlusal pairs was determined clinically. Masticatory performance was evaluated by the sieving method, and the time used for the comminute test food was registered as the masticatory time. Masticatory ability was measured by a dichotomic self-perception questionnaire. Statistical analysis was done by one-way ANOVA, ANOVA on ranks, Chi-Square and Spearman tests. Class II-1 and III malocclusion groups presented a smaller number of occlusal pairs than Normocclusion (p < 0.0001), Class I (p < 0.001) and II-2 (p < 0.0001) malocclusion groups. Class I, and III malocclusion groups showed lower masticatory performance values compared to Normocclusion (p < 0.05) and Class II-2 (p < 0.05) malocclusion groups. There were no differences in masticatory time (p = 0.156) and ability (χ2 = 3.58/p= 0.465) among groups. Occlusal pairs were associated with malocclusion (rho = 0.444/p < 0.0001) and masticatory performance (rho = 0.393/p < 0.0001), but malocclusion was not correlated with masticatory performance (rho = 0.116/p= 0.306). In conclusion, masticatory performance and ability were not related to malocclusion, and subjects with Class I, II-1 and III malocclusions presented lower masticatory performance because of their smaller number of occlusal pairs.