853 resultados para Unsupervised clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present studies of the spatial clustering of inertial particles embedded in turbulent flow. A major part of the thesis is experimental, involving the technique of Phase Doppler Interferometry (PDI). The thesis also includes significant amount of simulation studies and some theoretical considerations. We describe the details of PDI and explain why it is suitable for study of particle clustering in turbulent flow with a strong mean velocity. We introduce the concept of the radial distribution function (RDF) as our chosen way of quantifying inertial particle clustering and present some original works on foundational and practical considerations related to it. These include methods of treating finite sampling size, interpretation of the magnitude of RDF and the possibility of isolating RDF signature of inertial clustering from that of large scale mixing. In experimental work, we used the PDI to observe clustering of water droplets in a turbulent wind tunnel. From that we present, in the form of a published paper, evidence of dynamical similarity (Stokes number similarity) of inertial particle clustering together with other results in qualitative agreement with available theoretical prediction and simulation results. We next show detailed quantitative comparisons of results from our experiments, direct-numerical-simulation (DNS) and theory. Very promising agreement was found for like-sized particles (mono-disperse). Theory is found to be incorrect regarding clustering of different-sized particles and we propose a empirical correction based on the DNS and experimental results. Besides this, we also discovered a few interesting characteristics of inertial clustering. Firstly, through observations, we found an intriguing possibility for modeling the RDF arising from inertial clustering that has only one (sensitive) parameter. We also found that clustering becomes saturated at high Reynolds number.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important problem in unsupervised data clustering is how to determine the number of clusters. Here we investigate how this can be achieved in an automated way by using interrelation matrices of multivariate time series. Two nonparametric and purely data driven algorithms are expounded and compared. The first exploits the eigenvalue spectra of surrogate data, while the second employs the eigenvector components of the interrelation matrix. Compared to the first algorithm, the second approach is computationally faster and not limited to linear interrelation measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Prevotella nigrescens, Prevotella intermedia and Porphyromonas gingivalis are oral pathogens from the family Bacteroidaceae, regularly isolated from cases of gingivitis and periodontitis. In this study, the phylogenetic variability of these three bacterial species was investigated by means of 16S rRNA (rrs) gene sequence comparisons of a set of epidemiologically and geographically diverse isolates. For each of the three species, the rrs gene sequences of 11 clinical isolates as well as the corresponding type strains was determined. Comparison of all rrs sequences obtained with those of closely related species revealed a clear clustering of species, with only a little intraspecies variability but a clear difference in the rrs gene with respect to the next related taxon. The results indicate that the three species form stable, homogeneous genetic groups, which favours an rrs-based species identification of these oral pathogens. This is especially useful given the 7% sequence divergence between Prevotella intermedia and Prevotella nigrescens, since phenotypic distinction between the two Prevotella species is inconsistent or involves techniques not applicable in routine identification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

OBJECTIVES In dental research multiple site observations within patients or taken at various time intervals are commonplace. These clustered observations are not independent; statistical analysis should be amended accordingly. This study aimed to assess whether adjustment for clustering effects during statistical analysis was undertaken in five specialty dental journals. METHODS Thirty recent consecutive issues of Orthodontics (OJ), Periodontology (PJ), Endodontology (EJ), Maxillofacial (MJ) and Paediatric Dentristry (PDJ) journals were hand searched. Articles requiring adjustment accounting for clustering effects were identified and statistical techniques used were scrutinized. RESULTS Of 559 studies considered to have inherent clustering effects, adjustment for this was made in the statistical analysis in 223 (39.1%). Studies published in the Periodontology specialty accounted for clustering effects in the statistical analysis more often than articles published in other journals (OJ vs. PJ: OR=0.21, 95% CI: 0.12, 0.37, p<0.001; MJ vs. PJ: OR=0.02, 95% CI: 0.00, 0.07, p<0.001; PDJ vs. PJ: OR=0.14, 95% CI: 0.07, 0.28, p<0.001; EJ vs. PJ: OR=0.11, 95% CI: 0.06, 0.22, p<0.001). A positive correlation was found between increasing prevalence of clustering effects in individual specialty journals and correct statistical handling of clustering (r=0.89). CONCLUSIONS The majority of studies in 5 dental specialty journals (60.9%) examined failed to account for clustering effects in statistical analysis where indicated, raising the possibility of inappropriate decreases in p-values and the risk of inappropriate inferences.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: HCV coinfection remains a major cause of morbidity and mortality among HIV-infected individuals and its incidence has increased dramatically in HIV-infected men who have sex with men(MSM). METHODS: Hepatitis C virus (HCV) coinfection in the Swiss HIV Cohort Study(SHCS) was studied by combining clinical data with HIV-1 pol-sequences from the SHCS Drug Resistance Database(DRDB). We inferred maximum-likelihood phylogenetic trees, determined Swiss HIV-transmission pairs as monophyletic patient pairs, and then considered the distribution of HCV on those pairs. RESULTS: Among the 9748 patients in the SHCS-DRDB with known HCV status, 2768(28%) were HCV-positive. Focusing on subtype B(7644 patients), we identified 1555 potential HIV-1 transmission pairs. There, we found that, even after controlling for transmission group, calendar year, age and sex, the odds for an HCV coinfection were increased by an odds ratio (OR) of 3.2 [95% confidence interval (CI) 2.2, 4.7) if a patient clustered with another HCV-positive case. This strong association persisted if transmission groups of intravenous drug users (IDUs), MSMs and heterosexuals (HETs) were considered separately(in all cases OR >2). Finally we found that HCV incidence was increased by a hazard ratio of 2.1 (1.1, 3.8) for individuals paired with an HCV-positive partner. CONCLUSIONS: Patients whose HIV virus is closely related to the HIV virus of HIV/HCV-coinfected patients have a higher risk for carrying or acquiring HCV themselves. This indicates the occurrence of domestic and sexual HCV transmission and allows the identification of patients with a high HCV-infection risk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Against the background of a widely fragmented and diluted international environmental governance architecture, different reform options are currently being discussed. This issue brief considers whether streamlining international environmental regimes by grouping or ‘clustering’ international agreements could improve effectiveness and efficiency. It outlines the general idea of the clustering approach, draws lessons from the chemicals and waste cluster and examines the implications and potentials of clustering multilateral environmental agreements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND The insertion element IS630 found in Aeromonas salmonicida belongs to the IS630-Tc1-mariner superfamily of transposons. It is present in multiple copies and represents approximately half of the IS present in the genome of A. salmonicida subsp. salmonicida A449. RESULTS By using High Copy Number IS630 Restriction Fragment Length Polymorphism (HCN-IS630-RFLP), strains of various subspecies of Aeromonas salmonicida showed conserved or clustering patterns, thus allowing their differentiation from each other. Fingerprints of A. salmonicida subsp. salmonicida showed the highest homogeneity while 'atypical' A. salmonicida strains were more heterogeneous. IS630 typing also differentiated A. salmonicida from other Aeromonas species. The copy number of IS630 in Aeromonas salmonicida ranges from 8 to 35 and is much lower in other Aeromonas species. CONCLUSIONS HCN-IS630-RFLP is a powerful tool for subtyping of A. salmonicida. The high stability of IS630 insertions in A. salmonicida subsp. salmonicida indicates that it might have played a role in pathoadaptation of A. salmonicida which has reached an optimal configuration in the highly virulent and specific fish pathogen A. salmonicida subsp. salmonicida.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND Follicular variant of papillary thyroid carcinoma (FVPTC) shares features of papillary (PTC) and follicular (FTC) thyroid carcinomas on a clinical, morphological, and genetic level. MicroRNA (miRNA) deregulation was extensively studied in PTCs and FTCs. However, very limited information is available for FVPTC. The aim of this study was to assess miRNA expression in FVPTC with the most comprehensive miRNA array panel and to correlate it with the clinicopathological data. METHODS Forty-four papillary thyroid carcinomas (17 FVPTC, 27 classic PTC) and eight normal thyroid tissue samples were analyzed for expression of 748 miRNAs using Human Microarray Assays on the ABI 7900 platform (Life Technologies, Carlsbad, CA). In addition, an independent set of 61 tumor and normal samples was studied for expression of novel miRNA markers detected in this study. RESULTS Overall, the miRNA expression profile demonstrated similar trends between FVPTC and classic PTC. Fourteen miRNAs were deregulated in FVPTC with a fold change of more than five (up/down), including miRNAs known to be upregulated in PTC (miR-146b-3p, -146-5p, -221, -222 and miR-222-5p) and novel miRNAs (miR-375, -551b, 181-2-3p, 99b-3p). However, the levels of miRNA expression were different between these tumor types and some miRNAs were uniquely dysregulated in FVPTC allowing separation of these tumors on the unsupervised hierarchical clustering analysis. Upregulation of novel miR-375 was confirmed in a large independent set of follicular cell derived neoplasms and benign nodules and demonstrated specific upregulation for PTC. Two miRNAs (miR-181a-2-3p, miR-99b-3p) were associated with an adverse outcome in FVPTC patients by a Kaplan-Meier (p < 0.05) and multivariate Cox regression analysis (p < 0.05). CONCLUSIONS Despite high similarity in miRNA expression between FVPTC and classic PTC, several miRNAs were uniquely expressed in each tumor type, supporting their histopathologic differences. Highly upregulated miRNA identified in this study (miR-375) can serve as a novel marker of papillary thyroid carcinoma, and miR-181a-2-3p and miR-99b-3p can predict relapse-free survival in patients with FVPTC thus potentially providing important diagnostic and predictive value.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of fitting a union of subspaces to a collection of data points drawn from one or more subspaces and corrupted by noise and/or gross errors. We pose this problem as a non-convex optimization problem, where the goal is to decompose the corrupted data matrix as the sum of a clean and self-expressive dictionary plus a matrix of noise and/or gross errors. By self-expressive we mean a dictionary whose atoms can be expressed as linear combinations of themselves with low-rank coefficients. In the case of noisy data, our key contribution is to show that this non-convex matrix decomposition problem can be solved in closed form from the SVD of the noisy data matrix. The solution involves a novel polynomial thresholding operator on the singular values of the data matrix, which requires minimal shrinkage. For one subspace, a particular case of our framework leads to classical PCA, which requires no shrinkage. For multiple subspaces, the low-rank coefficients obtained by our framework can be used to construct a data affinity matrix from which the clustering of the data according to the subspaces can be obtained by spectral clustering. In the case of data corrupted by gross errors, we solve the problem using an alternating minimization approach, which combines our polynomial thresholding operator with the more traditional shrinkage-thresholding operator. Experiments on motion segmentation and face clustering show that our framework performs on par with state-of-the-art techniques at a reduced computational cost.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The biomedical literature is extensively catalogued and indexed in MEDLINE. MEDLINE indexing is done by trained human indexers, who identify the most important concepts in each article, and is expensive and inconsistent. Automating the indexing task is difficult: the National Library of Medicine produces the Medical Text Indexer (MTI), which suggests potential indexing terms to the indexers. MTI’s output is not good enough to work unattended. In my thesis, I propose a different way to approach the indexing task called MEDRank. MEDRank creates graphs representing the concepts in biomedical articles and their relationships within the text, and applies graph-based ranking algorithms to identify the most important concepts in each article. I evaluate the performance of several automated indexing solutions, including my own, by comparing their output to the indexing terms selected by the human indexers. MEDRank outperformed all other evaluated indexing solutions, including MTI, in general indexing performance and precision. MEDRank can be used to cluster documents, index any kind of biomedical text with standard vocabularies, or could become part of MTI itself.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An integrated approach for multi-spectral segmentation of MR images is presented. This method is based on the fuzzy c-means (FCM) and includes bias field correction and contextual constraints over spatial intensity distribution and accounts for the non-spherical cluster's shape in the feature space. The bias field is modeled as a linear combination of smooth polynomial basis functions for fast computation in the clustering iterations. Regularization terms for the neighborhood continuity of intensity are added into the FCM cost functions. To reduce the computational complexity, the contextual regularizations are separated from the clustering iterations. Since the feature space is not isotropic, distance measure adopted in Gustafson-Kessel (G-K) algorithm is used instead of the Euclidean distance, to account for the non-spherical shape of the clusters in the feature space. These algorithms are quantitatively evaluated on MR brain images using the similarity measures.