31 resultados para Incremental Clustering

em University of Queensland eSpace - Australia


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present ICICLE (Image ChainNet and Incremental Clustering Engine), a prototype system that we have developed to efficiently and effectively retrieve WWW images based on image semantics. ICICLE has two distinguishing features. First, it employs a novel image representation model called Weight ChainNet to capture the semantics of the image content. A new formula, called list space model, for computing semantic similarities is also introduced. Second, to speed up retrieval, ICICLE employs an incremental clustering mechanism, ICC (Incremental Clustering on ChainNet), to cluster images with similar semantics into the same partition. Each cluster has a summary representative and all clusters' representatives are further summarized into a balanced and full binary tree structure. We conducted an extensive performance study to evaluate ICICLE. Compared with some recently proposed methods, our results show that ICICLE provides better recall and precision. Our clustering technique ICC facilitates speedy retrieval of images without sacrificing recall and precision significantly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background-Although assessment of myocardial perfusion by myocardial contrast echocardiography (MCE) is feasible, its incremental benefit to stress echocardiography is not well defined. We examined whether the addition of MCE to combined dipyridamole-exercise echocardiography (DExE) provides incremental benefit for evaluation of coronary artery disease (CAD). Methods and Results-MCE was combined with DExE in 85 patients, 70 of whom were undergoing quantitative coronary angiography and 15 patients with a low probability of CAD. MCE was acquired by low-mechanical-index imaging in 3 apical views after acquisition of standard resting and poststress images. Wall motion, left ventricular opacification, and MCE components of the study were interpreted sequentially, blinded to other data. Significant (>50%) stenoses were present in 43 patients and involved 69 coronary territories. The addition of qualitative MCE improved sensitivity for the detection of CAD (91% versus 74%, P=0.02) and accurate recognition of disease extent (87% versus 65% of territories, P=0.003), with a nonsignificant reduction in specificity. Conclusions-The addition of low-mechanical-index MCE to standard imaging during DExE improves detection of CAD and enables a more accurate determination of disease extent.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Stable social aggregations are rarely recorded in lizards, but have now been reported from several species in the Australian scincid genus Egernia. Most of those examples come from species using rock crevice refuges that are relatively easy to observe. But for many other Egernia species that occupy different habitats and are more secretive, it is hard to gather the observational data needed to deduce their social structure. Therefore, we used genotypes at six polymorphic microsatellite DNA loci of 229 individuals of Egernia frerei, trapped in 22 sampling sites over 3500 ha of eucalypt forest on Fraser Island, Australia. Each sampling site contained 15 trap locations in a 100 x 50 m grid. We estimated relatedness among pairs of individuals and found that relatedness was higher within than between sites. Relatedness of females within sites was higher than relatedness of males, and was higher than relatedness between males and females. Within sites we found that juvenile lizards were highly related to other juveniles and to adults trapped at the same location, or at adjacent locations, but relatedness decreased with increasing trap separation. We interpreted the results as suggesting high natal philopatry among juvenile lizards and adult females. This result is consistent with stable family group structure previously reported in rock dwelling Egernia species, and suggests that social behaviour in this genus is not habitat driven.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Evolutionary algorithms perform optimization using a population of sample solution points. An interesting development has been to view population-based optimization as the process of evolving an explicit, probabilistic model of the search space. This paper investigates a formal basis for continuous, population-based optimization in terms of a stochastic gradient descent on the Kullback-Leibler divergence between the model probability density and the objective function, represented as an unknown density of assumed form. This leads to an update rule that is related and compared with previous theoretical work, a continuous version of the population-based incremental learning algorithm, and the generalized mean shift clustering framework. Experimental results are presented that demonstrate the dynamics of the new algorithm on a set of simple test problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectives: The objectives of this study were to examine the extent of clustering of smoking, high levels of television watching, overweight, and high blood pressure among adolescents and whether this clustering varies by socioeconomic position and Cognitive function. Methods: This study was a cross-sectional analysis of 3613 (1742 females) participants of an Australian birth cohort who were examined at age 14. Results: Three hundred fifty-three (9.8%) of the participants had co-occurrence of three or four risk factors. Risk factors clustered in these adolescents with a greater number of participants than would be predicted by assumptions of independence having no risk factors and three or four risk factors. The extent of clustering tended to be greater in those from lower-income families and among those with lower cognitive function. The age-adjusted ratio of observed to expected cooccurrence of three or four risk factors was 2.70 (95% confidence interval [Cl], 1.80-4.06) among those from low-income families and 1.70 (95% Cl, 1.34-2.16) among those from more affluent families. The ratio among those with low Raven's scores (nonverbal reasoning) was 2.36 (95% Cl, 1.69-3.30) and among those with higher scores was 1.51 (95% Cl, 1.19-1.92); similar results for the WRAT 3 score (reading ability) were 2.69 (95% Cl, 1.85-3.94) and 1.68 (95% Cl, 1.34-2.11). Clustering did not differ by sex. Conclusion: Among adolescents, coronary heart disease risk factors cluster, and there is some evidence that this clustering is greater among those from families with low income and those who have lower cognitive function.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have undertaken two-dimensional gel electrophoresis proteomic profiling on a series of cell lines with different recombinant antibody production rates. Due to the nature of gel-based experiments not all protein spots are detected across all samples in an experiment, and hence datasets are invariably incomplete. New approaches are therefore required for the analysis of such graduated datasets. We approached this problem in two ways. Firstly, we applied a missing value imputation technique to calculate missing data points. Secondly, we combined a singular value decomposition based hierarchical clustering with the expression variability test to identify protein spots whose expression correlates with increased antibody production. The results have shown that while imputation of missing data was a useful method to improve the statistical analysis of such data sets, this was of limited use in differentiating between the samples investigated, and highlighted a small number of candidate proteins for further investigation. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aims Technological advances in cardiac imaging have led to dramatic increases in test utilization and consumption of a growing proportion of cardiovascular healthcare costs. The opportunity costs of strategies favouring exercise echocardiography or SPECT imaging have been incompletely evaluated. Methods and results We examined prognosis and cost-effectiveness of exercise echocardiography (n=4884) vs. SPECT (n=4637) imaging in stable, intermediate risk, chest pain patients. Ischaemia extent was defined as the number of vascular territories with echocardiographic wall motion or SPECT perfusion abnormalities. Cox proportional hazard models were employed to assess time to cardiac death or myocardial infarction (MI). Total cardiovascular costs were summed (discounted and inflation-corrected) throughout follow-up. A cost-effectiveness ratio = 2% annual event risk), SPECT ischaemia was associated with earlier and greater utilization of coronary revascularization (P < 0.0001) resulting in an incremental cost-effectiveness ratio of $32 381/LYS. Conclusion Health care policies aimed at allocating limited resources can be effectively guided by applying clinical and economic outcomes evidence. A strategy aimed at cost-effective testing would support using echocardiography in low-risk patients with suspected coronary disease, whereas those higher risk patients benefit from referral to SPECT imaging.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article, we propose a framework, namely, Prediction-Learning-Distillation (PLD) for interactive document classification and distilling misclassified documents. Whenever a user points out misclassified documents, the PLD learns from the mistakes and identifies the same mistakes from all other classified documents. The PLD then enforces this learning for future classifications. If the classifier fails to accept relevant documents or reject irrelevant documents on certain categories, then PLD will assign those documents as new positive/negative training instances. The classifier can then strengthen its weakness by learning from these new training instances. Our experiments’ results have demonstrated that the proposed algorithm can learn from user-identified misclassified documents, and then distil the rest successfully.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quality of life has been shown to be poor among people living with chronic hepatitis C However, it is not clear how this relates to the presence of symptoms and their severity. The aim of this study was to describe the typology of a broad array of symptoms that were attributed to hepatitis C virus (HCV) infection. Phase I used qualitative methods to identify symptoms. In Phase 2, 188 treatment-naive people living with HCV participated in a quantitative survey. The most prevalent symptom was physical tiredness (86%) followed by irritability (75%), depression (70%), mental tiredness (70%), and abdominal pain (68%). Temporal clustering of symptoms was reported in 62% of participants. Principal components analysis identified four symptom clusters: neuropsychiatric (mental tiredness, poor concentration, forgetfulness, depression, irritability, physical tiredness, and sleep problems); gastrointestinal (day sweats, nausea, food intolerance, night sweats, abdominal pain, poor appetite, and diarrhea); algesic (joint pain, muscle pain, and general body pain); and dysesthetic (noise sensitivity, light sensitivity, skin. problems, and headaches). These data demonstrate that symptoms are prevalent in treatment-naive people with HCV and support the hypothesis that symptom clustering occurs.