38 resultados para Cluster Analysis. Information Theory. Entropy. Cross Information Potential. Complex Data

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the application of a new technique, rough clustering, to the problem of market segmentation. Rough clustering produces different solutions to k-means analysis because of the possibility of multiple cluster membership of objects. Traditional clustering methods generate extensional descriptions of groups, that show which objects are members of each cluster. Clustering techniques based on rough sets theory generate intensional descriptions, which outline the main characteristics of each cluster. In this study, a rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation (the general predisposition of consumers toward the act of shopping) and intention to purchase products via the Internet. The cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. The rough clusters obtained provide interpretations of different shopping orientations present in the data without the restriction of attempting to fit each object into only one segment. Such descriptions can be an aid to marketers attempting to identify potential segments of consumers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Normal mixture models are often used to cluster continuous data. However, conventional approaches for fitting these models will have problems in producing nonsingular estimates of the component-covariance matrices when the dimension of the observations is large relative to the number of observations. In this case, methods such as principal components analysis (PCA) and the mixture of factor analyzers model can be adopted to avoid these estimation problems. We examine these approaches applied to the Cabernet wine data set of Ashenfelter (1999), considering the clustering of both the wines and the judges, and comparing our results with another analysis. The mixture of factor analyzers model proves particularly effective in clustering the wines, accurately classifying many of the wines by location.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena. While normal mixture models are often used to cluster data sets of continuous multivariate data, a more robust clustering can be obtained by considering the t mixture model-based approach. Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data where the number of observations n is very large relative to their dimension p. As the approach using the multivariate normal family of distributions is sensitive to outliers, it is more robust to adopt the multivariate t family for the component error and factor distributions. The computational aspects associated with robustness and high dimensionality in these approaches to cluster analysis are discussed and illustrated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study was to investigate the relationship between self-awareness, emotional distress, motivation, and outcome in adults with severe traumatic brain injury. A sample of 55 patients were selected from 120 consecutive patients with severe traumatic brain injury admitted to the rehabilitation unit of a large metropolitan public hospital. Subjects received multidisciplinary inpatient rehabilitation and different types of outpatient rehabilitation and community-based services according to availability and need, Measures used in the cluster analysis were the Patient Competency Rating Scale, Self-Awareness of Deficits Interview, Head Injury Behavior Scale, Change Assessment Questionnaire, the Beck Depression Inventory, and Beck Anxiety Inventory; outcome measures were the Disability Rating Scale, Community Integration Questionnaire, and Sickness Impact Profile. A three-cluster solution was selected, with groups labeled as high self-awareness (n = 23), low self-awareness (n = 23), and good recovery (n = 8). The high self-awareness cluster had significantly higher levels of self-awareness, motivation, and emotional distress than the low self-awareness cluster but did not differ significantly in outcome. Self-awareness after brain injury is associated with greater motivation to change behavior and higher levels of depression and anxiety; however, it was not clear that this heightened motivation actually led to any improvement in outcome. Rehabilitation timing and approach may need to be tailored to match the individual's level of self-awareness, motivation, and emotional distress.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a network module detection approach which combines a rapid and robust clustering algorithm with an objective measure of the coherence of the modules identified. The approach is applied to the network of genetic regulatory interactions surrounding the tumor suppressor gene p53. This algorithm identifies ten clusters in the p53 network, which are visually coherent and biologically plausible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Cnidarian - dinoflagellate intracellular symbioses are one of the most important mutualisms in the marine environment. They form the trophic and structural foundation of coral reef ecosystems, and have played a key role in the evolutionary radiation and biodiversity of cnidarian species. Despite the prevalence of these symbioses, we still know very little about the molecular modulators that initiate, regulate, and maintain the interaction between these two different biological entities. In this study, we conducted a comparative host anemone transcriptome analysis using a cDNA microarray platform to identify genes involved in cnidarian - algal symbiosis. Results: We detected statistically significant differences in host gene expression profiles between sea anemones ( Anthopleura elegantissima) in a symbiotic and non-symbiotic state. The group of genes, whose expression is altered, is diverse, suggesting that the molecular regulation of the symbiosis is governed by changes in multiple cellular processes. In the context of cnidarian dinoflagellate symbioses, we discuss pivotal host gene expression changes involved in lipid metabolism, cell adhesion, cell proliferation, apoptosis, and oxidative stress. Conclusion: Our data do not support the existence of symbiosis- specific genes involved in controlling and regulating the symbiosis. Instead, it appears that the symbiosis is maintained by altering expression of existing genes involved in vital cellular processes. Specifically, the finding of key genes involved in cell cycle progression and apoptosis have led us to hypothesize that a suppression of apoptosis, together with a deregulation of the host cell cycle, create a platform that might be necessary for symbiont and/or symbiont-containing host cell survival. This first comprehensive molecular examination of the cnidarian - dinoflagellate associations provides critical insights into the maintenance and regulation of the symbiosis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. There are a variety of methods that could be used to increase the efficiency of the design of experiments. However, it is only recently that such methods have been considered in the design of clinical pharmacology trials. 2. Two such methods, termed data-dependent (e.g. simulation) and data-independent (e.g. analytical evaluation of the information in a particular design), are becoming increasingly used as efficient methods for designing clinical trials. These two design methods have tended to be viewed as competitive, although a complementary role in design is proposed here. 3. The impetus for the use of these two methods has been the need for a more fully integrated approach to the drug development process that specifically allows for sequential development (i.e. where the results of early phase studies influence later-phase studies). 4. The present article briefly presents the background and theory that underpins both the data-dependent and -independent methods with the use of illustrative examples from the literature. In addition, the potential advantages and disadvantages of each method are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Experimental data for E. coli debris size reduction during high-pressure homogenisation at 55 MPa are presented. A mathematical model based on grinding theory is developed to describe the data. The model is based on first-order breakage and compensation conditions. It does not require any assumption of a specified distribution for debris size and can be used given information on the initial size distribution of whole cells and the disruption efficiency during homogenisation. The number of homogeniser passes is incorporated into the model and used to describe the size reduction of non-induced stationary and induced E. coil cells during homogenisation. Regressing the results to the model equations gave an excellent fit to experimental data ( > 98.7% of variance explained for both fermentations), confirming the model's potential for predicting size reduction during high-pressure homogenisation. This study provides a means to optimise both homogenisation and disc-stack centrifugation conditions for recombinant product recovery. (C) 1997 Elsevier Science Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing procedures for the generation of polymorphic DNA markers are not optimal for insect studies in which the organisms are often tiny and background molecular Information is often non-existent. We have used a new high throughput DNA marker generation protocol called randomly amplified DNA fingerprints (RAF) to analyse the genetic variability In three separate strains of the stored grain pest, Rhyzopertha dominica. This protocol is quick, robust and reliable even though it requires minimal sample preparation, minute amounts of DNA and no prior molecular analysis of the organism. Arbitrarily selected oligonucleotide primers routinely produced similar to 50 scoreable polymorphic DNA markers, between individuals of three Independent field isolates of R. dominica. Multivariate cluster analysis using forty-nine arbitrarily selected polymorphisms generated from a single primer reliably separated individuals into three clades corresponding to their geographical origin. The resulting clades were quite distinct, with an average genetic difference of 37.5 +/- 6.0% between clades and of 21.0 +/- 7.1% between individuals within clades. As a prelude to future gene mapping efforts, we have also assessed the performance of RAF under conditions commonly used in gene mapping. In this analysis, fingerprints from pooled DNA samples accurately and reproducibly reflected RAF profiles obtained from Individual DNA samples that had been combined to create the bulked samples.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A marker database was compiled for isolates of the potato and tomato late blight pathogen, Phytophthora infestans, originating from 41 locations which include 31 countries plus 10 regions within Mexico. Presently, the database contains information on 1,776 isolates for one or more of the following markers: restriction fragment length polymorphism (RFLP) fingerprint consisting of 23 bands; mating type; dilocus allozyme genotype; mitochondrial DNA haplotype; sensitivity to the fungicide metalaxyl; and virulence. In the database, 305 entries have unique RFLP fingerprints and 258 entries have unique multilocus genotypes based on RFLP fingerprint, dilocus allozyme genotype, and mating type. A nomenclature is described for naming multilocus genotypes based on the International Organization for Standardization (ISO) two-letter country code and a unique number, Forty-two previously published multilocus genotypes are represented in the database with references to publications. As a result of compilation of the database, seven new genotypes were identified and named. Cluster analysis of genotypes from clonally propagated populations worldwide generally confirmed a previously published classification of old and new genotypes. Genotypes from geographically distant countries were frequently clustered, and several old and new genotypes were found in two or more distant countries. The cluster analysis also demonstrated that A2 genotypes from Argentina differed from all others. The database is available via the Internet, and thus can serve as a resource for Phytophthora workers worldwide.