25 resultados para Cluster Analysis. Information Theory. Entropy. Cross Information Potential. Complex Data

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the application of a new technique, rough clustering, to the problem of market segmentation. Rough clustering produces different solutions to k-means analysis because of the possibility of multiple cluster membership of objects. Traditional clustering methods generate extensional descriptions of groups, that show which objects are members of each cluster. Clustering techniques based on rough sets theory generate intensional descriptions, which outline the main characteristics of each cluster. In this study, a rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation (the general predisposition of consumers toward the act of shopping) and intention to purchase products via the Internet. The cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. The rough clusters obtained provide interpretations of different shopping orientations present in the data without the restriction of attempting to fit each object into only one segment. Such descriptions can be an aid to marketers attempting to identify potential segments of consumers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Normal mixture models are often used to cluster continuous data. However, conventional approaches for fitting these models will have problems in producing nonsingular estimates of the component-covariance matrices when the dimension of the observations is large relative to the number of observations. In this case, methods such as principal components analysis (PCA) and the mixture of factor analyzers model can be adopted to avoid these estimation problems. We examine these approaches applied to the Cabernet wine data set of Ashenfelter (1999), considering the clustering of both the wines and the judges, and comparing our results with another analysis. The mixture of factor analyzers model proves particularly effective in clustering the wines, accurately classifying many of the wines by location.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena. While normal mixture models are often used to cluster data sets of continuous multivariate data, a more robust clustering can be obtained by considering the t mixture model-based approach. Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data where the number of observations n is very large relative to their dimension p. As the approach using the multivariate normal family of distributions is sensitive to outliers, it is more robust to adopt the multivariate t family for the component error and factor distributions. The computational aspects associated with robustness and high dimensionality in these approaches to cluster analysis are discussed and illustrated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a network module detection approach which combines a rapid and robust clustering algorithm with an objective measure of the coherence of the modules identified. The approach is applied to the network of genetic regulatory interactions surrounding the tumor suppressor gene p53. This algorithm identifies ten clusters in the p53 network, which are visually coherent and biologically plausible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Cnidarian - dinoflagellate intracellular symbioses are one of the most important mutualisms in the marine environment. They form the trophic and structural foundation of coral reef ecosystems, and have played a key role in the evolutionary radiation and biodiversity of cnidarian species. Despite the prevalence of these symbioses, we still know very little about the molecular modulators that initiate, regulate, and maintain the interaction between these two different biological entities. In this study, we conducted a comparative host anemone transcriptome analysis using a cDNA microarray platform to identify genes involved in cnidarian - algal symbiosis. Results: We detected statistically significant differences in host gene expression profiles between sea anemones ( Anthopleura elegantissima) in a symbiotic and non-symbiotic state. The group of genes, whose expression is altered, is diverse, suggesting that the molecular regulation of the symbiosis is governed by changes in multiple cellular processes. In the context of cnidarian dinoflagellate symbioses, we discuss pivotal host gene expression changes involved in lipid metabolism, cell adhesion, cell proliferation, apoptosis, and oxidative stress. Conclusion: Our data do not support the existence of symbiosis- specific genes involved in controlling and regulating the symbiosis. Instead, it appears that the symbiosis is maintained by altering expression of existing genes involved in vital cellular processes. Specifically, the finding of key genes involved in cell cycle progression and apoptosis have led us to hypothesize that a suppression of apoptosis, together with a deregulation of the host cell cycle, create a platform that might be necessary for symbiont and/or symbiont-containing host cell survival. This first comprehensive molecular examination of the cnidarian - dinoflagellate associations provides critical insights into the maintenance and regulation of the symbiosis.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The data structure of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. This research develops a methodology for evaluating, ex ante, the relative desirability of alternative data structures for end user queries. This research theorizes that the data structure that yields the lowest weighted average complexity for a representative sample of information requests is the most desirable data structure for end user queries. The theory was tested in an experiment that compared queries from two different relational database schemas. As theorized, end users querying the data structure associated with the less complex queries performed better Complexity was measured using three different Halstead metrics. Each of the three metrics provided excellent predictions of end user performance. This research supplies strong evidence that organizations can use complexity metrics to evaluate, ex ante, the desirability of alternate data structures. Organizations can use these evaluations to enhance the efficient and effective retrieval of information by creating data structures that minimize end user query complexity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As part of a comparative mapping study between sugarcane and sorghum, a sugarcane cDNA clone with homology to the maize Rp1-D rust resistance gene was mapped in sorghum. The cDNA probe hybridised to multiple loci, including one on sorghum linkage group (LG) E in a region where a major rust resistance QTL had been previously mapped. Partial sorghum Rp1-D homologues were isolated from genomic DNA of rust-resistant and -susceptible progeny selected from a sorghum mapping population. Sequencing of the Rp1-D homologues revealed five discrete sequence classes: three from resistant progeny and two from susceptible progeny. PCR primers specific to each sequence class were used to amplify products from the progeny and confirmed that the five sequence classes mapped to the same locus on LG E. Cluster analysis of these sorghum sequences and available sugarcane, maize and sorghum Rp1-D homologue sequences showed that the maize Rp1-D sequence and the partial sugarcane Rp1-D homologue were clustered with one of the sorghum resistant progeny sequence classes, while previously published sorghum Rp1-D homologue sequences clustered with the susceptible progeny sequence classes. Full-length sequence information was obtained for one member of a resistant progeny sequence class (Rp1-SO) and compared with the maize Rp1-D sequence and a previously identified sorghum Rp1 homologue (Rph1-2). There was considerable similarity between the two sorghum sequences and less similarity between the sorghum and maize sequences. These results suggest a conservation of function and gene sequence homology at the Rp1 loci of maize and sorghum and provide a basis for convenient PCR-based screening tools for putative rust resistance alleles in sorghum.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Various marker systems exist for genetic analysis of horticultural species. Isozymes were first applied to the woody perennial nut crop, macadamia, in the early 1990s. The advent of DNA markers saw the development, for macadamia, of STMS (sequence-tagged microsatellite site), RAPD (randomly amplified polymorphic DNA), and RAF (randomly amplified DNA fingerprinting). The RAF technique typically generates dominant markers, but within the dominant marker profiles, certain primers also amplify multi-allelic co-dominant markers that are suspected to be microsatellites. In this paper, we confirm this for one such marker, and describe how RAF primers can be chosen that amplify one or more putative microsatellites. This approach of genotyping anonymous microsatellite markers via RAF is designated RAMiFi (randomly amplified microsatellite fingerprinting). Several marker systems were compared for the type, amount, and cost-efficiency of the information generated, using data from published studies on macadamia. The markers were also compared for the way they clustered a common set of accessions. The RAMiFi approach was identified as the most efficient and economical. The availability of such a versatile tool offers many advantages for the genetic characterisation of horticultural species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rumor discourse has been conceptualized as an attempt to reduce anxiety and uncertainty via a process of social sensemaking. Fourteen rumors transmitted on various Internet discussion groups were observed and content analyzed over the life of each rumor With this (previously unavailable) more ecologically robust methodology, the intertwined threads of sensemaking and the gaining of interpretive control are clearly evident in the tapestry of rumor discourse. We propose a categorization of statements (the Rumor Interaction Analysis System) and find differences between dread rumors and wish rumors in anxiety-related content categories. Cluster analysis of these statements reveals a typology of voices (communicative postures) exhibiting sensemaking activities of the rumor discussion group, such as hypothesizing, skeptical critique, directing of activities to gain information, and presentation of evidence. These findings enrich our understanding of the long-implicated sensemaking function of rumor by clarifying the elements of communication that operate in rumor's social context.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human social organization can deeply affect levels of genetic diversity. This fact implies that genetic information can be used to study social structures, which is the basis of ethnogenetics. Recently, methods have been developed to extract this information from genetic data gathered from subdivided populations that have gone through recent spatial expansions, which is typical of most human populations. Here, we perform a Bayesian analysis of mitochondrial and Y chromosome diversity in three matrilocal and three patrilocal groups from northern Thailand to infer the number of males and females arriving in these populations each generation and to estimate the age of their range expansion. We find that the number of male immigrants is 8 times smaller in patrilocal populations than in matrilocal populations, whereas women move 2.5 times more in patrilocal populations than in matrilocal populations. In addition to providing genetic quantification of sex-specific dispersal rates in human populations, we show that although men and women are exchanged at a similar rate between matrilocal populations, there are far fewer men than women moving into patrilocal populations. This finding is compatible with the hypothesis that men are strictly controlling male immigration and promoting female immigration in patrilocal populations and that immigration is much less regulated in matrilocal populations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recognising the laterality of a pictured hand involves making an initial decision and confirming that choice by mentally moving one's own hand to match the picture. This depends on an intact body schema. Because patients with complex regional pain syndrome type 1 (CRPS1) take longer to recognise a hand's laterality when it corresponds to their affected hand, it has been proposed that nociceptive input disrupts the body schema. However, chronic pain is associated with physiological and psychosocial complexities that may also explain the results. In three studies, we investigated whether the effect is simply due to nociceptive input. Study one evaluated the temporal and perceptual characteristics of acute hand pain elicited by intramuscular injection of hypertonic saline into the thenar eminence. In studies two and three, subjects performed a hand laterality recognition task before, during, and after acute experimental hand pain, and experimental elbow pain, respectively. During hand pain and during elbow pain, when the laterality of the pictured hand corresponded to the painful side, there was no effect on response time (RT). That suggests that nociceptive input alone is not sufficient to disrupt the working body schema. Conversely to patients with CRPS1, when the laterality of the pictured hand corresponded to the non-painful hand, RT increased similar to 380 ms (95% confidence interval 190 ms-590 ms). The results highlight the differences between acute and chronic pain and may reflect a bias in information processing in acute pain toward the affected part.