891 resultados para Hierarchical Clustering Model


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

System compositional approach to model construction and research of informational processes, which take place in biological hierarchical neural networks, is being discussed. A computer toolbox has been successfully developed for solution of tasks from this scientific sphere. A series of computational experiments investigating the work of this toolbox on olfactory bulb model has been carried out. The well-known psychophysical phenomena have been reproduced in experiments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Results of numerical experiments are introduced. Experiments were carried out by means of computer simulation on olfactory bulb for the purpose of checking of thinking mechanisms conceptual model, introduced in [2]. Key role of quasisymbol neurons in processes of pattern identification, existence of mental view, functions of cyclic connections between symbol and quasisymbol neurons as short-term memory, important role of synaptic plasticity in learning processes are confirmed numerically. Correctness of fundamental ideas put in base of conceptual model is confirmed on olfactory bulb at quantitative level.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning sentence-level embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned representations outperform the publicly available embeddings on half of the metrics in the word similarity task, 6 out of 13 sub tasks in the analogical reasoning task, and gives the best overall accuracy in the word sense effect classification task, which shows the effectiveness of our proposed distributed distribution learning model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Using survey data from 358 online customers, the study finds that the e-service quality construct conforms to the structure of a third-order factor model that links online service quality perceptions to distinct and actionable dimensions, including (1) website design, (2) fulfilment, (3) customer service, and (4) security/privacy. Each dimension is found to consist of several attributes that define the basis of e-service quality perceptions. A comprehensive specification of the construct, which includes attributes not covered in existing scales, is developed. The study contrasts a formative model consisting of 4 dimensions and 16 attributes against a reflective conceptualization. The results of this comparison indicate that studies using an incorrectly specified model overestimate the importance of certain e-service quality attributes. Global fit criteria are also found to support the detection of measurement misspecification. Meta-analytic data from 31,264 online customers are used to show that the developed measurement predicts customer behavior better than widely used scales, such as WebQual and E-S-Qual. The results show that the new measurement enables managers to assess e-service quality more accurately and predict customer behavior more reliably.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers' consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed. (C) 2016 Elsevier Ltd. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A flexible and multipurpose bio-inspired hierarchical model for analyzing musical timbre is presented in this paper. Inspired by findings in the fields of neuroscience, computational neuroscience, and psychoacoustics, not only does the model extract spectral and temporal characteristics of a signal, but it also analyzes amplitude modulations on different timescales. It uses a cochlear filter bank to resolve the spectral components of a sound, lateral inhibition to enhance spectral resolution, and a modulation filter bank to extract the global temporal envelope and roughness of the sound from amplitude modulations. The model was evaluated in three applications. First, it was used to simulate subjective data from two roughness experiments. Second, it was used for musical instrument classification using the k-NN algorithm and a Bayesian network. Third, it was applied to find the features that characterize sounds whose timbres were labeled in an audiovisual experiment. The successful application of the proposed model in these diverse tasks revealed its potential in capturing timbral information.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers’ consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes a novel demand response model using a fuzzy subtractive cluster approach. The model development provides support to domestic consumer decisions on controllable loads management, considering consumers’ consumption needs and the appropriate load shape or rescheduling in order to achieve possible economic benefits. The model based on fuzzy subtractive clustering method considers clusters of domestic consumption covering an adequate consumption range. Analysis of different scenarios is presented considering available electric power and electric energy prices. Simulation results are presented and conclusions of the proposed demand response model are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We numerically study the dynamics of a discrete spring-block model introduced by Olami, Feder, and Christensen (OFC) to mimic earthquakes and investigate to what extent this simple model is able to reproduce the observed spatiotemporal clustering of seismicity. Following a recently proposed method to characterize such clustering by networks of recurrent events [J. Davidsen, P. Grassberger, and M. Paczuski, Geophys. Res. Lett. 33, L11304 (2006)], we find that for synthetic catalogs generated by the OFC model these networks have many nontrivial statistical properties. This includes characteristic degree distributions, very similar to what has been observed for real seismicity. There are, however, also significant differences between the OFC model and earthquake catalogs, indicating that this simple model is insufficient to account for certain aspects of the spatiotemporal clustering of seismicity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When the data consist of certain attributes measured on the same set of items in different situations, they would be described as a three-mode three-way array. A mixture likelihood approach can be implemented to cluster the items (i.e., one of the modes) on the basis of both of the other modes simultaneously (i.e,, the attributes measured in different situations). In this paper, it is shown that this approach can be extended to handle three-mode three-way arrays where some of the data values are missing at random in the sense of Little and Rubin (1987). The methodology is illustrated by clustering the genotypes in a three-way soybean data set where various attributes were measured on genotypes grown in several environments.