987 resultados para cosmologia, clustering, AP-test


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The fundamental aim of clustering algorithms is to partition data points. We consider tasks where the discovered partition is allowed to vary with some covariate such as space or time. One approach would be to use fragmentation-coagulation processes, but these, being Markov processes, are restricted to linear or tree structured covariate spaces. We define a partition-valued process on an arbitrary covariate space using Gaussian processes. We use the process to construct a multitask clustering model which partitions datapoints in a similar way across multiple data sources, and a time series model of network data which allows cluster assignments to vary over time. We describe sampling algorithms for inference and apply our method to defining cancer subtypes based on different types of cellular characteristics, finding regulatory modules from gene expression data from multiple human populations, and discovering time varying community structure in a social network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering behavior is studied in a model of integrate-and-fire oscillators with excitatory pulse coupling. When considering a population of identical oscillators, the main result is a proof of global convergence to a phase-locked clustered behavior. The robustness of this clustering behavior is then investigated in a population of nonidentical oscillators by studying the transition from total clustering to the absence of clustering as the group coherence decreases. A robust intermediate situation of partial clustering, characterized by few oscillators traveling among nearly phase-locked clusters, is of particular interest. The analysis complements earlier studies of synchronization in a closely related model. © 2008 American Institute of Physics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Results are presented of systematic studies of vibration damping in steel of a type, and processed by a route, rel-evant to Caribbean steel pans. Damping is likely to be a significant factor in the variation of sound quality be-tween different pans. The main stages in pan manufac-ture are simulated in a controlled manner using sheet steel, cold-rolled to a prescribed level of thickness reduc-tion then annealed at a chosen temperature in a laboratory furnace. Small test strips were cut from the resulting material, and tested in free-free beam bending to deduce the Young’s modulus and its associated loss factor. It is shown that the steel type, the degree of cold working and the annealing temperature all have a significant influence on damping. Furthermore, for each individual specimen damping is found to decrease with rising frequency, ap-proximately following a power law. Comparison with samples cut from a real pan show that there are further influences from the pan’s geometrical details.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparameteric Bayesian kernel based method to cluster data points without the need to prespecify the number of clusters or to model complicated densities from which data points are assumed to be generated from. The key insight is to use determinants of submatrices of a kernel matrix as a measure of how close together a set of points are. We explore some theoretical properties of the model and derive a natural Gibbs based algorithm with MCMC hyperparameter learning. The model is implemented on a variety of synthetic and real world data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper is concerned with the identification of theoretical preview steering controllers using data obtained from five test subjects in a fixed-base driving simulator. An understanding of human steering control behaviour is relevant to the design of autonomous and semi-autonomous vehicle controls. The driving task involved steering a linear vehicle along a randomly curving path. The theoretical steering controllers identified from the data were based on optimal linear preview control. A direct-identification method was used, and the steering controllers were identified so that the predicted steering angle matched as closely as possible the measured steering angle of the test subjects. It was found that identification of the driver's time delay and noise is necessary to avoid bias in identification of the controller parameters. Most subjects' steering behaviour was predicted well by a theoretical controller based on the lateral/yaw dynamics of the vehicle. There was some evidence that an inexperienced driver's steering action was better represented by a controller based on a simpler model of the vehicle dynamics, perhaps reflecting incomplete learning by the driver. Copyright © 2014 Inderscience Enterprises Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The classification of a concrete mixture as self-compacting (SCC) is performed by a series of empirical characterization tests that have been designed to assess not only the flowability of the mixture but also its segregation resistance and filling ability. The objective of the present work is to correlate the rheological parameters of SCC matrix, yield stress and plastic viscosity, to slump flow measurements. The focus of the slump flow test investigation was centered on the fully yielded flow regime and an empirical model relating the yield stress to material and flow parameters is proposed. Our experimental data revealed that the time for a spread of 500 mm which is used in engineering practice as reference for measurement parameters, is an arbitrary choice. Our findings indicate that the non-dimensional final spread is linearly related to the non-dimensional yield-stress. Finally, there are strong indications that the non-dimensional viscosity of the mixture is associated with the non-dimensional final spread as well as the stopping time of the slump flow; this experimental data set suggests an exponential decay of the final spread and stopping time with viscosity. © Appl. Rheol.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge of the effect of geographic factors on the assemblages of protozoan testate amoebae is still limited, despite there having been a number of studies on this fauna. We applied statistical analyses to data on the distribution of testate amoebae from nine major lakes in the Yunnan Plateau, southwest China. Cluster analysis, based on community structure, separated the lakes into two groups - the oligotrophic/mesotrophic lakes and the hypercutrophic lakes - confirming the idea that the testate amoebae assemblages in lakes are closely related to the trophic status. Additionally, within the oligotrophic/mesotrophic lakes, there was distinct geographic clustering. Linear regression analysis and the Mantel test both revealed that similarity of species composition decreased with increasing geographic distance among the oligotrophic/mesotrophic lakes.