146 resultados para Semi-supervised clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Chapter 20 Clustering User Data for User Modelling in the GUIDE Multi-modal Set- top Box PM Langdon and P. Biswas 20.1 ... It utilises advanced user modelling and simulation in conjunction with a single layer interface that permits a ...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new co-clustering problem of images and visual features. The problem involves a set of non-object images in addition to a set of object images and features to be co-clustered. Co-clustering is performed in a way that maximises discrimination of object images from non-object images, thus emphasizing discriminative features. This provides a way of obtaining perceptual joint-clusters of object images and features. We tackle the problem by simultaneously boosting multiple strong classifiers which compete for images by their expertise. Each boosting classifier is an aggregation of weak-learners, i.e. simple visual features. The obtained classifiers are useful for object detection tasks which exhibit multimodalities, e.g. multi-category and multi-view object detection tasks. Experiments on a set of pedestrian images and a face data set demonstrate that the method yields intuitive image clusters with associated features and is much superior to conventional boosting classifiers in object detection tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A combination of singular systems analysis and analytic phase techniques are used to investigate the possible occurrence in observations of coherent synchronization between quasi-biennial and semi-annual oscillations (QBOs; SAOs) in the stratosphere and troposphere. Time series of zonal mean zonal winds near the Equator are analysed from the ERA-40 and ERA-interim reanalysis datasets over a ∼ 50-year period. In the stratosphere, the QBO is found to synchronize with the SAO almost all the time, but with a frequency ratio that changes erratically between 4:1, 5:1 and 6:1. A similar variable synchronization is also evident in the tropical troposphere between semi-annual and quasi-biennial cycles (known as TBOs). Mean zonal winds from ERA-40 and ERA-interim, and also time series of indices for the Indian and West Pacific monsoons, are commonly found to exhibit synchronization, with SAO/TBO ratios that vary between 4:1 and 7:1. Coherent synchronization between the QBO and tropical TBO does not appear to persist for long intervals, however. This suggests that both the QBO and tropical TBOs may be separately synchronized to SAOs that are themselves enslaved to the seasonal cycle, or to the annual cycle itself. However, the QBO and TBOs are evidently only weakly coupled between themselves and are frequently found to lose mutual coherence when each changes its frequency ratio to its respective SAO. This suggests a need to revise a commonly cited paradigm that advocates the use of stratospheric QBO indices as a predictor for tropospheric phenomena such as monsoons and hurricanes. © 2012 Royal Meteorological Society.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For many applications, it is necessary to produce speech transcriptions in a causal fashion. To produce high quality transcripts, speaker adaptation is often used. This requires online speaker clustering and incremental adaptation techniques to be developed. This paper presents an integrated approach to online speaker clustering and adaptation which allows efficient clustering of speakers using the same accumulated statistics that are normally used for adaptation. Using a consistent criterion for both clustering and adaptation should yield gains for both stages. The proposed approach is evaluated on a meetings transcription task using audio from multiple distant microphones. Consistent gains over standard clustering and adaptation were obtained. Copyright © 2011 ISCA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

MOTIVATION: The integration of multiple datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct-but often complementary-information. We present a Bayesian method for the unsupervised integrative modelling of multiple datasets, which we refer to as MDI (Multiple Dataset Integration). MDI can integrate information from a wide range of different datasets and data types simultaneously (including the ability to model time series data explicitly using Gaussian processes). Each dataset is modelled using a Dirichlet-multinomial allocation (DMA) mixture model, with dependencies between these models captured through parameters that describe the agreement among the datasets. RESULTS: Using a set of six artificially constructed time series datasets, we show that MDI is able to integrate a significant number of datasets simultaneously, and that it successfully captures the underlying structural similarity between the datasets. We also analyse a variety of real Saccharomyces cerevisiae datasets. In the two-dataset case, we show that MDI's performance is comparable with the present state-of-the-art. We then move beyond the capabilities of current approaches and integrate gene expression, chromatin immunoprecipitation-chip and protein-protein interaction data, to identify a set of protein complexes for which genes are co-regulated during the cell cycle. Comparisons to other unsupervised data integration techniques-as well as to non-integrative approaches-demonstrate that MDI is competitive, while also providing information that would be difficult or impossible to extract using other methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semi-implicit, second order temporal and spatial finite volume computations of the flow in a differentially heated rotating annulus are presented. For the regime considered, three cyclones and anticyclones separated by a relatively fast moving jet of fluid or "jet stream" are predicted. Two second order methods are compared with, first order spatial predictions, and experimental measurements. Velocity vector plots are used to illustrate the predicted flow structure. Computations made using second order central differences are shown to agree best with experimental measurements, and to be stable for integrations over long time periods (> 1000s). No periodic smoothing is required to prevent divergence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The fundamental aim of clustering algorithms is to partition data points. We consider tasks where the discovered partition is allowed to vary with some covariate such as space or time. One approach would be to use fragmentation-coagulation processes, but these, being Markov processes, are restricted to linear or tree structured covariate spaces. We define a partition-valued process on an arbitrary covariate space using Gaussian processes. We use the process to construct a multitask clustering model which partitions datapoints in a similar way across multiple data sources, and a time series model of network data which allows cluster assignments to vary over time. We describe sampling algorithms for inference and apply our method to defining cancer subtypes based on different types of cellular characteristics, finding regulatory modules from gene expression data from multiple human populations, and discovering time varying community structure in a social network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An investigation into the potential for reducing road damage by optimising the design of heavy vehicle suspensions is described. In the first part of the paper two simple mathematical models are used to study the optimisation of conventional passive suspensions. Simple modifications are made to the steel spring suspension of a tandem axle trailer and it is found experimentally that RMS dynamic tyre forces can be reduced by 15% and theoretical road damage by 5.2%. A mathematical model of an air-sprung articulated vehicle is validated, and its suspension is optimised according to the simple models. This vehicle generates about 9% less damage than the leaf-sprung vehicle in the unmodified state and it is predicted that, for the operating conditions examined, the road damage caused by this vehicle can be reduced by a further 5.4%. Finally, it is shown experimentally that computer-controlled semi-active dampers have the potential to reduce road damage by a further 5-6%, compared to an air suspension with optimum passive damping. © Copyright 1994 Society of Automotive Engineers, Inc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ground movements induced by the construction of supported excavation systems are generally predicted by empirical/semi-empirical methods in the design stage. However, these methods cannot account for the site-specific conditions and for information that becomes available as an excavation proceeds. A Bayesian updating methodology is proposed to update the predictions of ground movements in the later stages of excavation based on recorded deformation measurements. As an application, the proposed framework is used to predict the three-dimensional deformation shapes at four incremental excavation stages of an actual supported excavation project. © 2011 Taylor & Francis Group, London.