786 resultados para machine learning, decision tree, concept drift, ensemble learning, classication, random forest


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ideally, one would like to perform image search using an intuitive and friendly approach. Many existing image search engines, however, present users with sets of images arranged in some default order on the screen, typically the relevance to a query, only. While this certainly has its advantages, arguably, a more flexible and intuitive way would be to sort images into arbitrary structures such as grids, hierarchies, or spheres so that images that are visually or semantically alike are placed together. This paper focuses on designing such a navigation system for image browsers. This is a challenging task because arbitrary layout structure makes it difficult - if not impossible - to compute cross-similarities between images and structure coordinates, the main ingredient of traditional layouting approaches. For this reason, we resort to a recently developed machine learning technique: kernelized sorting. It is a general technique for matching pairs of objects from different domains without requiring cross-domain similarity measures and hence elegantly allows sorting images into arbitrary structures. Moreover, we extend it so that some images can be preselected for instance forming the tip of the hierarchy allowing to subsequently navigate through the search results in the lower levels in an intuitive way. Copyright 2010 ACM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conventional Hidden Markov models generally consist of a Markov chain observed through a linear map corrupted by additive noise. This general class of model has enjoyed a huge and diverse range of applications, for example, speech processing, biomedical signal processing and more recently quantitative finance. However, a lesser known extension of this general class of model is the so-called Factorial Hidden Markov Model (FHMM). FHMMs also have diverse applications, notably in machine learning, artificial intelligence and speech recognition [13, 17]. FHMMs extend the usual class of HMMs, by supposing the partially observed state process is a finite collection of distinct Markov chains, either statistically independent or dependent. There is also considerable current activity in applying collections of partially observed Markov chains to complex action recognition problems, see, for example, [6]. In this article we consider the Maximum Likelihood (ML) parameter estimation problem for FHMMs. Much of the extant literature concerning this problem presents parameter estimation schemes based on full data log-likelihood EM algorithms. This approach can be slow to converge and often imposes heavy demands on computer memory. The latter point is particularly relevant for the class of FHMMs where state space dimensions are relatively large. The contribution in this article is to develop new recursive formulae for a filter-based EM algorithm that can be implemented online. Our new formulae are equivalent ML estimators, however, these formulae are purely recursive and so, significantly reduce numerical complexity and memory requirements. A computer simulation is included to demonstrate the performance of our results. © Taylor & Francis Group, LLC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consider the following problem: given sets of unlabeled observations, each set with known label proportions, predict the labels of another set of observations, also with known label proportions. This problem appears in areas like e-commerce, spam filtering and improper content detection. We present consistent estimators which can reconstruct the correct labels with high probability in a uniform convergence sense. Experiments show that our method works well in practice. Copyright 2008 by the author(s)/owner(s).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a nonparametric Bayesian method for disease subtype discovery in multi-dimensional cancer data. Our method can simultaneously analyse a wide range of data types, allowing for both agreement and disagreement between their underlying clustering structure. It includes feature selection and infers the most likely number of disease subtypes, given the data. We apply the method to 277 glioblastoma samples from The Cancer Genome Atlas, for which there are gene expression, copy number variation, methylation and microRNA data. We identify 8 distinct consensus subtypes and study their prognostic value for death, new tumour events, progression and recurrence. The consensus subtypes are prognostic of tumour recurrence (log-rank p-value of $3.6 \times 10^{-4}$ after correction for multiple hypothesis tests). This is driven principally by the methylation data (log-rank p-value of $2.0 \times 10^{-3}$) but the effect is strengthened by the other 3 data types, demonstrating the value of integrating multiple data types. Of particular note is a subtype of 47 patients characterised by very low levels of methylation. This subtype has very low rates of tumour recurrence and no new events in 10 years of follow up. We also identify a small gene expression subtype of 6 patients that shows particularly poor survival outcomes. Additionally, we note a consensus subtype that showly a highly distinctive data signature and suggest that it is therefore a biologically distinct subtype of glioblastoma. The code is available from https://sites.google.com/site/multipledatafusion/

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we tackle the problem of learning a linear regression model whose parameter is a fixed-rank matrix. We study the Riemannian manifold geometry of the set of fixed-rank matrices and develop efficient line-search algorithms. The proposed algorithms have many applications, scale to high-dimensional problems, enjoy local convergence properties and confer a geometric basis to recent contributions on learning fixed-rank matrices. Numerical experiments on benchmarks suggest that the proposed algorithms compete with the state-of-the-art, and that manifold optimization offers a versatile framework for the design of rank-constrained machine learning algorithms. Copyright 2011 by the author(s)/owner(s).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper addresses the problem of learning a regression model parameterized by a fixed-rank positive semidefinite matrix. The focus is on the nonlinear nature of the search space and on scalability to high-dimensional problems. The mathematical developments rely on the theory of gradient descent algorithms adapted to the Riemannian geometry that underlies the set of fixedrank positive semidefinite matrices. In contrast with previous contributions in the literature, no restrictions are imposed on the range space of the learned matrix. The resulting algorithms maintain a linear complexity in the problem size and enjoy important invariance properties. We apply the proposed algorithms to the problem of learning a distance function parameterized by a positive semidefinite matrix. Good performance is observed on classical benchmarks. © 2011 Gilles Meyer, Silvere Bonnabel and Rodolphe Sepulchre.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we develop a new approach to sparse principal component analysis (sparse PCA). We propose two single-unit and two block optimization formulations of the sparse PCA problem, aimed at extracting a single sparse dominant principal component of a data matrix, or more components at once, respectively. While the initial formulations involve nonconvex functions, and are therefore computationally intractable, we rewrite them into the form of an optimization program involving maximization of a convex function on a compact set. The dimension of the search space is decreased enormously if the data matrix has many more columns (variables) than rows. We then propose and analyze a simple gradient method suited for the task. It appears that our algorithm has best convergence properties in the case when either the objective function or the feasible set are strongly convex, which is the case with our single-unit formulations and can be enforced in the block case. Finally, we demonstrate numerically on a set of random and gene expression test problems that our approach outperforms existing algorithms both in quality of the obtained solution and in computational speed. © 2010 Michel Journée, Yurii Nesterov, Peter Richtárik and Rodolphe Sepulchre.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work applies a variety of multilinear function factorisation techniques to extract appropriate features or attributes from high dimensional multivariate time series for classification. Recently, a great deal of work has centred around designing time series classifiers using more and more complex feature extraction and machine learning schemes. This paper argues that complex learners and domain specific feature extraction schemes of this type are not necessarily needed for time series classification, as excellent classification results can be obtained by simply applying a number of existing matrix factorisation or linear projection techniques, which are simple and computationally inexpensive. We highlight this using a geometric separability measure and classification accuracies obtained though experiments on four different high dimensional multivariate time series datasets. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Standard forms of density-functional theory (DFT) have good predictive power for many materials, but are not yet fully satisfactory for cluster, solid, and liquid forms of water. Recent work has stressed the importance of DFT errors in describing dispersion, but we note that errors in other parts of the energy may also contribute. We obtain information about the nature of DFT errors by using a many-body separation of the total energy into its 1-body, 2-body, and beyond-2-body components to analyze the deficiencies of the popular PBE and BLYP approximations for the energetics of water clusters and ice structures. The errors of these approximations are computed by using accurate benchmark energies from the coupled-cluster technique of molecular quantum chemistry and from quantum Monte Carlo calculations. The systems studied are isomers of the water hexamer cluster, the crystal structures Ih, II, XV, and VIII of ice, and two clusters extracted from ice VIII. For the binding energies of these systems, we use the machine-learning technique of Gaussian Approximation Potentials to correct successively for 1-body and 2-body errors of the DFT approximations. We find that even after correction for these errors, substantial beyond-2-body errors remain. The characteristics of the 2-body and beyond-2-body errors of PBE are completely different from those of BLYP, but the errors of both approximations disfavor the close approach of non-hydrogen-bonded monomers. We note the possible relevance of our findings to the understanding of liquid water.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copyright © (2014) by the International Machine Learning Society (IMLS) All rights reserved. Classical methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are ubiquitous in statistics. However, these techniques are only able to reveal linear re-lationships in data. Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale. In a separate strand of recent research, randomized methods have been proposed to construct features that help reveal nonlinear patterns in data. For basic tasks such as regression or classification, random features exhibit little or no loss in performance, while achieving drastic savings in computational requirements. In this paper we leverage randomness to design scalable new variants of nonlinear PCA and CCA; our ideas extend to key multivariate analysis tools such as spectral clustering or LDA. We demonstrate our algorithms through experiments on real- world data, on which we compare against the state-of-the-art. A simple R implementation of the presented algorithms is provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the Student-t process as an alternative to the Gaussian process as a non-parametric prior over functions. We derive closed form expressions for the marginal likelihood and predictive distribution of a Student-t process, by integrating away an inverse Wishart process prior over the co-variance kernel of a Gaussian process model. We show surprising equivalences between different hierarchical Gaussian process models leading to Student-t processes, and derive a new sampling scheme for the inverse Wishart process, which helps elucidate these equivalences. Overall, we show that a Student-t process can retain the attractive properties of a Gaussian process - a nonparamet-ric representation, analytic marginal and predictive distributions, and easy model selection through covariance kernels - but has enhanced flexibility, and predictive covariances that, unlike a Gaussian process, explicitly depend on the values of training observations. We verify empirically that a Student-t process is especially useful in situations where there are changes in covariance structure, or in applications such as Bayesian optimization, where accurate predictive covariances are critical for good performance. These advantages come at no additional computational cost over Gaussian processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Choosing appropriate architectures and regularization strategies of deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide, deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing infinitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copyright 2014 by the author(s). We present a nonparametric prior over reversible Markov chains. We use completely random measures, specifically gamma processes, to construct a countably infinite graph with weighted edges. By enforcing symmetry to make the edges undirected we define a prior over random walks on graphs that results in a reversible Markov chain. The resulting prior over infinite transition matrices is closely related to the hierarchical Dirichlet process but enforces reversibility. A reinforcement scheme has recently been proposed with similar properties, but the de Finetti measure is not well characterised. We take the alternative approach of explicitly constructing the mixing measure, which allows more straightforward and efficient inference at the cost of no longer having a closed form predictive distribution. We use our process to construct a reversible infinite HMM which we apply to two real datasets, one from epigenomics and one ion channel recording.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present and test an extension of slow feature analysis as a novel approach to nonlinear blind source separation. The algorithm relies on temporal correlations and iteratively reconstructs a set of statistically independent sources from arbitrary nonlinear instantaneous mixtures. Simulations show that it is able to invert a complicated nonlinear mixture of two audio signals with a high reliability. The algorithm is based on a mathematical analysis of slow feature analysis for the case of input data that are generated from statistically independent sources. © 2014 Henning Sprekeler, Tiziano Zito and Laurenz Wiskott.