11 resultados para Classification Methods
em Cambridge University Engineering Department Publications Database
Resumo:
Data quality (DQ) assessment can be significantly enhanced with the use of the right DQ assessment methods, which provide automated solutions to assess DQ. The range of DQ assessment methods is very broad: from data profiling and semantic profiling to data matching and data validation. This paper gives an overview of current methods for DQ assessment and classifies the DQ assessment methods into an existing taxonomy of DQ problems. Specific examples of the placement of each DQ method in the taxonomy are provided and illustrate why the method is relevant to the particular taxonomy position. The gaps in the taxonomy, where no current DQ methods exist, show where new methods are required and can guide future research and DQ tool development.
Resumo:
We provide a comprehensive overview of many recent algorithms for approximate inference in Gaussian process models for probabilistic binary classification. The relationships between several approaches are elucidated theoretically, and the properties of the different algorithms are corroborated by experimental results. We examine both 1) the quality of the predictive distributions and 2) the suitability of the different marginal likelihood approximations for model selection (selecting hyperparameters) and compare to a gold standard based on MCMC. Interestingly, some methods produce good predictive distributions although their marginal likelihood approximations are poor. Strong conclusions are drawn about the methods: The Expectation Propagation algorithm is almost always the method of choice unless the computational budget is very tight. We also extend existing methods in various ways, and provide unifying code implementing all approaches.
Resumo:
We present in this paper a new multivariate probabilistic approach to Acoustic Pulse Recognition (APR) for tangible interface applications. This model uses Principle Component Analysis (PCA) in a probabilistic framework to classify tapping pulses with a high degree of variability. It was found that this model, achieves a higher robustness to pulse variability than simpler template matching methods, specifically when allowed to train on data containing high variability. © 2011 IEEE.
Resumo:
The amount of original imaging information produced yearly during the last decade has experienced a tremendous growth in all industries due to the technological breakthroughs in digital imaging and electronic storage capabilities. This trend is affecting the construction industry as well, where digital cameras and image databases are gradually replacing traditional photography. Owners demand complete site photograph logs and engineers store thousands of images for each project to use in a number of construction management tasks like monitoring an activity's progress and keeping evidence of the "as built" in case any disputes arise. So far, retrieval methodologies are done manually with the user being responsible for imaging classification according to specific rules that serve a limited number of construction management tasks. New methods that, with the guidance of the user, can automatically classify and retrieve construction site images are being developed and promise to remove the heavy burden of manually indexing images. In this paper, both the existing methods and a novel image retrieval method developed by the authors for the classification and retrieval of construction site images are described and compared. Specifically a number of examples are deployed in order to present their advantages and limitations. The results from this comparison demonstrates that the content based image retrieval method developed by the authors can reduce the overall time spent for the classification and retrieval of construction images while providing the user with the flexibility to retrieve images according different classification schemes.
Resumo:
McCullagh and Yang (2006) suggest a family of classification algorithms based on Cox processes. We further investigate the log Gaussian variant which has a number of appealing properties. Conditioned on the covariates, the distribution over labels is given by a type of conditional Markov random field. In the supervised case, computation of the predictive probability of a single test point scales linearly with the number of training points and the multiclass generalization is straightforward. We show new links between the supervised method and classical nonparametric methods. We give a detailed analysis of the pairwise graph representable Markov random field, which we use to extend the model to semi-supervised learning problems, and propose an inference method based on graph min-cuts. We give the first experimental analysis on supervised and semi-supervised datasets and show good empirical performance.