731 resultados para internet traffic classification machine learning apache spark hadoop big data word2vec


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several research studies have been recently initiated to investigate the use of construction site images for automated infrastructure inspection, progress monitoring, etc. In these studies, it is always necessary to extract material regions (concrete or steel) from the images. Existing methods made use of material's special color/texture ranges for material information retrieval, but they do not sufficiently discuss how to find these appropriate color/texture ranges. As a result, users have to define appropriate ones by themselves, which is difficult for those who do not have enough image processing background. This paper presents a novel method of identifying concrete material regions using machine learning techniques. Under the method, each construction site image is first divided into regions through image segmentation. Then, the visual features of each region are calculated and classified with a pre-trained classifier. The output value determines whether the region is composed of concrete or not. The method was implemented using C++ and tested over hundreds of construction site images. The results were compared with the manual classification ones to indicate the method's validity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numerical integration is a key component of many problems in scientific computing, statistical modelling, and machine learning. Bayesian Quadrature is a modelbased method for numerical integration which, relative to standard Monte Carlo methods, offers increased sample efficiency and a more robust estimate of the uncertainty in the estimated integral. We propose a novel Bayesian Quadrature approach for numerical integration when the integrand is non-negative, such as the case of computing the marginal likelihood, predictive distribution, or normalising constant of a probabilistic model. Our approach approximately marginalises the quadrature model's hyperparameters in closed form, and introduces an active learning scheme to optimally select function evaluations, as opposed to using Monte Carlo samples. We demonstrate our method on both a number of synthetic benchmarks and a real scientific problem from astronomy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work applies a variety of multilinear function factorisation techniques to extract appropriate features or attributes from high dimensional multivariate time series for classification. Recently, a great deal of work has centred around designing time series classifiers using more and more complex feature extraction and machine learning schemes. This paper argues that complex learners and domain specific feature extraction schemes of this type are not necessarily needed for time series classification, as excellent classification results can be obtained by simply applying a number of existing matrix factorisation or linear projection techniques, which are simple and computationally inexpensive. We highlight this using a geometric separability measure and classification accuracies obtained though experiments on four different high dimensional multivariate time series datasets. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copyright © (2014) by the International Machine Learning Society (IMLS) All rights reserved. Classical methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are ubiquitous in statistics. However, these techniques are only able to reveal linear re-lationships in data. Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale. In a separate strand of recent research, randomized methods have been proposed to construct features that help reveal nonlinear patterns in data. For basic tasks such as regression or classification, random features exhibit little or no loss in performance, while achieving drastic savings in computational requirements. In this paper we leverage randomness to design scalable new variants of nonlinear PCA and CCA; our ideas extend to key multivariate analysis tools such as spectral clustering or LDA. We demonstrate our algorithms through experiments on real- world data, on which we compare against the state-of-the-art. A simple R implementation of the presented algorithms is provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Studies on learning problems from geometry perspective have attracted an ever increasing attention in machine learning, leaded by achievements on information geometry. This paper proposes a different geometrical learning from the perspective of high-dimensional descriptive geometry. Geometrical properties of high-dimensional structures underlying a set of samples are learned via successive projections from the higher dimension to the lower dimension until two-dimensional Euclidean plane, under guidance of the established properties and theorems in high-dimensional descriptive geometry. Specifically, we introduce a hyper sausage like geometry shape for learning samples and provides a geometrical learning algorithm for specifying the hyper sausage shapes, which is then applied to biomimetic pattern recognition. Experimental results are presented to show that the proposed approach outperforms three types of support vector machines with either a three degree polynomial kernel or a radial basis function kernel, especially in the cases of high-dimensional samples of a finite size. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the use of independent component analysis (ICA) for speech feature extraction in digits speech recognition systems. We observe that this may be true for recognition tasks based on Geometrical Learning with little training data. In contrast to image processing, phase information is not essential for digits speech recognition. We therefore propose a new scheme that shows how the phase sensitivity can be removed by using an analytical description of the ICA-adapted basis functions. Furthermore, since the basis functions are not shift invariant, we extend the method to include a frequency-based ICA stage that removes redundant time shift information. The digits speech recognition results show promising accuracy. Experiments show that the method based on ICA and Geometrical Learning outperforms HMM in a different number of training samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis examines the problem of an autonomous agent learning a causal world model of its environment. Previous approaches to learning causal world models have concentrated on environments that are too "easy" (deterministic finite state machines) or too "hard" (containing much hidden state). We describe a new domain --- environments with manifest causal structure --- for learning. In such environments the agent has an abundance of perceptions of its environment. Specifically, it perceives almost all the relevant information it needs to understand the environment. Many environments of interest have manifest causal structure and we show that an agent can learn the manifest aspects of these environments quickly using straightforward learning techniques. We present a new algorithm to learn a rule-based causal world model from observations in the environment. The learning algorithm includes (1) a low level rule-learning algorithm that converges on a good set of specific rules, (2) a concept learning algorithm that learns concepts by finding completely correlated perceptions, and (3) an algorithm that learns general rules. In addition this thesis examines the problem of finding a good expert from a sequence of experts. Each expert has an "error rate"; we wish to find an expert with a low error rate. However, each expert's error rate and the distribution of error rates are unknown. A new expert-finding algorithm is presented and an upper bound on the expected error rate of the expert is derived.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clare, A. and King R.D. (2003) Predicting gene function in Saccharomyces cerevisiae. 2nd European Conference on Computational Biology (ECCB '03). (published as a journal supplement in Bioinformatics 19: ii42-ii49)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ferr?, S. and King, R. D. (2004) A dichotomic search algorithm for mining and learning in domain-specific logics. Fundamenta Informaticae. IOS Press. To appear

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R. Jensen and Q. Shen. Fuzzy-Rough Sets Assisted Attribute Selection. IEEE Transactions on Fuzzy Systems, vol. 15, no. 1, pp. 73-89, 2007.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R. Jensen and Q. Shen. Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough Based Approaches. IEEE Transactions on Knowledge and Data Engineering, 16(12): 1457-1471. 2004.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R. Jensen and Q. Shen, 'Fuzzy-Rough Data Reduction with Ant Colony Optimization,' Fuzzy Sets and Systems, vol. 149, no. 1, pp. 5-20, 2005.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R. Jensen, Q. Shen and A. Tuson, 'Finding Rough Set Reducts with SAT,' Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, LNAI 3641, pp. 194-203, 2005.