21 resultados para Machine Learning Techniques
Resumo:
This thesis is concerned with the state and parameter estimation in state space models. The estimation of states and parameters is an important task when mathematical modeling is applied to many different application areas such as the global positioning systems, target tracking, navigation, brain imaging, spread of infectious diseases, biological processes, telecommunications, audio signal processing, stochastic optimal control, machine learning, and physical systems. In Bayesian settings, the estimation of states or parameters amounts to computation of the posterior probability density function. Except for a very restricted number of models, it is impossible to compute this density function in a closed form. Hence, we need approximation methods. A state estimation problem involves estimating the states (latent variables) that are not directly observed in the output of the system. In this thesis, we use the Kalman filter, extended Kalman filter, Gauss–Hermite filters, and particle filters to estimate the states based on available measurements. Among these filters, particle filters are numerical methods for approximating the filtering distributions of non-linear non-Gaussian state space models via Monte Carlo. The performance of a particle filter heavily depends on the chosen importance distribution. For instance, inappropriate choice of the importance distribution can lead to the failure of convergence of the particle filter algorithm. In this thesis, we analyze the theoretical Lᵖ particle filter convergence with general importance distributions, where p ≥2 is an integer. A parameter estimation problem is considered with inferring the model parameters from measurements. For high-dimensional complex models, estimation of parameters can be done by Markov chain Monte Carlo (MCMC) methods. In its operation, the MCMC method requires the unnormalized posterior distribution of the parameters and a proposal distribution. In this thesis, we show how the posterior density function of the parameters of a state space model can be computed by filtering based methods, where the states are integrated out. This type of computation is then applied to estimate parameters of stochastic differential equations. Furthermore, we compute the partial derivatives of the log-posterior density function and use the hybrid Monte Carlo and scaled conjugate gradient methods to infer the parameters of stochastic differential equations. The computational efficiency of MCMC methods is highly depend on the chosen proposal distribution. A commonly used proposal distribution is Gaussian. In this kind of proposal, the covariance matrix must be well tuned. To tune it, adaptive MCMC methods can be used. In this thesis, we propose a new way of updating the covariance matrix using the variational Bayesian adaptive Kalman filter algorithm.
Resumo:
A new area of machine learning research called deep learning, has moved machine learning closer to one of its original goals: artificial intelligence and general learning algorithm. The key idea is to pretrain models in completely unsupervised way and finally they can be fine-tuned for the task at hand using supervised learning. In this thesis, a general introduction to deep learning models and algorithms are given and these methods are applied to facial keypoints detection. The task is to predict the positions of 15 keypoints on grayscale face images. Each predicted keypoint is specified by an (x,y) real-valued pair in the space of pixel indices. In experiments, we pretrained deep belief networks (DBN) and finally performed a discriminative fine-tuning. We varied the depth and size of an architecture. We tested both deterministic and sampled hidden activations and the effect of additional unlabeled data on pretraining. The experimental results show that our model provides better results than publicly available benchmarks for the dataset.
Resumo:
The growing population in cities increases the energy demand and affects the environment by increasing carbon emissions. Information and communications technology solutions which enable energy optimization are needed to address this growing energy demand in cities and to reduce carbon emissions. District heating systems optimize the energy production by reusing waste energy with combined heat and power plants. Forecasting the heat load demand in residential buildings assists in optimizing energy production and consumption in a district heating system. However, the presence of a large number of factors such as weather forecast, district heating operational parameters and user behavioural parameters, make heat load forecasting a challenging task. This thesis proposes a probabilistic machine learning model using a Naive Bayes classifier, to forecast the hourly heat load demand for three residential buildings in the city of Skellefteå, Sweden over a period of winter and spring seasons. The district heating data collected from the sensors equipped at the residential buildings in Skellefteå, is utilized to build the Bayesian network to forecast the heat load demand for horizons of 1, 2, 3, 6 and 24 hours. The proposed model is validated by using four cases to study the influence of various parameters on the heat load forecast by carrying out trace driven analysis in Weka and GeNIe. Results show that current heat load consumption and outdoor temperature forecast are the two parameters with most influence on the heat load forecast. The proposed model achieves average accuracies of 81.23 % and 76.74 % for a forecast horizon of 1 hour in the three buildings for winter and spring seasons respectively. The model also achieves an average accuracy of 77.97 % for three buildings across both seasons for the forecast horizon of 1 hour by utilizing only 10 % of the training data. The results indicate that even a simple model like Naive Bayes classifier can forecast the heat load demand by utilizing less training data.
Resumo:
Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.
Resumo:
This work investigates theoretical properties of symmetric and anti-symmetric kernels. First chapters give an overview of the theory of kernels used in supervised machine learning. Central focus is on the regularized least squares algorithm, which is motivated as a problem of function reconstruction through an abstract inverse problem. Brief review of reproducing kernel Hilbert spaces shows how kernels define an implicit hypothesis space with multiple equivalent characterizations and how this space may be modified by incorporating prior knowledge. Mathematical results of the abstract inverse problem, in particular spectral properties, pseudoinverse and regularization are recollected and then specialized to kernels. Symmetric and anti-symmetric kernels are applied in relation learning problems which incorporate prior knowledge that the relation is symmetric or anti-symmetric, respectively. Theoretical properties of these kernels are proved in a draft this thesis is based on and comprehensively referenced here. These proofs show that these kernels can be guaranteed to learn only symmetric or anti-symmetric relations, and they can learn any relations relative to the original kernel modified to learn only symmetric or anti-symmetric parts. Further results prove spectral properties of these kernels, central result being a simple inequality for the the trace of the estimator, also called the effective dimension. This quantity is used in learning bounds to guarantee smaller variance.
Resumo:
Tämän kandidaatintyön tavoitteena on käsitellä tekoälyjärjestelmien käyttöä liiketoiminnassa. Tekoälyä on tutkittu pitkään, mutta sen soveltaminen liiketoimintaan on suhteellisen uutta. Työssä esitellään IBM Watson Analytics- tekoälyjärjestelmän käyttöä. Tämän esittelyn kautta on tarkoitus näyttää, kuinka helposti tekoälyjärjestelmät todellisuudessa ovat hyödynnettävissä. Kirjallisuudesta löytyvien esimerkkien kautta työssä esitellään, minkälaisia järjestelmiä tällä hetkellä käytetään, ja millaisiin tarkoituksiin ne on luotu. Tekoälyjärjestelmien monimuotoisuuden vuoksi niitä käytetäänkin laajalti erilaisiin sovelluksiin. Kirjallisuudesta huomataan, että tekoälyjärjestelmät koostuvat usein monesta eri tavasta toteuttaa tekoälyä. Kirjallisuuden ja tekoälyn toteutuksen teorian pohjalta huomataan myös, että tekoälyjärjestelmät toimivat useimmiten erilaisissa päätöksentekoa tukevissa tai helpottavissa tehtävissä. Työssä esitetään myös IBM Watson Analyticsin ja avoimen datan avulla, kuinka helposti tekoälyjärjestelmiä pystytään hyödyntämään. Työssä näytetään tämän esimerkin kautta, miten ja minkä tyyppistä liiketoimintaa tukevaa informaatiota tekoälyjärjestelmä pystyy helposti tuottamaan.