23 resultados para Kernel Density


Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this work is to invert the ionospheric electron density profile from Riometer (Relative Ionospheric opacity meter) measurement. The newly Riometer instrument KAIRA (Kilpisjärvi Atmospheric Imaging Receiver Array) is used to measure the cosmic HF radio noise absorption that taking place in the D-region ionosphere between 50 to 90 km. In order to invert the electron density profile synthetic data is used to feed the unknown parameter Neq using spline height method, which works by taking electron density profile at different altitude. Moreover, smoothing prior method also used to sample from the posterior distribution by truncating the prior covariance matrix. The smoothing profile approach makes the problem easier to find the posterior using MCMC (Markov Chain Monte Carlo) method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electric energy demand has been growing constantly as the global population increases. To avoid electric energy shortage, renewable energy sources and energy conservation are emphasized all over the world. The role of power electronics in energy saving and development of renewable energy systems is significant. Power electronics is applied in wind, solar, fuel cell, and micro turbine energy systems for the energy conversion and control. The use of power electronics introduces an energy saving potential in such applications as motors, lighting, home appliances, and consumer electronics. Despite the advantages of power converters, their penetration into the market requires that they have a set of characteristics such as high reliability and power density, cost effectiveness, and low weight, which are dictated by the emerging applications. In association with the increasing requirements, the design of the power converter is becoming more complicated, and thus, a multidisciplinary approach to the modelling of the converter is required. In this doctoral dissertation, methods and models are developed for the design of a multilevel power converter and the analysis of the related electromagnetic, thermal, and reliability issues. The focus is on the design of the main circuit. The electromagnetic model of the laminated busbar system and the IGBT modules is established with the aim of minimizing the stray inductance of the commutation loops that degrade the converter power capability. The circular busbar system is proposed to achieve equal current sharing among parallel-connected devices and implemented in the non-destructive test set-up. In addition to the electromagnetic model, a thermal model of the laminated busbar system is developed based on a lumped parameter thermal model. The temperature and temperature-dependent power losses of the busbars are estimated by the proposed algorithm. The Joule losses produced by non-sinusoidal currents flowing through the busbars in the converter are estimated taking into account the skin and proximity effects, which have a strong influence on the AC resistance of the busbars. The lifetime estimation algorithm was implemented to investigate the influence of the cooling solution on the reliability of the IGBT modules. As efficient cooling solutions have a low thermal inertia, they cause excessive temperature cycling of the IGBTs. Thus, a reliability analysis is required when selecting the cooling solutions for a particular application. The control of the cooling solution based on the use of a heat flux sensor is proposed to reduce the amplitude of the temperature cycles. The developed methods and models are verified experimentally by a laboratory prototype.