22 resultados para Kernel polynomials
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
The main topic of the thesis is optimal stopping. This is treated in two research articles. In the first article we introduce a new approach to optimal stopping of general strong Markov processes. The approach is based on the representation of excessive functions as expected suprema. We present a variety of examples, in particular, the Novikov-Shiryaev problem for Lévy processes. In the second article on optimal stopping we focus on differentiability of excessive functions of diffusions and apply these results to study the validity of the principle of smooth fit. As an example we discuss optimal stopping of sticky Brownian motion. The third research article offers a survey like discussion on Appell polynomials. The crucial role of Appell polynomials in optimal stopping of Lévy processes was noticed by Novikov and Shiryaev. They described the optimal rule in a large class of problems via these polynomials. We exploit the probabilistic approach to Appell polynomials and show that many classical results are obtained with ease in this framework. In the fourth article we derive a new relationship between the generalized Bernoulli polynomials and the generalized Euler polynomials.
Resumo:
Yhteisön langaton palvelualusta on konsepti langattomien yhteisöllisten verkkopalveluiden tarjoamiseen.Konsepti perustuu langattomaan WLAN-reititinlaitteeseen sekä Linux-käyttöjärjestelmäpohjaiseen laiteohjelmistoon, joiden avulla voidaan tarjota paikallisia, erilaisten pienten yhteisöjen käyttöön tarkoitettuja langattomia verkkopalveluita.Soveltuvia käyttökohteita voivat olla esimerkiksi asuntoyhteisöjen välinen tiedotuskanava tai perheen sisäinen viihdekeskus, jonka kautta voidaan tarjota yhteisön toimintaa helpottavia ja tavoitteita edistäviä palveluita. Yhteisön langattomat palvelut perustuvat vapaisiin ohjelmistoihin, jotka mahdollistavat monipuolisen palveluvalikoiman luomisen. Langattoman palvelualustan avulla pienten yhteisöjen tarvitsemat langattomat palvelut voidaan luoda helposti, edullisesti ja joustavasti, ilman kaupallisten palveluoperaattoreiden asettamia rajoituksia.
Resumo:
Diplomityön tarkoituksena oli luoda menetelmä Symbian-käyttöjärjestelmää käyttävien älypuhelinten suorituskyvyn määrittämiseen, jotta laitteiden välisiä eroja voitaisiin mitata. Aluksi Symbian-käyttöjärjestelmää ja älypuhelinlaitteistoja tutkittiin suorituskykyyn ja sen vaihteluun vaikuttavien tekijöiden ja osien löytämiseksi. Tämän jälkeen kehitettiin useita testitapauksia sisältävä testikirjasto, jolla voitiin mitata joidenkin suorituskykyyn vaikuttavien käyttöjärjestelmän rajapintojen suoritusaikoja. Testikirjaston testit ajettiin kolmella eri älypuhelinmallilla, jotta testien toimivuutta voitiin arvioida. Lopuksi testituloksia analysoitiin mahdollisten pullonkaulojen havaitsemiseksi suorituskyvystä. Testikirjaston pystyttiin havaitsemaan eroja laitteiden suorituskyvyssä. Viimeisin, uudella Symbianin EKA2-ytimellä varustettu älypuhelin, Nokia E70, jäi mittauksissa viimeiseksi, koska se pärjäsi huonosti muistinvarauksia ja TRAP-poikkeuksia testaavissa tapauksissa. Muilla mitatuilla osa-alueilla se kuitenkin päihitti selvästi muut testatut puhelimet, Nokia N90:n ja Nokia 6630:n. Näiden kahden muun laitteen tulosten skaalan tasaisuus osoittaa, että kehitetyn testikirjaston avulla saadaan johdonmukaisia ja uskottavia mittaustuloksia.
Resumo:
In this paper, a new two-dimensional shear deformable beam element based on the absolute nodal coordinate formulation is proposed. The nonlinear elastic forces of the beam element are obtained using a continuum mechanics approach without employing a local element coordinate system. In this study, linear polynomials are used to interpolate both the transverse and longitudinal components of the displacement. This is different from other absolute nodal-coordinate-based beam elements where cubic polynomials are used in the longitudinal direction. The accompanying defects of the phenomenon known as shear locking are avoided through the adoption of selective integration within the numerical integration method. The proposed element is verified using several numerical examples, and the results are compared to analytical solutions and the results for an existing shear deformable beam element. It is shown that by using the proposed element, accurate linear and nonlinear static deformations, as well as realistic dynamic behavior, can be achieved with a smaller computational effort than by using existing shear deformable two-dimensional beam elements.
Resumo:
Nykyään kolmeen kerrokseen perustuvat client-server –sovellukset ovat suuri kinnostuskohde sekä niiden kehittäjille etta käyttäjille. Tietotekniikan nopean kehityksen ansiosta näillä sovelluksilla on monipuolinen käyttö teollisuuden eri alueilla. Tällä hetkellä on olemassa paljon työkaluja client-server –sovellusten kehittämiseen, jotka myös tyydyttävät asiakkaiden asettamia vaatimuksia. Nämä työkalut eivät kuitenkaan mahdollista joustavaa toimintaa graafisen käyttöliittyman kanssa. Tämä diplomityö käsittelee client-server –sovellusten kehittamistä XML –kielen avulla. Tämä lähestymistapa mahdollistaa client-server –sovellusten rakentamista niin, että niiden graafinen käyttöliittymä ja ulkonäkö olisivat helposti muokattavissa ilman ohjelman ytimen uudelleenkääntämistä. Diplomityö koostuu kahdesta ostasta: teoreettisesta ja käytännöllisestä. Teoreettinen osa antaa yleisen tiedon client-server –arkkitehtuurista ja kuvailee ohjelmistotekniikan pääkohdat. Käytannöllinen osa esittää tulokset, client-server –sovellusten kehittämisteknologian kehittämislähestymistavan XML: ää käyttäen ja tuloksiin johtavat usecase– ja sekvenssidiagrammit. Käytännöllinen osa myos sisältää esimerkit toteutetuista XML-struktuureista, jotka kuvaavat client –sovellusten kuvaruutukaavakkeiden esintymisen ja serverikyselykaaviot.
Resumo:
VDSL on teknologia, joka mahdollistaa nopeat Internet-yhteydet tavallista puhelinlinjaa käyttäen. Tätä varten käyttäjä tarvitsee VDSL-modeemin ja Internet-operaattori reitittimen, johon VDSL-linjat kytketään. Reitittimen on oltava suorituskykyinen, jotta kaikki VDSL-liikenne voidaan reittittää eteenpäin. Tehokkuutta haetaan tekemällä suuri osa reitityksestä erityisillä reititinpiireillä. Tässä diplomityössä käsitellään reititinpiirien teoriaa ja niiden hallintaa. Lisäksi vertailtiin kolmen suuren valmistajan tuotteita. Tuotteiden tarjoamat ominaisuudet vaikuttivat hyvin yhteneväisiltä. Ominaisuuksien hallinta ja toteutus olivat erilaisia. Työn tavoitteena oli löytää ohjelmistoarkkitehtuuri piirien ohjaamiseen niin, että Linux-käyttöjärjestelmän ytimen palveluja voitaisiin käyttää mahdollisimman hyödyllisesti. Työssä havaittiin, että ohjelmistoarkkitehtuurin voi määritellä monella eri tavalla riippuen siitä, miten piiri on kytketty prosessoriin, mitä piirin ominaisuuksia halutaan käyttää ja miten arkkitehtuuria halutaan jatkossa laajentaa.
Resumo:
Työn tarkoituksena on tutkia pinon ylikirjoitukseen perustuvien hyökkäysten toimintaa ja osoittaa kokeellisesti nykyisten suojaustekniikoiden olevan riittämättömiä. Tutkimus suoritetaan testaamalla miten valitut tietoturvatuotteet toimivat eri testitilanteissa. Testatut tuotteet ovat Openwall, PaX, Libsafe 2.0 ja Immunix 6.2. Testaus suoritetaan pääasiassa RedHat 7.0 ympäristössä testiohjelman avulla. Testeissä mitataan sekä tuotteiden kyky havaita hyökkäyksiä että niiden nopeusvaikutukset. Myös erityyppisten hyökkäysten ja niitä vastaan kehitettyjen metodien toimintaperiaatteet esitellään seikkaperäisesti ja havainnollistetaan yksinkertaistetuilla esimerkeillä. Esitellyt tekniikat sisältävät puskurin ylivuodot, laittomat muotoiluparametrit, loppumerkittömät merkkijonot ja taulukoiden ylivuodot. Testit osoittavat, etteivät valitut tuotteet estä kaikkia hyökkäyksiä, joten lopuksi perehdytään myös vahinkojen minimointiin onnistuneiden hyökkäysten varalta.
Resumo:
Reaaliaikaisten käyttöjärjestelmien käyttö sulautetuissa järjestelmissä on kasvamassa koko ajan. Sulautettuja tietokoneita käytetään yhä useammassa kohteessa kuten sähkökäyttöjen ohjauksessa. Sähkökäyttöjen ohjaus hoidetaan nykyisin yleensä nopealla digitaalisella signaaliprosessorilla (DSP), jolloin ohjelmointi ja päivittäminen on hidasta ja vaikeaa johtuen käytettävästä matalan tason Assembler-kielestä. Ratkaisuna yleiskäyttöisten prosessorien ja reaaliaikakäyttöjärjestelmien käyttö. Kaupalliset reaaliaikakäyttöjärjestelmät ovat kalliita ja lähdekoodin saaminen omaan käyttöön jopa mahdotonta. Linux on ei-kaupallinen avoimen lähdekoodin käyttöjärjestelmä, joten sen käyttö on ilmaista ja sitä voi muokata vapaasti. Linux:iin on saatavana useita laajennuksia, jotka tekevät siitä reaaliaikaisen käyttöjärjestelmän. Vaihtoehtoina joko kova (hard) tai pehmeä (soft) reaaliaikaisuus. Linux:iin on olemassa valmiita kehitysympäristöjä mutta ne kaipaavat parannusta ennen kuin niitä voidaan käyttää suuressa mittakaavassa teollisuudessa. Reaaliaika Linux ei sovellus nopeisiin ohjauslooppeihin (<100 ms) koska nopeus ei riitä vielä mutta nopeus kasvaa samalla kun prosessorit kehittyvät. Linux soveltuu hyvin rajapinnaksi nopean ohjauksen ja käyttäjän välille ja hitaampaan ohjaukseen.
Resumo:
This thesis deals with distance transforms which are a fundamental issue in image processing and computer vision. In this thesis, two new distance transforms for gray level images are presented. As a new application for distance transforms, they are applied to gray level image compression. The new distance transforms are both new extensions of the well known distance transform algorithm developed by Rosenfeld, Pfaltz and Lay. With some modification their algorithm which calculates a distance transform on binary images with a chosen kernel has been made to calculate a chessboard like distance transform with integer numbers (DTOCS) and a real value distance transform (EDTOCS) on gray level images. Both distance transforms, the DTOCS and EDTOCS, require only two passes over the graylevel image and are extremely simple to implement. Only two image buffers are needed: The original gray level image and the binary image which defines the region(s) of calculation. No other image buffers are needed even if more than one iteration round is performed. For large neighborhoods and complicated images the two pass distance algorithm has to be applied to the image more than once, typically 3 10 times. Different types of kernels can be adopted. It is important to notice that no other existing transform calculates the same kind of distance map as the DTOCS. All the other gray weighted distance function, GRAYMAT etc. algorithms find the minimum path joining two points by the smallest sum of gray levels or weighting the distance values directly by the gray levels in some manner. The DTOCS does not weight them that way. The DTOCS gives a weighted version of the chessboard distance map. The weights are not constant, but gray value differences of the original image. The difference between the DTOCS map and other distance transforms for gray level images is shown. The difference between the DTOCS and EDTOCS is that the EDTOCS calculates these gray level differences in a different way. It propagates local Euclidean distances inside a kernel. Analytical derivations of some results concerning the DTOCS and the EDTOCS are presented. Commonly distance transforms are used for feature extraction in pattern recognition and learning. Their use in image compression is very rare. This thesis introduces a new application area for distance transforms. Three new image compression algorithms based on the DTOCS and one based on the EDTOCS are presented. Control points, i.e. points that are considered fundamental for the reconstruction of the image, are selected from the gray level image using the DTOCS and the EDTOCS. The first group of methods select the maximas of the distance image to new control points and the second group of methods compare the DTOCS distance to binary image chessboard distance. The effect of applying threshold masks of different sizes along the threshold boundaries is studied. The time complexity of the compression algorithms is analyzed both analytically and experimentally. It is shown that the time complexity of the algorithms is independent of the number of control points, i.e. the compression ratio. Also a new morphological image decompression scheme is presented, the 8 kernels' method. Several decompressed images are presented. The best results are obtained using the Delaunay triangulation. The obtained image quality equals that of the DCT images with a 4 x 4
Resumo:
The ongoing development of the digital media has brought a new set of challenges with it. As images containing more than three wavelength bands, often called spectral images, are becoming a more integral part of everyday life, problems in the quality of the RGB reproduction from the spectral images have turned into an important area of research. The notion of image quality is often thought to comprise two distinctive areas – image quality itself and image fidelity, both dealing with similar questions, image quality being the degree of excellence of the image, and image fidelity the measure of the match of the image under study to the original. In this thesis, both image fidelity and image quality are considered, with an emphasis on the influence of color and spectral image features on both. There are very few works dedicated to the quality and fidelity of spectral images. Several novel image fidelity measures were developed in this study, which include kernel similarity measures and 3D-SSIM (structural similarity index). The kernel measures incorporate the polynomial, Gaussian radial basis function (RBF) and sigmoid kernels. The 3D-SSIM is an extension of a traditional gray-scale SSIM measure developed to incorporate spectral data. The novel image quality model presented in this study is based on the assumption that the statistical parameters of the spectra of an image influence the overall appearance. The spectral image quality model comprises three parameters of quality: colorfulness, vividness and naturalness. The quality prediction is done by modeling the preference function expressed in JNDs (just noticeable difference). Both image fidelity measures and the image quality model have proven to be effective in the respective experiments.
Resumo:
Preference relations, and their modeling, have played a crucial role in both social sciences and applied mathematics. A special category of preference relations is represented by cardinal preference relations, which are nothing other than relations which can also take into account the degree of relation. Preference relations play a pivotal role in most of multi criteria decision making methods and in the operational research. This thesis aims at showing some recent advances in their methodology. Actually, there are a number of open issues in this field and the contributions presented in this thesis can be grouped accordingly. The first issue regards the estimation of a weight vector given a preference relation. A new and efficient algorithm for estimating the priority vector of a reciprocal relation, i.e. a special type of preference relation, is going to be presented. The same section contains the proof that twenty methods already proposed in literature lead to unsatisfactory results as they employ a conflicting constraint in their optimization model. The second area of interest concerns consistency evaluation and it is possibly the kernel of the thesis. This thesis contains the proofs that some indices are equivalent and that therefore, some seemingly different formulae, end up leading to the very same result. Moreover, some numerical simulations are presented. The section ends with some consideration of a new method for fairly evaluating consistency. The third matter regards incomplete relations and how to estimate missing comparisons. This section reports a numerical study of the methods already proposed in literature and analyzes their behavior in different situations. The fourth, and last, topic, proposes a way to deal with group decision making by means of connecting preference relations with social network analysis.