23 resultados para reduced rank regression

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selostus: Suolapitoisuuden pienentämisen vaikutus kinkkuleikkeen aistittuun suolaisuuteen

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selostus: Aurattoman viljelyn vaikutus eroosioon ja ravinnehuuhtoumiin eteläsuomlaisella, savimaalla sijaitsevalla pellolla

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Kolmen eri hitsausliitoksen väsymisikä arvio on analysoitu monimuuttuja regressio analyysin avulla. Regression perustana on laaja S-N tietokanta joka on kerätty kirjallisuudesta. Tarkastellut liitokset ovat tasalevy liitos, krusiformi liitos ja pitkittäisripa levyssä. Muuttujina ovat jännitysvaihtelu, kuormitetun levyn paksuus ja kuormitus tapa. Paksuus effekti on käsitelty uudelleen kaikkia kolmea liitosta ajatellen. Uudelleen käsittelyn avulla on varmistettu paksuus effektin olemassa olo ennen monimuuttuja regressioon siirtymistä. Lineaariset väsymisikä yhtalöt on ajettu kolmelle hitsausliitokselle ottaen huomioon kuormitetun levyn paksuus sekä kuormitus tapa. Väsymisikä yhtalöitä on verrattu ja keskusteltu testitulosten valossa, jotka on kerätty kirjallisuudesta. Neljä tutkimustaon tehty kerättyjen väsymistestien joukosta ja erilaisia väsymisikä arvio metodeja on käytetty väsymisiän arviointiin. Tuloksia on tarkasteltu ja niistä keskusteltu oikeiden testien valossa. Tutkimuksissa on katsottu 2mm ja 6mm symmetristäpitkittäisripaa levyssä, 12.7mm epäsymmetristä pitkittäisripaa, 38mm symmetristä pitkittäisripaa vääntökuormituksessa ja 25mm/38mm kuorman kantavaa krusiformi liitosta vääntökuormituksessa. Mallinnus on tehty niin lähelle testi liitosta kuin mahdollista. Väsymisikä arviointi metodit sisältävät hot-spot metodin jossa hot-spot jännitys on laskettu kahta lineaarista ja epälineaarista ekstrapolointiakäyttäen sekä paksuuden läpi integrointia käyttäen. Lovijännitys ja murtumismekaniikka metodeja on käytetty krusiformi liitosta laskiessa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tutkimuksen tavoitteena on selvittää, esiintyykö suomeen sijoittavilla osakerahastoilla menestyksen pysyvyyttä. Tutkimusaineisto koostuu kaikista suomalaisista osakerahastoista, jotka toimivat ajanjaksolla 15.1.1998-13.1.2005. Aineisto on vapaa selviytymisvinoumasta. Suorituskyvyn mittareina käytetään CAPM-alfaa sekä kolmi- ja nelifaktori-alfaa. Empiirisessä osassa osakerahastojen menestyksen pysyvyyttä testataan Spearmanin järjestyskorrelaatiotestillä. Evidenssi menestyksen pysyvyydestä jäi vähäiseksi, vaikkakin sitä esiintyi satunnaisesti kaikilla menestysmittareilla joillakin ranking- ja sijoitusperiodin yhdistelmillä. CAPM-alfalla tarkasteltuna tilastollisesti merkitsevää menestyksen pysyvyyttä esiintyi selvästi useammin kuin muilla menestysmittareilla. Tulokset tukevat viimeaikaisia kansainvälisiä tutkimuksia, joiden mukaan menestyksen pysyvyys riippuu usein mittaustavasta. Menestysmittareina käytettyjen regressiomallien merkitsevyystestit osoittavat multifaktorimallien selittävän osakerahastojen tuottoja CAPM:a paremmin. Lisätyt muuttujat parantavat merkittävästi CAPM:n selitysvoimaa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The central theme for this study is graduate employment and employability in European-wide discussion. In this study, the complex relationships between higher education and the world of work are explored from the vantage point of how individuals make use of the higher education system in their transition from education to employment. The variation among individual transition processes in nine European countries is analysed with the help of a comparable graduate survey. Countries in this study are Italy, Spain, France, Austria, Germany, the Netherlands, the United Kingdom, Finland, and Norway. The data used for the study is commonly known as the “CHEERS” or “Careers after Higher Education, A European Research Survey.” The data was collected in 1999. The study discusses the possibilities and limitations the higher education system has in supporting the initial education-to-work transitions of youth. The study also addresses problems with comparing national higher education systems in terms of enrolment and graduate employability. A central purpose for this study is to reflect on concerns about the prolongation of individual transitions with a framework that simultaneously considers both the graduate employability and the duration of the education-to-work transition process. The key concept for this study is the standard student/graduate; synonym concepts are the traditional and the conventional student/graduate. Standard graduates are relatively young individuals who are performing their initial transition from education to working-life and who complete the degree-earning process within the stipulated time frame. In all nine countries, standard graduates make up a considerable share of the student flow, passing from higher education to the labour markets. The share of standard graduates is by far the largest in France, where they comprise the overwhelming mass. The proportion of the standard graduates is the lowest in Italy, Finland, and Austria where approximately one in four graduates completed the process of higher education within the stipulated time frame. Of the nine countries compared, employability of the whole graduate population is the greatest in Norway, the UK, Finland, and the Netherlands. Compared with employability of the whole graduate population, variation among the countries is considerably reduced when reviewing the employability of only the standard graduates. Thereby, even though the ranking among countries remains largely unchanged, the variations among them are smaller when the duration of degree earning process is standardized. The study also discusses other ideal types of student careers (or transition processes) besides the standard student/graduate. Results of regression analyses indicate that that at the pan-European level analysis, the graduate labour markets are not heavily segmented in terms of the type of the individual transition process. When considering within-country differences between the graduates, the field of studies is clearly a more powerful explanatory variable than the type of the transition process. There are, nevertheless, clear indications that, irrespective of the country, chances of finding a high status job are, on the average, highest amongst those who graduate within the stipulated duration of the degree program and who thereby have experienced the standard student career, whereas, participating in working life while studying protects against unemployment after finishing one’s degree.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates performance persistence among the equity funds investing in Russia during 2003-2007. Fund performance is measured using several methods including the Jensen alpha, the Fama-French 3- factor alpha, the Sharpe ratio and two of its variations. Moreover, we apply the Bayesian shrinkage estimation in performance measurement and evaluate its usefulness compared with the OLS 3-factor alphas. The pattern of performance persistence is analyzed using the Spearman rank correlation test, cross-sectional regression analysis and stacked return time series. Empirical results indicate that the Bayesian shrinkage estimates may provide better and more accurate estimates of fund performance compared with the OLS 3-factor alphas. Secondly, based on the results it seems that the degree of performance persistence is strongly related to length of the observation period. For the full sample period the results show strong signs of performance reversal whereas for the subperiod analysis the results indicate performance persistence during the most recent years.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Power transformer is the most expensive equipment on a substation. It is always necessary to get needed benefit with the lowest expenses. Producing of power transformers with reduced insulation strength is one of the possible ways to reduce expenses. Exploitation of such transformers was begun in the end of 70-th in the last century. Protection from overvoltages was done with valve-type magnetic combined surge arresters with increased blanking voltage during switching overvoltages. Nowadays there is the necessity of replacement of those devices. That’s why modernized nonlinear surge arrester was invented. This master’s thesis is focused on the use research of that modernized device in comparison with usual nonlinear surge arresters. The goal is to show the lightning overvoltages level using different types of nonlinear surge arresters and then calculations of the lightning protection reliability.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The underlying cause of many human autoimmune diseases is unknown, but several environmental factors are implicated in triggering the self-destructive immune reactions. Multiple Sclerosis (MS) is a chronic autoimmune disease of the central nervous system, potentially leading to persistent neurological deterioration. The cause of MS is not known, and apart from immunomodulatory treatments there is no cure. In the early phase of the disease, relapsing-remitting MS (RR-MS) is characterized by unpredictable exacerbations of the neurological symptoms called relapses, which can occur at different intervals ranging from 4 weeks to several years. Microbial infections are known to be able to trigger MS relapses, and the patients are instructed to avoid all factors that might increase the risk of infections and to properly use antibiotics as well as to take care of dental hygiene. Among those environmental factors which are known to increase susceptibility to infections, high ambient air inhalable particulate matter levels affect all people within a geographical region. During the period of interest in this thesis, the occurrence of MS relapses could be effectively reduced by injections of interferon, which has immunomodulatory and antiviral properties. In this thesis, ecological and epidemiological analyses were used to study the possible connection between MS relapse occurrence, population level viral infections and air quality factors, as well as the effects of interferon medication. Hospital archive data were collected retrospectively from 1986-2001, a period in time ranging from when interferon medication first became available until just before other disease-modifying MS therapies arrived on the market. The grouped data were studied with logistic regression and intervention analysis, and individual patient data with survival analysis. Interferons proved to be effective in the treatment of MS in this observational study, as the amount of MS exacerbations was lower during interferon use as compared to the time before interferon treatment. A statistically significant temporal relationship between MS relapses and inhalable particular matter (PM10) concentrations was found in this study, which implies that MS patients are affected by the exposure to PM10. Interferon probably protected against the effect of PM10, because a significant increase in the risk of exacerbations was only observed in MS patients without interferon medication following environmental exposure to population level specific viral infections and PM10. Apart from being antiviral, interferon could thus also attenuate the enhancement of immune reactions caused by ambient air PM10. The retrospective approach utilizing carefully constructed hospital records proved to be an economical and reliable source of MS disease information for statistical analyses.