16 resultados para predictive algorithm
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
The parameter setting of a differential evolution algorithm must meet several requirements: efficiency, effectiveness, and reliability. Problems vary. The solution of a particular problem can be represented in different ways. An algorithm most efficient in dealing with a particular representation may be less efficient in dealing with other representations. The development of differential evolution-based methods contributes substantially to research on evolutionary computing and global optimization in general. The objective of this study is to investigatethe differential evolution algorithm, the intelligent adjustment of its controlparameters, and its application. In the thesis, the differential evolution algorithm is first examined using different parameter settings and test functions. Fuzzy control is then employed to make control parameters adaptive based on an optimization process and expert knowledge. The developed algorithms are applied to training radial basis function networks for function approximation with possible variables including centers, widths, and weights of basis functions and both having control parameters kept fixed and adjusted by fuzzy controller. After the influence of control variables on the performance of the differential evolution algorithm was explored, an adaptive version of the differential evolution algorithm was developed and the differential evolution-based radial basis function network training approaches were proposed. Experimental results showed that the performance of the differential evolution algorithm is sensitive to parameter setting, and the best setting was found to be problem dependent. The fuzzy adaptive differential evolution algorithm releases the user load of parameter setting and performs better than those using all fixedparameters. Differential evolution-based approaches are effective for training Gaussian radial basis function networks.
Resumo:
Coherent anti-Stokes Raman scattering is the powerful method of laser spectroscopy in which significant successes are achieved. However, the non-linear nature of CARS complicates the analysis of the received spectra. The objective of this Thesis is to develop a new phase retrieval algorithm for CARS. It utilizes the maximum entropy method and the new wavelet approach for spectroscopic background correction of a phase function. The method was developed to be easily automated and used on a large number of spectra of different substances.. The algorithm was successfully tested on experimental data.
Resumo:
The identifiability of the parameters of a heat exchanger model without phase change was studied in this Master’s thesis using synthetically made data. A fast, two-step Markov chain Monte Carlo method (MCMC) was tested with a couple of case studies and a heat exchanger model. The two-step MCMC-method worked well and decreased the computation time compared to the traditional MCMC-method. The effect of measurement accuracy of certain control variables to the identifiability of parameters was also studied. The accuracy used did not seem to have a remarkable effect to the identifiability of parameters. The use of the posterior distribution of parameters in different heat exchanger geometries was studied. It would be computationally most efficient to use the same posterior distribution among different geometries in the optimisation of heat exchanger networks. According to the results, this was possible in the case when the frontal surface areas were the same among different geometries. In the other cases the same posterior distribution can be used for optimisation too, but that will give a wider predictive distribution as a result. For condensing surface heat exchangers the numerical stability of the simulation model was studied. As a result, a stable algorithm was developed.
Resumo:
In the Russian Wholesale Market, electricity and capacity are traded separately. Capacity is a special good, the sale of which obliges suppliers to keep their generating equipment ready to produce the quantity of electricity indicated by the System Operator. The purpose of the formation of capacity trading was the maintenance of reliable and uninterrupted delivery of electricity in the wholesale market. The price of capacity reflects constant investments in construction, modernization and maintenance of power plants. So, the capacity sale creates favorable conditions to attract investments in the energy sector because it guarantees the investor that his investments will be returned.
Resumo:
In this work a fuzzy linear system is used to solve Leontief input-output model with fuzzy entries. For solving this model, we assume that the consumption matrix from di erent sectors of the economy and demand are known. These assumptions heavily depend on the information obtained from the industries. Hence uncertainties are involved in this information. The aim of this work is to model these uncertainties and to address them by fuzzy entries such as fuzzy numbers and LR-type fuzzy numbers (triangular and trapezoidal). Fuzzy linear system has been developed using fuzzy data and it is solved using Gauss-Seidel algorithm. Numerical examples show the e ciency of this algorithm. The famous example from Prof. Leontief, where he solved the production levels for U.S. economy in 1958, is also further analyzed.
Resumo:
I doktorsavhandlingen undersöks förmågan att lösa hos ett antal lösare för optimeringsproblem och ett antal svårigheter med att göra en rättvis lösarjämförelse avslöjas. Dessutom framläggs några förbättringar som utförts på en av lösarna som heter GAMS/AlphaECP. Optimering innebär, i det här sammanhanget, att finna den bästa möjliga lösningen på ett problem. Den undersökta klassen av problem kan karaktäriseras som svårlöst och förekommer inom ett flertal industriområden. Målet har varit att undersöka om det finns en lösare som är universellt snabbare och hittar lösningar med högre kvalitet än någon av de andra lösarna. Det kommersiella optimeringssystemet GAMS (General Algebraic Modeling System) och omfattande problembibliotek har använts för att jämföra lösare. Förbättringarna som presenterats har utförts på GAMS/AlphaECP lösaren som baserar sig på skärplansmetoden Extended Cutting Plane (ECP). ECP-metoden har utvecklats främst av professor Tapio Westerlund på Anläggnings- och systemteknik vid Åbo Akademi.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
Tässä diplomityössä määritellään biopolttoainetta käyttävän voimalaitoksen käytönaikainen tuotannon optimointimenetelmä. Määrittelytyö liittyy MW Powerin MultiPower CHP –voimalaitoskonseptin jatkokehitysprojektiin. Erilaisten olemassa olevien optimointitapojen joukosta valitaan tarkoitukseen sopiva, laitosmalliin ja kustannusfunktioon perustuva menetelmä, jonka tulokset viedään automaatiojärjestelmään PID-säätimien asetusarvojen muodossa. Prosessin mittaustulosten avulla lasketaan laitoksen energia- ja massataseet, joiden tuloksia käytetään seuraavan optimointihetken lähtötietoina. Optimoinnin kohdefunktio on kustannusfunktio, jonka termit ovat voimalaitoksen käytöstä aiheutuvia tuottoja ja kustannuksia. Prosessia optimoidaan säätimille annetut raja-arvot huomioiden niin, että kokonaiskate maksimoituu. Kun laitokselle kertyy käyttöikää ja historiadataa, voidaan prosessin optimointia nopeuttaa hakemalla tilastollisesti historiadatasta nykytilanteen olosuhteita vastaava hetki. Kyseisen historian hetken katetta verrataan kustannusfunktion optimoinnista saatuun katteeseen. Paremman katteen antavan menetelmän laskemat asetusarvot otetaan käyttöön prosessin ohjausta varten. Mikäli kustannusfunktion laskenta eikä historiadatan perusteella tehty haku anna paranevaa katetta, niiden laskemia asetusarvoja ei oteta käyttöön. Sen sijaan optimia aletaan hakea deterministisellä optimointialgoritmilla, joka hakee nykyhetken ympäristöstä paremman katteen antavia säätimien asetusarvoja. Säätöjärjestelmä on mahdollista toteuttaa myös tulevaisuutta ennustavana. Työn käytännön osuudessa voimalaitosmalli luodaan kahden eri mallinnusohjelman avulla, joista toisella kuvataan kattilan ja toisella voimalaitosprosessin toimintaa. Mallinnuksen tuloksena saatuja prosessiarvoja hyödynnetään lähtötietoina käyttökatteen laskennassa. Kate lasketaan kustannusfunktion perusteella. Tuotoista suurimmat liittyvät sähkön ja lämmön myyntiin sekä tuotantotukeen, ja suurimmat kustannukset liittyvät investoinnin takaisinmaksuun ja polttoaineen ostoon. Kustannusfunktiolle tehdään herkkyystarkastelu, jossa seurataan katteen muutosta prosessin teknisiä arvoja muutettaessa. Tuloksia vertaillaan referenssivoimalaitoksella suoritettujen verifiointimittausten tuloksiin, ja havaitaan, että tulokset eivät ole täysin yhteneviä. Erot johtuvat sekä mallinnuksen puutteista että mittausten lyhyehköistä tarkasteluajoista. Automatisoidun optimointijärjestelmän käytännön toteutusta alustetaan määrittelemällä käyttöön otettava optimointitapa, siihen liittyvät säätöpiirit ja tarvittavat lähtötiedot. Projektia tullaan jatkamaan järjestelmän ohjelmoinnilla, testauksella ja virityksellä todellisessa voimalaitosympäristössä ja myöhemmin ennustavan säädön toteuttamisella.
Resumo:
Tissue-based biomarkers are studied to receive information about the pathologic processes and cancer outcome, and to enable development of patient-tailored treatments. The aim of this study was to investigate the potential prognostic and/or predictive value of selected biomarkers in colorectal cancer (CRC). Group IIA secretory phospholipase A2 (IIA PLA2) expression was assessed in 114 samples presenting different phases of human colorectal carcinogenesis. Securin, Ki-67, CD44 variant 6 (CD44v6), aldehyde dehydrogenase 1 (ALDH1) and β-catenin were studied in a material including 227 rectal carcinoma patients treated with short-course preoperative radiotherapy (RT), long-course preoperative (chemo)RT (CRT) or surgery only. Epidermal growth factor receptor (EGFR) gene copy number (GCN), its heterogeneity in CRC tissue, and association with response to EGFR-targeted antibodies cetuximab and panitumumab were analyzed in a cohort of 76 metastatic CRC. IIA PLA2 expression was decreased in invasive carcinomas compared to adenomas, but did not relate to patient survival. High securin expression after long-course (C)RT and high ALDH1 expression in node-negative rectal cancer were independent adverse prognostic factors, ALDH1 specifically in patients treated with adjuvant chemotherapy. The lack of membranous CD44v6 in the rectal cancer invasive front associated with infiltrative growth pattern and the risk of disease recurrence. Heterogeneous EGFR GCN increase predicted benefit from EGFR-targeted antibodies, also in the chemorefractory patient population. In summary, high securin and ALDH1 protein expression independently relate to poor outcome in subgroups of rectal cancer patients, potentially because of resistance to conventional chemotherapeutics. Heterogeneous increase in EGFR GCN was validated to be a promising predictive factor in the treatment of metastatic CRC.
Resumo:
Med prediktion avses att man skattar det framtida värdet på en observerbar storhet. Kännetecknande för det bayesianska paradigmet är att osäkerhet gällande okända storheter uttrycks i form av sannolikheter. En bayesiansk prediktiv modell är således en sannolikhetsfördelning över de möjliga värden som en observerbar, men ännu inte observerad storhet kan anta. I de artiklar som ingår i avhandlingen utvecklas metoder, vilka bl.a. tillämpas i analys av kromatografiska data i brottsutredningar. Med undantag för den första artikeln, bygger samtliga metoder på bayesiansk prediktiv modellering. I artiklarna betraktas i huvudsak tre olika typer av problem relaterade till kromatografiska data: kvantifiering, parvis matchning och klustring. I den första artikeln utvecklas en icke-parametrisk modell för mätfel av kromatografiska analyser av alkoholhalt i blodet. I den andra artikeln utvecklas en prediktiv inferensmetod för jämförelse av två stickprov. Metoden tillämpas i den tredje artik eln för jämförelse av oljeprover i syfte att kunna identifiera den förorenande källan i samband med oljeutsläpp. I den fjärde artikeln härleds en prediktiv modell för klustring av data av blandad diskret och kontinuerlig typ, vilken bl.a. tillämpas i klassificering av amfetaminprover med avseende på produktionsomgångar.
Resumo:
Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.
Resumo:
The purpose of this paper is to examine the stability and predictive abilities of the beta coefficients of individual equities in the Finnish stock market. As beta is widely used in several areas of finance, including risk management, asset pricing and performance evaluation among others, it is important to understand its characteristics and find out whether its estimates can be trusted and utilized.