21 resultados para Ant-based algorithm
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Paperin pinnan karheus on yksi paperin laatukriteereistä. Sitä mitataan fyysisestipaperin pintaa mittaavien laitteiden ja optisten laitteiden avulla. Mittaukset vaativat laboratorioolosuhteita, mutta nopeammille, suoraan linjalla tapahtuville mittauksilla olisi tarvetta paperiteollisuudessa. Paperin pinnan karheus voidaan ilmaista yhtenä näytteelle kohdistuvana karheusarvona. Tässä työssä näyte on jaettu merkitseviin alueisiin, ja jokaiselle alueelle on laskettu erillinen karheusarvo. Karheuden mittaukseen on käytetty useita menetelmiä. Yleisesti hyväksyttyä tilastollista menetelmää on käytetty tässä työssä etäisyysmuunnoksen lisäksi. Paperin pinnan karheudenmittauksessa on ollut tarvetta jakaa analysoitava näyte karheuden perusteella alueisiin. Aluejaon avulla voidaan rajata näytteestä selvästi karheampana esiintyvät alueet. Etäisyysmuunnos tuottaa alueita, joita on analysoitu. Näistä alueista on muodostettu yhtenäisiä alueita erilaisilla segmentointimenetelmillä. PNN -menetelmään (Pairwise Nearest Neighbor) ja naapurialueiden yhdistämiseen perustuvia algoritmeja on käytetty.Alueiden jakamiseen ja yhdistämiseen perustuvaa lähestymistapaa on myös tarkasteltu. Segmentoitujen kuvien validointi on yleensä tapahtunut ihmisen tarkastelemana. Tämän työn lähestymistapa on verrata yleisesti hyväksyttyä tilastollista menetelmää segmentoinnin tuloksiin. Korkea korrelaatio näiden tulosten välillä osoittaa onnistunutta segmentointia. Eri kokeiden tuloksia on verrattu keskenään hypoteesin testauksella. Työssä on analysoitu kahta näytesarjaa, joidenmittaukset on suoritettu OptiTopolla ja profilometrillä. Etäisyysmuunnoksen aloitusparametrit, joita muutettiin kokeiden aikana, olivat aloituspisteiden määrä ja sijainti. Samat parametrimuutokset tehtiin kaikille algoritmeille, joita käytettiin alueiden yhdistämiseen. Etäisyysmuunnoksen jälkeen korrelaatio oli voimakkaampaa profilometrillä mitatuille näytteille kuin OptiTopolla mitatuille näytteille. Segmentoiduilla OptiTopo -näytteillä korrelaatio parantui voimakkaammin kuin profilometrinäytteillä. PNN -menetelmän tuottamilla tuloksilla korrelaatio oli paras.
Resumo:
Main purpose of this thesis is to introduce a new lossless compression algorithm for multispectral images. Proposed algorithm is based on reducing the band ordering problem to the problem of finding a minimum spanning tree in a weighted directed graph, where set of the graph vertices corresponds to multispectral image bands and the arcs’ weights have been computed using a newly invented adaptive linear prediction model. The adaptive prediction model is an extended unification of 2–and 4–neighbour pixel context linear prediction schemes. The algorithm provides individual prediction of each image band using the optimal prediction scheme, defined by the adaptive prediction model and the optimal predicting band suggested by minimum spanning tree. Its efficiency has been compared with respect to the best lossless compression algorithms for multispectral images. Three recently invented algorithms have been considered. Numerical results produced by these algorithms allow concluding that adaptive prediction based algorithm is the best one for lossless compression of multispectral images. Real multispectral data captured from an airplane have been used for the testing.
Resumo:
This work presents synopsis of efficient strategies used in power managements for achieving the most economical power and energy consumption in multicore systems, FPGA and NoC Platforms. In this work, a practical approach was taken, in an effort to validate the significance of the proposed Adaptive Power Management Algorithm (APMA), proposed for system developed, for this thesis project. This system comprise arithmetic and logic unit, up and down counters, adder, state machine and multiplexer. The essence of carrying this project firstly, is to develop a system that will be used for this power management project. Secondly, to perform area and power synopsis of the system on these various scalable technology platforms, UMC 90nm nanotechnology 1.2v, UMC 90nm nanotechnology 1.32v and UMC 0.18 μmNanotechnology 1.80v, in order to examine the difference in area and power consumption of the system on the platforms. Thirdly, to explore various strategies that can be used to reducing system’s power consumption and to propose an adaptive power management algorithm that can be used to reduce the power consumption of the system. The strategies introduced in this work comprise Dynamic Voltage Frequency Scaling (DVFS) and task parallelism. After the system development, it was run on FPGA board, basically NoC Platforms and on these various technology platforms UMC 90nm nanotechnology1.2v, UMC 90nm nanotechnology 1.32v and UMC180 nm nanotechnology 1.80v, the system synthesis was successfully accomplished, the simulated result analysis shows that the system meets all functional requirements, the power consumption and the area utilization were recorded and analyzed in chapter 7 of this work. This work extensively reviewed various strategies for managing power consumption which were quantitative research works by many researchers and companies, it's a mixture of study analysis and experimented lab works, it condensed and presents the whole basic concepts of power management strategy from quality technical papers.
Resumo:
With the development of electronic devices, more and more mobile clients are connected to the Internet and they generate massive data every day. We live in an age of “Big Data”, and every day we generate hundreds of million magnitude data. By analyzing the data and making prediction, we can carry out better development plan. Unfortunately, traditional computation framework cannot meet the demand, so the Hadoop would be put forward. First the paper introduces the background and development status of Hadoop, compares the MapReduce in Hadoop 1.0 and YARN in Hadoop 2.0, and analyzes the advantages and disadvantages of them. Because the resource management module is the core role of YARN, so next the paper would research about the resource allocation module including the resource management, resource allocation algorithm, resource preemption model and the whole resource scheduling process from applying resource to finishing allocation. Also it would introduce the FIFO Scheduler, Capacity Scheduler, and Fair Scheduler and compare them. The main work has been done in this paper is researching and analyzing the Dominant Resource Fair algorithm of YARN, putting forward a maximum resource utilization algorithm based on Dominant Resource Fair algorithm. The paper also provides a suggestion to improve the unreasonable facts in resource preemption model. Emphasizing “fairness” during resource allocation is the core concept of Dominant Resource Fair algorithm of YARM. Because the cluster is multiple users and multiple resources, so the user’s resource request is multiple too. The DRF algorithm would divide the user’s resources into dominant resource and normal resource. For a user, the dominant resource is the one whose share is highest among all the request resources, others are normal resource. The DRF algorithm requires the dominant resource share of each user being equal. But for these cases where different users’ dominant resource amount differs greatly, emphasizing “fairness” is not suitable and can’t promote the resource utilization of the cluster. By analyzing these cases, this thesis puts forward a new allocation algorithm based on DRF. The new algorithm takes the “fairness” into consideration but not the main principle. Maximizing the resource utilization is the main principle and goal of the new algorithm. According to comparing the result of the DRF and new algorithm based on DRF, we found that the new algorithm has more high resource utilization than DRF. The last part of the thesis is to install the environment of YARN and use the Scheduler Load Simulator (SLS) to simulate the cluster environment.
Resumo:
Tehoelektoniikkalaitteella tarkoitetaan ohjaus- ja säätöjärjestelmää, jolla sähköä muokataan saatavilla olevasta muodosta haluttuun uuteen muotoon ja samalla hallitaan sähköisen tehon virtausta lähteestä käyttökohteeseen. Tämä siis eroaa signaalielektroniikasta, jossa sähköllä tyypillisesti siirretään tietoa hyödyntäen eri tiloja. Tehoelektroniikkalaitteita vertailtaessa katsotaan yleensä niiden luotettavuutta, kokoa, tehokkuutta, säätötarkkuutta ja tietysti hintaa. Tyypillisiä tehoelektroniikkalaitteita ovat taajuudenmuuttajat, UPS (Uninterruptible Power Supply) -laitteet, hitsauskoneet, induktiokuumentimet sekä erilaiset teholähteet. Perinteisesti näiden laitteiden ohjaus toteutetaan käyttäen mikroprosessoreja, ASIC- (Application Specific Integrated Circuit) tai IC (Intergrated Circuit) -piirejä sekä analogisia säätimiä. Tässä tutkimuksessa on analysoitu FPGA (Field Programmable Gate Array) -piirien soveltuvuutta tehoelektroniikan ohjaukseen. FPGA-piirien rakenne muodostuu erilaisista loogisista elementeistä ja niiden välisistä yhdysjohdoista.Loogiset elementit ovat porttipiirejä ja kiikkuja. Yhdysjohdot ja loogiset elementit ovat piirissä kiinteitä eikä koostumusta tai lukumäärää voi jälkikäteen muuttaa. Ohjelmoitavuus syntyy elementtien välisistä liitännöistä. Piirissä on lukuisia, jopa miljoonia kytkimiä, joiden asento voidaan asettaa. Siten piirin peruselementeistä voidaan muodostaa lukematon määrä erilaisia toiminnallisia kokonaisuuksia. FPGA-piirejä on pitkään käytetty kommunikointialan tuotteissa ja siksi niiden kehitys on viime vuosina ollut nopeaa. Samalla hinnat ovat pudonneet. Tästä johtuen FPGA-piiristä on tullut kiinnostava vaihtoehto myös tehoelektroniikkalaitteiden ohjaukseen. Väitöstyössä FPGA-piirien käytön soveltuvuutta on tutkittu käyttäen kahta vaativaa ja erilaista käytännön tehoelektroniikkalaitetta: taajuudenmuuttajaa ja hitsauskonetta. Molempiin testikohteisiin rakennettiin alan suomalaisten teollisuusyritysten kanssa soveltuvat prototyypit,joiden ohjauselektroniikka muutettiin FPGA-pohjaiseksi. Lisäksi kehitettiin tätä uutta tekniikkaa hyödyntävät uudentyyppiset ohjausmenetelmät. Prototyyppien toimivuutta verrattiin vastaaviin perinteisillä menetelmillä ohjattuihin kaupallisiin tuotteisiin ja havaittiin FPGA-piirien mahdollistaman rinnakkaisen laskennantuomat edut molempien tehoelektroniikkalaitteiden toimivuudessa. Työssä on myösesitetty uusia menetelmiä ja työkaluja FPGA-pohjaisen säätöjärjestelmän kehitykseen ja testaukseen. Esitetyillä menetelmillä tuotteiden kehitys saadaan mahdollisimman nopeaksi ja tehokkaaksi. Lisäksi työssä on kehitetty FPGA:n sisäinen ohjaus- ja kommunikointiväylärakenne, joka palvelee tehoelektroniikkalaitteiden ohjaussovelluksia. Uusi kommunikointirakenne edistää lisäksi jo tehtyjen osajärjestelmien uudelleen käytettävyyttä tulevissa sovelluksissa ja tuotesukupolvissa.
Resumo:
The parameter setting of a differential evolution algorithm must meet several requirements: efficiency, effectiveness, and reliability. Problems vary. The solution of a particular problem can be represented in different ways. An algorithm most efficient in dealing with a particular representation may be less efficient in dealing with other representations. The development of differential evolution-based methods contributes substantially to research on evolutionary computing and global optimization in general. The objective of this study is to investigatethe differential evolution algorithm, the intelligent adjustment of its controlparameters, and its application. In the thesis, the differential evolution algorithm is first examined using different parameter settings and test functions. Fuzzy control is then employed to make control parameters adaptive based on an optimization process and expert knowledge. The developed algorithms are applied to training radial basis function networks for function approximation with possible variables including centers, widths, and weights of basis functions and both having control parameters kept fixed and adjusted by fuzzy controller. After the influence of control variables on the performance of the differential evolution algorithm was explored, an adaptive version of the differential evolution algorithm was developed and the differential evolution-based radial basis function network training approaches were proposed. Experimental results showed that the performance of the differential evolution algorithm is sensitive to parameter setting, and the best setting was found to be problem dependent. The fuzzy adaptive differential evolution algorithm releases the user load of parameter setting and performs better than those using all fixedparameters. Differential evolution-based approaches are effective for training Gaussian radial basis function networks.
Resumo:
The purpose of this thesis is to present a new approach to the lossy compression of multispectral images. Proposed algorithm is based on combination of quantization and clustering. Clustering was investigated for compression of the spatial dimension and the vector quantization was applied for spectral dimension compression. Presenting algo¬rithms proposes to compress multispectral images in two stages. During the first stage we define the classes' etalons, another words to each uniform areas are located inside the image the number of class is given. And if there are the pixels are not yet assigned to some of the clusters then it doing during the second; pass and assign to the closest eta¬lons. Finally a compressed image is represented with a flat index image pointing to a codebook with etalons. The decompression stage is instant too. The proposed method described in this paper has been tested on different satellite multispectral images from different resources. The numerical results and illustrative examples of the method are represented too.
Resumo:
Diplomityössä on käsitelty uudenlaisia menetelmiä riippumattomien komponenttien analyysiin(ICA): Menetelmät perustuvat colligaatioon ja cross-momenttiin. Colligaatio menetelmä perustuu painojen colligaatioon. Menetelmässä on käytetty kahden tyyppisiä todennäköisyysjakaumia yhden sijasta joka perustuu yleiseen itsenäisyyden kriteeriin. Työssä on käytetty colligaatio lähestymistapaa kahdella asymptoottisella esityksellä. Gram-Charlie ja Edgeworth laajennuksia käytetty arvioimaan todennäköisyyksiä näissä menetelmissä. Työssä on myös käytetty cross-momentti menetelmää joka perustuu neljännen asteen cross-momenttiin. Menetelmä on hyvin samankaltainen FastICA algoritmin kanssa. Molempia menetelmiä on tarkasteltu lineaarisella kahden itsenäisen muuttajan sekoituksella. Lähtö signaalit ja sekoitetut matriisit ovattuntemattomia signaali lähteiden määrää lukuunottamatta. Työssä on vertailtu colligaatio menetelmään ja sen modifikaatioita FastICA:an ja JADE:en. Työssä on myös tehty vertailu analyysi suorituskyvyn ja keskusprosessori ajan suhteen cross-momenttiin perustuvien menetelmien, FastICA:n ja JADE:n useiden sekoitettujen parien kanssa.
Resumo:
This master’s thesis aims to study and represent from literature how evolutionary algorithms are used to solve different search and optimisation problems in the area of software engineering. Evolutionary algorithms are methods, which imitate the natural evolution process. An artificial evolution process evaluates fitness of each individual, which are solution candidates. The next population of candidate solutions is formed by using the good properties of the current population by applying different mutation and crossover operations. Different kinds of evolutionary algorithm applications related to software engineering were searched in the literature. Applications were classified and represented. Also the necessary basics about evolutionary algorithms were presented. It was concluded, that majority of evolutionary algorithm applications related to software engineering were about software design or testing. For example, there were applications about classifying software production data, project scheduling, static task scheduling related to parallel computing, allocating modules to subsystems, N-version programming, test data generation and generating an integration test order. Many applications were experimental testing rather than ready for real production use. There were also some Computer Aided Software Engineering tools based on evolutionary algorithms.
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
This thesis concentrates on developing a practical local approach methodology based on micro mechanical models for the analysis of ductile fracture of welded joints. Two major problems involved in the local approach, namely the dilational constitutive relation reflecting the softening behaviour of material, and the failure criterion associated with the constitutive equation, have been studied in detail. Firstly, considerable efforts were made on the numerical integration and computer implementation for the non trivial dilational Gurson Tvergaard model. Considering the weaknesses of the widely used Euler forward integration algorithms, a family of generalized mid point algorithms is proposed for the Gurson Tvergaard model. Correspondingly, based on the decomposition of stresses into hydrostatic and deviatoric parts, an explicit seven parameter expression for the consistent tangent moduli of the algorithms is presented. This explicit formula avoids any matrix inversion during numerical iteration and thus greatly facilitates the computer implementation of the algorithms and increase the efficiency of the code. The accuracy of the proposed algorithms and other conventional algorithms has been assessed in a systematic manner in order to highlight the best algorithm for this study. The accurate and efficient performance of present finite element implementation of the proposed algorithms has been demonstrated by various numerical examples. It has been found that the true mid point algorithm (a = 0.5) is the most accurate one when the deviatoric strain increment is radial to the yield surface and it is very important to use the consistent tangent moduli in the Newton iteration procedure. Secondly, an assessment of the consistency of current local failure criteria for ductile fracture, the critical void growth criterion, the constant critical void volume fraction criterion and Thomason's plastic limit load failure criterion, has been made. Significant differences in the predictions of ductility by the three criteria were found. By assuming the void grows spherically and using the void volume fraction from the Gurson Tvergaard model to calculate the current void matrix geometry, Thomason's failure criterion has been modified and a new failure criterion for the Gurson Tvergaard model is presented. Comparison with Koplik and Needleman's finite element results shows that the new failure criterion is fairly accurate indeed. A novel feature of the new failure criterion is that a mechanism for void coalescence is incorporated into the constitutive model. Hence the material failure is a natural result of the development of macroscopic plastic flow and the microscopic internal necking mechanism. By the new failure criterion, the critical void volume fraction is not a material constant and the initial void volume fraction and/or void nucleation parameters essentially control the material failure. This feature is very desirable and makes the numerical calibration of void nucleation parameters(s) possible and physically sound. Thirdly, a local approach methodology based on the above two major contributions has been built up in ABAQUS via the user material subroutine UMAT and applied to welded T joints. By using the void nucleation parameters calibrated from simple smooth and notched specimens, it was found that the fracture behaviour of the welded T joints can be well predicted using present methodology. This application has shown how the damage parameters of both base material and heat affected zone (HAZ) material can be obtained in a step by step manner and how useful and capable the local approach methodology is in the analysis of fracture behaviour and crack development as well as structural integrity assessment of practical problems where non homogeneous materials are involved. Finally, a procedure for the possible engineering application of the present methodology is suggested and discussed.
Resumo:
Metaheuristic methods have become increasingly popular approaches in solving global optimization problems. From a practical viewpoint, it is often desirable to perform multimodal optimization which, enables the search of more than one optimal solution to the task at hand. Population-based metaheuristic methods offer a natural basis for multimodal optimization. The topic has received increasing interest especially in the evolutionary computation community. Several niching approaches have been suggested to allow multimodal optimization using evolutionary algorithms. Most global optimization approaches, including metaheuristics, contain global and local search phases. The requirement to locate several optima sets additional requirements for the design of algorithms to be effective in both respects in the context of multimodal optimization. In this thesis, several different multimodal optimization algorithms are studied in regard to how their implementation in the global and local search phases affect their performance in different problems. The study concentrates especially on variations of the Differential Evolution algorithm and their capabilities in multimodal optimization. To separate the global and local search search phases, three multimodal optimization algorithms are proposed, two of which hybridize the Differential Evolution with a local search method. As the theoretical background behind the operation of metaheuristics is not generally thoroughly understood, the research relies heavily on experimental studies in finding out the properties of different approaches. To achieve reliable experimental information, the experimental environment must be carefully chosen to contain appropriate and adequately varying problems. The available selection of multimodal test problems is, however, rather limited, and no general framework exists. As a part of this thesis, such a framework for generating tunable test functions for evaluating different methods of multimodal optimization experimentally is provided and used for testing the algorithms. The results demonstrate that an efficient local phase is essential for creating efficient multimodal optimization algorithms. Adding a suitable global phase has the potential to boost the performance significantly, but the weak local phase may invalidate the advantages gained from the global phase.
Resumo:
The development of correct programs is a core problem in computer science. Although formal verification methods for establishing correctness with mathematical rigor are available, programmers often find these difficult to put into practice. One hurdle is deriving the loop invariants and proving that the code maintains them. So called correct-by-construction methods aim to alleviate this issue by integrating verification into the programming workflow. Invariant-based programming is a practical correct-by-construction method in which the programmer first establishes the invariant structure, and then incrementally extends the program in steps of adding code and proving after each addition that the code is consistent with the invariants. In this way, the program is kept internally consistent throughout its development, and the construction of the correctness arguments (proofs) becomes an integral part of the programming workflow. A characteristic of the approach is that programs are described as invariant diagrams, a graphical notation similar to the state charts familiar to programmers. Invariant-based programming is a new method that has not been evaluated in large scale studies yet. The most important prerequisite for feasibility on a larger scale is a high degree of automation. The goal of the Socos project has been to build tools to assist the construction and verification of programs using the method. This thesis describes the implementation and evaluation of a prototype tool in the context of the Socos project. The tool supports the drawing of the diagrams, automatic derivation and discharging of verification conditions, and interactive proofs. It is used to develop programs that are correct by construction. The tool consists of a diagrammatic environment connected to a verification condition generator and an existing state-of-the-art theorem prover. Its core is a semantics for translating diagrams into verification conditions, which are sent to the underlying theorem prover. We describe a concrete method for 1) deriving sufficient conditions for total correctness of an invariant diagram; 2) sending the conditions to the theorem prover for simplification; and 3) reporting the results of the simplification to the programmer in a way that is consistent with the invariantbased programming workflow and that allows errors in the program specification to be efficiently detected. The tool uses an efficient automatic proof strategy to prove as many conditions as possible automatically and lets the remaining conditions be proved interactively. The tool is based on the verification system PVS and i uses the SMT (Satisfiability Modulo Theories) solver Yices as a catch-all decision procedure. Conditions that were not discharged automatically may be proved interactively using the PVS proof assistant. The programming workflow is very similar to the process by which a mathematical theory is developed inside a computer supported theorem prover environment such as PVS. The programmer reduces a large verification problem with the aid of the tool into a set of smaller problems (lemmas), and he can substantially improve the degree of proof automation by developing specialized background theories and proof strategies to support the specification and verification of a specific class of programs. We demonstrate this workflow by describing in detail the construction of a verified sorting algorithm. Tool-supported verification often has little to no presence in computer science (CS) curricula. Furthermore, program verification is frequently introduced as an advanced and purely theoretical topic that is not connected to the workflow taught in the early and practically oriented programming courses. Our hypothesis is that verification could be introduced early in the CS education, and that verification tools could be used in the classroom to support the teaching of formal methods. A prototype of Socos has been used in a course at Åbo Akademi University targeted at first and second year undergraduate students. We evaluate the use of Socos in the course as part of a case study carried out in 2007.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
The maintenance of electric distribution network is a topical question for distribution system operators because of increasing significance of failure costs. In this dissertation the maintenance practices of the distribution system operators are analyzed and a theory for scheduling maintenance activities and reinvestment of distribution components is created. The scheduling is based on the deterioration of components and the increasing failure rates due to aging. The dynamic programming algorithm is used as a solving method to maintenance problem which is caused by the increasing failure rates of the network. The other impacts of network maintenance like environmental and regulation reasons are not included to the scope of this thesis. Further the tree trimming of the corridors and the major disturbance of the network are not included to the problem optimized in this thesis. For optimizing, four dynamic programming models are presented and the models are tested. Programming is made in VBA-language to the computer. For testing two different kinds of test networks are used. Because electric distribution system operators want to operate with bigger component groups, optimal timing for component groups is also analyzed. A maintenance software package is created to apply the presented theories in practice. An overview of the program is presented.