58 resultados para Selection Algorithms

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selostus: Alkionsiirtojalostusohjelma "ASMO", sen tavoitteet ja yhteenveto alkuvalinnan tuloksista

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tämän diplomityön tavoitteena on kartoittaa suomalaisen sahakonevalmistaja Veisto Oy:n kannalta lähitulevaisuuden merkittävimmät markkina-alueet, joiden sahateollisuuteen tehdään lähivuosina eniten korkean teknologian investointeja. Markkina-alueiden valinnassa sovelletaan sekä numeerisiin tilastoihin että asiantuntijahaastatteluihin pohjautuvia ranking-menetelmiä. Työn ensimmäinen osa käsittelee kansainvälisten teollisten markkinoiden ominaispiirteitä ja niiden analysointia. Pääpaino on kuitenkin screening-menetelmillä, markkina-alueiden vertailumenetelmilläja päätöksenteon työkaluilla. Työn toisessa osassa keskitytään markkina-alueiden screeningiin, analysointiin ja maiden eri ominaisuuksien vertailuun. Päätöksentekomatriiseja hyödyntäen valitaan Veisto Oy:lle kolme tällä hetkellä houkuttelevinta markkina-aluetta, joita ovat Venäjä, USA:n kaakkoisosan Southern Yellow Pine -alue sekä Etelä-Amerikan suurimmat sahaajamaat (Brasilia, Argentiina ja Chile) yhtenä alueena. Valituilla alueilla on Veiston kannalta omat haasteensa: USA:ssa vahvat kotimaiset kilpailijat ja uusien referenssien saaminen, Venäjällä investointien epävarmuus ja markkina-alueen laajuuden tuoma monimuotoisuus sekä Etelä-Amerikassa vahvat ruotsalaiset kilpailijat sekä etenkin Brasilian osalta tuntuvat suojatullit.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tutkielmantavoitteena oli luoda ohjeistus toimittajan valinnasta ja suorituskyvyn arvioinnista case - yrityksen, Exel Oyj:n, käyttöön. Ohjeistuksen tarkoituksena oli ollalähtökohtana toimittajan valinta- ja suoristuskyvyn arviointiprosessien kehittämisessä. Tutkielma keskittyy esittelemään toimittajan valintakriteereitä ja toimittajan suorituskyvyn arviointikriteereitä. Kriteerit valittiin ja analysoitiin teorian ja empirian avulla ja kriteereistä tehtiin selkeät listaukset. Näitä listoja käytettiin avuksi pohdittaessa uusia valintakriteereitä ja suorituskyvyn arviointikriteereitä, joita case -yritys voi jatkossa käyttää. Tutkielmassa käytiin läpi myös toimittajan valintaprosessi jaapuvälineitä ja mittareita toimittajan arviointiin liittyen. Empiirisen aineiston keruu toteutettiin haastattelemalla hankintapäällikköä sekä keräämällä tietoavuosikertomuksesta ja yrityksen internet sivuilta. Tutkielman tuloksena saatiinlistauksia kriteereistä, joita yritys voi hyödyntää jatkossa sekä listaukset kriteereistä, jotka valittiin alustavasti yrityksen käyttöön.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of the dissertation is to increase understanding and knowledge in the field where group decision support system (GDSS) and technology selection research overlap in the strategic sense. The purpose is to develop pragmatic, unique and competent management practices and processes for strategic technology assessment and selection from the whole company's point of view. The combination of the GDSS and technology selection is approached from the points of view of the core competence concept, the lead user -method, and different technology types. In this research the aim is to find out how the GDSS contributes to the technology selection process, what aspects should be considered when selecting technologies to be developed or acquired, and what advantages and restrictions the GDSS has in the selection processes. These research objectives are discussed on the basis of experiences and findings in real life selection meetings. The research has been mainly carried outwith constructive, case study research methods. The study contributes novel ideas to the present knowledge and prior literature on the GDSS and technology selection arena. Academic and pragmatic research has been conducted in four areas: 1) the potential benefits of the group support system with the lead user -method,where the need assessment process is positioned as information gathering for the selection of wireless technology development projects; 2) integrated technology selection and core competencies management processes both in theory and in practice; 3) potential benefits of the group decision support system in the technology selection processes of different technology types; and 4) linkages between technology selection and R&D project selection in innovative product development networks. New type of knowledge and understanding has been created on the practical utilization of the GDSS in technology selection decisions. The study demonstrates that technology selection requires close cooperation between differentdepartments, functions, and strategic business units in order to gather the best knowledge for the decision making. The GDSS is proved to be an effective way to promote communication and co-operation between the selectors. The constructs developed in this study have been tested in many industry fields, for example in information and communication, forest, telecommunication, metal, software, and miscellaneous industries, as well as in non-profit organizations. The pragmatic results in these organizations are some of the most relevant proofs that confirm the scientific contribution of the study, according to the principles of the constructive research approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Superheater corrosion causes vast annual losses for the power companies. With a reliable corrosion prediction method, the plants can be designed accordingly, and knowledge of fuel selection and determination of process conditions may be utilized to minimize superheater corrosion. Growing interest to use recycled fuels creates additional demands for the prediction of corrosion potential. Models depending on corrosion theories will fail, if relations between the inputs and the output are poorly known. A prediction model based on fuzzy logic and an artificial neural network is able to improve its performance as the amount of data increases. The corrosion rate of a superheater material can most reliably be detected with a test done in a test combustor or in a commercial boiler. The steel samples can be located in a special, temperature-controlled probe, and exposed to the corrosive environment for a desired time. These tests give information about the average corrosion potential in that environment. Samples may also be cut from superheaters during shutdowns. The analysis ofsamples taken from probes or superheaters after exposure to corrosive environment is a demanding task: if the corrosive contaminants can be reliably analyzed, the corrosion chemistry can be determined, and an estimate of the material lifetime can be given. In cases where the reason for corrosion is not clear, the determination of the corrosion chemistry and the lifetime estimation is more demanding. In order to provide a laboratory tool for the analysis and prediction, a newapproach was chosen. During this study, the following tools were generated: · Amodel for the prediction of superheater fireside corrosion, based on fuzzy logic and an artificial neural network, build upon a corrosion database developed offuel and bed material analyses, and measured corrosion data. The developed model predicts superheater corrosion with high accuracy at the early stages of a project. · An adaptive corrosion analysis tool based on image analysis, constructedas an expert system. This system utilizes implementation of user-defined algorithms, which allows the development of an artificially intelligent system for thetask. According to the results of the analyses, several new rules were developed for the determination of the degree and type of corrosion. By combining these two tools, a user-friendly expert system for the prediction and analyses of superheater fireside corrosion was developed. This tool may also be used for the minimization of corrosion risks by the design of fluidized bed boilers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tämän tutkielman tavoitteena oli määrittää uuden markkinan valinnan perusteita teolliselle tuotteelle. Tutkielma keskittyi jo tunnettuihin kansainvälisen markkinavalinnan lähestymistapoihin ja pyrki soveltamaan yhtä menetelmää käytäntöön tutkielman empiria osassa case-tutkimuksen avulla. Tutkimusote oli tutkiva, eksploratiivinen ja perustui sekundääri analyysiin. Käytetyt tiedon lähteet olivat suureksi osin sekundäärisiä tuottaen kvalitatiivista tietoa. Kuitenkin haastatteluita suoritettiin myös. Kattava kirjallisuus katsaus tunnetuista teoreettisista lähestymistavoista kansainväliseen markkinavalintaan oli osa tutkielmaa. Kolme tärkeintä lähestymistapaa esiteltiin tarkemmin. Yksi lähestymistavoista, ei-järjestelmällinen, muodosti viitekehyksen tutkielman empiria-osalle. Empiria pyrki soveltamaan yhtä ei-järjestelmällisen lähestymistavan malleista kansainvälisessä paperiteollisuudessa. Tarkoituksena oli tunnistaa kaikkein houkuttelevimmat maat mahdollisille markkinointitoimenpiteille tuotteen yhdellä loppukäyttöalueella. Tutkielmassa päädyttiin käyttämään ilmastollisia olosuhteita, siipikarjan päälukua sekä siipikarjan kasvuprosenttia suodattimina pyrittäessä vähentämään mahdollisten maiden lukumäärää. Tutkielman empiria-osa kärsi selkeästi relevantin tiedon puutteesta. Siten myös tutkielman reliabiliteetti ja validiteetti voidaan jossain määrin kyseenalaistaa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work presents new, efficient Markov chain Monte Carlo (MCMC) simulation methods for statistical analysis in various modelling applications. When using MCMC methods, the model is simulated repeatedly to explore the probability distribution describing the uncertainties in model parameters and predictions. In adaptive MCMC methods based on the Metropolis-Hastings algorithm, the proposal distribution needed by the algorithm learns from the target distribution as the simulation proceeds. Adaptive MCMC methods have been subject of intensive research lately, as they open a way for essentially easier use of the methodology. The lack of user-friendly computer programs has been a main obstacle for wider acceptance of the methods. This work provides two new adaptive MCMC methods: DRAM and AARJ. The DRAM method has been built especially to work in high dimensional and non-linear problems. The AARJ method is an extension to DRAM for model selection problems, where the mathematical formulation of the model is uncertain and we want simultaneously to fit several different models to the same observations. The methods were developed while keeping in mind the needs of modelling applications typical in environmental sciences. The development work has been pursued while working with several application projects. The applications presented in this work are: a winter time oxygen concentration model for Lake Tuusulanjärvi and adaptive control of the aerator; a nutrition model for Lake Pyhäjärvi and lake management planning; validation of the algorithms of the GOMOS ozone remote sensing instrument on board the Envisat satellite of European Space Agency and the study of the effects of aerosol model selection on the GOMOS algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Drying is a major step in the manufacturing process in pharmaceutical industries, and the selection of dryer and operating conditions are sometimes a bottleneck. In spite of difficulties, the bottlenecks are taken care of with utmost care due to good manufacturing practices (GMP) and industries' image in the global market. The purpose of this work is to research the use of existing knowledge for the selection of dryer and its operating conditions for drying of pharmaceutical materials with the help of methods like case-based reasoning and decision trees to reduce time and expenditure for research. The work consisted of two major parts as follows: Literature survey on the theories of spray dying, case-based reasoning and decision trees; working part includes data acquisition and testing of the models based on existing and upgraded data. Testing resulted in a combination of two models, case-based reasoning and decision trees, leading to more specific results when compared to conventional methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Technology scaling has proceeded into dimensions in which the reliability of manufactured devices is becoming endangered. The reliability decrease is a consequence of physical limitations, relative increase of variations, and decreasing noise margins, among others. A promising solution for bringing the reliability of circuits back to a desired level is the use of design methods which introduce tolerance against possible faults in an integrated circuit. This thesis studies and presents fault tolerance methods for network-onchip (NoC) which is a design paradigm targeted for very large systems-onchip. In a NoC resources, such as processors and memories, are connected to a communication network; comparable to the Internet. Fault tolerance in such a system can be achieved at many abstraction levels. The thesis studies the origin of faults in modern technologies and explains the classification to transient, intermittent and permanent faults. A survey of fault tolerance methods is presented to demonstrate the diversity of available methods. Networks-on-chip are approached by exploring their main design choices: the selection of a topology, routing protocol, and flow control method. Fault tolerance methods for NoCs are studied at different layers of the OSI reference model. The data link layer provides a reliable communication link over a physical channel. Error control coding is an efficient fault tolerance method especially against transient faults at this abstraction level. Error control coding methods suitable for on-chip communication are studied and their implementations presented. Error control coding loses its effectiveness in the presence of intermittent and permanent faults. Therefore, other solutions against them are presented. The introduction of spare wires and split transmissions are shown to provide good tolerance against intermittent and permanent errors and their combination to error control coding is illustrated. At the network layer positioned above the data link layer, fault tolerance can be achieved with the design of fault tolerant network topologies and routing algorithms. Both of these approaches are presented in the thesis together with realizations in the both categories. The thesis concludes that an optimal fault tolerance solution contains carefully co-designed elements from different abstraction levels