975 resultados para Selection Algorithms


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The benefit promoted by ectomycorrhizal depends on the interaction between symbionts and phosphorus (P) contents. Phosphorus effect on ectomycorrhizal formation and the effectiveness of these in promoting plant growth for fungal pre-selection were assessed under in vitro conditions. For P effect evaluation, Eucalyptus urophylla seedlings inoculated with four Pisolithus sp. isolates and others non-inoculated were grown on substrate containing 0.87, 1.16 and 1.72 mg P per plant. For evaluation of effectiveness and fungal pre-selection, other 30 isolates of Pisolithus sp., Pisolithus microcarpus ITA06 isolate, Amanita muscaria AM16 isolate, Scleroderma areolatum SC129 isolate were studied. D26 isolate promoted the highest plant heights for the three P doses, D51 at the lower dose and D72 at the intermediate dose. P doses did not influenced shoot fresh weight and fungal colonization. In the pre-selection of fungi, 14 isolates of Pisolithus sp., P. microcarpus ITA06 isolate and S. areolatum SC129isolate increased plant height and fresh weight. D82 isolate of Pisolithus sp. had effect singly on plant height while D17 and D58 on fresh weight. Of these, only D15, D17, D58 and ITA06 had typical ectomycorrhizae. The cultivation in vitro has shown adequate for pre-selection of ectomycorrhizal fungi. Colonization and benefits depend on species and isolate. D15, D17 and D58 of Pisolithus sp. and P. microcarpus isolate ITA06 are the most promising for nursery studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ABSTRACT This study estimates the repeatability coefficients of two production traits in two native populations of Brazil nut trees. It determines the number of years of suitable evaluations for an efficient selection process, determines the permanent phenotypic correlation between production traits and also the selection of promising trees in these populations. Populations, located in the Itã region (ITA) and in the in the Cujubim region (CUJ), are both belonging to the municipality of Caracaraí, state of Roraima - Brazil, and consist of 85 and 51 adult trees, respectively. Each tree was evaluated regarding the number of fruits per plant (NFP) and fresh seed weight per plant (SWP), for eight (ITA) and five consecutive years (CUJ). Statistical analyses were performed according to the mixed model methodology, using Software Selegen-REML/BLUP (RESENDE, 2007). The repeatability coefficients were low for NFP (0.3145 and 0.3269 for ITA and CUJ, respectively) and also for SWP (0.2957 and 0.3436 for ITA and CUJ, respectively). It on average takes nine evaluation years to reach coefficients of determination higher than 80%. Permanent phenotypic correlation values higher than 0.95 were obtained for NFP and SWP in both populations. Although trees with a high number of fruits and seed weight were identified, more evaluation years are needed to perform the selection process more efficiently.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective this study has been the selection of lipase productor microorganism, for removal of oils and grease, in the pre-treatment of biodiesel wastewater washing. For this, analyses of the physicist-chemistries characteristics had been made with the wastewater of the biodiesel washing, and then it had been isolated and chosen, by means of determinations of the lipase activity. Following, it was made a test of fat biodegradation, in the conditions: pH (5.95), temperature (35 ºC), rotation (180 rpm) and ammonium sulfate as nitrogen source (3 g L-1) and establishing as variable the two microorganism preselected and the time (24; 48; 72; 96 and 120 h). The biodiesel purification wastewater had presented high potential of environmental impact, presenting a concentration of O of 6.76 g L-1. From the six isolated microbiological cultures, two microorganisms (A and B) had been selected, with enzymatic index of 0.56 and 0.57, respectively. The treatment of the wastewater using the isolated microorganism (Klebsiella oxytoca) had 80% of the fatty removal in 48 h.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, feature selection in classification based problems is highlighted. The role of feature selection methods is to select important features by discarding redundant and irrelevant features in the data set, we investigated this case by using fuzzy entropy measures. We developed fuzzy entropy based feature selection method using Yu's similarity and test this using similarity classifier. As the similarity classifier we used Yu's similarity, we tested our similarity on the real world data set which is dermatological data set. By performing feature selection based on fuzzy entropy measures before classification on our data set the empirical results were very promising, the highest classification accuracy of 98.83% was achieved when testing our similarity measure to the data set. The achieved results were then compared with some other results previously obtained using different similarity classifiers, the obtained results show better accuracy than the one achieved before. The used methods helped to reduce the dimensionality of the used data set, to speed up the computation time of a learning algorithm and therefore have simplified the classification task

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this thesis work is to develop and study the Differential Evolution Algorithm for multi-objective optimization with constraints. Differential Evolution is an evolutionary algorithm that has gained in popularity because of its simplicity and good observed performance. Multi-objective evolutionary algorithms have become popular since they are able to produce a set of compromise solutions during the search process to approximate the Pareto-optimal front. The starting point for this thesis was an idea how Differential Evolution, with simple changes, could be extended for optimization with multiple constraints and objectives. This approach is implemented, experimentally studied, and further developed in the work. Development and study concentrates on the multi-objective optimization aspect. The main outcomes of the work are versions of a method called Generalized Differential Evolution. The versions aim to improve the performance of the method in multi-objective optimization. A diversity preservation technique that is effective and efficient compared to previous diversity preservation techniques is developed. The thesis also studies the influence of control parameters of Differential Evolution in multi-objective optimization. Proposals for initial control parameter value selection are given. Overall, the work contributes to the diversity preservation of solutions in multi-objective optimization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In development of human medicines, it is important to predict early and accurately enough the disease and patient population to be treated as well as the effective and safe dose range of the studied medicine. This is pursued by using preclinical research models, clinical pharmacology and early clinical studies with small sample sizes. When successful, this enables effective development of medicines and reduces unnecessary exposure of healthy subjects and patients to ineffectice or harmfull doses of experimental compounds. Toremifene is a selective estrogen receptor modulator (SERM) used for treatment of breast cancer. Its development was initiated in 1980s when selection of treatment indications and doses were based on research in cell and animal models and on noncomparative clinical studies including small number of patients. Since the early development phase, the treatment indication, the patient population and the dose range were confirmed in large comparative clinical studies in patients. Based on the currently available large and long term clinical study data the aim of this study was to investigate how the early phase studies were able to predict the treatment indication, patient population and the dose range of the SERM. As a conclusion and based on the estrogen receptor mediated mechanism of action early studies were able to predict the treatment indication, target patient population and a dose range to be studied in confirmatory clinical studies. However, comparative clinical studies are needed to optimize dose selection of the SERM in treatment of breast cancer.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parameter estimation still remains a challenge in many important applications. There is a need to develop methods that utilize achievements in modern computational systems with growing capabilities. Owing to this fact different kinds of Evolutionary Algorithms are becoming an especially perspective field of research. The main aim of this thesis is to explore theoretical aspects of a specific type of Evolutionary Algorithms class, the Differential Evolution (DE) method, and implement this algorithm as codes capable to solve a large range of problems. Matlab, a numerical computing environment provided by MathWorks inc., has been utilized for this purpose. Our implementation empirically demonstrates the benefits of a stochastic optimizers with respect to deterministic optimizers in case of stochastic and chaotic problems. Furthermore, the advanced features of Differential Evolution are discussed as well as taken into account in the Matlab realization. Test "toycase" examples are presented in order to show advantages and disadvantages caused by additional aspects involved in extensions of the basic algorithm. Another aim of this paper is to apply the DE approach to the parameter estimation problem of the system exhibiting chaotic behavior, where the well-known Lorenz system with specific set of parameter values is taken as an example. Finally, the DE approach for estimation of chaotic dynamics is compared to the Ensemble prediction and parameter estimation system (EPPES) approach which was recently proposed as a possible solution for similar problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work mathematical programming models for structural and operational optimisation of energy systems are developed and applied to a selection of energy technology problems. The studied cases are taken from industrial processes and from large regional energy distribution systems. The models are based on Mixed Integer Linear Programming (MILP), Mixed Integer Non-Linear Programming (MINLP) and on a hybrid approach of a combination of Non-Linear Programming (NLP) and Genetic Algorithms (GA). The optimisation of the structure and operation of energy systems in urban regions is treated in the work. Firstly, distributed energy systems (DES) with different energy conversion units and annual variations of consumer heating and electricity demands are considered. Secondly, district cooling systems (DCS) with cooling demands for a large number of consumers are studied, with respect to a long term planning perspective regarding to given predictions of the consumer cooling demand development in a region. The work comprises also the development of applications for heat recovery systems (HRS), where paper machine dryer section HRS is taken as an illustrative example. The heat sources in these systems are moist air streams. Models are developed for different types of equipment price functions. The approach is based on partitioning of the overall temperature range of the system into a number of temperature intervals in order to take into account the strong nonlinearities due to condensation in the heat recovery exchangers. The influence of parameter variations on the solutions of heat recovery systems is analysed firstly by varying cost factors and secondly by varying process parameters. Point-optimal solutions by a fixed parameter approach are compared to robust solutions with given parameter variation ranges. In the work enhanced utilisation of excess heat in heat recovery systems with impingement drying, electricity generation with low grade excess heat and the use of absorption heat transformers to elevate a stream temperature above the excess heat temperature are also studied.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acquisitions are a way for a company to grow, enter new geographical areas, buy out competition or diversify. Acquisitions have recently grown in both size and value. Despite of this, only approximately 25 percent of acquisitions reach their targets and goals. Companies making serial acquisitions seem to be exceptionally successful and succeed in the majority of their acquisitions. The main research question this study aims to answer is: “What issues impact the selection of acquired companies from the point of view of a serial acquirer? The main research question is answered through three sub questions: “What is a buying process for a serial acquirer like?”, “What are the motives for a serial acquirer to buy companies?” and “What is the connection between company strategy and serial acquisitions?”. The case company KONE is a globally operating company which mainly produces and maintains elevators and escalators. Its headquarter is located in Helsinki, Finland. The company has a long history of making acquisitions and does 20- 30 acquisitions a year. By a key person interview, the acquisition process of the case company is compared with the literature about successful serial acquirers. The acquisition motives in this case are reflected upon three of the acquisition motive theories by Trautwein: efficiency theory, monopoly theory and valuation theory. The linkage between serial acquisitions and company strategy is studied through the key person interview. The main research findings are that the acquisition process of KONE is compatible with a successful acquisition process recognized in literature (RAID). This study confirms the efficiency theory as an acquisition motive and more closely the operational synergies. The monopoly theory can only vaguely be supported by this study, but cannot be totally rejected because of the structure of the industry. The valuation theory does not get any support in this study and can therefore be rejected. The linkage between company strategy and serial acquisitions is obvious and making acquisitions can be seen as growth strategy and a part of other company strategies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this work was to propose, apply and evaluate a methodical approach to select welding processes in a productive environment based on market requirements of Quality and Costs. A case study was used. The welds were carried out in laboratory, simulating the joint conditions of a manufacturer and using several welding processes: SMAW, GTAW, pulsed GTAW, GMAW with CO2 and Ar based shielding gases and pulsed GMAW. For Quality analysis geometrical aspects of the beads were considered and for Cost analysis, welding parameters and consumable prices. Quantitative indices were proposed and evaluated. After that, evaluation of both Quality and Costs was done, showing to be possible to select the most suitable welding process to a specific application, taking into account the market conditions of a company.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis examines the application of data envelopment analysis as an equity portfolio selection criterion in the Finnish stock market during period 2001-2011. A sample of publicly traded firms in the Helsinki Stock Exchange is examined in this thesis. The sample covers the majority of the publicly traded firms in the Helsinki Stock Exchange. Data envelopment analysis is used to determine the efficiency of firms using a set of input and output financial parameters. The set of financial parameters consist of asset utilization, liquidity, capital structure, growth, valuation and profitability measures. The firms are divided into artificial industry categories, because of the industry-specific nature of the input and output parameters. Comparable portfolios are formed inside the industry category according to the efficiency scores given by the DEA and the performance of the portfolios is evaluated with several measures. The empirical evidence of this thesis suggests that with certain limitations, data envelopment analysis can successfully be used as portfolio selection criterion in the Finnish stock market when the portfolios are rebalanced at annual frequency according to the efficiency scores given by the data envelopment analysis. However, when the portfolios were rebalanced every two or three years, the results are mixed and inconclusive.