929 resultados para Parameter tuning
Resumo:
The scheduling problem is considered in complexity theory as a NP-hard combinatorial optimization problem. Meta-heuristics proved to be very useful in the resolution of this class of problems. However, these techniques require parameter tuning which is a very hard task to perform. A Case-based Reasoning module is proposed in order to solve the parameter tuning problem in a Multi-Agent Scheduling System. A computational study is performed in order to evaluate the proposed CBR module performance.
Resumo:
The growing population on earth along with diminishing fossil deposits and the climate change debate calls out for a better utilization of renewable, bio-based materials. In a biorefinery perspective, the renewable biomass is converted into many different products such as fuels, chemicals, and materials, quite similar to the petroleum refinery industry. Since forests cover about one third of the land surface on earth, ligno-cellulosic biomass is the most abundant renewable resource available. The natural first step in a biorefinery is separation and isolation of the different compounds the biomass is comprised of. The major components in wood are cellulose, hemicellulose, and lignin, all of which can be made into various end-products. Today, focus normally lies on utilizing only one component, e.g., the cellulose in the Kraft pulping process. It would be highly desirable to utilize all the different compounds, both from an economical and environmental point of view. The separation process should therefore be optimized. Hemicelluloses can partly be extracted with hot-water prior to pulping. Depending in the severity of the extraction, the hemicelluloses are degraded to various degrees. In order to be able to choose from a variety of different end-products, the hemicelluloses should be as intact as possible after the extraction. The main focus of this work has been on preserving the hemicellulose molar mass throughout the extraction at a high yield by actively controlling the extraction pH at the high temperatures used. Since it has not been possible to measure pH during an extraction due to the high temperatures, the extraction pH has remained a “black box”. Therefore, a high-temperature in-line pH measuring system was developed, validated, and tested for hot-water wood extractions. One crucial step in the measurements is calibration, therefore extensive efforts was put on developing a reliable calibration procedure. Initial extractions with wood showed that the actual extraction pH was ~0.35 pH units higher than previously believed. The measuring system was also equipped with a controller connected to a pump. With this addition it was possible to control the extraction to any desired pH set point. When the pH dropped below the set point, the controller started pumping in alkali and by that the desired set point was maintained very accurately. Analyses of the extracted hemicelluloses showed that less hemicelluloses were extracted at higher pH but with a higher molar-mass. Monomer formation could, at a certain pH level, be completely inhibited. Increasing the temperature, but maintaining a specific pH set point, would speed up the extraction without degrading the molar-mass of the hemicelluloses and thereby intensifying the extraction. The diffusion of the dissolved hemicelluloses from the wood particle is a major part of the extraction process. Therefore, a particle size study ranging from 0.5 mm wood particles to industrial size wood chips was conducted to investigate the internal mass transfer of the hemicelluloses. Unsurprisingly, it showed that hemicelluloses were extracted faster from smaller wood particles than larger although it did not seem to have a substantial effect on the average molar mass of the extracted hemicelluloses. However, smaller particle sizes require more energy to manufacture and thus increases the economic cost. Since bark comprises 10 – 15 % of a tree, it is important to also consider it in a biorefinery concept. Spruce inner and outer bark was hot-water extracted separately to investigate the possibility to isolate the bark hemicelluloses. It was showed that the bark hemicelluloses comprised mostly of pectic material and differed considerably from the wood hemicelluloses. The bark hemicelluloses, or pectins, could be extracted at lower temperatures than the wood hemicelluloses. A chemical characterization, done separately on inner and outer bark, showed that inner bark contained over 10 % stilbene glucosides that could be extracted already at 100 °C with aqueous acetone.
Resumo:
Support vector machines (SVMs) were originally formulated for the solution of binary classification problems. In multiclass problems, a decomposition approach is often employed, in which the multiclass problem is divided into multiple binary subproblems, whose results are combined. Generally, the performance of SVM classifiers is affected by the selection of values for their parameters. This paper investigates the use of genetic algorithms (GAs) to tune the parameters of the binary SVMs in common multiclass decompositions. The developed GA may search for a set of parameter values common to all binary classifiers or for differentiated values for each binary classifier. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
A neural network enhanced proportional, integral and derivative (PID) controller is presented that combines the attributes of neural network learning with a generalized minimum-variance self-tuning control (STC) strategy. The neuro PID controller is structured with plant model identification and PID parameter tuning. The plants to be controlled are approximated by an equivalent model composed of a simple linear submodel to approximate plant dynamics around operating points, plus an error agent to accommodate the errors induced by linear submodel inaccuracy due to non-linearities and other complexities. A generalized recursive least-squares algorithm is used to identify the linear submodel, and a layered neural network is used to detect the error agent in which the weights are updated on the basis of the error between the plant output and the output from the linear submodel. The procedure for controller design is based on the equivalent model, and therefore the error agent is naturally functioned within the control law. In this way the controller can deal not only with a wide range of linear dynamic plants but also with those complex plants characterized by severe non-linearity, uncertainties and non-minimum phase behaviours. Two simulation studies are provided to demonstrate the effectiveness of the controller design procedure.
Resumo:
L'elaborato di tesi tratta dei vantaggi ottenibili dall'uso di tecniche di automatic parameter tuning, applicando un'implementazione di iterated racing su di un innovativo sistema di controllo semaforico auto-organizzante ispirato da concetti di swarm intelligence.
Resumo:
Metaheuristics performance is highly dependent of the respective parameters which need to be tuned. Parameter tuning may allow a larger flexibility and robustness but requires a careful initialization. The process of defining which parameters setting should be used is not obvious. The values for parameters depend mainly on the problem, the instance to be solved, the search time available to spend in solving the problem, and the required quality of solution. This paper presents a learning module proposal for an autonomous parameterization of Metaheuristics, integrated on a Multi-Agent System for the resolution of Dynamic Scheduling problems. The proposed learning module is inspired on Autonomic Computing Self-Optimization concept, defining that systems must continuously and proactively improve their performance. For the learning implementation it is used Case-based Reasoning, which uses previous similar data to solve new cases. In the use of Case-based Reasoning it is assumed that similar cases have similar solutions. After a literature review on topics used, both AutoDynAgents system and Self-Optimization module are described. Finally, a computational study is presented where the proposed module is evaluated, obtained results are compared with previous ones, some conclusions are reached, and some future work is referred. It is expected that this proposal can be a great contribution for the self-parameterization of Metaheuristics and for the resolution of scheduling problems on dynamic environments.
Batch effect confounding leads to strong bias in performance estimates obtained by cross-validation.
Resumo:
BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.
Resumo:
A recurring task in the analysis of mass genome annotation data from high-throughput technologies is the identification of peaks or clusters in a noisy signal profile. Examples of such applications are the definition of promoters on the basis of transcription start site profiles, the mapping of transcription factor binding sites based on ChIP-chip data and the identification of quantitative trait loci (QTL) from whole genome SNP profiles. Input to such an analysis is a set of genome coordinates associated with counts or intensities. The output consists of a discrete number of peaks with respective volumes, extensions and center positions. We have developed for this purpose a flexible one-dimensional clustering tool, called MADAP, which we make available as a web server and as standalone program. A set of parameters enables the user to customize the procedure to a specific problem. The web server, which returns results in textual and graphical form, is useful for small to medium-scale applications, as well as for evaluation and parameter tuning in view of large-scale applications, requiring a local installation. The program written in C++ can be freely downloaded from ftp://ftp.epd.unil.ch/pub/software/unix/madap. The MADAP web server can be accessed at http://www.isrec.isb-sib.ch/madap/.
Resumo:
Forest inventories are used to estimate forest characteristics and the condition of forest for many different applications: operational tree logging for forest industry, forest health state estimation, carbon balance estimation, land-cover and land use analysis in order to avoid forest degradation etc. Recent inventory methods are strongly based on remote sensing data combined with field sample measurements, which are used to define estimates covering the whole area of interest. Remote sensing data from satellites, aerial photographs or aerial laser scannings are used, depending on the scale of inventory. To be applicable in operational use, forest inventory methods need to be easily adjusted to local conditions of the study area at hand. All the data handling and parameter tuning should be objective and automated as much as possible. The methods also need to be robust when applied to different forest types. Since there generally are no extensive direct physical models connecting the remote sensing data from different sources to the forest parameters that are estimated, mathematical estimation models are of "black-box" type, connecting the independent auxiliary data to dependent response data with linear or nonlinear arbitrary models. To avoid redundant complexity and over-fitting of the model, which is based on up to hundreds of possibly collinear variables extracted from the auxiliary data, variable selection is needed. To connect the auxiliary data to the inventory parameters that are estimated, field work must be performed. In larger study areas with dense forests, field work is expensive, and should therefore be minimized. To get cost-efficient inventories, field work could partly be replaced with information from formerly measured sites, databases. The work in this thesis is devoted to the development of automated, adaptive computation methods for aerial forest inventory. The mathematical model parameter definition steps are automated, and the cost-efficiency is improved by setting up a procedure that utilizes databases in the estimation of new area characteristics.
Resumo:
There exist several researches and applications about laser welding monitoring and parameter control but not a single one have been created for controlling of laser scribing processes. Laser scribing is considered to be very fast and accurate process and thus it would be necessary to develop accurate turning and monitoring system for such a process. This research focuses on finding out whether it would be possible to develop real-time adaptive control for ultra-fast laser scribing processes utilizing spectrometer online monitoring. The thesis accurately presents how control code for laser parameter tuning is developed using National Instrument's LabVIEW and how spectrometer is being utilized in online monitoring. Results are based on behavior of the control code and accuracy of the spectrometer monitoring when scribing different steel materials. Finally control code success is being evaluated and possible development ideas for future are presented.
Resumo:
Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.
Resumo:
An efficient model identification algorithm for a large class of linear-in-the-parameters models is introduced that simultaneously optimises the model approximation ability, sparsity and robustness. The derived model parameters in each forward regression step are initially estimated via the orthogonal least squares (OLS), followed by being tuned with a new gradient-descent learning algorithm based on the basis pursuit that minimises the l(1) norm of the parameter estimate vector. The model subset selection cost function includes a D-optimality design criterion that maximises the determinant of the design matrix of the subset to ensure model robustness and to enable the model selection procedure to automatically terminate at a sparse model. The proposed approach is based on the forward OLS algorithm using the modified Gram-Schmidt procedure. Both the parameter tuning procedure, based on basis pursuit, and the model selection criterion, based on the D-optimality that is effective in ensuring model robustness, are integrated with the forward regression. As a consequence the inherent computational efficiency associated with the conventional forward OLS approach is maintained in the proposed algorithm. Examples demonstrate the effectiveness of the new approach.
Resumo:
Over Arctic sea ice, pressure ridges and floe andmelt pond edges all introduce discrete obstructions to the flow of air or water past the ice and are a source of form drag. In current climate models form drag is only accounted for by tuning the air–ice and ice–ocean drag coefficients, that is, by effectively altering the roughness length in a surface drag parameterization. The existing approach of the skin drag parameter tuning is poorly constrained by observations and fails to describe correctly the physics associated with the air–ice and ocean–ice drag. Here, the authors combine recent theoretical developments to deduce the total neutral form drag coefficients from properties of the ice cover such as ice concentration, vertical extent and area of the ridges, freeboard and floe draft, and the size of floes and melt ponds. The drag coefficients are incorporated into the Los Alamos Sea Ice Model (CICE) and show the influence of the new drag parameterization on the motion and state of the ice cover, with the most noticeable being a depletion of sea ice over the west boundary of the Arctic Ocean and over the Beaufort Sea. The new parameterization allows the drag coefficients to be coupled to the sea ice state and therefore to evolve spatially and temporally. It is found that the range of values predicted for the drag coefficients agree with the range of values measured in several regions of the Arctic. Finally, the implications of the new form drag formulation for the spinup or spindown of the Arctic Ocean are discussed.
Resumo:
The Capacitated Arc Routing Problem (CARP) is a well-known NP-hard combinatorial optimization problem where, given an undirected graph, the objective is to find a minimum cost set of tours servicing a subset of required edges under vehicle capacity constraints. There are numerous applications for the CARP, such as street sweeping, garbage collection, mail delivery, school bus routing, and meter reading. A Greedy Randomized Adaptive Search Procedure (GRASP) with Path-Relinking (PR) is proposed and compared with other successful CARP metaheuristics. Some features of this GRASP with PR are (i) reactive parameter tuning, where the parameter value is stochastically selected biased in favor of those values which historically produced the best solutions in average; (ii) a statistical filter, which discard initial solutions if they are unlikely to improve the incumbent best solution; (iii) infeasible local search, where high-quality solutions, though infeasible, are used to explore the feasible/infeasible boundaries of the solution space; (iv) evolutionary PR, a recent trend where the pool of elite solutions is progressively improved by successive relinking of pairs of elite solutions. Computational tests were conducted using a set of 81 instances, and results reveal that the GRASP is very competitive, achieving the best overall deviation from lower bounds and the highest number of best solutions found. © 2011 Elsevier Ltd. All rights reserved.