947 resultados para stochastic search variable selection
Resumo:
A model based on chemical structure was developed for the accurate prediction of octanol/water partition coefficient (K OW) of polychlorinated biphenyls (PCBs), which are molecules of environmental interest. Partial least squares (PLS) was used to build the regression model. Topological indices were used as molecular descriptors. Variable selection was performed by Hierarchical Cluster Analysis (HCA). In the modeling process, the experimental K OW measured for 30 PCBs by thin-layer chromatography - retention time (TLC-RT) has been used. The developed model (Q² = 0,990 and r² = 0,994) was used to estimate the log K OW values for the 179 PCB congeners whose K OW data have not yet been measured by TLC-RT method. The results showed that topological indices can be very useful to predict the K OW.
Resumo:
Calibration transfer has received considerable attention in the recent literature. Several standardization methods have been proposed for transferring calibration models between equipments. The goal of this paper is to present a general revision of calibration transfer techniques. Basic concepts will be reviewed, as well as the main advantages and drawbacks of each technique. A case study based on a set of 80 NIR spectra of maize samples recorded on two different instruments is used to illustrate the main calibration transfer techniques (direct standardization, piecewise direct standardization, orthogonal signal correction and robust variable selection).
Resumo:
The Problem Based Learning (PBL) can be used as a strategy for methodological change in conventional learning environments. In this paper, the integration of laboratory work in PBL grounded activities during an introductory organic chemistry course is described. The most decisive issues of their implementation are discussed. The results show how this methodology favours the laboratory work contextualization in subject-matter and promotes the Science-Technology-Society-Environment relationships. Besides, it contributes to competence development like planning and organization skills, information search and selection, cooperative work, etc., the same way as the tutorial action improvement.
Resumo:
QSAR modeling is a novel computer program developed to generate and validate QSAR or QSPR (quantitative structure- activity or property relationships) models. With QSAR modeling, users can build partial least squares (PLS) regression models, perform variable selection with the ordered predictors selection (OPS) algorithm, and validate models by using y-randomization and leave-N-out cross validation. An additional new feature is outlier detection carried out by simultaneous comparison of sample leverage with the respective Studentized residuals. The program was developed using Java version 6, and runs on any operating system that supports Java Runtime Environment version 6. The use of the program is illustrated. This program is available for download at lqta.iqm.unicamp.br.
Resumo:
This study developed and validated a method for moisture determination in artisanal Minas cheese, using near-infrared spectroscopy and partial-least-squares. The model robustness was assured by broad sample diversity, real conditions of routine analysis, variable selection, outlier detection and analytical validation. The model was built from 28.5-55.5% w/w, with a root-mean-square-error-of-prediction of 1.6%. After its adoption, the method stability was confirmed over a period of two years through the development of a control chart. Besides this specific method, the present study sought to provide an example multivariate metrological methodology with potential for application in several areas, including new aspects, such as more stringent evaluation of the linearity of multivariate methods.
Resumo:
The aim of this present work was to provide a more fast, simple and less expensive to analyze sulfur content in diesel samples than by the standard methods currently used. Thus, samples of diesel fuel with sulfur concentrations varying from 400 and 2500 mgkg-1 were analyzed by two methodologies: X-ray fluorescence, according to ASTM D4294 and by Fourier transform infrared spectrometry (FTIR). The spectral data obtained from FTIR were used to build multivariate calibration models by partial least squares (PLS). Four models were built in three different ways: 1) a model using the full spectra (665 to 4000 cm-1), 2) two models using some specific spectrum regions and 3) a model with variable selected by classic method of variable selection stepwise. The model obtained by variable selection stepwise and the model built with region spectra between 665 and 856 cm-1 and 1145 and 2717 cm-1 showed better results in the determination of sulfur content.
Resumo:
Forest inventories are used to estimate forest characteristics and the condition of forest for many different applications: operational tree logging for forest industry, forest health state estimation, carbon balance estimation, land-cover and land use analysis in order to avoid forest degradation etc. Recent inventory methods are strongly based on remote sensing data combined with field sample measurements, which are used to define estimates covering the whole area of interest. Remote sensing data from satellites, aerial photographs or aerial laser scannings are used, depending on the scale of inventory. To be applicable in operational use, forest inventory methods need to be easily adjusted to local conditions of the study area at hand. All the data handling and parameter tuning should be objective and automated as much as possible. The methods also need to be robust when applied to different forest types. Since there generally are no extensive direct physical models connecting the remote sensing data from different sources to the forest parameters that are estimated, mathematical estimation models are of "black-box" type, connecting the independent auxiliary data to dependent response data with linear or nonlinear arbitrary models. To avoid redundant complexity and over-fitting of the model, which is based on up to hundreds of possibly collinear variables extracted from the auxiliary data, variable selection is needed. To connect the auxiliary data to the inventory parameters that are estimated, field work must be performed. In larger study areas with dense forests, field work is expensive, and should therefore be minimized. To get cost-efficient inventories, field work could partly be replaced with information from formerly measured sites, databases. The work in this thesis is devoted to the development of automated, adaptive computation methods for aerial forest inventory. The mathematical model parameter definition steps are automated, and the cost-efficiency is improved by setting up a procedure that utilizes databases in the estimation of new area characteristics.
Resumo:
Les simulations ont été implémentées avec le programme Java.
Resumo:
Bien que l’on ait longtemps considéré que les substrats cérébraux de la mémoire sémantique (MS) demeuraient intacts au cours du vieillissement normal (VN), en raison d’une préservation de la performance des personnes âgées à des épreuves sémantiques, plusieurs études récentes suggèrent que des modifications cérébrales sous-tendant le traitement sémantique opèrent au cours du vieillissement. Celles-ci toucheraient principalement les régions responsables des aspects exécutifs du traitement sémantique, impliqués dans les processus de recherche, de sélection et de manipulation stratégique de l’information sémantique. Cependant, les mécanismes spécifiques régissant la réorganisation cérébrale du traitement sémantique au cours du VN demeurent méconnus, notamment en raison de divergences méthodologiques entre les études. De plus, des données de la littérature suggèrent que des modifications cérébrales associées au vieillissement pourraient également avoir lieu en relation avec les aspects perceptifs visuels du traitement des mots. Puisque le processus de lecture des mots représente un processus interactif et dynamique entre les fonctions perceptuelles de bas niveau et les fonctions de plus haut niveau tel que la MS, il pourrait exister des modifications liées à l’âge au plan des interactions cérébrales entre les aspects perceptifs et sémantiques du traitement des mots. Dans son ensemble, l’objectif de la présente thèse était de caractériser les modifications cérébrales ainsi que le décours temporel du signal cérébral qui sont associés au traitement sémantique ainsi qu’au traitement perceptif des mots en lien avec le VN, ainsi que les relations et les modulations entre les processus sémantiques et perceptifs au cours du VN, en utilisant la magnétoencéphalographie (MEG) comme technique d’investigation. Dans un premier temps (chapitre 2), les patrons d’activation cérébrale d’un groupe de participants jeunes et d’un groupe de participants âgés sains ont été comparés alors qu’ils effectuaient une tâche de jugement sémantique sur des mots en MEG, en se concentrant sur le signal autour de la N400, une composante associée au traitement sémantique. Les résultats démontrent que des modifications cérébrales liées à l’âge touchent principalement les structures impliquées dans les aspects exécutifs du traitement sémantique. Une activation plus importante du cortex préfrontal inférieur (IPC) a été observée chez les participants jeunes que chez les participants âgés, alors que ces derniers activaient davantage les régions temporo-pariétales que les jeunes adultes. Par ailleurs, le lobe temporal antérieur (ATL) gauche, considéré comme une région centrale et amodale du traitement sémantique, était également davantage activé par les participants âgés que par les jeunes adultes. Dans un deuxième temps (chapitre 3), les patrons d’activation cérébrale d’un groupe de participants jeunes et d’un groupe de participants âgés sains ont été comparés en se concentrant sur le signal associé au traitement perceptif visuel, soit dans les 200 premières millisecondes du traitement des mots. Les résultats montrent que des modifications cérébrales liées à l’âge touchent le gyrus fusiforme mais aussi le réseau sémantique, avec une plus grande activation pour le groupe de participants âgés, malgré une absence de différence d’activation dans le cortex visuel extrastrié entre les deux groupes. Les implications théoriques des résultats de ces deux études sont ensuite discutées, et les limites et perspectives futures sont finalement adressées (chapitre 4).
Resumo:
This paper presents a Reinforcement Learning (RL) approach to economic dispatch (ED) using Radial Basis Function neural network. We formulate the ED as an N stage decision making problem. We propose a novel architecture to store Qvalues and present a learning algorithm to learn the weights of the neural network. Even though many stochastic search techniques like simulated annealing, genetic algorithm and evolutionary programming have been applied to ED, they require searching for the optimal solution for each load demand. Also they find limitation in handling stochastic cost functions. In our approach once we learn the Q-values, we can find the dispatch for any load demand. We have recently proposed a RL approach to ED. In that approach, we could find only the optimum dispatch for a set of specified discrete values of power demand. The performance of the proposed algorithm is validated by taking IEEE 6 bus system, considering transmission losses
Resumo:
Over-sampling sigma-delta analogue-to-digital converters (ADCs) are one of the key building blocks of state of the art wireless transceivers. In the sigma-delta modulator design the scaling coefficients determine the overall signal-to-noise ratio. Therefore, selecting the optimum value of the coefficient is very important. To this end, this paper addresses the design of a fourthorder multi-bit sigma-delta modulator for Wireless Local Area Networks (WLAN) receiver with feed-forward path and the optimum coefficients are selected using genetic algorithm (GA)- based search method. In particular, the proposed converter makes use of low-distortion swing suppression SDM architecture which is highly suitable for low oversampling ratios to attain high linearity over a wide bandwidth. The focus of this paper is the identification of the best coefficients suitable for the proposed topology as well as the optimization of a set of system parameters in order to achieve the desired signal-to-noise ratio. GA-based search engine is a stochastic search method which can find the optimum solution within the given constraints.
Resumo:
El presente proyecto tiene como objeto identificar cuáles son los conceptos de salud, enfermedad, epidemiología y riesgo aplicables a las empresas del sector de extracción de petróleo y gas natural en Colombia. Dado, el bajo nivel de predicción de los análisis financieros tradicionales y su insuficiencia, en términos de inversión y toma de decisiones a largo plazo, además de no considerar variables como el riesgo y las expectativas de futuro, surge la necesidad de abordar diferentes perspectivas y modelos integradores. Esta apreciación es pertinente dentro del sector de extracción de petróleo y gas natural, debido a la creciente inversión extranjera que ha reportado, US$2.862 millones en el 2010, cifra mayor a diez veces su valor en el año 2003. Así pues, se podrían desarrollar modelos multi-dimensional, con base en los conceptos de salud financiera, epidemiológicos y estadísticos. El termino de salud y su adopción en el sector empresarial, resulta útil y mantiene una coherencia conceptual, evidenciando una presencia de diferentes subsistemas o factores interactuantes e interconectados. Es necesario mencionar también, que un modelo multidimensional (multi-stage) debe tener en cuenta el riesgo y el análisis epidemiológico ha demostrado ser útil al momento de determinarlo e integrarlo en el sistema junto a otros conceptos, como la razón de riesgo y riesgo relativo. Esto se analizará mediante un estudio teórico-conceptual, que complementa un estudio previo, para contribuir al proyecto de finanzas corporativas de la línea de investigación en Gerencia.
Resumo:
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.
Resumo:
We study stochastic billiards on general tables: a particle moves according to its constant velocity inside some domain D R(d) until it hits the boundary and bounces randomly inside, according to some reflection law. We assume that the boundary of the domain is locally Lipschitz and almost everywhere continuously differentiable. The angle of the outgoing velocity with the inner normal vector has a specified, absolutely continuous density. We construct the discrete time and the continuous time processes recording the sequence of hitting points on the boundary and the pair location/velocity. We mainly focus on the case of bounded domains. Then, we prove exponential ergodicity of these two Markov processes, we study their invariant distribution and their normal (Gaussian) fluctuations. Of particular interest is the case of the cosine reflection law: the stationary distributions for the two processes are uniform in this case, the discrete time chain is reversible though the continuous time process is quasi-reversible. Also in this case, we give a natural construction of a chord ""picked at random"" in D, and we study the angle of intersection of the process with a (d - 1) -dimensional manifold contained in D.
Resumo:
We define personalisation as the set of capabilities that enables a user or an organisation to customise their working environment to suit their specific needs, preferences and circumstances. In the context of service discovery on the Grid, the demand for personalisation comes from individual users, who want their preferences to be taken into account during the search and selection of suitable services. These preferences can express, for example, the reliability of a service, quality of results, functionality, and so on. In this paper, we identify the problems related to personalising service discovery and present our solution: a personalised service registry or View. We describe scenarios in which personsalised service discovery would be useful and describe how our technology achieves them.