971 resultados para Sequential Monte Carlo methods
Resumo:
This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.
Resumo:
Myocardial perfusion gated-single photon emission computed tomography (gated-SPECT) imaging is used for the combined evaluation of myocardial perfusion and left ventricular (LV) function. The aim of this study is to analyze the influence of counts/pixel and concomitantly the total counts in the myocardium for the calculation of myocardial functional parameters. Material and methods: Gated-SPECT studies were performed using a Monte Carlo GATE simulation package and the NCAT phantom. The simulations of these studies use the radiopharmaceutical 99mTc-labeled tracers (250, 350, 450 and 680MBq) for standard patient types, effectively corresponding to the following activities of myocardium: 3, 4.2, 5.4-8.2MBq. All studies were simulated using 15 and 30s/projection. The simulated data were reconstructed and processed by quantitative-gated-SPECT software, and the analysis of functional parameters in gated-SPECT images was done by using Bland-Altman test and Mann-Whitney-Wilcoxon test. Results: In studies simulated using different times (15 and 30s/projection), it was noted that for the activities for full body: 250 and 350MBq, there were statistically significant differences in parameters Motility and Thickness. For the left ventricular ejection fraction (LVEF), end-systolic volume (ESV) it was only for 250MBq, and 350MBq in the end-diastolic volume (EDV), while the simulated studies with 450 and 680MBq showed no statistically significant differences for global functional parameters: LVEF, EDV and ESV. Conclusion: The number of counts/pixel and, concomitantly, the total counts per simulation do not significantly interfere with the determination of gated-SPECT functional parameters, when using the administered average activity of 450MBq, corresponding to the 5.4MBq of the myocardium, for standard patient types.
Resumo:
Aim: Optimise a set of exposure factors, with the lowest effective dose, to delineate spinal curvature with the modified Cobb method in a full spine using computed radiography (CR) for a 5-year-old paediatric anthropomorphic phantom. Methods: Images were acquired by varying a set of parameters: positions (antero-posterior (AP), posteroanterior (PA) and lateral), kilo-voltage peak (kVp) (66-90), source-to-image distance (SID) (150 to 200cm), broad focus and the use of a grid (grid in/out) to analyse the impact on E and image quality (IQ). IQ was analysed applying two approaches: objective [contrast-to-noise-ratio/(CNR] and perceptual, using 5 observers. Monte-Carlo modelling was used for dose estimation. Cohen’s Kappa coefficient was used to calculate inter-observer-variability. The angle was measured using Cobb’s method on lateral projections under different imaging conditions. Results: PA promoted the lowest effective dose (0.013 mSv) compared to AP (0.048 mSv) and lateral (0.025 mSv). The exposure parameters that allowed lower dose were 200cm SID, 90 kVp, broad focus and grid out for paediatrics using an Agfa CR system. Thirty-seven images were assessed for IQ and thirty-two were classified adequate. Cobb angle measurements varied between 16°±2.9 and 19.9°±0.9. Conclusion: Cobb angle measurements can be performed using the lowest dose with a low contrast-tonoise ratio. The variation on measurements for this was ±2.9° and this is within the range of acceptable clinical error without impact on clinical diagnosis. Further work is recommended on improvement to the sample size and a more robust perceptual IQ assessment protocol for observers.
Resumo:
Dissertação para a obtenção do grau de Mestre em Engenharia Electrotécnica Ramo de Energia
Resumo:
A Thesis submitted for the co-tutelle degree of Doctor in Physics at Universidade Nova de Lisboa and Université Pierre et Marie Curie
Resumo:
Old timber structures may show significant variation in the cross section geometry along the same element, as a result of both construction methods and deterioration. As consequence, the definition of the geometric parameters in situ may be both time consuming and costly. This work presents the results of inspections carried out in different timber structures. Based on the obtained results, different simplified geometric models are proposed in order to efficiently model the geometry variations found. Probabilistic modelling techniques are also used to define safety parameters of existing timber structures, when subjected to dead and live loads, namely self-weight and wind actions. The parameters of the models have been defined as probabilistic variables, and safety of a selected case study was assessed using the Monte Carlo simulation technique. Assuming a target reliability index, a model was defined for both the residual cross section and the time dependent deterioration evolution. As a consequence, it was possible to compute probabilities of failure and reliability indices, as well as, time evolution deterioration curves for this structure. The results obtained provide a proposal for definition of the cross section geometric parameters of existing timber structures with different levels of decay, using a simplified probabilistic geometry model and considering a remaining capacity factor for the decayed areas. This model can be used for assessing the safety of the structure at present and for predicting future performance.
Resumo:
Assessing the safety of existing timber structures is of paramount importance for taking reliable decisions on repair actions and their extent. The results obtained through semi-probabilistic methods are unrealistic, as the partial safety factors present in codes are calibrated considering the uncertainty present in new structures. In order to overcome these limitations, and also to include the effects of decay in the safety analysis, probabilistic methods, based on Monte-Carlo simulation are applied here to assess the safety of existing timber structures. In particular, the impact of decay on structural safety is analyzed and discussed, using a simple structural model, similar to that used for current semi-probabilistic analysis.
Resumo:
INTRODUCTION: Malaria is a serious problem in the Brazilian Amazon region, and the detection of possible risk factors could be of great interest for public health authorities. The objective of this article was to investigate the association between environmental variables and the yearly registers of malaria in the Amazon region using Bayesian spatiotemporal methods. METHODS: We used Poisson spatiotemporal regression models to analyze the Brazilian Amazon forest malaria count for the period from 1999 to 2008. In this study, we included some covariates that could be important in the yearly prediction of malaria, such as deforestation rate. We obtained the inferences using a Bayesian approach and Markov Chain Monte Carlo (MCMC) methods to simulate samples for the joint posterior distribution of interest. The discrimination of different models was also discussed. RESULTS: The model proposed here suggests that deforestation rate, the number of inhabitants per km², and the human development index (HDI) are important in the prediction of malaria cases. CONCLUSIONS: It is possible to conclude that human development, population growth, deforestation, and their associated ecological alterations are conducive to increasing malaria risk. We conclude that the use of Poisson regression models that capture the spatial and temporal effects under the Bayesian paradigm is a good strategy for modeling malaria counts.
Resumo:
ABSTRACTINTRODUCTION: Monte Carlo simulations have been used for selecting optimal antibiotic regimens for treatment of bacterial infections. The aim of this study was to assess the pharmacokinetic and pharmacodynamic target attainment of intravenous β-lactam regimens commonly used to treat bloodstream infections (BSIs) caused by Gram-negative rod-shaped organisms in a Brazilian teaching hospital.METHODS: In total, 5,000 patients were included in the Monte Carlo simulations of distinct antimicrobial regimens to estimate the likelihood of achieving free drug concentrations above the minimum inhibitory concentration (MIC; fT > MIC) for the requisite periods to clear distinct target organisms. Microbiological data were obtained from blood culture isolates harvested in our hospital from 2008 to 2010.RESULTS: In total, 614 bacterial isolates, including Escherichia coli, Enterobacterspp., Klebsiella pneumoniae, Acinetobacter baumannii, and Pseudomonas aeruginosa, were analyzed Piperacillin/tazobactam failed to achieve a cumulative fraction of response (CFR) > 90% for any of the isolates. While standard dosing (short infusion) of β-lactams achieved target attainment for BSIs caused by E. coliand Enterobacterspp., pharmacodynamic target attainment against K. pneumoniaeisolates was only achieved with ceftazidime and meropenem (prolonged infusion). Lastly, only prolonged infusion of high-dose meropenem approached an ideal CFR against P. aeruginosa; however, no antimicrobial regimen achieved an ideal CFR against A. baumannii.CONCLUSIONS:These data reinforce the use of prolonged infusions of high-dose β-lactam antimicrobials as a reasonable strategy for the treatment of BSIs caused by multidrug resistant Gram-negative bacteria in Brazil.
Resumo:
Given a model that can be simulated, conditional moments at a trial parameter value can be calculated with high accuracy by applying kernel smoothing methods to a long simulation. With such conditional moments in hand, standard method of moments techniques can be used to estimate the parameter. Since conditional moments are calculated using kernel smoothing rather than simple averaging, it is not necessary that the model be simulable subject to the conditioning information that is used to define the moment conditions. For this reason, the proposed estimator is applicable to general dynamic latent variable models. Monte Carlo results show that the estimator performs well in comparison to other estimators that have been proposed for estimation of general DLV models.
Resumo:
Background: Over the last two decades, mortality from coronary heart disease (CHD) and cerebrovascular disease (CVD) declined by about 30% in the European Union (EU). Design: We analyzed trends in CHD (X ICD codes: I20-I25) and CVD (X ICD codes: I60-I69) mortality in young adults (age 35-44 years) in the EU as a whole and in 12 selected European countries, over the period 1980-2007. Methods: Data were derived from the World Health Organization mortality database. With joinpoint regression analysis, we identified significant changes in trends and estimated average annual percent changes (AAPC). Results: CHD mortality rates at ages 35-44 years have decreased in both sexes since the 1980s for most countries, except for Russia (130/100,000 men and 24/100,000 women, in 2005-7). The lowest rates (around 9/100,000 men, 2/100,000 women) were in France, Italy and Sweden. In men, the steepest declines in mortality were in the Czech Republic (AAPC = -6.1%), the Netherlands (-5.2%), Poland (-4.5%), and England and Wales (-4.5%). Patterns were similar in women, though with appreciably lower rates. The AAPC in the EU was -3.3% for men (rate = 16.6/100,000 in 2005-7) and -2.1% for women (rate = 3.5/100,000). For CVD, Russian rates in 2005-7 were 40/100,000 men and 16/100,000 women, 5 to 10-fold higher than in most western European countries. The steepest declines were in the Czech Republic and Italy for men, in Sweden and the Czech Republic for women. The AAPC in the EU was -2.5% in both sexes, with steeper declines after the mid-late 1990s (rates = 6.4/100,000 men and 4.3/100,000 women in 2005-7). Conclusions: CHD and CVD mortality steadily declined in Europe, except in Russia, whose rates were 10 to 15-fold higher than those of France, Italy or Sweden. Hungary and Poland, and also Scotland, where CHD trends were less favourable than in other western European countries, also emerge as priorities for preventive interventions.
Resumo:
There are both theoretical and empirical reasons for believing that the parameters of macroeconomic models may vary over time. However, work with time-varying parameter models has largely involved Vector autoregressions (VARs), ignoring cointegration. This is despite the fact that cointegration plays an important role in informing macroeconomists on a range of issues. In this paper we develop time varying parameter models which permit cointegration. Time-varying parameter VARs (TVP-VARs) typically use state space representations to model the evolution of parameters. In this paper, we show that it is not sensible to use straightforward extensions of TVP-VARs when allowing for cointegration. Instead we develop a specification which allows for the cointegrating space to evolve over time in a manner comparable to the random walk variation used with TVP-VARs. The properties of our approach are investigated before developing a method of posterior simulation. We use our methods in an empirical investigation involving a permanent/transitory variance decomposition for inflation.
Resumo:
This paper contributes to the on-going empirical debate regarding the role of the RBC model and in particular of technology shocks in explaining aggregate fluctuations. To this end we estimate the model’s posterior density using Markov-Chain Monte-Carlo (MCMC) methods. Within this framework we extend Ireland’s (2001, 2004) hybrid estimation approach to allow for a vector autoregressive moving average (VARMA) process to describe the movements and co-movements of the model’s errors not explained by the basic RBC model. The results of marginal likelihood ratio tests reveal that the more general model of the errors significantly improves the model’s fit relative to the VAR and AR alternatives. Moreover, despite setting the RBC model a more difficult task under the VARMA specification, our analysis, based on forecast error and spectral decompositions, suggests that the RBC model is still capable of explaining a significant fraction of the observed variation in macroeconomic aggregates in the post-war U.S. economy.
Resumo:
This paper develops methods for Stochastic Search Variable Selection (currently popular with regression and Vector Autoregressive models) for Vector Error Correction models where there are many possible restrictions on the cointegration space. We show how this allows the researcher to begin with a single unrestricted model and either do model selection or model averaging in an automatic and computationally efficient manner. We apply our methods to a large UK macroeconomic model.
Resumo:
This paper considers the instrumental variable regression model when there is uncertainty about the set of instruments, exogeneity restrictions, the validity of identifying restrictions and the set of exogenous regressors. This uncertainty can result in a huge number of models. To avoid statistical problems associated with standard model selection procedures, we develop a reversible jump Markov chain Monte Carlo algorithm that allows us to do Bayesian model averaging. The algorithm is very exible and can be easily adapted to analyze any of the di¤erent priors that have been proposed in the Bayesian instrumental variables literature. We show how to calculate the probability of any relevant restriction (e.g. the posterior probability that over-identifying restrictions hold) and discuss diagnostic checking using the posterior distribution of discrepancy vectors. We illustrate our methods in a returns-to-schooling application.