971 resultados para Sequential Monte Carlo methods
Resumo:
A family of scaling corrections aimed to improve the chi-square approximation of goodness-of-fit test statistics in small samples, large models, and nonnormal data was proposed in Satorra and Bentler (1994). For structural equations models, Satorra-Bentler's (SB) scaling corrections are available in standard computer software. Often, however, the interest is not on the overall fit of a model, but on a test of the restrictions that a null model say ${\cal M}_0$ implies on a less restricted one ${\cal M}_1$. If $T_0$ and $T_1$ denote the goodness-of-fit test statistics associated to ${\cal M}_0$ and ${\cal M}_1$, respectively, then typically the difference $T_d = T_0 - T_1$ is used as a chi-square test statistic with degrees of freedom equal to the difference on the number of independent parameters estimated under the models ${\cal M}_0$ and ${\cal M}_1$. As in the case of the goodness-of-fit test, it is of interest to scale the statistic $T_d$ in order to improve its chi-square approximation in realistic, i.e., nonasymptotic and nonnormal, applications. In a recent paper, Satorra (1999) shows that the difference between two Satorra-Bentler scaled test statistics for overall model fit does not yield the correct SB scaled difference test statistic. Satorra developed an expression that permits scaling the difference test statistic, but his formula has some practical limitations, since it requires heavy computations that are notavailable in standard computer software. The purpose of the present paper is to provide an easy way to compute the scaled difference chi-square statistic from the scaled goodness-of-fit test statistics of models ${\cal M}_0$ and ${\cal M}_1$. A Monte Carlo study is provided to illustrate the performance of the competing statistics.
Resumo:
Any electoral system has an electoral formula that converts voteproportions into parliamentary seats. Pre-electoral polls usually focuson estimating vote proportions and then applying the electoral formulato give a forecast of the parliament's composition. We here describe theproblems arising from this approach: there is always a bias in theforecast. We study the origin of the bias and some methods to evaluateand to reduce it. We propose some rules to compute the sample sizerequired for a given forecast accuracy. We show by Monte Carlo simulationthe performance of the proposed methods using data from Spanish electionsin last years. We also propose graphical methods to visualize how electoralformulae and parliamentary forecasts work (or fail).
Resumo:
A national survey designed for estimating a specific population quantity is sometimes used for estimation of this quantity also for a small area, such as a province. Budget constraints do not allow a greater sample size for the small area, and so other means of improving estimation have to be devised. We investigate such methods and assess them by a Monte Carlo study. We explore how a complementary survey can be exploited in small area estimation. We use the context of the Spanish Labour Force Survey (EPA) and the Barometer in Spain for our study.
Resumo:
We study the statistical properties of three estimation methods for a model of learning that is often fitted to experimental data: quadratic deviation measures without unobserved heterogeneity, and maximum likelihood withand without unobserved heterogeneity. After discussing identification issues, we show that the estimators are consistent and provide their asymptotic distribution. Using Monte Carlo simulations, we show that ignoring unobserved heterogeneity can lead to seriously biased estimations in samples which have the typical length of actual experiments. Better small sample properties areobtained if unobserved heterogeneity is introduced. That is, rather than estimating the parameters for each individual, the individual parameters are considered random variables, and the distribution of those random variables is estimated.
Resumo:
This paper investigates the comparative performance of five small areaestimators. We use Monte Carlo simulation in the context of boththeoretical and empirical populations. In addition to the direct andindirect estimators, we consider the optimal composite estimator withpopulation weights, and two composite estimators with estimatedweights: one that assumes homogeneity of within area variance andsquare bias, and another one that uses area specific estimates ofvariance and square bias. It is found that among the feasibleestimators, the best choice is the one that uses area specificestimates of variance and square bias.
Resumo:
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical {\sc vc} dimension, empirical {\sc vc} entropy, andmargin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.
Resumo:
L'utilisation efficace des systèmes géothermaux, la séquestration du CO2 pour limiter le changement climatique et la prévention de l'intrusion d'eau salée dans les aquifères costaux ne sont que quelques exemples qui démontrent notre besoin en technologies nouvelles pour suivre l'évolution des processus souterrains à partir de la surface. Un défi majeur est d'assurer la caractérisation et l'optimisation des performances de ces technologies à différentes échelles spatiales et temporelles. Les méthodes électromagnétiques (EM) d'ondes planes sont sensibles à la conductivité électrique du sous-sol et, par conséquent, à la conductivité électrique des fluides saturant la roche, à la présence de fractures connectées, à la température et aux matériaux géologiques. Ces méthodes sont régies par des équations valides sur de larges gammes de fréquences, permettant détudier de manières analogues des processus allant de quelques mètres sous la surface jusqu'à plusieurs kilomètres de profondeur. Néanmoins, ces méthodes sont soumises à une perte de résolution avec la profondeur à cause des propriétés diffusives du champ électromagnétique. Pour cette raison, l'estimation des modèles du sous-sol par ces méthodes doit prendre en compte des informations a priori afin de contraindre les modèles autant que possible et de permettre la quantification des incertitudes de ces modèles de façon appropriée. Dans la présente thèse, je développe des approches permettant la caractérisation statique et dynamique du sous-sol à l'aide d'ondes EM planes. Dans une première partie, je présente une approche déterministe permettant de réaliser des inversions répétées dans le temps (time-lapse) de données d'ondes EM planes en deux dimensions. Cette stratégie est basée sur l'incorporation dans l'algorithme d'informations a priori en fonction des changements du modèle de conductivité électrique attendus. Ceci est réalisé en intégrant une régularisation stochastique et des contraintes flexibles par rapport à la gamme des changements attendus en utilisant les multiplicateurs de Lagrange. J'utilise des normes différentes de la norme l2 pour contraindre la structure du modèle et obtenir des transitions abruptes entre les régions du model qui subissent des changements dans le temps et celles qui n'en subissent pas. Aussi, j'incorpore une stratégie afin d'éliminer les erreurs systématiques de données time-lapse. Ce travail a mis en évidence l'amélioration de la caractérisation des changements temporels par rapport aux approches classiques qui réalisent des inversions indépendantes à chaque pas de temps et comparent les modèles. Dans la seconde partie de cette thèse, j'adopte un formalisme bayésien et je teste la possibilité de quantifier les incertitudes sur les paramètres du modèle dans l'inversion d'ondes EM planes. Pour ce faire, je présente une stratégie d'inversion probabiliste basée sur des pixels à deux dimensions pour des inversions de données d'ondes EM planes et de tomographies de résistivité électrique (ERT) séparées et jointes. Je compare les incertitudes des paramètres du modèle en considérant différents types d'information a priori sur la structure du modèle et différentes fonctions de vraisemblance pour décrire les erreurs sur les données. Les résultats indiquent que la régularisation du modèle est nécessaire lorsqu'on a à faire à un large nombre de paramètres car cela permet d'accélérer la convergence des chaînes et d'obtenir des modèles plus réalistes. Cependent, ces contraintes mènent à des incertitudes d'estimations plus faibles, ce qui implique des distributions a posteriori qui ne contiennent pas le vrai modèledans les régions ou` la méthode présente une sensibilité limitée. Cette situation peut être améliorée en combinant des méthodes d'ondes EM planes avec d'autres méthodes complémentaires telles que l'ERT. De plus, je montre que le poids de régularisation des paramètres et l'écart-type des erreurs sur les données peuvent être retrouvés par une inversion probabiliste. Finalement, j'évalue la possibilité de caractériser une distribution tridimensionnelle d'un panache de traceur salin injecté dans le sous-sol en réalisant une inversion probabiliste time-lapse tridimensionnelle d'ondes EM planes. Etant donné que les inversions probabilistes sont très coûteuses en temps de calcul lorsque l'espace des paramètres présente une grande dimension, je propose une stratégie de réduction du modèle ou` les coefficients de décomposition des moments de Legendre du panache de traceur injecté ainsi que sa position sont estimés. Pour ce faire, un modèle de résistivité de base est nécessaire. Il peut être obtenu avant l'expérience time-lapse. Un test synthétique montre que la méthodologie marche bien quand le modèle de résistivité de base est caractérisé correctement. Cette méthodologie est aussi appliquée à un test de trac¸age par injection d'une solution saline et d'acides réalisé dans un système géothermal en Australie, puis comparée à une inversion time-lapse tridimensionnelle réalisée selon une approche déterministe. L'inversion probabiliste permet de mieux contraindre le panache du traceur salin gr^ace à la grande quantité d'informations a priori incluse dans l'algorithme. Néanmoins, les changements de conductivités nécessaires pour expliquer les changements observés dans les données sont plus grands que ce qu'expliquent notre connaissance actuelle des phénomenès physiques. Ce problème peut être lié à la qualité limitée du modèle de résistivité de base utilisé, indiquant ainsi que des efforts plus grands devront être fournis dans le futur pour obtenir des modèles de base de bonne qualité avant de réaliser des expériences dynamiques. Les études décrites dans cette thèse montrent que les méthodes d'ondes EM planes sont très utiles pour caractériser et suivre les variations temporelles du sous-sol sur de larges échelles. Les présentes approches améliorent l'évaluation des modèles obtenus, autant en termes d'incorporation d'informations a priori, qu'en termes de quantification d'incertitudes a posteriori. De plus, les stratégies développées peuvent être appliquées à d'autres méthodes géophysiques, et offrent une grande flexibilité pour l'incorporation d'informations additionnelles lorsqu'elles sont disponibles. -- The efficient use of geothermal systems, the sequestration of CO2 to mitigate climate change, and the prevention of seawater intrusion in coastal aquifers are only some examples that demonstrate the need for novel technologies to monitor subsurface processes from the surface. A main challenge is to assure optimal performance of such technologies at different temporal and spatial scales. Plane-wave electromagnetic (EM) methods are sensitive to subsurface electrical conductivity and consequently to fluid conductivity, fracture connectivity, temperature, and rock mineralogy. These methods have governing equations that are the same over a large range of frequencies, thus allowing to study in an analogous manner processes on scales ranging from few meters close to the surface down to several hundreds of kilometers depth. Unfortunately, they suffer from a significant resolution loss with depth due to the diffusive nature of the electromagnetic fields. Therefore, estimations of subsurface models that use these methods should incorporate a priori information to better constrain the models, and provide appropriate measures of model uncertainty. During my thesis, I have developed approaches to improve the static and dynamic characterization of the subsurface with plane-wave EM methods. In the first part of this thesis, I present a two-dimensional deterministic approach to perform time-lapse inversion of plane-wave EM data. The strategy is based on the incorporation of prior information into the inversion algorithm regarding the expected temporal changes in electrical conductivity. This is done by incorporating a flexible stochastic regularization and constraints regarding the expected ranges of the changes by using Lagrange multipliers. I use non-l2 norms to penalize the model update in order to obtain sharp transitions between regions that experience temporal changes and regions that do not. I also incorporate a time-lapse differencing strategy to remove systematic errors in the time-lapse inversion. This work presents improvements in the characterization of temporal changes with respect to the classical approach of performing separate inversions and computing differences between the models. In the second part of this thesis, I adopt a Bayesian framework and use Markov chain Monte Carlo (MCMC) simulations to quantify model parameter uncertainty in plane-wave EM inversion. For this purpose, I present a two-dimensional pixel-based probabilistic inversion strategy for separate and joint inversions of plane-wave EM and electrical resistivity tomography (ERT) data. I compare the uncertainties of the model parameters when considering different types of prior information on the model structure and different likelihood functions to describe the data errors. The results indicate that model regularization is necessary when dealing with a large number of model parameters because it helps to accelerate the convergence of the chains and leads to more realistic models. These constraints also lead to smaller uncertainty estimates, which imply posterior distributions that do not include the true underlying model in regions where the method has limited sensitivity. This situation can be improved by combining planewave EM methods with complimentary geophysical methods such as ERT. In addition, I show that an appropriate regularization weight and the standard deviation of the data errors can be retrieved by the MCMC inversion. Finally, I evaluate the possibility of characterizing the three-dimensional distribution of an injected water plume by performing three-dimensional time-lapse MCMC inversion of planewave EM data. Since MCMC inversion involves a significant computational burden in high parameter dimensions, I propose a model reduction strategy where the coefficients of a Legendre moment decomposition of the injected water plume and its location are estimated. For this purpose, a base resistivity model is needed which is obtained prior to the time-lapse experiment. A synthetic test shows that the methodology works well when the base resistivity model is correctly characterized. The methodology is also applied to an injection experiment performed in a geothermal system in Australia, and compared to a three-dimensional time-lapse inversion performed within a deterministic framework. The MCMC inversion better constrains the water plumes due to the larger amount of prior information that is included in the algorithm. The conductivity changes needed to explain the time-lapse data are much larger than what is physically possible based on present day understandings. This issue may be related to the base resistivity model used, therefore indicating that more efforts should be given to obtain high-quality base models prior to dynamic experiments. The studies described herein give clear evidence that plane-wave EM methods are useful to characterize and monitor the subsurface at a wide range of scales. The presented approaches contribute to an improved appraisal of the obtained models, both in terms of the incorporation of prior information in the algorithms and the posterior uncertainty quantification. In addition, the developed strategies can be applied to other geophysical methods, and offer great flexibility to incorporate additional information when available.
Resumo:
The activity of radiopharmaceuticals in nuclear medicine is measured before patient injection with radionuclide calibrators. In Switzerland, the general requirements for quality controls are defined in a federal ordinance and a directive of the Federal Office of Metrology (METAS) which require each instrument to be verified. A set of three gamma sources (Co-57, Cs-137 and Co-60) is used to verify the response of radionuclide calibrators in the gamma energy range of their use. A beta source, a mixture of (90)Sr and (90)Y in secular equilibrium, is used as well. Manufacturers are responsible for the calibration factors. The main goal of the study was to monitor the validity of the calibration factors by using two sources: a (90)Sr/(90)Y source and a (18)F source. The three types of commercial radionuclide calibrators tested do not have a calibration factor for the mixture but only for (90)Y. Activity measurements of a (90)Sr/(90)Y source with the (90)Y calibration factor are performed in order to correct for the extra-contribution of (90)Sr. The value of the correction factor was found to be 1.113 whereas Monte Carlo simulations of the radionuclide calibrators estimate the correction factor to be 1.117. Measurements with (18)F sources in a specific geometry are also performed. Since this radionuclide is widely used in Swiss hospitals equipped with PET and PET-CT, the metrology of the (18)F is very important. The (18)F response normalized to the (137)Cs response shows that the difference with a reference value does not exceed 3% for the three types of radionuclide calibrators.
Resumo:
In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.
Resumo:
PURPOSE: To assess how different diagnostic decision aids perform in terms of sensitivity, specificity, and harm. METHODS: Four diagnostic decision aids were compared, as applied to a simulated patient population: a findings-based algorithm following a linear or branched pathway, a serial threshold-based strategy, and a parallel threshold-based strategy. Headache in immune-compromised HIV patients in a developing country was used as an example. Diagnoses included cryptococcal meningitis, cerebral toxoplasmosis, tuberculous meningitis, bacterial meningitis, and malaria. Data were derived from literature and expert opinion. Diagnostic strategies' validity was assessed in terms of sensitivity, specificity, and harm related to mortality and morbidity. Sensitivity analyses and Monte Carlo simulation were performed. RESULTS: The parallel threshold-based approach led to a sensitivity of 92% and a specificity of 65%. Sensitivities of the serial threshold-based approach and the branched and linear algorithms were 47%, 47%, and 74%, respectively, and the specificities were 85%, 95%, and 96%. The parallel threshold-based approach resulted in the least harm, with the serial threshold-based approach, the branched algorithm, and the linear algorithm being associated with 1.56-, 1.44-, and 1.17-times higher harm, respectively. Findings were corroborated by sensitivity and Monte Carlo analyses. CONCLUSION: A threshold-based diagnostic approach is designed to find the optimal trade-off that minimizes expected harm, enhancing sensitivity and lowering specificity when appropriate, as in the given example of a symptom pointing to several life-threatening diseases. Findings-based algorithms, in contrast, solely consider clinical observations. A parallel workup, as opposed to a serial workup, additionally allows for all potential diseases to be reviewed, further reducing false negatives. The parallel threshold-based approach might, however, not be as good in other disease settings.
Resumo:
Elastic scattering of relativistic electrons and positrons by atoms is considered in the framework of the static field approximation. The scattering field is expressed as a sum of Yukawa terms to allow the use of various approximations. Accurate phase shifts have been computed by combining Bühring¿s power-series method with the WKB and Born approximations. This combined procedure allows the evaluation of differential cross sections for kinetic energies up to several tens of MeV. Numerical results are used to analyze the validity of several approximate methods, namely the first- and second-order Born approximations and the screened Mott formula, which are frequently adopted as the basis of multiple scattering theories and Monte Carlo simulations of electron and positron transport.
Resumo:
PURPOSE: In the radiopharmaceutical therapy approach to the fight against cancer, in particular when it comes to translating laboratory results to the clinical setting, modeling has served as an invaluable tool for guidance and for understanding the processes operating at the cellular level and how these relate to macroscopic observables. Tumor control probability (TCP) is the dosimetric end point quantity of choice which relates to experimental and clinical data: it requires knowledge of individual cellular absorbed doses since it depends on the assessment of the treatment's ability to kill each and every cell. Macroscopic tumors, seen in both clinical and experimental studies, contain too many cells to be modeled individually in Monte Carlo simulation; yet, in particular for low ratios of decays to cells, a cell-based model that does not smooth away statistical considerations associated with low activity is a necessity. The authors present here an adaptation of the simple sphere-based model from which cellular level dosimetry for macroscopic tumors and their end point quantities, such as TCP, may be extrapolated more reliably. METHODS: Ten homogenous spheres representing tumors of different sizes were constructed in GEANT4. The radionuclide 131I was randomly allowed to decay for each model size and for seven different ratios of number of decays to number of cells, N(r): 1000, 500, 200, 100, 50, 20, and 10 decays per cell. The deposited energy was collected in radial bins and divided by the bin mass to obtain the average bin absorbed dose. To simulate a cellular model, the number of cells present in each bin was calculated and an absorbed dose attributed to each cell equal to the bin average absorbed dose with a randomly determined adjustment based on a Gaussian probability distribution with a width equal to the statistical uncertainty consistent with the ratio of decays to cells, i.e., equal to Nr-1/2. From dose volume histograms the surviving fraction of cells, equivalent uniform dose (EUD), and TCP for the different scenarios were calculated. Comparably sized spherical models containing individual spherical cells (15 microm diameter) in hexagonal lattices were constructed, and Monte Carlo simulations were executed for all the same previous scenarios. The dosimetric quantities were calculated and compared to the adjusted simple sphere model results. The model was then applied to the Bortezomib-induced enzyme-targeted radiotherapy (BETR) strategy of targeting Epstein-Barr virus (EBV)-expressing cancers. RESULTS: The TCP values were comparable to within 2% between the adjusted simple sphere and full cellular models. Additionally, models were generated for a nonuniform distribution of activity, and results were compared between the adjusted spherical and cellular models with similar comparability. The TCP values from the experimental macroscopic tumor results were consistent with the experimental observations for BETR-treated 1 g EBV-expressing lymphoma tumors in mice. CONCLUSIONS: The adjusted spherical model presented here provides more accurate TCP values than simple spheres, on par with full cellular Monte Carlo simulations while maintaining the simplicity of the simple sphere model. This model provides a basis for complementing and understanding laboratory and clinical results pertaining to radiopharmaceutical therapy.
Resumo:
BACKGROUND: Anal condylomata acuminata (ACA) are caused by human papilloma virus (HPV) infection which is transmitted by close physical and sexual contact. The result of surgical treatment of ACA has an overall success rate of 71% to 93%, with a recurrence rate between 4% and 29%. The aim of this study was to assess a possible association between HPV type and ACA recurrence after surgical treatment. METHODS: We performed a retrospective analysis of 140 consecutive patients who underwent surgery for ACA from January 1990 to December 2005 at our tertiary University Hospital. We confirmed ACA by histopathological analysis and determined the HPV typing using the polymerase chain reaction. Patients gave consent for HPV testing and completed a questionnaire. We looked at the association of ACA, HPV typing, and HIV disease. We used chi, the Monte Carlo simulation, and Wilcoxon tests for statistical analysis. RESULTS: Among the 140 patients (123 M/17 F), HPV 6 and 11 were the most frequently encountered viruses (51% and 28%, respectively). Recurrence occurred in 35 (25%) patients. HPV 11 was present in 19 (41%) of these recurrences, which is statistically significant, when compared with other HPVs. There was no significant difference between recurrence rates in the 33 (24%) HIV-positive and the HIV-negative patients. CONCLUSIONS: HPV 11 is associated with higher recurrence rate of ACA. This makes routine clinical HPV typing questionable. Follow-up is required to identify recurrence and to treat it early, especially if HPV 11 has been identified.
Resumo:
A number of geophysical methods, such as ground-penetrating radar (GPR), have the potential to provide valuable information on hydrological properties in the unsaturated zone. In particular, the stochastic inversion of such data within a coupled geophysical-hydrological framework may allow for the effective estimation of vadose zone hydraulic parameters and their corresponding uncertainties. A critical issue in stochastic inversion is choosing prior parameter probability distributions from which potential model configurations are drawn and tested against observed data. A well chosen prior should reflect as honestly as possible the initial state of knowledge regarding the parameters and be neither overly specific nor too conservative. In a Bayesian context, combining the prior with available data yields a posterior state of knowledge about the parameters, which can then be used statistically for predictions and risk assessment. Here we investigate the influence of prior information regarding the van Genuchten-Mualem (VGM) parameters, which describe vadose zone hydraulic properties, on the stochastic inversion of crosshole GPR data collected under steady state, natural-loading conditions. We do this using a Bayesian Markov chain Monte Carlo (MCMC) inversion approach, considering first noninformative uniform prior distributions and then more informative priors derived from soil property databases. For the informative priors, we further explore the effect of including information regarding parameter correlation. Analysis of both synthetic and field data indicates that the geophysical data alone contain valuable information regarding the VGM parameters. However, significantly better results are obtained when we combine these data with a realistic, informative prior.
Resumo:
The present work focuses the attention on the skew-symmetry index as a measure of social reciprocity. This index is based on the correspondence between the amount of behaviour that each individual addresses to its partners and what it receives from them in return. Although the skew-symmetry index enables researchers to describe social groups, statistical inferential tests are required. The main aim of the present study is to propose an overall statistical technique for testing symmetry in experimental conditions, calculating the skew-symmetry statistic (Φ) at group level. Sampling distributions for the skew- symmetry statistic have been estimated by means of a Monte Carlo simulation in order to allow researchers to make statistical decisions. Furthermore, this study will allow researchers to choose the optimal experimental conditions for carrying out their research, as the power of the statistical test has been estimated. This statistical test could be used in experimental social psychology studies in which researchers may control the group size and the number of interactions within dyads.