12 resultados para maximized monte Carlo test

em Helda - Digital Repository of University of Helsinki


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A better understanding of the limiting step in a first order phase transition, the nucleation process, is of major importance to a variety of scientific fields ranging from atmospheric sciences to nanotechnology and even to cosmology. This is due to the fact that in most phase transitions the new phase is separated from the mother phase by a free energy barrier. This barrier is crossed in a process called nucleation. Nowadays it is considered that a significant fraction of all atmospheric particles is produced by vapor-to liquid nucleation. In atmospheric sciences, as well as in other scientific fields, the theoretical treatment of nucleation is mostly based on a theory known as the Classical Nucleation Theory. However, the Classical Nucleation Theory is known to have only a limited success in predicting the rate at which vapor-to-liquid nucleation takes place at given conditions. This thesis studies the unary homogeneous vapor-to-liquid nucleation from a statistical mechanics viewpoint. We apply Monte Carlo simulations of molecular clusters to calculate the free energy barrier separating the vapor and liquid phases and compare our results against the laboratory measurements and Classical Nucleation Theory predictions. According to our results, the work of adding a monomer to a cluster in equilibrium vapour is accurately described by the liquid drop model applied by the Classical Nucleation Theory, once the clusters are larger than some threshold size. The threshold cluster sizes contain only a few or some tens of molecules depending on the interaction potential and temperature. However, the error made in modeling the smallest of clusters as liquid drops results in an erroneous absolute value for the cluster work of formation throughout the size range, as predicted by the McGraw-Laaksonen scaling law. By calculating correction factors to Classical Nucleation Theory predictions for the nucleation barriers of argon and water, we show that the corrected predictions produce nucleation rates that are in good comparison with experiments. For the smallest clusters, the deviation between the simulation results and the liquid drop values are accurately modelled by the low order virial coefficients at modest temperatures and vapour densities, or in other words, in the validity range of the non-interacting cluster theory by Frenkel, Band and Bilj. Our results do not indicate a need for a size dependent replacement free energy correction. The results also indicate that Classical Nucleation Theory predicts the size of the critical cluster correctly. We also presents a new method for the calculation of the equilibrium vapour density, surface tension size dependence and planar surface tension directly from cluster simulations. We also show how the size dependence of the cluster surface tension in equimolar surface is a function of virial coefficients, a result confirmed by our cluster simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A better understanding of the limiting step in a first order phase transition, the nucleation process, is of major importance to a variety of scientific fields ranging from atmospheric sciences to nanotechnology and even to cosmology. This is due to the fact that in most phase transitions the new phase is separated from the mother phase by a free energy barrier. This barrier is crossed in a process called nucleation. Nowadays it is considered that a significant fraction of all atmospheric particles is produced by vapor-to liquid nucleation. In atmospheric sciences, as well as in other scientific fields, the theoretical treatment of nucleation is mostly based on a theory known as the Classical Nucleation Theory. However, the Classical Nucleation Theory is known to have only a limited success in predicting the rate at which vapor-to-liquid nucleation takes place at given conditions. This thesis studies the unary homogeneous vapor-to-liquid nucleation from a statistical mechanics viewpoint. We apply Monte Carlo simulations of molecular clusters to calculate the free energy barrier separating the vapor and liquid phases and compare our results against the laboratory measurements and Classical Nucleation Theory predictions. According to our results, the work of adding a monomer to a cluster in equilibrium vapour is accurately described by the liquid drop model applied by the Classical Nucleation Theory, once the clusters are larger than some threshold size. The threshold cluster sizes contain only a few or some tens of molecules depending on the interaction potential and temperature. However, the error made in modeling the smallest of clusters as liquid drops results in an erroneous absolute value for the cluster work of formation throughout the size range, as predicted by the McGraw-Laaksonen scaling law. By calculating correction factors to Classical Nucleation Theory predictions for the nucleation barriers of argon and water, we show that the corrected predictions produce nucleation rates that are in good comparison with experiments. For the smallest clusters, the deviation between the simulation results and the liquid drop values are accurately modelled by the low order virial coefficients at modest temperatures and vapour densities, or in other words, in the validity range of the non-interacting cluster theory by Frenkel, Band and Bilj. Our results do not indicate a need for a size dependent replacement free energy correction. The results also indicate that Classical Nucleation Theory predicts the size of the critical cluster correctly. We also presents a new method for the calculation of the equilibrium vapour density, surface tension size dependence and planar surface tension directly from cluster simulations. We also show how the size dependence of the cluster surface tension in equimolar surface is a function of virial coefficients, a result confirmed by our cluster simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is concerned with using the bootstrap to obtain improved critical values for the error correction model (ECM) cointegration test in dynamic models. In the paper we investigate the effects of dynamic specification on the size and power of the ECM cointegration test with bootstrap critical values. The results from a Monte Carlo study show that the size of the bootstrap ECM cointegration test is close to the nominal significance level. We find that overspecification of the lag length results in a loss of power. Underspecification of the lag length results in size distortion. The performance of the bootstrap ECM cointegration test deteriorates if the correct lag length is not used in the ECM. The bootstrap ECM cointegration test is therefore not robust to model misspecification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Polar Regions are an energy sink of the Earth system, as the Sun rays do not reach the Poles for half of the year, and hit them only at very low angles for the other half of the year. In summer, solar radiation is the dominant energy source for the Polar areas, therefore even small changes in the surface albedo strongly affect the surface energy balance and, thus, the speed and amount of snow and ice melting. In winter, the main heat sources for the atmosphere are the cyclones approaching from lower latitudes, and the atmosphere-surface heat transfer takes place through turbulent mixing and longwave radiation, the latter dominated by clouds. The aim of this thesis is to improve the knowledge about the surface and atmospheric processes that control the surface energy budget over snow and ice, with particular focus on albedo during the spring and summer seasons, on horizontal advection of heat, cloud longwave forcing, and turbulent mixing during the winter season. The critical importance of a correct albedo representation in models is illustrated through the analysis of the causes for the errors in the surface and near-surface air temperature produced in a short-range numerical weather forecast by the HIRLAM model. Then, the daily and seasonal variability of snow and ice albedo have been examined by analysing field measurements of albedo, carried out in different environments. On the basis of the data analysis, simple albedo parameterizations have been derived, which can be implemented into thermodynamic sea ice models, as well as numerical weather prediction and climate models. Field measurements of radiation and turbulent fluxes over the Bay of Bothnia (Baltic Sea) also allowed examining the impact of a large albedo change during the melting season on surface energy and ice mass budgets. When high contrasts in surface albedo are present, as in the case of snow covered areas next to open water, the effect of the surface albedo heterogeneity on the downwelling solar irradiance under overcast condition is very significant, although it is usually not accounted for in single column radiative transfer calculations. To account for this effect, an effective albedo parameterization based on three-dimensional Monte Carlo radiative transfer calculations has been developed. To test a potentially relevant application of the effective albedo parameterization, its performance in the ground-based retrieval of cloud optical depth was illustrated. Finally, the factors causing the large variations of the surface and near-surface temperatures over the Central Arctic during winter were examined. The relative importance of cloud radiative forcing, turbulent mixing, and lateral heat advection on the Arctic surface temperature were quantified through the analysis of direct observations from Russian drifting ice stations, with the lateral heat advection calculated from reanalysis products.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims: Develop and validate tools to estimate residual noise covariance in Planck frequency maps. Quantify signal error effects and compare different techniques to produce low-resolution maps. Methods: We derive analytical estimates of covariance of the residual noise contained in low-resolution maps produced using a number of map-making approaches. We test these analytical predictions using Monte Carlo simulations and their impact on angular power spectrum estimation. We use simulations to quantify the level of signal errors incurred in different resolution downgrading schemes considered in this work. Results: We find an excellent agreement between the optimal residual noise covariance matrices and Monte Carlo noise maps. For destriping map-makers, the extent of agreement is dictated by the knee frequency of the correlated noise component and the chosen baseline offset length. The significance of signal striping is shown to be insignificant when properly dealt with. In map resolution downgrading, we find that a carefully selected window function is required to reduce aliasing to the sub-percent level at multipoles, ell > 2Nside, where Nside is the HEALPix resolution parameter. We show that sufficient characterization of the residual noise is unavoidable if one is to draw reliable contraints on large scale anisotropy. Conclusions: We have described how to compute the low-resolution maps, with a controlled sky signal level, and a reliable estimate of covariance of the residual noise. We have also presented a method to smooth the residual noise covariance matrices to describe the noise correlations in smoothed, bandwidth limited maps.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The likelihood ratio test of cointegration rank is the most widely used test for cointegration. Many studies have shown that its finite sample distribution is not well approximated by the limiting distribution. The article introduces and evaluates by Monte Carlo simulation experiments bootstrap and fast double bootstrap (FDB) algorithms for the likelihood ratio test. It finds that the performance of the bootstrap test is very good. The more sophisticated FDB produces a further improvement in cases where the performance of the asymptotic test is very unsatisfactory and the ordinary bootstrap does not work as well as it might. Furthermore, the Monte Carlo simulations provide a number of guidelines on when the bootstrap and FDB tests can be expected to work well. Finally, the tests are applied to US interest rates and international stock prices series. It is found that the asymptotic test tends to overestimate the cointegration rank, while the bootstrap and FDB tests choose the correct cointegration rank.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is a case study of applying nonparametric statistical methods to corpus data. We show how to use ideas from permutation testing to answer linguistic questions related to morphological productivity and type richness. In particular, we study the use of the suffixes -ity and -ness in the 17th-century part of the Corpus of Early English Correspondence within the framework of historical sociolinguistics. Our hypothesis is that the productivity of -ity, as measured by type counts, is significantly low in letters written by women. To test such hypotheses, and to facilitate exploratory data analysis, we take the approach of computing accumulation curves for types and hapax legomena. We have developed an open source computer program which uses Monte Carlo sampling to compute the upper and lower bounds of these curves for one or more levels of statistical significance. By comparing the type accumulation from women’s letters with the bounds, we are able to confirm our hypothesis.