8 resultados para Monte Carlo EM algorithm
em Helda - Digital Repository of University of Helsinki
Resumo:
Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.
Resumo:
A better understanding of the limiting step in a first order phase transition, the nucleation process, is of major importance to a variety of scientific fields ranging from atmospheric sciences to nanotechnology and even to cosmology. This is due to the fact that in most phase transitions the new phase is separated from the mother phase by a free energy barrier. This barrier is crossed in a process called nucleation. Nowadays it is considered that a significant fraction of all atmospheric particles is produced by vapor-to liquid nucleation. In atmospheric sciences, as well as in other scientific fields, the theoretical treatment of nucleation is mostly based on a theory known as the Classical Nucleation Theory. However, the Classical Nucleation Theory is known to have only a limited success in predicting the rate at which vapor-to-liquid nucleation takes place at given conditions. This thesis studies the unary homogeneous vapor-to-liquid nucleation from a statistical mechanics viewpoint. We apply Monte Carlo simulations of molecular clusters to calculate the free energy barrier separating the vapor and liquid phases and compare our results against the laboratory measurements and Classical Nucleation Theory predictions. According to our results, the work of adding a monomer to a cluster in equilibrium vapour is accurately described by the liquid drop model applied by the Classical Nucleation Theory, once the clusters are larger than some threshold size. The threshold cluster sizes contain only a few or some tens of molecules depending on the interaction potential and temperature. However, the error made in modeling the smallest of clusters as liquid drops results in an erroneous absolute value for the cluster work of formation throughout the size range, as predicted by the McGraw-Laaksonen scaling law. By calculating correction factors to Classical Nucleation Theory predictions for the nucleation barriers of argon and water, we show that the corrected predictions produce nucleation rates that are in good comparison with experiments. For the smallest clusters, the deviation between the simulation results and the liquid drop values are accurately modelled by the low order virial coefficients at modest temperatures and vapour densities, or in other words, in the validity range of the non-interacting cluster theory by Frenkel, Band and Bilj. Our results do not indicate a need for a size dependent replacement free energy correction. The results also indicate that Classical Nucleation Theory predicts the size of the critical cluster correctly. We also presents a new method for the calculation of the equilibrium vapour density, surface tension size dependence and planar surface tension directly from cluster simulations. We also show how the size dependence of the cluster surface tension in equimolar surface is a function of virial coefficients, a result confirmed by our cluster simulations.
Resumo:
A better understanding of the limiting step in a first order phase transition, the nucleation process, is of major importance to a variety of scientific fields ranging from atmospheric sciences to nanotechnology and even to cosmology. This is due to the fact that in most phase transitions the new phase is separated from the mother phase by a free energy barrier. This barrier is crossed in a process called nucleation. Nowadays it is considered that a significant fraction of all atmospheric particles is produced by vapor-to liquid nucleation. In atmospheric sciences, as well as in other scientific fields, the theoretical treatment of nucleation is mostly based on a theory known as the Classical Nucleation Theory. However, the Classical Nucleation Theory is known to have only a limited success in predicting the rate at which vapor-to-liquid nucleation takes place at given conditions. This thesis studies the unary homogeneous vapor-to-liquid nucleation from a statistical mechanics viewpoint. We apply Monte Carlo simulations of molecular clusters to calculate the free energy barrier separating the vapor and liquid phases and compare our results against the laboratory measurements and Classical Nucleation Theory predictions. According to our results, the work of adding a monomer to a cluster in equilibrium vapour is accurately described by the liquid drop model applied by the Classical Nucleation Theory, once the clusters are larger than some threshold size. The threshold cluster sizes contain only a few or some tens of molecules depending on the interaction potential and temperature. However, the error made in modeling the smallest of clusters as liquid drops results in an erroneous absolute value for the cluster work of formation throughout the size range, as predicted by the McGraw-Laaksonen scaling law. By calculating correction factors to Classical Nucleation Theory predictions for the nucleation barriers of argon and water, we show that the corrected predictions produce nucleation rates that are in good comparison with experiments. For the smallest clusters, the deviation between the simulation results and the liquid drop values are accurately modelled by the low order virial coefficients at modest temperatures and vapour densities, or in other words, in the validity range of the non-interacting cluster theory by Frenkel, Band and Bilj. Our results do not indicate a need for a size dependent replacement free energy correction. The results also indicate that Classical Nucleation Theory predicts the size of the critical cluster correctly. We also presents a new method for the calculation of the equilibrium vapour density, surface tension size dependence and planar surface tension directly from cluster simulations. We also show how the size dependence of the cluster surface tension in equimolar surface is a function of virial coefficients, a result confirmed by our cluster simulations.
Resumo:
Aerosols impact the planet and our daily lives through various effects, perhaps most notably those related to their climatic and health-related consequences. While there are several primary particle sources, secondary new particle formation from precursor vapors is also known to be a frequent, global phenomenon. Nevertheless, the formation mechanism of new particles, as well as the vapors participating in the process, remain a mystery. This thesis consists of studies on new particle formation specifically from the point of view of numerical modeling. A dependence of formation rate of 3 nm particles on the sulphuric acid concentration to the power of 1-2 has been observed. This suggests nucleation mechanism to be of first or second order with respect to the sulphuric acid concentration, in other words the mechanisms based on activation or kinetic collision of clusters. However, model studies have had difficulties in replicating the small exponents observed in nature. The work done in this thesis indicates that the exponents may be lowered by the participation of a co-condensing (and potentially nucleating) low-volatility organic vapor, or by increasing the assumed size of the critical clusters. On the other hand, the presented new and more accurate method for determining the exponent indicates high diurnal variability. Additionally, these studies included several semi-empirical nucleation rate parameterizations as well as a detailed investigation of the analysis used to determine the apparent particle formation rate. Due to their high proportion of the earth's surface area, oceans could potentially prove to be climatically significant sources of secondary particles. In the lack of marine observation data, new particle formation events in a coastal region were parameterized and studied. Since the formation mechanism is believed to be similar, the new parameterization was applied in a marine scenario. The work showed that marine CCN production is feasible in the presence of additional vapors contributing to particle growth. Finally, a new method to estimate concentrations of condensing organics was developed. The algorithm utilizes a Markov chain Monte Carlo method to determine the required combination of vapor concentrations by comparing a measured particle size distribution with one from an aerosol dynamics process model. The evaluation indicated excellent agreement against model data, and initial results with field data appear sound as well.
Resumo:
In this thesis we deal with the concept of risk. The objective is to bring together and conclude on some normative information regarding quantitative portfolio management and risk assessment. The first essay concentrates on return dependency. We propose an algorithm for classifying markets into rising and falling. Given the algorithm, we derive a statistic: the Trend Switch Probability, for detection of long-term return dependency in the first moment. The empirical results suggest that the Trend Switch Probability is robust over various volatility specifications. The serial dependency in bear and bull markets behaves however differently. It is strongly positive in rising market whereas in bear markets it is closer to a random walk. Realized volatility, a technique for estimating volatility from high frequency data, is investigated in essays two and three. In the second essay we find, when measuring realized variance on a set of German stocks, that the second moment dependency structure is highly unstable and changes randomly. Results also suggest that volatility is non-stationary from time to time. In the third essay we examine the impact from market microstructure on the error between estimated realized volatility and the volatility of the underlying process. With simulation-based techniques we show that autocorrelation in returns leads to biased variance estimates and that lower sampling frequency and non-constant volatility increases the error variation between the estimated variance and the variance of the underlying process. From these essays we can conclude that volatility is not easily estimated, even from high frequency data. It is neither very well behaved in terms of stability nor dependency over time. Based on these observations, we would recommend the use of simple, transparent methods that are likely to be more robust over differing volatility regimes than models with a complex parameter universe. In analyzing long-term return dependency in the first moment we find that the Trend Switch Probability is a robust estimator. This is an interesting area for further research, with important implications for active asset allocation.