15 resultados para Bayesian risk prediction models

em Helda - Digital Repository of University of Helsinki


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern-day weather forecasting is highly dependent on Numerical Weather Prediction (NWP) models as the main data source. The evolving state of the atmosphere with time can be numerically predicted by solving a set of hydrodynamic equations, if the initial state is known. However, such a modelling approach always contains approximations that by and large depend on the purpose of use and resolution of the models. Present-day NWP systems operate with horizontal model resolutions in the range from about 40 km to 10 km. Recently, the aim has been to reach operationally to scales of 1 4 km. This requires less approximations in the model equations, more complex treatment of physical processes and, furthermore, more computing power. This thesis concentrates on the physical parameterization methods used in high-resolution NWP models. The main emphasis is on the validation of the grid-size-dependent convection parameterization in the High Resolution Limited Area Model (HIRLAM) and on a comprehensive intercomparison of radiative-flux parameterizations. In addition, the problems related to wind prediction near the coastline are addressed with high-resolution meso-scale models. The grid-size-dependent convection parameterization is clearly beneficial for NWP models operating with a dense grid. Results show that the current convection scheme in HIRLAM is still applicable down to a 5.6 km grid size. However, with further improved model resolution, the tendency of the model to overestimate strong precipitation intensities increases in all the experiment runs. For the clear-sky longwave radiation parameterization, schemes used in NWP-models provide much better results in comparison with simple empirical schemes. On the other hand, for the shortwave part of the spectrum, the empirical schemes are more competitive for producing fairly accurate surface fluxes. Overall, even the complex radiation parameterization schemes used in NWP-models seem to be slightly too transparent for both long- and shortwave radiation in clear-sky conditions. For cloudy conditions, simple cloud correction functions are tested. In case of longwave radiation, the empirical cloud correction methods provide rather accurate results, whereas for shortwave radiation the benefit is only marginal. Idealised high-resolution two-dimensional meso-scale model experiments suggest that the reason for the observed formation of the afternoon low level jet (LLJ) over the Gulf of Finland is an inertial oscillation mechanism, when the large-scale flow is from the south-east or west directions. The LLJ is further enhanced by the sea-breeze circulation. A three-dimensional HIRLAM experiment, with a 7.7 km grid size, is able to generate a similar LLJ flow structure as suggested by the 2D-experiments and observations. It is also pointed out that improved model resolution does not necessary lead to better wind forecasts in the statistical sense. In nested systems, the quality of the large-scale host model is really important, especially if the inner meso-scale model domain is small.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis report attempts to improve the models for predicting forest stand structure for practical use, e.g. forest management planning (FMP) purposes in Finland. Comparisons were made between Weibull and Johnson s SB distribution and alternative regression estimation methods. Data used for preliminary studies was local but the final models were based on representative data. Models were validated mainly in terms of bias and RMSE in the main stand characteristics (e.g. volume) using independent data. The bivariate SBB distribution model was used to mimic realistic variations in tree dimensions by including within-diameter-class height variation. Using the traditional method, diameter distribution with the expected height resulted in reduced height variation, whereas the alternative bivariate method utilized the error-term of the height model. The lack of models for FMP was covered to some extent by the models for peatland and juvenile stands. The validation of these models showed that the more sophisticated regression estimation methods provided slightly improved accuracy. A flexible prediction and application for stand structure consisted of seemingly unrelated regression models for eight stand characteristics, the parameters of three optional distributions and Näslund s height curve. The cross-model covariance structure was used for linear prediction application, in which the expected values of the models were calibrated with the known stand characteristics. This provided a framework to validate the optional distributions and the optional set of stand characteristics. Height distribution is recommended for the earliest state of stands because of its continuous feature. From the mean height of about 4 m, Weibull dbh-frequency distribution is recommended in young stands if the input variables consist of arithmetic stand characteristics. In advanced stands, basal area-dbh distribution models are recommended. Näslund s height curve proved useful. Some efficient transformations of stand characteristics are introduced, e.g. the shape index, which combined the basal area, the stem number and the median diameter. Shape index enabled SB model for peatland stands to detect large variation in stand densities. This model also demonstrated reasonable behaviour for stands in mineral soils.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assessment of the outcome of critical illness is complex. Severity scoring systems and organ dysfunction scores are traditional tools in mortality and morbidity prediction in intensive care. Their ability to explain risk of death is impressive for large cohorts of patients, but insufficient for an individual patient. Although events before intensive care unit (ICU) admission are prognostically important, the prediction models utilize data collected at and just after ICU admission. In addition, several biomarkers have been evaluated to predict mortality, but none has proven entirely useful in clinical practice. Therefore, new prognostic markers of critical illness are vital when evaluating the intensive care outcome. The aim of this dissertation was to investigate new measures and biological markers of critical illness and to evaluate their predictive value and association with mortality and disease severity. The impact of delay in emergency department (ED) on intensive care outcome, measured as hospital mortality and health-related quality of life (HRQoL) at 6 months, was assessed in 1537 consecutive patients admitted to medical ICU. Two new biological markers were investigated in two separate patient populations: in 231 ICU patients and 255 patients with severe sepsis or septic shock. Cell-free plasma DNA is a surrogate marker of apoptosis. Its association with disease severity and mortality rate was evaluated in ICU patients. Next, the predictive value of plasma DNA regarding mortality and its association with the degree of organ dysfunction and disease severity was evaluated in severe sepsis or septic shock. Heme oxygenase-1 (HO-1) is a potential regulator of apoptosis. Finally, HO-1 plasma concentrations and HO-1 gene polymorphisms and their association with outcome were evaluated in ICU patients. The length of ED stay was not associated with outcome of intensive care. The hospital mortality rate was significantly lower in patients admitted to the medical ICU from the ED than from the non-ED, and the HRQoL in the critically ill at 6 months was significantly lower than in the age- and sex-matched general population. In the ICU patient population, the maximum plasma DNA concentration measured during the first 96 hours in intensive care correlated significantly with disease severity and degree of organ failure and was independently associated with hospital mortality. In patients with severe sepsis or septic shock, the cell-free plasma DNA concentrations were significantly higher in ICU and hospital nonsurvivors than in survivors and showed a moderate discriminative power regarding ICU mortality. Plasma DNA was an independent predictor for ICU mortality, but not for hospital mortality. The degree of organ dysfunction correlated independently with plasma DNA concentration in severe sepsis and plasma HO-1 concentration in ICU patients. The HO-1 -413T/GT(L)/+99C haplotype was associated with HO-1 plasma levels and frequency of multiple organ dysfunction. Plasma DNA and HO-1 concentrations may support the assessment of outcome or organ failure development in critically ill patients, although their value is limited and requires further evaluation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The planet Mars is the Earth's neighbour in the Solar System. Planetary research stems from a fundamental need to explore our surroundings, typical for mankind. Manned missions to Mars are already being planned, and understanding the environment to which the astronauts would be exposed is of utmost importance for a successful mission. Information of the Martian environment given by models is already now used in designing the landers and orbiters sent to the red planet. In particular, studies of the Martian atmosphere are crucial for instrument design, entry, descent and landing system design, landing site selection, and aerobraking calculations. Research of planetary atmospheres can also contribute to atmospheric studies of the Earth via model testing and development of parameterizations: even after decades of modeling the Earth's atmosphere, we are still far from perfect weather predictions. On a global level, Mars has also been experiencing climate change. The aerosol effect is one of the largest unknowns in the present terrestrial climate change studies, and the role of aerosol particles in any climate is fundamental: studies of climate variations on another planet can help us better understand our own global change. In this thesis I have used an atmospheric column model for Mars to study the behaviour of the lowest layer of the atmosphere, the planetary boundary layer (PBL), and I have developed nucleation (particle formation) models for Martian conditions. The models were also coupled to study, for example, fog formation in the PBL. The PBL is perhaps the most significant part of the atmosphere for landers and humans, since we live in it and experience its state, for example, as gusty winds, nightfrost, and fogs. However, PBL modelling in weather prediction models is still a difficult task. Mars hosts a variety of cloud types, mainly composed of water ice particles, but also CO2 ice clouds form in the very cold polar night and at high altitudes elsewhere. Nucleation is the first step in particle formation, and always includes a phase transition. Cloud crystals on Mars form from vapour to ice on ubiquitous, suspended dust particles. Clouds on Mars have a small radiative effect in the present climate, but it may have been more important in the past. This thesis represents an attempt to model the Martian atmosphere at the smallest scales with high resolution. The models used and developed during the course of the research are useful tools for developing and testing parameterizations for larger-scale models all the way up to global climate models, since the small-scale models can describe processes that in the large-scale models are reduced to subgrid (not explicitly resolved) scale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Both inherited genetic variations and somatically acquired mutations drive cancer development. The aim of this thesis was to gain insight into the molecular mechanisms underlying colorectal cancer (CRC) predisposition and tumor progression. Whereas one-third of CRC may develop in the context of hereditary predisposition, the known highly penetrant syndromes only explain a small fraction of all cases. Genome-wide association studies have shown that ten common single nucleotide polymorphisms (SNPs) modestly predispose to CRC. Our population-based sample series of around thousand CRC cases and healthy controls was genotyped for these SNPs. Tumors of heterozygous patients were analyzed for allelic imbalance, in an attempt to reveal the role of these SNPs in somatic tumor progression. The risk allele of rs6983267 at 8q24 was favored in the tumors significantly more often than the neutral allele, indicating that this germline variant is somatically selected for. No imbalance targeting the risk allele was observed in the remaining loci, suggesting that most of the low-penetrance CRC SNPs mainly play a role in the early stages of the neoplastic process. The ten SNPs were further analyzed in 788 CRC cases, 97 of which had a family history of CRC, to evaluate their combined contribution. A significant association appeared between the overall number of risk alleles and familial CRC and these ten SNPs seem to explain around 9% of the familial clustering of CRC. Finding more CRC susceptibility alleles may facilitate individualized risk prediction and cancer prevention in the future. Microsatellite instability (MSI), resulting from defective mismatch repair function, is a hallmark of Lynch syndrome and observed in a subset of all CRCs. Our aim was to identify microsatellite frameshift mutations that inactivate tumor suppressor genes in MSI CRCs. By sequencing microsatellite repeats of underexpressed genes we found six novel MSI target genes that were frequently mutated in 100 MSI CRCs: 51% in GLYR1, 47% in ABCC5, 43% in WDTC1, 33% in ROCK1, 30% in OR51E2, and 28% in TCEB3. Immunohistochemical staining of GLYR1 revealed defective protein expression in homozygously mutated tumors, providing further support for the loss of function hypothesis. Another mutation screening effort sought to identify MSI target genes with putative oncogenic functions. Microsatellites were similarly sequenced in genes that were overexpressed and, upon mutation, predicted to avoid nonsense-mediated mRNA decay. The mitotic checkpoint kinase TTK harbored protein-elongating mutations in 59% of MSI CRCs and the mutant protein was detected in heterozygous MSI CRC cells. No checkpoint dysregulation or defective protein localization was observable however, and the biological relevance of this mutation may hence be related to other mechanisms. In conclusion, these two large-scale and unbiased efforts identified frequently mutated genes that are likely to contribute to the development of this cancer type and may be utilized in developing diagnostic and therapeutic applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In meteorology, observations and forecasts of a wide range of phenomena for example, snow, clouds, hail, fog, and tornados can be categorical, that is, they can only have discrete values (e.g., "snow" and "no snow"). Concentrating on satellite-based snow and cloud analyses, this thesis explores methods that have been developed for evaluation of categorical products and analyses. Different algorithms for satellite products generate different results; sometimes the differences are subtle, sometimes all too visible. In addition to differences between algorithms, the satellite products are influenced by physical processes and conditions, such as diurnal and seasonal variation in solar radiation, topography, and land use. The analysis of satellite-based snow cover analyses from NOAA, NASA, and EUMETSAT, and snow analyses for numerical weather prediction models from FMI and ECMWF was complicated by the fact that we did not have the true knowledge of snow extent, and we were forced simply to measure the agreement between different products. The Sammon mapping, a multidimensional scaling method, was then used to visualize the differences between different products. The trustworthiness of the results for cloud analyses [EUMETSAT Meteorological Products Extraction Facility cloud mask (MPEF), together with the Nowcasting Satellite Application Facility (SAFNWC) cloud masks provided by Météo-France (SAFNWC/MSG) and the Swedish Meteorological and Hydrological Institute (SAFNWC/PPS)] compared with ceilometers of the Helsinki Testbed was estimated by constructing confidence intervals (CIs). Bootstrapping, a statistical resampling method, was used to construct CIs, especially in the presence of spatial and temporal correlation. The reference data for validation are constantly in short supply. In general, the needs of a particular project drive the requirements for evaluation, for example, for the accuracy and the timeliness of the particular data and methods. In this vein, we discuss tentatively how data provided by general public, e.g., photos shared on the Internet photo-sharing service Flickr, can be used as a new source for validation. Results show that they are of reasonable quality and their use for case studies can be warmly recommended. Last, the use of cluster analysis on meteorological in-situ measurements was explored. The Autoclass algorithm was used to construct compact representations of synoptic conditions of fog at Finnish airports.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The future use of genetically modified (GM) plants in food, feed and biomass production requires a careful consideration of possible risks related to the unintended spread of trangenes into new habitats. This may occur via introgression of the transgene to conventional genotypes, due to cross-pollination, and via the invasion of GM plants to new habitats. Assessment of possible environmental impacts of GM plants requires estimation of the level of gene flow from a GM population. Furthermore, management measures for reducing gene flow from GM populations are needed in order to prevent possible unwanted effects of transgenes on ecosystems. This work develops modeling tools for estimating gene flow from GM plant populations in boreal environments and for investigating the mechanisms of the gene flow process. To describe spatial dimensions of the gene flow, dispersal models are developed for the local and regional scale spread of pollen grains and seeds, with special emphasis on wind dispersal. This study provides tools for describing cross-pollination between GM and conventional populations and for estimating the levels of transgenic contamination of the conventional crops. For perennial populations, a modeling framework describing the dynamics of plants and genotypes is developed, in order to estimate the gene flow process over a sequence of years. The dispersal of airborne pollen and seeds cannot be easily controlled, and small amounts of these particles are likely to disperse over long distances. Wind dispersal processes are highly stochastic due to variation in atmospheric conditions, so that there may be considerable variation between individual dispersal patterns. This, in turn, is reflected to the large amount of variation in annual levels of cross-pollination between GM and conventional populations. Even though land-use practices have effects on the average levels of cross-pollination between GM and conventional fields, the level of transgenic contamination of a conventional crop remains highly stochastic. The demographic effects of a transgene have impacts on the establishment of trangenic plants amongst conventional genotypes of the same species. If the transgene gives a plant a considerable fitness advantage in comparison to conventional genotypes, the spread of transgenes to conventional population can be strongly increased. In such cases, dominance of the transgene considerably increases gene flow from GM to conventional populations, due to the enhanced fitness of heterozygous hybrids. The fitness of GM plants in conventional populations can be reduced by linking the selectively favoured primary transgene to a disfavoured mitigation transgene. Recombination between these transgenes is a major risk related to this technique, especially because it tends to take place amongst the conventional genotypes and thus promotes the establishment of invasive transgenic plants in conventional populations.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

One of the most fundamental and widely accepted ideas in finance is that investors are compensated through higher returns for taking on non-diversifiable risk. Hence the quantification, modeling and prediction of risk have been, and still are one of the most prolific research areas in financial economics. It was recognized early on that there are predictable patterns in the variance of speculative prices. Later research has shown that there may also be systematic variation in the skewness and kurtosis of financial returns. Lacking in the literature so far, is an out-of-sample forecast evaluation of the potential benefits of these new more complicated models with time-varying higher moments. Such an evaluation is the topic of this dissertation. Essay 1 investigates the forecast performance of the GARCH (1,1) model when estimated with 9 different error distributions on Standard and Poor’s 500 Index Future returns. By utilizing the theory of realized variance to construct an appropriate ex post measure of variance from intra-day data it is shown that allowing for a leptokurtic error distribution leads to significant improvements in variance forecasts compared to using the normal distribution. This result holds for daily, weekly as well as monthly forecast horizons. It is also found that allowing for skewness and time variation in the higher moments of the distribution does not further improve forecasts. In Essay 2, by using 20 years of daily Standard and Poor 500 index returns, it is found that density forecasts are much improved by allowing for constant excess kurtosis but not improved by allowing for skewness. By allowing the kurtosis and skewness to be time varying the density forecasts are not further improved but on the contrary made slightly worse. In Essay 3 a new model incorporating conditional variance, skewness and kurtosis based on the Normal Inverse Gaussian (NIG) distribution is proposed. The new model and two previously used NIG models are evaluated by their Value at Risk (VaR) forecasts on a long series of daily Standard and Poor’s 500 returns. The results show that only the new model produces satisfactory VaR forecasts for both 1% and 5% VaR Taken together the results of the thesis show that kurtosis appears not to exhibit predictable time variation, whereas there is found some predictability in the skewness. However, the dynamic properties of the skewness are not completely captured by any of the models.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Myrkyllisten aineiden jakaumat ja vaikutusmallit jätealueiden ympäristöriskien analyysissä.