30 resultados para PREDICTIVE MODELING

em Helda - Digital Repository of University of Helsinki


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis addresses modeling of financial time series, especially stock market returns and daily price ranges. Modeling data of this kind can be approached with so-called multiplicative error models (MEM). These models nest several well known time series models such as GARCH, ACD and CARR models. They are able to capture many well established features of financial time series including volatility clustering and leptokurtosis. In contrast to these phenomena, different kinds of asymmetries have received relatively little attention in the existing literature. In this thesis asymmetries arise from various sources. They are observed in both conditional and unconditional distributions, for variables with non-negative values and for variables that have values on the real line. In the multivariate context asymmetries can be observed in the marginal distributions as well as in the relationships of the variables modeled. New methods for all these cases are proposed. Chapter 2 considers GARCH models and modeling of returns of two stock market indices. The chapter introduces the so-called generalized hyperbolic (GH) GARCH model to account for asymmetries in both conditional and unconditional distribution. In particular, two special cases of the GARCH-GH model which describe the data most accurately are proposed. They are found to improve the fit of the model when compared to symmetric GARCH models. The advantages of accounting for asymmetries are also observed through Value-at-Risk applications. Both theoretical and empirical contributions are provided in Chapter 3 of the thesis. In this chapter the so-called mixture conditional autoregressive range (MCARR) model is introduced, examined and applied to daily price ranges of the Hang Seng Index. The conditions for the strict and weak stationarity of the model as well as an expression for the autocorrelation function are obtained by writing the MCARR model as a first order autoregressive process with random coefficients. The chapter also introduces inverse gamma (IG) distribution to CARR models. The advantages of CARR-IG and MCARR-IG specifications over conventional CARR models are found in the empirical application both in- and out-of-sample. Chapter 4 discusses the simultaneous modeling of absolute returns and daily price ranges. In this part of the thesis a vector multiplicative error model (VMEM) with asymmetric Gumbel copula is found to provide substantial benefits over the existing VMEM models based on elliptical copulas. The proposed specification is able to capture the highly asymmetric dependence of the modeled variables thereby improving the performance of the model considerably. The economic significance of the results obtained is established when the information content of the volatility forecasts derived is examined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In visual object detection and recognition, classifiers have two interesting characteristics: accuracy and speed. Accuracy depends on the complexity of the image features and classifier decision surfaces. Speed depends on the hardware and the computational effort required to use the features and decision surfaces. When attempts to increase accuracy lead to increases in complexity and effort, it is necessary to ask how much are we willing to pay for increased accuracy. For example, if increased computational effort implies quickly diminishing returns in accuracy, then those designing inexpensive surveillance applications cannot aim for maximum accuracy at any cost. It becomes necessary to find trade-offs between accuracy and effort. We study efficient classification of images depicting real-world objects and scenes. Classification is efficient when a classifier can be controlled so that the desired trade-off between accuracy and effort (speed) is achieved and unnecessary computations are avoided on a per input basis. A framework is proposed for understanding and modeling efficient classification of images. Classification is modeled as a tree-like process. In designing the framework, it is important to recognize what is essential and to avoid structures that are narrow in applicability. Earlier frameworks are lacking in this regard. The overall contribution is two-fold. First, the framework is presented, subjected to experiments, and shown to be satisfactory. Second, certain unconventional approaches are experimented with. This allows the separation of the essential from the conventional. To determine if the framework is satisfactory, three categories of questions are identified: trade-off optimization, classifier tree organization, and rules for delegation and confidence modeling. Questions and problems related to each category are addressed and empirical results are presented. For example, related to trade-off optimization, we address the problem of computational bottlenecks that limit the range of trade-offs. We also ask if accuracy versus effort trade-offs can be controlled after training. For another example, regarding classifier tree organization, we first consider the task of organizing a tree in a problem-specific manner. We then ask if problem-specific organization is necessary.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many species inhabit fragmented landscapes, resulting either from anthropogenic or from natural processes. The ecological and evolutionary dynamics of spatially structured populations are affected by a complex interplay between endogenous and exogenous factors. The metapopulation approach, simplifying the landscape to a discrete set of patches of breeding habitat surrounded by unsuitable matrix, has become a widely applied paradigm for the study of species inhabiting highly fragmented landscapes. In this thesis, I focus on the construction of biologically realistic models and their parameterization with empirical data, with the general objective of understanding how the interactions between individuals and their spatially structured environment affect ecological and evolutionary processes in fragmented landscapes. I study two hierarchically structured model systems, which are the Glanville fritillary butterfly in the Åland Islands, and a system of two interacting aphid species in the Tvärminne archipelago, both being located in South-Western Finland. The interesting and challenging feature of both study systems is that the population dynamics occur over multiple spatial scales that are linked by various processes. My main emphasis is in the development of mathematical and statistical methodologies. For the Glanville fritillary case study, I first build a Bayesian framework for the estimation of death rates and capture probabilities from mark-recapture data, with the novelty of accounting for variation among individuals in capture probabilities and survival. I then characterize the dispersal phase of the butterflies by deriving a mathematical approximation of a diffusion-based movement model applied to a network of patches. I use the movement model as a building block to construct an individual-based evolutionary model for the Glanville fritillary butterfly metapopulation. I parameterize the evolutionary model using a pattern-oriented approach, and use it to study how the landscape structure affects the evolution of dispersal. For the aphid case study, I develop a Bayesian model of hierarchical multi-scale metapopulation dynamics, where the observed extinction and colonization rates are decomposed into intrinsic rates operating specifically at each spatial scale. In summary, I show how analytical approaches, hierarchical Bayesian methods and individual-based simulations can be used individually or in combination to tackle complex problems from many different viewpoints. In particular, hierarchical Bayesian methods provide a useful tool for decomposing ecological complexity into more tractable components.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Placental abruption, one of the most significant causes of perinatal mortality and maternal morbidity, occurs in 0.5-1% of pregnancies. Its etiology is unknown, but defective trophoblastic invasion of the spiral arteries and consequent poor vascularization may play a role. The aim of this study was to define the prepregnancy risk factors of placental abruption, to define the risk factors during the index pregnancy, and to describe the clinical presentation of placental abruption. We also wanted to find a biochemical marker for predicting placental abruption early in pregnancy. Among women delivering at the University Hospital of Helsinki in 1997-2001 (n=46,742), 198 women with placental abruption and 396 control women were identified. The overall incidence of placental abruption was 0.42%. The prepregnancy risk factors were smoking (OR 1.7; 95% CI 1.1, 2.7), uterine malformation (OR 8.1; 1.7, 40), previous cesarean section (OR 1.7; 1.1, 2.8), and history of placental abruption (OR 4.5; 1.1, 18). The risk factors during the index pregnancy were maternal (adjusted OR 1.8; 95% CI 1.1, 2.9) and paternal smoking (2.2; 1.3, 3.6), use of alcohol (2.2; 1.1, 4.4), placenta previa (5.7; 1.4, 23.1), preeclampsia (2.7; 1.3, 5.6) and chorioamnionitis (3.3; 1.0, 10.0). Vaginal bleeding (70%), abdominal pain (51%), bloody amniotic fluid (50%) and fetal heart rate abnormalities (69%) were the most common clinical manifestations of placental abruption. Retroplacental blood clot was seen by ultrasound in 15% of the cases. Neither bleeding nor pain was present in 19% of the cases. Overall, 59% went into preterm labor (OR 12.9; 95% CI 8.3, 19.8), and 91% were delivered by cesarean section (34.7; 20.0, 60.1). Of the newborns, 25% were growth restricted. The perinatal mortality rate was 9.2% (OR 10.1; 95% CI 3.4, 30.1). We then tested selected biochemical markers for prediction of placental abruption. The median of the maternal serum alpha-fetoprotein (MSAFP) multiples of median (MoM) (1.21) was significantly higher in the abruption group (n=57) than in the control group (n=108) (1.07) (p=0.004) at 15-16 gestational weeks. In multivariate analysis, elevated MSAFP remained as an independent risk factor for placental abruption, adjusting for parity ≥ 3, smoking, previous placental abruption, preeclampsia, bleeding in II or III trimester, and placenta previa. MSAFP ≥ 1.5 MoM had a sensitivity of 29% and a false positive rate of 10%. The levels of the maternal serum free beta human chorionic gonadotrophin MoM did not differ between the cases and the controls. None of the angiogenic factors (soluble endoglin, soluble fms-like tyrosine kinase 1, or placental growth factor) showed any difference between the cases (n=42) and the controls (n=50) in the second trimester. The levels of C-reactive protein (CRP) showed no difference between the cases (n=181) and the controls (n=261) (median 2.35 mg/l [interquartile range {IQR} 1.09-5.93] versus 2.28 mg/l [IQR 0.92-5.01], not significant) when tested in the first trimester (mean 10.4 gestational weeks). Chlamydia pneumoniae specific immunoglobulin G (IgG) and immunoglobulin A (IgA) as well as C. trachomatis specific IgG, IgA and chlamydial heat-shock protein 60 antibody rates were similar between the groups. In conclusion, although univariate analysis identified many prepregnancy risk factors for placental abruption, only smoking, uterine malformation, previous cesarean section and history of placental abruption remained significant by multivariate analysis. During the index pregnancy maternal alcohol consumption and smoking and smoking by the partner turned out to be the major independent risk factors for placental abruption. Smoking by both partners multiplied the risk. The liberal use of ultrasound examination contributed little to the management of women with placental abruption. Although second-trimester MSAFP levels were higher in women with subsequent placental abruption, clinical usefulness of this test is limited due to low sensitivity and high false positive rate. Similarly, angiogenic factors in early second trimester, or CRP levels, or chlamydial antibodies in the first trimester failed to predict placental abruption.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cyclosporine is an immunosuppressant drug with a narrow therapeutic index and large variability in pharmacokinetics. To improve cyclosporine dose individualization in children, we used population pharmacokinetic modeling to study the effects of developmental, clinical, and genetic factors on cyclosporine pharmacokinetics in altogether 176 subjects (age range: 0.36–20.2 years) before and up to 16 years after renal transplantation. Pre-transplantation test doses of cyclosporine were given intravenously (3 mg/kg) and orally (10 mg/kg), on separate occasions, followed by blood sampling for 24 hours (n=175). After transplantation, in a total of 137 patients, cyclosporine concentration was quantified at trough, two hours post-dose, or with dose-interval curves. One-hundred-four of the studied patients were genotyped for 17 putatively functionally significant sequence variations in the ABCB1, SLCO1B1, ABCC2, CYP3A4, CYP3A5, and NR1I2 genes. Pharmacokinetic modeling was performed with the nonlinear mixed effects modeling computer program, NONMEM. A 3-compartment population pharmacokinetic model with first order absorption without lag-time was used to describe the data. The most important covariate affecting systemic clearance and distribution volume was allometrically scaled body weight i.e. body weight**3/4 for clearance and absolute body weight for volume of distribution. The clearance adjusted by absolute body weight declined with age and pre-pubertal children (< 8 years) had an approximately 25% higher clearance/body weight (L/h/kg) than did older children. Adjustment of clearance for allometric body weight removed its relationship to age after the first year of life. This finding is consistent with a gradual reduction in relative liver size towards adult values, and a relatively constant CYP3A content in the liver from about 6–12 months of age to adulthood. The other significant covariates affecting cyclosporine clearance and volume of distribution were hematocrit, plasma cholesterol, and serum creatinine, explaining up to 20%–30% of inter-individual differences before transplantation. After transplantation, their predictive role was smaller, as the variations in hematocrit, plasma cholesterol, and serum creatinine were also smaller. Before transplantation, no clinical or demographic covariates were found to affect oral bioavailability, and no systematic age-related changes in oral bioavailability were observed. After transplantation, older children receiving cyclosporine twice daily as the gelatine capsule microemulsion formulation had an about 1.25–1.3 times higher bioavailability than did the younger children receiving the liquid microemulsion formulation thrice daily. Moreover, cyclosporine oral bioavailability increased over 1.5-fold in the first month after transplantation, returning thereafter gradually to its initial value in 1–1.5 years. The largest cyclosporine doses were administered in the first 3–6 months after transplantation, and thereafter the single doses of cyclosporine were often smaller than 3 mg/kg. Thus, the results suggest that cyclosporine displays dose-dependent, saturable pre-systemic metabolism even at low single doses, whereas complete saturation of CYP3A4 and MDR1 (P-glycoprotein) renders cyclosporine pharmacokinetics dose-linear at higher doses. No significant associations were found between genetic polymorphisms and cyclosporine pharmacokinetics before transplantation in the whole population for which genetic data was available (n=104). However, in children older than eight years (n=22), heterozygous and homozygous carriers of the ABCB1 c.2677T or c.1236T alleles had an about 1.3 times or 1.6 times higher oral bioavailability, respectively, than did non-carriers. After transplantation, none of the ABCB1 SNPs or any other SNPs were found to be associated with cyclosporine clearance or oral bioavailability in the whole population, in the patients older than eight years, or in the patients younger than eight years. In the whole population, in those patients carrying the NR1I2 g.-25385C–g.-24381A–g.-205_-200GAGAAG–g.7635G–g.8055C haplotype, however, the bioavailability of cyclosporine was about one tenth lower, per allele, than in non-carriers. This effect was significant also in a subgroup of patients older than eight years. Furthermore, in patients carrying the NR1I2 g.-25385C–g.-24381A–g.-205_-200GAGAAG–g.7635G–g.8055T haplotype, the bioavailability was almost one fifth higher, per allele, than in non-carriers. It may be possible to improve individualization of cyclosporine dosing in children by accounting for the effects of developmental factors (body weight, liver size), time after transplantation, and cyclosporine dosing frequency/formulation. Further studies are required on the predictive value of genotyping for individualization of cyclosporine dosing in children.