12 resultados para predictive regression

em Helda - Digital Repository of University of Helsinki


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The low predictive power of implied volatility in forecasting the subsequently realized volatility is a well-documented empirical puzzle. As suggested by e.g. Feinstein (1989), Jackwerth and Rubinstein (1996), and Bates (1997), we test whether unrealized expectations of jumps in volatility could explain this phenomenon. Our findings show that expectations of infrequently occurring jumps in volatility are indeed priced in implied volatility. This has two important consequences. First, implied volatility is actually expected to exceed realized volatility over long periods of time only to be greatly less than realized volatility during infrequently occurring periods of very high volatility. Second, the slope coefficient in the classic forecasting regression of realized volatility on implied volatility is very sensitive to the discrepancy between ex ante expected and ex post realized jump frequencies. If the in-sample frequency of positive volatility jumps is lower than ex ante assessed by the market, the classic regression test tends to reject the hypothesis of informational efficiency even if markets are informationally effective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Placental abruption, one of the most significant causes of perinatal mortality and maternal morbidity, occurs in 0.5-1% of pregnancies. Its etiology is unknown, but defective trophoblastic invasion of the spiral arteries and consequent poor vascularization may play a role. The aim of this study was to define the prepregnancy risk factors of placental abruption, to define the risk factors during the index pregnancy, and to describe the clinical presentation of placental abruption. We also wanted to find a biochemical marker for predicting placental abruption early in pregnancy. Among women delivering at the University Hospital of Helsinki in 1997-2001 (n=46,742), 198 women with placental abruption and 396 control women were identified. The overall incidence of placental abruption was 0.42%. The prepregnancy risk factors were smoking (OR 1.7; 95% CI 1.1, 2.7), uterine malformation (OR 8.1; 1.7, 40), previous cesarean section (OR 1.7; 1.1, 2.8), and history of placental abruption (OR 4.5; 1.1, 18). The risk factors during the index pregnancy were maternal (adjusted OR 1.8; 95% CI 1.1, 2.9) and paternal smoking (2.2; 1.3, 3.6), use of alcohol (2.2; 1.1, 4.4), placenta previa (5.7; 1.4, 23.1), preeclampsia (2.7; 1.3, 5.6) and chorioamnionitis (3.3; 1.0, 10.0). Vaginal bleeding (70%), abdominal pain (51%), bloody amniotic fluid (50%) and fetal heart rate abnormalities (69%) were the most common clinical manifestations of placental abruption. Retroplacental blood clot was seen by ultrasound in 15% of the cases. Neither bleeding nor pain was present in 19% of the cases. Overall, 59% went into preterm labor (OR 12.9; 95% CI 8.3, 19.8), and 91% were delivered by cesarean section (34.7; 20.0, 60.1). Of the newborns, 25% were growth restricted. The perinatal mortality rate was 9.2% (OR 10.1; 95% CI 3.4, 30.1). We then tested selected biochemical markers for prediction of placental abruption. The median of the maternal serum alpha-fetoprotein (MSAFP) multiples of median (MoM) (1.21) was significantly higher in the abruption group (n=57) than in the control group (n=108) (1.07) (p=0.004) at 15-16 gestational weeks. In multivariate analysis, elevated MSAFP remained as an independent risk factor for placental abruption, adjusting for parity ≥ 3, smoking, previous placental abruption, preeclampsia, bleeding in II or III trimester, and placenta previa. MSAFP ≥ 1.5 MoM had a sensitivity of 29% and a false positive rate of 10%. The levels of the maternal serum free beta human chorionic gonadotrophin MoM did not differ between the cases and the controls. None of the angiogenic factors (soluble endoglin, soluble fms-like tyrosine kinase 1, or placental growth factor) showed any difference between the cases (n=42) and the controls (n=50) in the second trimester. The levels of C-reactive protein (CRP) showed no difference between the cases (n=181) and the controls (n=261) (median 2.35 mg/l [interquartile range {IQR} 1.09-5.93] versus 2.28 mg/l [IQR 0.92-5.01], not significant) when tested in the first trimester (mean 10.4 gestational weeks). Chlamydia pneumoniae specific immunoglobulin G (IgG) and immunoglobulin A (IgA) as well as C. trachomatis specific IgG, IgA and chlamydial heat-shock protein 60 antibody rates were similar between the groups. In conclusion, although univariate analysis identified many prepregnancy risk factors for placental abruption, only smoking, uterine malformation, previous cesarean section and history of placental abruption remained significant by multivariate analysis. During the index pregnancy maternal alcohol consumption and smoking and smoking by the partner turned out to be the major independent risk factors for placental abruption. Smoking by both partners multiplied the risk. The liberal use of ultrasound examination contributed little to the management of women with placental abruption. Although second-trimester MSAFP levels were higher in women with subsequent placental abruption, clinical usefulness of this test is limited due to low sensitivity and high false positive rate. Similarly, angiogenic factors in early second trimester, or CRP levels, or chlamydial antibodies in the first trimester failed to predict placental abruption.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study examines the properties of Generalised Regression (GREG) estimators for domain class frequencies and proportions. The family of GREG estimators forms the class of design-based model-assisted estimators. All GREG estimators utilise auxiliary information via modelling. The classic GREG estimator with a linear fixed effects assisting model (GREG-lin) is one example. But when estimating class frequencies, the study variable is binary or polytomous. Therefore logistic-type assisting models (e.g. logistic or probit model) should be preferred over the linear one. However, other GREG estimators than GREG-lin are rarely used, and knowledge about their properties is limited. This study examines the properties of L-GREG estimators, which are GREG estimators with fixed-effects logistic-type models. Three research questions are addressed. First, I study whether and when L-GREG estimators are more accurate than GREG-lin. Theoretical results and Monte Carlo experiments which cover both equal and unequal probability sampling designs and a wide variety of model formulations show that in standard situations, the difference between L-GREG and GREG-lin is small. But in the case of a strong assisting model, two interesting situations arise: if the domain sample size is reasonably large, L-GREG is more accurate than GREG-lin, and if the domain sample size is very small, estimation of assisting model parameters may be inaccurate, resulting in bias for L-GREG. Second, I study variance estimation for the L-GREG estimators. The standard variance estimator (S) for all GREG estimators resembles the Sen-Yates-Grundy variance estimator, but it is a double sum of prediction errors, not of the observed values of the study variable. Monte Carlo experiments show that S underestimates the variance of L-GREG especially if the domain sample size is minor, or if the assisting model is strong. Third, since the standard variance estimator S often fails for the L-GREG estimators, I propose a new augmented variance estimator (A). The difference between S and the new estimator A is that the latter takes into account the difference between the sample fit model and the census fit model. In Monte Carlo experiments, the new estimator A outperformed the standard estimator S in terms of bias, root mean square error and coverage rate. Thus the new estimator provides a good alternative to the standard estimator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study investigates the role of social media as a form of organizational knowledge sharing. Social media is investigated in terms of the Web 2.0 technologies that organizations provide their employees as tools of internal communication. This study is anchored in the theoretical understanding of social media as technologies which enable both knowledge collection and knowledge donation. This study investigates the factors influencing employees’ use of social media in their working environment. The study presents the multidisciplinary research tradition concerning knowledge sharing. Social media is analyzed especially in relation to internal communication and knowledge sharing. Based on previous studies, it is assumed that personal, organizational, and technological factors influence employees’ use of social media in their working environment. The research represents a case study focusing on the employees of the Finnish company Wärtsilä. Wärtsilä represents an eligible case organization for this study given that it puts in use several Web 2.0 tools in its intranet. The research is based on quantitative methods. In total 343 answers were obtained with the aid of an online survey which was available in Wärtsilä’s intranet. The associations between the variables are analyzed with the aid of correlations. Finally, with the aid of multiple linear regression analysis the causality between the assumed factors and the use of social media is tested. The analysis demonstrates that personal, organizational and technological factors influence the respondents’ use of social media. As strong predictive variables emerge the benefits that respondents expect to receive from using social media and respondents’ experience in using Web 2.0 in their private lives. Also organizational factors such as managers’ and colleagues’ activeness and organizational guidelines for using social media form a causal relationship with the use of social media. In addition, respondents’ understanding of their responsibilities affects their use of social media. The more social media is considered as a part of individual responsibilities, the more frequently social media is used. Finally, technological factors must be recognized. The more user-friendly social media tools are considered and the better technical skills respondents have, the more frequently social media is used in the working environment. The central references in relation to knowledge sharing include Chun Wei Choo’s (2006) work Knowing Organization, Ikujiro Nonaka and Hirotaka Takeuchi’s (1995) work The Knowledge Creating Company and Linda Argote’s (1999) work Organizational Learning.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis, I study the changing ladscape and human environment of the Mätäjoki Valley, West-Helsinki, using reconstructions and predictive modelling. The study is a part of a larger project funded by the city of Helsinki aming to map the past of the Mätäjoki Valley. The changes in landscape from an archipelago in the Ancylus Lake to a river valley are studied from 10000 to 2000 years ago. Alongside shore displacement, we look at the changing environment from human perspective and predict the location of dwelling sitesat various times. As a result, two map series were produced that show how the landscape changed and where inhabitance is predicted. To back them up, we have also looked at what previous research says about the history of the waterways, climate, vegetation and archaeology. The changing landscape of the river valley is reconstructed using GIS methods. For this purpose, new laser point data set was used and at the same time tested in the context landscape modelling. Dwelling sites were modeled with logistic regression analysis. The spatial predictive model combines data on the locations of the known dwelling sites, environmental factors and shore displacement data. The predictions were visualised into raster maps that show the predictions for inhabitance 3000 and 5000 years ago. The aim of these maps was to help archaeologists map potential spots for human activity. The produced landscape reconstructions clarified previous shore displacement studies of the Mätäjoki region and provided new information on the location of shoreline. From the shore displacement history of the Mätäjoki Valley arise the following stages: 1. The northernmost hills of the Mätäjoki Valley rose from Ancylus Lake approximately 10000 years ago. Shore displacement was fast during the following thousand years. 2. The area was an archipelago with a relatively steady shoreline 9000 7000 years ago. 8000 years ago the shoreline drew back in the middle and southern parts of the river valley because of the transgression of the Litorina Sea. 3. Mätäjoki was a sheltered bay of the Litorina Sea 6000 5000 years ago. The Vantaanjoki River started to flow into the Mätäjoki Valley approximately 5000 years ago. 4. The sediment plains in the southern part of the river valley rose from the sea rather quickly 5000 3000 years ago. Salt water still pushed its way into the southermost part of the valley 4000 years ago. 5. The shoreline proceeded to Pitäjänmäki rapids where it stayed at least a thousand years 3000 2000 years ago. The predictive models managed to predict the locations of dwelling sites moderately well. The most accurate predictions were found on the eastern shore and Malminkartano area. Of the environment variables sand and aspect of slope were found to have the best predictive power. From the results of this study we can conclude that the Mätäjoki Valley has been a favorable location to live especially 6000 5000 years ago when the climate was mild and vegetation lush. The laser point data set used here works best in shore displacement studies located in rural areas or if further specific palaeogeographic or hydrologic analysis in the research area is not needed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The factors affecting the non-industrial, private forest landowners' (hereafter referred to using the acronym NIPF) strategic decisions in management planning are studied. A genetic algorithm is used to induce a set of rules predicting potential cut of the landowners' choices of preferred timber management strategies. The rules are based on variables describing the characteristics of the landowners and their forest holdings. The predictive ability of a genetic algorithm is compared to linear regression analysis using identical data sets. The data are cross-validated seven times applying both genetic algorithm and regression analyses in order to examine the data-sensitivity and robustness of the generated models. The optimal rule set derived from genetic algorithm analyses included the following variables: mean initial volume, landowner's positive price expectations for the next eight years, landowner being classified as farmer, and preference for the recreational use of forest property. When tested with previously unseen test data, the optimal rule set resulted in a relative root mean square error of 0.40. In the regression analyses, the optimal regression equation consisted of the following variables: mean initial volume, proportion of forestry income, intention to cut extensively in future, and positive price expectations for the next two years. The R2 of the optimal regression equation was 0.34 and the relative root mean square error obtained from the test data was 0.38. In both models, mean initial volume and positive stumpage price expectations were entered as significant predictors of potential cut of preferred timber management strategy. When tested with the complete data set of 201 observations, both the optimal rule set and the optimal regression model achieved the same level of accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Energiataseen mallinnus on osa KarjaKompassi-hankkeeseen liittyvää kehitystyötä. Tutkielman tavoitteena oli kehittää lypsylehmän energiatasetta etukäteen ennustavia ja tuotoskauden aikana saatavia tietoja hyödyntäviä matemaattisia malleja. Selittävinä muuttujina olivat dieetti-, rehu-, maitotuotos-, koelypsy-, elopaino- ja kuntoluokkatiedot. Tutkimuksen aineisto kerättiin 12 Suomessa tehdyistä 8 – 28 laktaatioviikon pituisesta ruokintakokeesta, jotka alkoivat heti poikimisen jälkeen. Mukana olleista 344 lypsylehmästä yksi neljäsosa oli friisiläis- ja loput ayshire-rotuisia. Vanhempien lehmien päätiedosto sisälsi 2647 havaintoa (koe * lehmä * laktaatioviikko) ja ensikoiden 1070. Aineisto käsiteltiin SAS-ohjelmiston Mixed-proseduuria käyttäen ja poikkeavat havainnot poistettiin Tukeyn menetelmällä. Korrelaatioanalyysillä tarkasteltiin energiataseen ja selittävien muuttujien välisiä yhteyksiä. Energiatase mallinnettiin regressioanalyysillä. Laktaatiopäivän vaikutusta energiataseeseen selitettiin viiden eri funktion avulla. Satunnaisena tekijänä mallissa oli lehmä kokeen sisällä. Mallin sopivuutta aineistoon tarkasteltiin jäännösvirheen, selitysasteen ja Bayesin informaatiokriteerin avulla. Parhaat mallit testattiin riippumattomassa aineistossa. Laktaatiopäivän vaikutusta energiataseeseen selitti hyvin Ali-Schaefferin funktio, jota käytettiin perusmallina. Kaikissa energiatasemalleissa vaihtelu kasvoi laktaatioviikosta 12. alkaen, kun havaintojen määrä väheni ja energiatase muuttui positiiviseksi. Ennen poikimista käytettävissä olevista muuttujista dieetin väkirehuosuus ja väkirehun syönti-indeksi paransivat selitysastetta ja pienensivät jäännösvirhettä. Ruokinnan onnistumista voidaan seurata maitotuotoksen, maidon rasvapitoisuuden ja rasva-valkuaissuhteen tai EKM:n sisältävillä malleilla. EKM:n vakiointi pienensi mallin jäännösvirhettä. Elopaino ja kuntoluokka olivat heikkoja selittäjiä. Malleja voidaan hyödyntää karjatason ruokinnan suunnittelussa ja seurannassa, mutta yksittäisen lehmän energiataseen ennustamiseen ne eivät sovellu.