9 resultados para HA Statistics

em Glasgow Theses Service


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Undoubtedly, statistics has become one of the most important subjects in the modern world, where its applications are ubiquitous. The importance of statistics is not limited to statisticians, but also impacts upon non-statisticians who have to use statistics within their own disciplines. Several studies have indicated that most of the academic departments around the world have realized the importance of statistics to non-specialist students. Therefore, the number of students enrolled in statistics courses has vastly increased, coming from a variety of disciplines. Consequently, research within the scope of statistics education has been able to develop throughout the last few years. One important issue is how statistics is best taught to, and learned by, non-specialist students. This issue is controlled by several factors that affect the learning and teaching of statistics to non-specialist students, such as the use of technology, the role of the English language (especially for those whose first language is not English), the effectiveness of statistics teachers and their approach towards teaching statistics courses, students’ motivation to learn statistics and the relevance of statistics courses to the main subjects of non-specialist students. Several studies, focused on aspects of learning and teaching statistics, have been conducted in different countries around the world, particularly in Western countries. Conversely, the situation in Arab countries, especially in Saudi Arabia, is different; here, there is very little research in this scope, and what there is does not meet the needs of those countries towards the development of learning and teaching statistics to non-specialist students. This research was instituted in order to develop the field of statistics education. The purpose of this mixed methods study was to generate new insights into this subject by investigating how statistics courses are currently taught to non-specialist students in Saudi universities. Hence, this study will contribute towards filling the knowledge gap that exists in Saudi Arabia. This study used multiple data collection approaches, including questionnaire surveys from 1053 non-specialist students who had completed at least one statistics course in different colleges of the universities in Saudi Arabia. These surveys were followed up with qualitative data collected via semi-structured interviews with 16 teachers of statistics from colleges within all six universities where statistics is taught to non-specialist students in Saudi Arabia’s Eastern Region. The data from questionnaires included several types, so different techniques were used in analysis. Descriptive statistics were used to identify the demographic characteristics of the participants. The chi-square test was used to determine associations between variables. Based on the main issues that are raised from literature review, the questions (items scales) were grouped and five key groups of questions were obtained which are: 1) Effectiveness of Teachers; 2) English Language; 3) Relevance of Course; 4) Student Engagement; 5) Using Technology. Exploratory data analysis was used to explore these issues in more detail. Furthermore, with the existence of clustering in the data (students within departments within colleges, within universities), multilevel generalized linear models for dichotomous analysis have been used to clarify the effects of clustering at those levels. Factor analysis was conducted confirming the dimension reduction of variables (items scales). The data from teachers’ interviews were analysed on an individual basis. The responses were assigned to one of the eight themes that emerged from within the data: 1) the lack of students’ motivation to learn statistics; 2) students' participation; 3) students’ assessment; 4) the effective use of technology; 5) the level of previous mathematical and statistical skills of non-specialist students; 6) the English language ability of non-specialist students; 7) the need for extra time for teaching and learning statistics; and 8) the role of administrators. All the data from students and teachers indicated that the situation of learning and teaching statistics to non-specialist students in Saudi universities needs to be improved in order to meet the needs of those students. The findings of this study suggested a weakness in the use of statistical software applications in these courses. This study showed that there is lack of application of technology such as statistical software programs in these courses, which would allow non-specialist students to consolidate their knowledge. The results also indicated that English language is considered one of the main challenges in learning and teaching statistics, particularly in institutions where English is not used as the main language. Moreover, the weakness of mathematical skills of students is considered another major challenge. Additionally, the results indicated that there was a need to tailor statistics courses to the needs of non-specialist students based on their main subjects. The findings indicate that statistics teachers need to choose appropriate methods when teaching statistics courses.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This PhD thesis contains three main chapters on macro finance, with a focus on the term structure of interest rates and the applications of state-of-the-art Bayesian econometrics. Except for Chapter 1 and Chapter 5, which set out the general introduction and conclusion, each of the chapters can be considered as a standalone piece of work. In Chapter 2, we model and predict the term structure of US interest rates in a data rich environment. We allow the model dimension and parameters to change over time, accounting for model uncertainty and sudden structural changes. The proposed timevarying parameter Nelson-Siegel Dynamic Model Averaging (DMA) predicts yields better than standard benchmarks. DMA performs better since it incorporates more macro-finance information during recessions. The proposed method allows us to estimate plausible realtime term premia, whose countercyclicality weakened during the financial crisis. Chapter 3 investigates global term structure dynamics using a Bayesian hierarchical factor model augmented with macroeconomic fundamentals. More than half of the variation in the bond yields of seven advanced economies is due to global co-movement. Our results suggest that global inflation is the most important factor among global macro fundamentals. Non-fundamental factors are essential in driving global co-movements, and are closely related to sentiment and economic uncertainty. Lastly, we analyze asymmetric spillovers in global bond markets connected to diverging monetary policies. Chapter 4 proposes a no-arbitrage framework of term structure modeling with learning and model uncertainty. The representative agent considers parameter instability, as well as the uncertainty in learning speed and model restrictions. The empirical evidence shows that apart from observational variance, parameter instability is the dominant source of predictive variance when compared with uncertainty in learning speed or model restrictions. When accounting for ambiguity aversion, the out-of-sample predictability of excess returns implied by the learning model can be translated into significant and consistent economic gains over the Expectations Hypothesis benchmark.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This PhD thesis contains three main chapters on macro finance, with a focus on the term structure of interest rates and the applications of state-of-the-art Bayesian econometrics. Except for Chapter 1 and Chapter 5, which set out the general introduction and conclusion, each of the chapters can be considered as a standalone piece of work. In Chapter 2, we model and predict the term structure of US interest rates in a data rich environment. We allow the model dimension and parameters to change over time, accounting for model uncertainty and sudden structural changes. The proposed time-varying parameter Nelson-Siegel Dynamic Model Averaging (DMA) predicts yields better than standard benchmarks. DMA performs better since it incorporates more macro-finance information during recessions. The proposed method allows us to estimate plausible real-time term premia, whose countercyclicality weakened during the financial crisis. Chapter 3 investigates global term structure dynamics using a Bayesian hierarchical factor model augmented with macroeconomic fundamentals. More than half of the variation in the bond yields of seven advanced economies is due to global co-movement. Our results suggest that global inflation is the most important factor among global macro fundamentals. Non-fundamental factors are essential in driving global co-movements, and are closely related to sentiment and economic uncertainty. Lastly, we analyze asymmetric spillovers in global bond markets connected to diverging monetary policies. Chapter 4 proposes a no-arbitrage framework of term structure modeling with learning and model uncertainty. The representative agent considers parameter instability, as well as the uncertainty in learning speed and model restrictions. The empirical evidence shows that apart from observational variance, parameter instability is the dominant source of predictive variance when compared with uncertainty in learning speed or model restrictions. When accounting for ambiguity aversion, the out-of-sample predictability of excess returns implied by the learning model can be translated into significant and consistent economic gains over the Expectations Hypothesis benchmark.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many exchange rate papers articulate the view that instabilities constitute a major impediment to exchange rate predictability. In this thesis we implement Bayesian and other techniques to account for such instabilities, and examine some of the main obstacles to exchange rate models' predictive ability. We first consider in Chapter 2 a time-varying parameter model in which fluctuations in exchange rates are related to short-term nominal interest rates ensuing from monetary policy rules, such as Taylor rules. Unlike the existing exchange rate studies, the parameters of our Taylor rules are allowed to change over time, in light of the widespread evidence of shifts in fundamentals - for example in the aftermath of the Global Financial Crisis. Focusing on quarterly data frequency from the crisis, we detect forecast improvements upon a random walk (RW) benchmark for at least half, and for as many as seven out of 10, of the currencies considered. Results are stronger when we allow the time-varying parameters of the Taylor rules to differ between countries. In Chapter 3 we look closely at the role of time-variation in parameters and other sources of uncertainty in hindering exchange rate models' predictive power. We apply a Bayesian setup that incorporates the notion that the relevant set of exchange rate determinants and their corresponding coefficients, change over time. Using statistical and economic measures of performance, we first find that predictive models which allow for sudden, rather than smooth, changes in the coefficients yield significant forecast improvements and economic gains at horizons beyond 1-month. At shorter horizons, however, our methods fail to forecast better than the RW. And we identify uncertainty in coefficients' estimation and uncertainty about the precise degree of coefficients variability to incorporate in the models, as the main factors obstructing predictive ability. Chapter 4 focus on the problem of the time-varying predictive ability of economic fundamentals for exchange rates. It uses bootstrap-based methods to uncover the time-specific conditioning information for predicting fluctuations in exchange rates. Employing several metrics for statistical and economic evaluation of forecasting performance, we find that our approach based on pre-selecting and validating fundamentals across bootstrap replications generates more accurate forecasts than the RW. The approach, known as bumping, robustly reveals parsimonious models with out-of-sample predictive power at 1-month horizon; and outperforms alternative methods, including Bayesian, bagging, and standard forecast combinations. Chapter 5 exploits the predictive content of daily commodity prices for monthly commodity-currency exchange rates. It builds on the idea that the effect of daily commodity price fluctuations on commodity currencies is short-lived, and therefore harder to pin down at low frequencies. Using MIxed DAta Sampling (MIDAS) models, and Bayesian estimation methods to account for time-variation in predictive ability, the chapter demonstrates the usefulness of suitably exploiting such short-lived effects in improving exchange rate forecasts. It further shows that the usual low-frequency predictors, such as money supplies and interest rates differentials, typically receive little support from the data at monthly frequency, whereas MIDAS models featuring daily commodity prices are highly likely. The chapter also introduces the random walk Metropolis-Hastings technique as a new tool to estimate MIDAS regressions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Critical infrastructures are based on complex systems that provide vital services to the nation. The complexities of the interconnected networks, each managed by individual organisations, if not properly secured, could offer vulnerabilities that threaten other organisations’ systems that depend on their services. This thesis argues that the awareness of interdependencies among critical sectors needs to be increased. Managing and securing critical infrastructure is not isolated responsibility of a government or an individual organisation. There is a need for a strong collaboration among critical service providers of public and private organisations in protecting critical information infrastructure. Cyber exercises have been incorporated in national cyber security strategies as part of critical information infrastructure protection. However, organising a cyber exercise involved multi sectors is challenging due to the diversity of participants’ background, working environments and incidents response policies. How well the lessons learned from the cyber exercise and how it can be transferred to the participating organisations is still a looming question. In order to understand the implications of cyber exercises on what participants have learnt and how it benefits participants’ organisation, a Cyber Exercise Post Assessment (CEPA) framework was proposed in this research. The CEPA framework consists of two parts. The first part aims to investigate the lessons learnt by participants from a cyber exercise using the four levels of the Kirkpatrick Training Model to identify their perceptions on reaction, learning, behaviour and results of the exercise. The second part investigates the Organisation Cyber Resilience (OCR) of participating sectors. The framework was used to study the impact of the cyber exercise called X Maya in Malaysia. Data collected through interviews with X Maya 5 participants were coded and categorised based on four levels according to the Kirkpatrick Training Model, while online surveys distributed to ten Critical National Information Infrastructure (CNII) sectors participated in the exercise. The survey used the C-Suite Executive Checklist developed by World Economic Forum in 2012. To ensure the suitability of the tool used to investigate the OCR, a reliability test conducted on the survey items showed high internal consistency results. Finally, individual OCR scores were used to develop the OCR Maturity Model to provide the organisation cyber resilience perspectives of the ten CNII sectors.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this thesis is to review and augment the theory and methods of optimal experimental design. In Chapter I the scene is set by considering the possible aims of an experimenter prior to an experiment, the statistical methods one might use to achieve those aims and how experimental design might aid this procedure. It is indicated that, given a criterion for design, a priori optimal design will only be possible in certain instances and, otherwise, some form of sequential procedure would seem to be indicated. In Chapter 2 an exact experimental design problem is formulated mathematically and is compared with its continuous analogue. Motivation is provided for the solution of this continuous problem, and the remainder of the chapter concerns this problem. A necessary and sufficient condition for optimality of a design measure is given. Problems which might arise in testing this condition are discussed, in particular with respect to possible non-differentiability of the criterion function at the design being tested. Several examples are given of optimal designs which may be found analytically and which illustrate the points discussed earlier in the chapter. In Chapter 3 numerical methods of solution of the continuous optimal design problem are reviewed. A new algorithm is presented with illustrations of how it should be used in practice. It is shown that, for reasonably large sample size, continuously optimal designs may be approximated to well by an exact design. In situations where this is not satisfactory algorithms for improvement of this design are reviewed. Chapter 4 consists of a discussion of sequentially designed experiments, with regard to both the philosophies underlying, and the application of the methods of, statistical inference. In Chapter 5 we criticise constructively previous suggestions for fully sequential design procedures. Alternative suggestions are made along with conjectures as to how these might improve performance. Chapter 6 presents a simulation study, the aim of which is to investigate the conjectures of Chapter 5. The results of this study provide empirical support for these conjectures. In Chapter 7 examples are analysed. These suggest aids to sequential experimentation by means of reduction of the dimension of the design space and the possibility of experimenting semi-sequentially. Further examples are considered which stress the importance of the use of prior information in situations of this type. Finally we consider the design of experiments when semi-sequential experimentation is mandatory because of the necessity of taking batches of observations at the same time. In Chapter 8 we look at some of the assumptions which have been made and indicate what may go wrong where these assumptions no longer hold.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The long-term adverse effects on health associated with air pollution exposure can be estimated using either cohort or spatio-temporal ecological designs. In a cohort study, the health status of a cohort of people are assessed periodically over a number of years, and then related to estimated ambient pollution concentrations in the cities in which they live. However, such cohort studies are expensive and time consuming to implement, due to the long-term follow up required for the cohort. Therefore, spatio-temporal ecological studies are also being used to estimate the long-term health effects of air pollution as they are easy to implement due to the routine availability of the required data. Spatio-temporal ecological studies estimate the health impact of air pollution by utilising geographical and temporal contrasts in air pollution and disease risk across $n$ contiguous small-areas, such as census tracts or electoral wards, for multiple time periods. The disease data are counts of the numbers of disease cases occurring in each areal unit and time period, and thus Poisson log-linear models are typically used for the analysis. The linear predictor includes pollutant concentrations and known confounders such as socio-economic deprivation. However, as the disease data typically contain residual spatial or spatio-temporal autocorrelation after the covariate effects have been accounted for, these known covariates are augmented by a set of random effects. One key problem in these studies is estimating spatially representative pollution concentrations in each areal which are typically estimated by applying Kriging to data from a sparse monitoring network, or by computing averages over modelled concentrations (grid level) from an atmospheric dispersion model. The aim of this thesis is to investigate the health effects of long-term exposure to Nitrogen Dioxide (NO2) and Particular matter (PM10) in mainland Scotland, UK. In order to have an initial impression about the air pollution health effects in mainland Scotland, chapter 3 presents a standard epidemiological study using a benchmark method. The remaining main chapters (4, 5, 6) cover the main methodological focus in this thesis which has been threefold: (i) how to better estimate pollution by developing a multivariate spatio-temporal fusion model that relates monitored and modelled pollution data over space, time and pollutant; (ii) how to simultaneously estimate the joint effects of multiple pollutants; and (iii) how to allow for the uncertainty in the estimated pollution concentrations when estimating their health effects. Specifically, chapters 4 and 5 are developed to achieve (i), while chapter 6 focuses on (ii) and (iii). In chapter 4, I propose an integrated model for estimating the long-term health effects of NO2, that fuses modelled and measured pollution data to provide improved predictions of areal level pollution concentrations and hence health effects. The air pollution fusion model proposed is a Bayesian space-time linear regression model for relating the measured concentrations to the modelled concentrations for a single pollutant, whilst allowing for additional covariate information such as site type (e.g. roadside, rural, etc) and temperature. However, it is known that some pollutants might be correlated because they may be generated by common processes or be driven by similar factors such as meteorology. The correlation between pollutants can help to predict one pollutant by borrowing strength from the others. Therefore, in chapter 5, I propose a multi-pollutant model which is a multivariate spatio-temporal fusion model that extends the single pollutant model in chapter 4, which relates monitored and modelled pollution data over space, time and pollutant to predict pollution across mainland Scotland. Considering that we are exposed to multiple pollutants simultaneously because the air we breathe contains a complex mixture of particle and gas phase pollutants, the health effects of exposure to multiple pollutants have been investigated in chapter 6. Therefore, this is a natural extension to the single pollutant health effects in chapter 4. Given NO2 and PM10 are highly correlated (multicollinearity issue) in my data, I first propose a temporally-varying linear model to regress one pollutant (e.g. NO2) against another (e.g. PM10) and then use the residuals in the disease model as well as PM10, thus investigating the health effects of exposure to both pollutants simultaneously. Another issue considered in chapter 6 is to allow for the uncertainty in the estimated pollution concentrations when estimating their health effects. There are in total four approaches being developed to adjust the exposure uncertainty. Finally, chapter 7 summarises the work contained within this thesis and discusses the implications for future research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Understanding how virus strains offer protection against closely related emerging strains is vital for creating effective vaccines. For many viruses, including Foot-and-Mouth Disease Virus (FMDV) and the Influenza virus where multiple serotypes often co-circulate, in vitro testing of large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross-protection between strains is important to help optimise vaccine choice. Vaccines will offer cross-protection against closely related strains, but not against those that are antigenically distinct. To be able to predict cross-protection we must understand the antigenic variability within a virus serotype, distinct lineages of a virus, and identify the antigenic residues and evolutionary changes that cause the variability. In this thesis we present a family of sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution (SABRE), as well as an extended version of the method, the extended SABRE (eSABRE) method, which better takes into account the data collection process. The SABRE methods are a family of sparse Bayesian hierarchical models that use spike and slab priors to identify sites in the viral protein which are important for the neutralisation of the virus. In this thesis we demonstrate how the SABRE methods can be used to identify antigenic residues within different serotypes and show how the SABRE method outperforms established methods, mixed-effects models based on forward variable selection or l1 regularisation, on both synthetic and viral datasets. In addition we also test a number of different versions of the SABRE method, compare conjugate and semi-conjugate prior specifications and an alternative to the spike and slab prior; the binary mask model. We also propose novel proposal mechanisms for the Markov chain Monte Carlo (MCMC) simulations, which improve mixing and convergence over that of the established component-wise Gibbs sampler. The SABRE method is then applied to datasets from FMDV and the Influenza virus in order to identify a number of known antigenic residue and to provide hypotheses of other potentially antigenic residues. We also demonstrate how the SABRE methods can be used to create accurate predictions of the important evolutionary changes of the FMDV serotypes. In this thesis we provide an extended version of the SABRE method, the eSABRE method, based on a latent variable model. The eSABRE method takes further into account the structure of the datasets for FMDV and the Influenza virus through the latent variable model and gives an improvement in the modelling of the error. We show how the eSABRE method outperforms the SABRE methods in simulation studies and propose a new information criterion for selecting the random effects factors that should be included in the eSABRE method; block integrated Widely Applicable Information Criterion (biWAIC). We demonstrate how biWAIC performs equally to two other methods for selecting the random effects factors and combine it with the eSABRE method to apply it to two large Influenza datasets. Inference in these large datasets is computationally infeasible with the SABRE methods, but as a result of the improved structure of the likelihood, we are able to show how the eSABRE method offers a computational improvement, leading it to be used on these datasets. The results of the eSABRE method show that we can use the method in a fully automatic manner to identify a large number of antigenic residues on a variety of the antigenic sites of two Influenza serotypes, as well as making predictions of a number of nearby sites that may also be antigenic and are worthy of further experiment investigation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cardiovascular disease is one of the leading causes of death around the world. Resting heart rate has been shown to be a strong and independent risk marker for adverse cardiovascular events and mortality, and yet its role as a predictor of risk is somewhat overlooked in clinical practice. With the aim of highlighting its prognostic value, the role of resting heart rate as a risk marker for death and other adverse outcomes was further examined in a number of different patient populations. A systematic review of studies that previously assessed the prognostic value of resting heart rate for mortality and other adverse cardiovascular outcomes was presented. New analyses of nine clinical trials were carried out. Both the original and extended Cox model that allows for analysis of time-dependent covariates were used to evaluate and compare the predictive value of baseline and time-updated heart rate measurements for adverse outcomes in the CAPRICORN, EUROPA, PROSPER, PERFORM, BEAUTIFUL and SHIFT populations. Pooled individual patient meta-analyses of the CAPRICORN, EPHESUS, OPTIMAAL and VALIANT trials, and the BEAUTIFUL and SHIFT trials, were also performed. The discrimination and calibration of the models applied were evaluated using Harrell’s C-statistic and likelihood ratio tests, respectively. Finally, following on from the systematic review, meta-analyses of the relation between baseline and time-updated heart rate, and the risk of death from any cause and from cardiovascular causes, were conducted. Both elevated baseline and time-updated resting heart rates were found to be associated with an increase in the risk of mortality and other adverse cardiovascular events in all of the populations analysed. In some cases, elevated time-updated heart rate was associated with risk of events where baseline heart rate was not. Time-updated heart rate also contributed additional information about the risk of certain events despite knowledge of baseline heart rate or previous heart rate measurements. The addition of resting heart rate to the models where resting heart rate was found to be associated with risk of outcome improved both discrimination and calibration, and in general, the models including time-updated heart rate along with baseline or the previous heart rate measurement had the highest and similar C-statistics, and thus the greatest discriminative ability. The meta-analyses demonstrated that a 5bpm higher baseline heart rate was associated with a 7.9% and an 8.0% increase in the risk of all-cause and cardiovascular death, respectively (both p less than 0.001). Additionally, a 5bpm higher time-updated heart rate (adjusted for baseline heart rate in eight of the ten studies included in the analyses) was associated with a 12.8% (p less than 0.001) and a 10.9% (p less than 0.001) increase in the risk of all-cause and cardiovascular death, respectively. These findings may motivate health care professionals to routinely assess resting heart rate in order to identify individuals at a higher risk of adverse events. The fact that the addition of time-updated resting heart rate improved the discrimination and calibration of models for certain outcomes, even if only modestly, strengthens the case that it be added to traditional risk models. The findings, however, are of particular importance, and have greater implications for the clinical management of patients with pre-existing disease. An elevated, or increasing heart rate over time could be used as a tool, potentially alongside other established risk scores, to help doctors identify patient deterioration or those at higher risk, who might benefit from more intensive monitoring or treatment re-evaluation. Further exploration of the role of continuous recording of resting heart rate, say, when patients are at home, would be informative. In addition, investigation into the cost-effectiveness and optimal frequency of resting heart rate measurement is required. One of the most vital areas for future research is the definition of an objective cut-off value for the definition of a high resting heart rate.