5 resultados para Mixed linear models

em Glasgow Theses Service


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Undoubtedly, statistics has become one of the most important subjects in the modern world, where its applications are ubiquitous. The importance of statistics is not limited to statisticians, but also impacts upon non-statisticians who have to use statistics within their own disciplines. Several studies have indicated that most of the academic departments around the world have realized the importance of statistics to non-specialist students. Therefore, the number of students enrolled in statistics courses has vastly increased, coming from a variety of disciplines. Consequently, research within the scope of statistics education has been able to develop throughout the last few years. One important issue is how statistics is best taught to, and learned by, non-specialist students. This issue is controlled by several factors that affect the learning and teaching of statistics to non-specialist students, such as the use of technology, the role of the English language (especially for those whose first language is not English), the effectiveness of statistics teachers and their approach towards teaching statistics courses, students’ motivation to learn statistics and the relevance of statistics courses to the main subjects of non-specialist students. Several studies, focused on aspects of learning and teaching statistics, have been conducted in different countries around the world, particularly in Western countries. Conversely, the situation in Arab countries, especially in Saudi Arabia, is different; here, there is very little research in this scope, and what there is does not meet the needs of those countries towards the development of learning and teaching statistics to non-specialist students. This research was instituted in order to develop the field of statistics education. The purpose of this mixed methods study was to generate new insights into this subject by investigating how statistics courses are currently taught to non-specialist students in Saudi universities. Hence, this study will contribute towards filling the knowledge gap that exists in Saudi Arabia. This study used multiple data collection approaches, including questionnaire surveys from 1053 non-specialist students who had completed at least one statistics course in different colleges of the universities in Saudi Arabia. These surveys were followed up with qualitative data collected via semi-structured interviews with 16 teachers of statistics from colleges within all six universities where statistics is taught to non-specialist students in Saudi Arabia’s Eastern Region. The data from questionnaires included several types, so different techniques were used in analysis. Descriptive statistics were used to identify the demographic characteristics of the participants. The chi-square test was used to determine associations between variables. Based on the main issues that are raised from literature review, the questions (items scales) were grouped and five key groups of questions were obtained which are: 1) Effectiveness of Teachers; 2) English Language; 3) Relevance of Course; 4) Student Engagement; 5) Using Technology. Exploratory data analysis was used to explore these issues in more detail. Furthermore, with the existence of clustering in the data (students within departments within colleges, within universities), multilevel generalized linear models for dichotomous analysis have been used to clarify the effects of clustering at those levels. Factor analysis was conducted confirming the dimension reduction of variables (items scales). The data from teachers’ interviews were analysed on an individual basis. The responses were assigned to one of the eight themes that emerged from within the data: 1) the lack of students’ motivation to learn statistics; 2) students' participation; 3) students’ assessment; 4) the effective use of technology; 5) the level of previous mathematical and statistical skills of non-specialist students; 6) the English language ability of non-specialist students; 7) the need for extra time for teaching and learning statistics; and 8) the role of administrators. All the data from students and teachers indicated that the situation of learning and teaching statistics to non-specialist students in Saudi universities needs to be improved in order to meet the needs of those students. The findings of this study suggested a weakness in the use of statistical software applications in these courses. This study showed that there is lack of application of technology such as statistical software programs in these courses, which would allow non-specialist students to consolidate their knowledge. The results also indicated that English language is considered one of the main challenges in learning and teaching statistics, particularly in institutions where English is not used as the main language. Moreover, the weakness of mathematical skills of students is considered another major challenge. Additionally, the results indicated that there was a need to tailor statistics courses to the needs of non-specialist students based on their main subjects. The findings indicate that statistics teachers need to choose appropriate methods when teaching statistics courses.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Understanding how virus strains offer protection against closely related emerging strains is vital for creating effective vaccines. For many viruses, including Foot-and-Mouth Disease Virus (FMDV) and the Influenza virus where multiple serotypes often co-circulate, in vitro testing of large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross-protection between strains is important to help optimise vaccine choice. Vaccines will offer cross-protection against closely related strains, but not against those that are antigenically distinct. To be able to predict cross-protection we must understand the antigenic variability within a virus serotype, distinct lineages of a virus, and identify the antigenic residues and evolutionary changes that cause the variability. In this thesis we present a family of sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution (SABRE), as well as an extended version of the method, the extended SABRE (eSABRE) method, which better takes into account the data collection process. The SABRE methods are a family of sparse Bayesian hierarchical models that use spike and slab priors to identify sites in the viral protein which are important for the neutralisation of the virus. In this thesis we demonstrate how the SABRE methods can be used to identify antigenic residues within different serotypes and show how the SABRE method outperforms established methods, mixed-effects models based on forward variable selection or l1 regularisation, on both synthetic and viral datasets. In addition we also test a number of different versions of the SABRE method, compare conjugate and semi-conjugate prior specifications and an alternative to the spike and slab prior; the binary mask model. We also propose novel proposal mechanisms for the Markov chain Monte Carlo (MCMC) simulations, which improve mixing and convergence over that of the established component-wise Gibbs sampler. The SABRE method is then applied to datasets from FMDV and the Influenza virus in order to identify a number of known antigenic residue and to provide hypotheses of other potentially antigenic residues. We also demonstrate how the SABRE methods can be used to create accurate predictions of the important evolutionary changes of the FMDV serotypes. In this thesis we provide an extended version of the SABRE method, the eSABRE method, based on a latent variable model. The eSABRE method takes further into account the structure of the datasets for FMDV and the Influenza virus through the latent variable model and gives an improvement in the modelling of the error. We show how the eSABRE method outperforms the SABRE methods in simulation studies and propose a new information criterion for selecting the random effects factors that should be included in the eSABRE method; block integrated Widely Applicable Information Criterion (biWAIC). We demonstrate how biWAIC performs equally to two other methods for selecting the random effects factors and combine it with the eSABRE method to apply it to two large Influenza datasets. Inference in these large datasets is computationally infeasible with the SABRE methods, but as a result of the improved structure of the likelihood, we are able to show how the eSABRE method offers a computational improvement, leading it to be used on these datasets. The results of the eSABRE method show that we can use the method in a fully automatic manner to identify a large number of antigenic residues on a variety of the antigenic sites of two Influenza serotypes, as well as making predictions of a number of nearby sites that may also be antigenic and are worthy of further experiment investigation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Trypanosomiasis has been identified as a neglected tropical disease in both humans and animals in many regions of sub-Saharan Africa. Whilst assessments of the biology of trypanosomes, vectors, vertebrate hosts and the environment have provided useful information about life cycles, transmission, and pathogenesis of the parasites that could be used for treatment and control, less information is available about the effects of interactions among multiple intrinsic factors on trypanosome presence in tsetse flies from different sites. It is known that multiple species of tsetse flies can transmit trypanosomes but differences in their vector competence has normally been studied in relation to individual factors in isolation, such as: intrinsic factors of the flies (e.g. age, sex); habitat characteristics; presence of endosymbionts (e.g. Wigglesworthia glossinidia, Sodalis glossinidius); feeding pattern; host communities that the flies feed on; and which species of trypanosomes are transmitted. The purpose of this study was to take a more integrated approach to investigate trypanosome prevalence in tsetse flies. In chapter 2, techniques were optimised for using the Polymerase Chain Reaction (PCR) to identify species of trypanosomes (Trypanosoma vivax, T. congolense, T. brucei, T. simiae, and T. godfreyi) present in four species of tsetse flies (Glossina austeni, G. brevipalpis, G. longipennis and G. pallidipes) from two regions of eastern Kenya (the Shimba Hills and Nguruman). Based on universal primers targeting the internal transcribed spacer 1 region (ITS-1), T. vivax was the predominant pathogenic species detected in flies, both singly and in combination with other species of trypanosomes. Using Generalised Linear Models (GLMs) and likelihood ratio tests to choose the best-fitting models, presence of T. vivax was significantly associated with an interaction between subpopulation (a combination between collection sites and species of Glossina) and sex of the flies (X2 = 7.52, df = 21, P-value = 0.0061); prevalence in females overall was higher than in males but this was not consistent across subpopulations. Similarly, T. congolense was significantly associated only with subpopulation (X2 = 18.77, df = 1, P-value = 0.0046); prevalence was higher overall in the Shimba Hills than in Nguruman but this pattern varied by species of tsetse fly. When associations were analysed in individual species of tsetse flies, there were no consistent associations between trypanosome prevalence and any single factor (site, sex, age) and different combinations of interactions were found to be significant for each. The results thus demonstrated complex interactions between vectors and trypanosome prevalence related to both the distribution and intrinsic factors of tsetse flies. The potential influence of the presence of S. glossinidius on trypanosome presence in tsetse flies was studied in chapter 3. A high number of Sodalis positive flies was found in the Shimba Hills, while there were only two positive flies from Nguruman. Presence or absence of Sodalis was significantly associated with subpopulation while trypanosome presence showed a significant association with age (X2 = 4.65, df = 14, P-value = 0.0310) and an interaction between subpopulation and sex (X2 = 18.94, df = 10, P-value = 0.0043). However, the specific associations that were significant varied across species of trypanosomes, with T. congolense and T. brucei but not T. vivax showing significant interactions involving Sodalis. Although it has previously been concluded that presence of Sodalis increases susceptibility to trypanosomes, the results presented here suggest a more complicated relationship, which may be biased by differences in the distribution and intrinsic factors of tsetse flies, as well as which trypanosome species are considered. In chapter 4 trypanosome status was studied in relation to blood meal sources, feeding status and feeding patterns of G. pallidipes (which was the predominant fly species collected for this study) as determined by sequencing the mitochondrial cytochrome B gene using DNA extracted from abdomen samples. African buffalo and African elephants were the main sources of blood meals but antelopes, warthogs, humans, giraffes and hyenas were also identified. Feeding on multiple hosts was common in flies sampled from the Shimba Hills but most flies from Nguruman had fed on single host species. Based on Multiple Correspondence Analysis (MCA), host-feeding patterns showed a correlation with site of sample collection and Sodalis status, while trypanosome status was correlated with sex and age of the flies, suggesting that recent host-feeding patterns from blood meal analysis cannot predict trypanosome status. In conclusion, the complexity of interactions found suggests that strategies of tsetse fly control should be specific to particular epidemic areas. Future studies should include laboratory experiments that use local colonies of tsetse flies, local strains of trypanosomes and local S. glossinidius under controlled environmental conditions to tease out the factors that affect vector competence and the relative influence of external environmental factors on the dynamics of these interactions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The long-term adverse effects on health associated with air pollution exposure can be estimated using either cohort or spatio-temporal ecological designs. In a cohort study, the health status of a cohort of people are assessed periodically over a number of years, and then related to estimated ambient pollution concentrations in the cities in which they live. However, such cohort studies are expensive and time consuming to implement, due to the long-term follow up required for the cohort. Therefore, spatio-temporal ecological studies are also being used to estimate the long-term health effects of air pollution as they are easy to implement due to the routine availability of the required data. Spatio-temporal ecological studies estimate the health impact of air pollution by utilising geographical and temporal contrasts in air pollution and disease risk across $n$ contiguous small-areas, such as census tracts or electoral wards, for multiple time periods. The disease data are counts of the numbers of disease cases occurring in each areal unit and time period, and thus Poisson log-linear models are typically used for the analysis. The linear predictor includes pollutant concentrations and known confounders such as socio-economic deprivation. However, as the disease data typically contain residual spatial or spatio-temporal autocorrelation after the covariate effects have been accounted for, these known covariates are augmented by a set of random effects. One key problem in these studies is estimating spatially representative pollution concentrations in each areal which are typically estimated by applying Kriging to data from a sparse monitoring network, or by computing averages over modelled concentrations (grid level) from an atmospheric dispersion model. The aim of this thesis is to investigate the health effects of long-term exposure to Nitrogen Dioxide (NO2) and Particular matter (PM10) in mainland Scotland, UK. In order to have an initial impression about the air pollution health effects in mainland Scotland, chapter 3 presents a standard epidemiological study using a benchmark method. The remaining main chapters (4, 5, 6) cover the main methodological focus in this thesis which has been threefold: (i) how to better estimate pollution by developing a multivariate spatio-temporal fusion model that relates monitored and modelled pollution data over space, time and pollutant; (ii) how to simultaneously estimate the joint effects of multiple pollutants; and (iii) how to allow for the uncertainty in the estimated pollution concentrations when estimating their health effects. Specifically, chapters 4 and 5 are developed to achieve (i), while chapter 6 focuses on (ii) and (iii). In chapter 4, I propose an integrated model for estimating the long-term health effects of NO2, that fuses modelled and measured pollution data to provide improved predictions of areal level pollution concentrations and hence health effects. The air pollution fusion model proposed is a Bayesian space-time linear regression model for relating the measured concentrations to the modelled concentrations for a single pollutant, whilst allowing for additional covariate information such as site type (e.g. roadside, rural, etc) and temperature. However, it is known that some pollutants might be correlated because they may be generated by common processes or be driven by similar factors such as meteorology. The correlation between pollutants can help to predict one pollutant by borrowing strength from the others. Therefore, in chapter 5, I propose a multi-pollutant model which is a multivariate spatio-temporal fusion model that extends the single pollutant model in chapter 4, which relates monitored and modelled pollution data over space, time and pollutant to predict pollution across mainland Scotland. Considering that we are exposed to multiple pollutants simultaneously because the air we breathe contains a complex mixture of particle and gas phase pollutants, the health effects of exposure to multiple pollutants have been investigated in chapter 6. Therefore, this is a natural extension to the single pollutant health effects in chapter 4. Given NO2 and PM10 are highly correlated (multicollinearity issue) in my data, I first propose a temporally-varying linear model to regress one pollutant (e.g. NO2) against another (e.g. PM10) and then use the residuals in the disease model as well as PM10, thus investigating the health effects of exposure to both pollutants simultaneously. Another issue considered in chapter 6 is to allow for the uncertainty in the estimated pollution concentrations when estimating their health effects. There are in total four approaches being developed to adjust the exposure uncertainty. Finally, chapter 7 summarises the work contained within this thesis and discusses the implications for future research.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of chemical control measures to reduce the impact of parasite and pest species has frequently resulted in the development of resistance. Thus, resistance management has become a key concern in human and veterinary medicine, and in agricultural production. Although it is known that factors such as gene flow between susceptible and resistant populations, drug type, application methods, and costs of resistance can affect the rate of resistance evolution, less is known about the impacts of density-dependent eco-evolutionary processes that could be altered by drug-induced mortality. The overall aim of this thesis was to take an experimental evolution approach to assess how life history traits respond to drug selection, using a free-living dioecious worm (Caenorhabditis remanei) as a model. In Chapter 2, I defined the relationship between C. remanei survival and Ivermectin dose over a range of concentrations, in order to control the intensity of selection used in the selection experiment described in Chapter 4. The dose-response data were also used to appraise curve-fitting methods, using Akaike Information Criterion (AIC) model selection to compare a series of nonlinear models. The type of model fitted to the dose response data had a significant effect on the estimates of LD50 and LD99, suggesting that failure to fit an appropriate model could give misleading estimates of resistance status. In addition, simulated data were used to establish that a potential cost of resistance could be predicted by comparing survival at the upper asymptote of dose-response curves for resistant and susceptible populations, even when differences were as low as 4%. This approach to dose-response modeling ensures that the maximum amount of useful information relating to resistance is gathered in one study. In Chapter 3, I asked how simulations could be used to inform important design choices used in selection experiments. Specifically, I focused on the effects of both within- and between-line variation on estimated power, when detecting small, medium and large effect sizes. Using mixed-effect models on simulated data, I demonstrated that commonly used designs with realistic levels of variation could be underpowered for substantial effect sizes. Thus, use of simulation-based power analysis provides an effective way to avoid under or overpowering a study designs incorporating variation due to random effects. In Chapter 4, I 3 investigated how Ivermectin dosage and changes in population density affect the rate of resistance evolution. I exposed replicate lines of C. remanei to two doses of Ivermectin (high and low) to assess relative survival of lines selected in drug-treated environments compared to untreated controls over 10 generations. Additionally, I maintained lines where mortality was imposed randomly to control for differences in density between drug treatments and to distinguish between the evolutionary consequences of drug treatment versus ecological processes affected by changes in density-dependent feedback. Intriguingly, both drug-selected and random-mortality lines showed an increase in survivorship when challenged with Ivermectin; the magnitude of this increase varied with the intensity of selection and life-history stage. The results suggest that interactions between density-dependent processes and life history may mediate evolved changes in susceptibility to control measures, which could result in misleading conclusions about the evolution of heritable resistance following drug treatment. In Chapter 5, I investigated whether the apparent changes in drug susceptibility found in Chapter 4 were related to evolved changes in life-history of C. remanei populations after selection in drug-treated and random-mortality environments. Rapid passage of lines in the drug-free environment had no effect on the measured life-history traits. In the drug-free environment, adult size and fecundity of drug-selected lines increased compared to the controls but drug selection did not affect lifespan. In the treated environment, drug-selected lines showed increased lifespan and fecundity relative to controls. Adult size of randomly culled lines responded in a similar way to drug-selected lines in the drug-free environment, but no change in fecundity or lifespan was observed in either environment. The results suggest that life histories of nematodes can respond to selection as a result of the application of control measures. Failure to take these responses into account when applying control measures could result in adverse outcomes, such as larger and more fecund parasites, as well as over-estimation of the development of genetically controlled resistance. In conclusion, my thesis shows that there may be a complex relationship between drug selection, density-dependent regulatory processes and life history of populations challenged with control measures. This relationship could have implications for how resistance is monitored and managed if life histories of parasitic species show such eco-evolutionary responses to drug application.