932 resultados para HMM, Nosocomial Pathogens, Genotyping, Statistical Modelling, VRE
Resumo:
An educational priority of many nations is to enhance mathematical learning in early childhood. One area in need of special attention is that of statistics. This paper argues for a renewed focus on statistical reasoning in the beginning school years, with opportunities for children to engage in data modelling activities. Such modelling involves investigations of meaningful phenomena, deciding what is worthy of attention (i.e., identifying complex attributes), and then progressing to organising, structuring, visualising, and representing data. Results are reported from the first year of a three-year longitudinal study in which three classes of first-grade children and their teachers engaged in activities that required the creation of data models. The theme of “Looking after our Environment,” a component of the children’s science curriculum at the time, provided the context for the activities. Findings focus on how the children dealt with given complex attributes and how they generated their own attributes in classifying broad data sets, and the nature of the models the children created in organising, structuring, and representing their data.
Resumo:
This thesis applies Monte Carlo techniques to the study of X-ray absorptiometric methods of bone mineral measurement. These studies seek to obtain information that can be used in efforts to improve the accuracy of the bone mineral measurements. A Monte Carlo computer code for X-ray photon transport at diagnostic energies has been developed from first principles. This development was undertaken as there was no readily available code which included electron binding energy corrections for incoherent scattering and one of the objectives of the project was to study the effects of inclusion of these corrections in Monte Carlo models. The code includes the main Monte Carlo program plus utilities for dealing with input data. A number of geometrical subroutines which can be used to construct complex geometries have also been written. The accuracy of the Monte Carlo code has been evaluated against the predictions of theory and the results of experiments. The results show a high correlation with theoretical predictions. In comparisons of model results with those of direct experimental measurements, agreement to within the model and experimental variances is obtained. The code is an accurate and valid modelling tool. A study of the significance of inclusion of electron binding energy corrections for incoherent scatter in the Monte Carlo code has been made. The results show this significance to be very dependent upon the type of application. The most significant effect is a reduction of low angle scatter flux for high atomic number scatterers. To effectively apply the Monte Carlo code to the study of bone mineral density measurement by photon absorptiometry the results must be considered in the context of a theoretical framework for the extraction of energy dependent information from planar X-ray beams. Such a theoretical framework is developed and the two-dimensional nature of tissue decomposition based on attenuation measurements alone is explained. This theoretical framework forms the basis for analytical models of bone mineral measurement by dual energy X-ray photon absorptiometry techniques. Monte Carlo models of dual energy X-ray absorptiometry (DEXA) have been established. These models have been used to study the contribution of scattered radiation to the measurements. It has been demonstrated that the measurement geometry has a significant effect upon the scatter contribution to the detected signal. For the geometry of the models studied in this work the scatter has no significant effect upon the results of the measurements. The model has also been used to study a proposed technique which involves dual energy X-ray transmission measurements plus a linear measurement of the distance along the ray path. This is designated as the DPA( +) technique. The addition of the linear measurement enables the tissue decomposition to be extended to three components. Bone mineral, fat and lean soft tissue are the components considered here. The results of the model demonstrate that the measurement of bone mineral using this technique is stable over a wide range of soft tissue compositions and hence would indicate the potential to overcome a major problem of the two component DEXA technique. However, the results also show that the accuracy of the DPA( +) technique is highly dependent upon the composition of the non-mineral components of bone and has poorer precision (approximately twice the coefficient of variation) than the standard DEXA measurements. These factors may limit the usefulness of the technique. These studies illustrate the value of Monte Carlo computer modelling of quantitative X-ray measurement techniques. The Monte Carlo models of bone densitometry measurement have:- 1. demonstrated the significant effects of the measurement geometry upon the contribution of scattered radiation to the measurements, 2. demonstrated that the statistical precision of the proposed DPA( +) three tissue component technique is poorer than that of the standard DEXA two tissue component technique, 3. demonstrated that the proposed DPA(+) technique has difficulty providing accurate simultaneous measurement of body composition in terms of a three component model of fat, lean soft tissue and bone mineral,4. and provided a knowledge base for input to decisions about development (or otherwise) of a physical prototype DPA( +) imaging system. The Monte Carlo computer code, data, utilities and associated models represent a set of significant, accurate and valid modelling tools for quantitative studies of physical problems in the fields of diagnostic radiology and radiography.
Resumo:
Background, aim, and scope Urban motor vehicle fleets are a major source of particulate matter pollution, especially of ultrafine particles (diameters < 0.1 µm), and exposure to particulate matter has known serious health effects. A considerable body of literature is available on vehicle particle emission factors derived using a wide range of different measurement methods for different particle sizes, conducted in different parts of the world. Therefore the choice as to which are the most suitable particle emission factors to use in transport modelling and health impact assessments presented as a very difficult task. The aim of this study was to derive a comprehensive set of tailpipe particle emission factors for different vehicle and road type combinations, covering the full size range of particles emitted, which are suitable for modelling urban fleet emissions. Materials and methods A large body of data available in the international literature on particle emission factors for motor vehicles derived from measurement studies was compiled and subjected to advanced statistical analysis, to determine the most suitable emission factors to use in modelling urban fleet emissions. Results This analysis resulted in the development of five statistical models which explained 86%, 93%, 87%, 65% and 47% of the variation in published emission factors for particle number, particle volume, PM1, PM2.5 and PM10 respectively. A sixth model for total particle mass was proposed but no significant explanatory variables were identified in the analysis. From the outputs of these statistical models, the most suitable particle emission factors were selected. This selection was based on examination of the statistical robustness of the statistical model outputs, including consideration of conservative average particle emission factors with the lowest standard errors, narrowest 95% confidence intervals and largest sample sizes, and the explanatory model variables, which were Vehicle Type (all particle metrics), Instrumentation (particle number and PM2.5), Road Type (PM10) and Size Range Measured and Speed Limit on the Road (particle volume). Discussion A multiplicity of factors need to be considered in determining emission factors that are suitable for modelling motor vehicle emissions, and this study derived a set of average emission factors suitable for quantifying motor vehicle tailpipe particle emissions in developed countries. Conclusions The comprehensive set of tailpipe particle emission factors presented in this study for different vehicle and road type combinations enable the full size range of particles generated by fleets to be quantified, including ultrafine particles (measured in terms of particle number). These emission factors have particular application for regions which may have a lack of funding to undertake measurements, or insufficient measurement data upon which to derive emission factors for their region. Recommendations and perspectives In urban areas motor vehicles continue to be a major source of particulate matter pollution and of ultrafine particles. It is critical that in order to manage this major pollution source methods are available to quantify the full size range of particles emitted for traffic modelling and health impact assessments.
Resumo:
A SNP genotyping method was developed for E. faecalis and E. faecium using the 'Minimum SNPs' program. SNP sets were interrogated using allele-specific real-time PCR. SNP-typing sub-divided clonal complexes 2 and 9 of E. faecalis and 17 of E. faecium, members of which cause the majority of nosocomial infections globally.
Resumo:
Objective Theoretical models of post-traumatic growth (PTG) have been derived in the general trauma literature to describe the post-trauma experience that facilitates the perception of positive life changes. To develop a statistical model identifying factors that are associated with PTG, structural equation modelling (SEM) was used in the current study to assess the relationships between perception of diagnosis severity, rumination, social support, distress, and PTG. Method A statistical model of PTG was tested in a sample of participants diagnosed with a variety of cancers (N=313). Results An initial principal components analysis of the measure used to assess rumination revealed three components: intrusive rumination, deliberate rumination of benefits, and life purpose rumination. SEM results indicated that the model fit the data well and that 30% of the variance in PTG was explained by the variables. Trauma severity was directly related to distress, but not to PTG. Deliberately ruminating on benefits and social support were directly related to PTG. Life purpose rumination and intrusive rumination were associated with distress. Conclusions The model showed that in addition to having unique correlating factors, distress was not related to PTG, thereby providing support for the notion that these are discrete constructs in the post-diagnosis experience. The statistical model provides support that post-diagnosis experience is simultaneously shaped by positive and negative life changes and that one or the other outcome may be prevalent or may occur concurrently. As such, an implication for practice is the need for supportive care that is holistic in nature.
Resumo:
This article explores the use of probabilistic classification, namely finite mixture modelling, for identification of complex disease phenotypes, given cross-sectional data. In particular, if focuses on posterior probabilities of subgroup membership, a standard output of finite mixture modelling, and how the quantification of uncertainty in these probabilities can lead to more detailed analyses. Using a Bayesian approach, we describe two practical uses of this uncertainty: (i) as a means of describing a person’s membership to a single or multiple latent subgroups and (ii) as a means of describing identified subgroups by patient-centred covariates not included in model estimation. These proposed uses are demonstrated on a case study in Parkinson’s disease (PD), where latent subgroups are identified using multiple symptoms from the Unified Parkinson’s Disease Rating Scale (UPDRS).
Resumo:
Over recent years a significant amount of research has been undertaken to develop prognostic models that can be used to predict the remaining useful life of engineering assets. Implementations by industry have only had limited success. By design, models are subject to specific assumptions and approximations, some of which are mathematical, while others relate to practical implementation issues such as the amount of data required to validate and verify a proposed model. Therefore, appropriate model selection for successful practical implementation requires not only a mathematical understanding of each model type, but also an appreciation of how a particular business intends to utilise a model and its outputs. This paper discusses business issues that need to be considered when selecting an appropriate modelling approach for trial. It also presents classification tables and process flow diagrams to assist industry and research personnel select appropriate prognostic models for predicting the remaining useful life of engineering assets within their specific business environment. The paper then explores the strengths and weaknesses of the main prognostics model classes to establish what makes them better suited to certain applications than to others and summarises how each have been applied to engineering prognostics. Consequently, this paper should provide a starting point for young researchers first considering options for remaining useful life prediction. The models described in this paper are Knowledge-based (expert and fuzzy), Life expectancy (stochastic and statistical), Artificial Neural Networks, and Physical models.
Resumo:
This paper discusses the statistical analyses used to derive bridge live loads models for Hong Kong from a 10-year weigh-in-motion (WIM) data. The statistical concepts required and the terminologies adopted in the development of bridge live load models are introduced. This paper includes studies for representative vehicles from the large amount of WIM data in Hong Kong. Different load affecting parameters such as gross vehicle weights, axle weights, axle spacings, average daily number of trucks etc are first analyzed by various stochastic processes in order to obtain the mathematical distributions of these parameters. As a prerequisite to determine accurate bridge design loadings in Hong Kong, this study not only takes advantages of code formulation methods used internationally but also presents a new method for modelling collected WIM data using a statistical approach.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
This paper argues for a renewed focus on statistical reasoning in the beginning school years, with opportunities for children to engage in data modelling. Some of the core components of data modelling are addressed. A selection of results from the first data modelling activity implemented during the second year (2010; second grade) of a current longitudinal study are reported. Data modelling involves investigations of meaningful phenomena, deciding what is worthy of attention (identifying complex attributes), and then progressing to organising, structuring, visualising, and representing data. Reported here are children's abilities to identify diverse and complex attributes, sort and classify data in different ways, and create and interpret models to represent their data.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
The world’s increasing complexity, competitiveness, interconnectivity, and dependence on technology generate new challenges for nations and individuals that cannot be met by continuing education as usual (Katehi, Pearson, & Feder, 2009). With the proliferation of complex systems have come new technologies for communication, collaboration, and conceptualisation. These technologies have led to significant changes in the forms of mathematical and scientific thinking that are required beyond the classroom. Modelling, in its various forms, can develop and broaden children’s mathematical and scientific thinking beyond the standard curriculum. This paper first considers future competencies in the mathematical sciences within an increasingly complex world. Next, consideration is given to interdisciplinary problem solving and models and modelling. Examples of complex, interdisciplinary modelling activities across grades are presented, with data modelling in 1st grade, model-eliciting in 4th grade, and engineering-based modelling in 7th-9th grades.