420 resultados para Goodness
Resumo:
Advances in algorithms for approximate sampling from a multivariable target function have led to solutions to challenging statistical inference problems that would otherwise not be considered by the applied scientist. Such sampling algorithms are particularly relevant to Bayesian statistics, since the target function is the posterior distribution of the unobservables given the observables. In this thesis we develop, adapt and apply Bayesian algorithms, whilst addressing substantive applied problems in biology and medicine as well as other applications. For an increasing number of high-impact research problems, the primary models of interest are often sufficiently complex that the likelihood function is computationally intractable. Rather than discard these models in favour of inferior alternatives, a class of Bayesian "likelihoodfree" techniques (often termed approximate Bayesian computation (ABC)) has emerged in the last few years, which avoids direct likelihood computation through repeated sampling of data from the model and comparing observed and simulated summary statistics. In Part I of this thesis we utilise sequential Monte Carlo (SMC) methodology to develop new algorithms for ABC that are more efficient in terms of the number of model simulations required and are almost black-box since very little algorithmic tuning is required. In addition, we address the issue of deriving appropriate summary statistics to use within ABC via a goodness-of-fit statistic and indirect inference. Another important problem in statistics is the design of experiments. That is, how one should select the values of the controllable variables in order to achieve some design goal. The presences of parameter and/or model uncertainty are computational obstacles when designing experiments but can lead to inefficient designs if not accounted for correctly. The Bayesian framework accommodates such uncertainties in a coherent way. If the amount of uncertainty is substantial, it can be of interest to perform adaptive designs in order to accrue information to make better decisions about future design points. This is of particular interest if the data can be collected sequentially. In a sense, the current posterior distribution becomes the new prior distribution for the next design decision. Part II of this thesis creates new algorithms for Bayesian sequential design to accommodate parameter and model uncertainty using SMC. The algorithms are substantially faster than previous approaches allowing the simulation properties of various design utilities to be investigated in a more timely manner. Furthermore the approach offers convenient estimation of Bayesian utilities and other quantities that are particularly relevant in the presence of model uncertainty. Finally, Part III of this thesis tackles a substantive medical problem. A neurological disorder known as motor neuron disease (MND) progressively causes motor neurons to no longer have the ability to innervate the muscle fibres, causing the muscles to eventually waste away. When this occurs the motor unit effectively ‘dies’. There is no cure for MND, and fatality often results from a lack of muscle strength to breathe. The prognosis for many forms of MND (particularly amyotrophic lateral sclerosis (ALS)) is particularly poor, with patients usually only surviving a small number of years after the initial onset of disease. Measuring the progress of diseases of the motor units, such as ALS, is a challenge for clinical neurologists. Motor unit number estimation (MUNE) is an attempt to directly assess underlying motor unit loss rather than indirect techniques such as muscle strength assessment, which generally is unable to detect progressions due to the body’s natural attempts at compensation. Part III of this thesis builds upon a previous Bayesian technique, which develops a sophisticated statistical model that takes into account physiological information about motor unit activation and various sources of uncertainties. More specifically, we develop a more reliable MUNE method by applying marginalisation over latent variables in order to improve the performance of a previously developed reversible jump Markov chain Monte Carlo sampler. We make other subtle changes to the model and algorithm to improve the robustness of the approach.
Resumo:
Ingredients: - 1 cup Vision - 100ml ‘Real World’ Application - 100ml Unit Structure/Organisation - 100ml Student-centric Approach [optional: Add Social Media/Popular Culture for extra goodness] - Large Dollop of Passion + Enthusiasm - Sprinkle of Approachability Mix all ingredients well. Cover and leave to rise in a Lecture Theatre for 1.5 hours. Cook in a Classroom for 1.5 hours. Garnish with a dash of Humour before serving. Serves 170 Students
Resumo:
Process mining encompasses the research area which is concerned with knowledge discovery from information system event logs. Within the process mining research area, two prominent tasks can be discerned. First of all, process discovery deals with the automatic construction of a process model out of an event log. Secondly, conformance checking focuses on the assessment of the quality of a discovered or designed process model in respect to the actual behavior as captured in event logs. Hereto, multiple techniques and metrics have been developed and described in the literature. However, the process mining domain still lacks a comprehensive framework for assessing the goodness of a process model from a quantitative perspective. In this study, we describe the architecture of an extensible framework within ProM, allowing for the consistent, comparative and repeatable calculation of conformance metrics. For the development and assessment of both process discovery as well as conformance techniques, such a framework is considered greatly valuable.
Resumo:
This work identifies the limitations of n-way data analysis techniques in multidimensional stream data, such as Internet chat room communications data, and establishes a link between data collection and performance of these techniques. Its contributions are twofold. First, it extends data analysis to multiple dimensions by constructing n-way data arrays known as high order tensors. Chat room tensors are generated by a simulator which collects and models actual communication data. The accuracy of the model is determined by the Kolmogorov-Smirnov goodness-of-fit test which compares the simulation data with the observed (real) data. Second, a detailed computational comparison is performed to test several data analysis techniques including svd [1], and multi-way techniques including Tucker1, Tucker3 [2], and Parafac [3].
Resumo:
This work investigates the accuracy and efficiency tradeoffs between centralized and collective (distributed) algorithms for (i) sampling, and (ii) n-way data analysis techniques in multidimensional stream data, such as Internet chatroom communications. Its contributions are threefold. First, we use the Kolmogorov-Smirnov goodness-of-fit test to show that statistical differences between real data obtained by collective sampling in time dimension from multiple servers and that of obtained from a single server are insignificant. Second, we show using the real data that collective data analysis of 3-way data arrays (users x keywords x time) known as high order tensors is more efficient than centralized algorithms with respect to both space and computational cost. Furthermore, we show that this gain is obtained without loss of accuracy. Third, we examine the sensitivity of collective constructions and analysis of high order data tensors to the choice of server selection and sampling window size. We construct 4-way tensors (users x keywords x time x servers) and analyze them to show the impact of server and window size selections on the results.
Resumo:
Process mining encompasses the research area which is concerned with knowledge discovery from event logs. One common process mining task focuses on conformance checking, comparing discovered or designed process models with actual real-life behavior as captured in event logs in order to assess the “goodness” of the process model. This paper introduces a novel conformance checking method to measure how well a process model performs in terms of precision and generalization with respect to the actual executions of a process as recorded in an event log. Our approach differs from related work in the sense that we apply the concept of so-called weighted artificial negative events towards conformance checking, leading to more robust results, especially when dealing with less complete event logs that only contain a subset of all possible process execution behavior. In addition, our technique offers a novel way to estimate a process model’s ability to generalize. Existing literature has focused mainly on the fitness (recall) and precision (appropriateness) of process models, whereas generalization has been much more difficult to estimate. The described algorithms are implemented in a number of ProM plugins, and a Petri net conformance checking tool was developed to inspect process model conformance in a visual manner.
Resumo:
The purpose of the study was to undertake rigorous psychometric testing of the Caring Efficacy Scale in a sample of Registered Nurses. A cross-sectional survey of 2000 registered nurses was undertaken. The Caring Efficacy Scale was utilised to inform the psychometric properties of the selected items of the Caring Efficacy Scale. Cronbach’s Alpha identified reliability of the data. Exploratory Factor Analysis and Confirmatory Factor Analysis were undertaken to validate the factors. Confirmatory factor analysis confirmed the development of two factors; Confidence to Care and Doubts and Concerns. The Caring Efficacy Scale has undergone rigorous psychometric testing, affording evidence of internal consistency and goodness-of-fit indices within satisfactory ranges. The Caring Efficacy Scale is valid for use in an Australian population of registered nurses. The scale can be used as a subscale or total score reflective of self-efficacy in nursing. This scale may assist nursing educators to predict levels of caring efficacy.
Resumo:
This paper presents a novel framework for the modelling of passenger facilitation in a complex environment. The research is motivated by the challenges in the airport complex system, where there are multiple stakeholders, differing operational objectives and complex interactions and interdependencies between different parts of the airport system. Traditional methods for airport terminal modelling do not explicitly address the need for understanding causal relationships in a dynamic environment. Additionally, existing Bayesian Network (BN) models, which provide a means for capturing causal relationships, only present a static snapshot of a system. A method to integrate a BN complex systems model with stochastic queuing theory is developed based on the properties of the Poisson and Exponential distributions. The resultant Hybrid Queue-based Bayesian Network (HQBN) framework enables the simulation of arbitrary factors, their relationships, and their effects on passenger flow and vice versa. A case study implementation of the framework is demonstrated on the inbound passenger facilitation process at Brisbane International Airport. The predicted outputs of the model, in terms of cumulative passenger flow at intermediary and end points in the inbound process, are found to have an $R^2$ goodness of fit of 0.9994 and 0.9982 respectively over a 10 hour test period. The utility of the framework is demonstrated on a number of usage scenarios including real time monitoring and `what-if' analysis. This framework provides the ability to analyse and simulate a dynamic complex system, and can be applied to other socio-technical systems such as hospitals.
Resumo:
A generalised gamma bidding model is presented, which incorporates many previous models. The log likelihood equations are provided. Using a new method of testing, variants of the model are fitted to some real data for construction contract auctions to find the best fitting models for groupings of bidders. The results are examined for simplifying assumptions, including all those in the main literature. These indicate no one model to be best for all datasets. However, some models do appear to perform significantly better than others and it is suggested that future research would benefit from a closer examination of these.
Resumo:
Background The application of theoretical frameworks for modeling predictors of drug risk among male street laborers remains limited. The objective of this study was to test a modified version of the IMB (Information-Motivation-Behavioral Skills Model), which includes psychosocial stress, and compare this modified version with the original IMB model in terms of goodness-of-fit to predict risky drug use behavior among this population. Methods In a cross-sectional study, social mapping technique was conducted to recruit 450 male street laborers from 135 street venues across 13 districts of Hanoi city, Vietnam, for face-to-face interviews. Structural equation modeling (SEM) was used to analyze data from interviews. Results Overall measures of fit via SEM indicated that the original IMB model provided a better fit to the data than the modified version. Although the former model was able to predict a lesser variance than the latter (55% vs. 62%), it was of better fit. The findings suggest that men who are better informed and motivated for HIV prevention are more likely to report higher behavioral skills, which, in turn, are less likely to be engaged in risky drug use behavior. Conclusions This was the first application of the modified IMB model for drug use in men who were unskilled, unregistered laborers in urban settings. An AIDS prevention program for these men should not only distribute information and enhance motivations for HIV prevention, but consider interventions that could improve self-efficacy for preventing HIV infection. Future public health research and action may also consider broader factors such as structural social capital and social policy to alter the conditions that drive risky drug use among these men.
Resumo:
This study considered the problem of predicting survival, based on three alternative models: a single Weibull, a mixture of Weibulls and a cure model. Instead of the common procedure of choosing a single “best” model, where “best” is defined in terms of goodness of fit to the data, a Bayesian model averaging (BMA) approach was adopted to account for model uncertainty. This was illustrated using a case study in which the aim was the description of lymphoma cancer survival with covariates given by phenotypes and gene expression. The results of this study indicate that if the sample size is sufficiently large, one of the three models emerge as having highest probability given the data, as indicated by the goodness of fit measure; the Bayesian information criterion (BIC). However, when the sample size was reduced, no single model was revealed as “best”, suggesting that a BMA approach would be appropriate. Although a BMA approach can compromise on goodness of fit to the data (when compared to the true model), it can provide robust predictions and facilitate more detailed investigation of the relationships between gene expression and patient survival. Keywords: Bayesian modelling; Bayesian model averaging; Cure model; Markov Chain Monte Carlo; Mixture model; Survival analysis; Weibull distribution
Resumo:
Background: The overuse of antibiotics is becoming an increasing concern. Antibiotic resistance, which increases both the burden of disease, and the cost of health services, is perhaps the most profound impact of antibiotics overuse. Attempts have been made to develop instruments to measure the psychosocial constructs underlying antibiotics use, however, none of these instruments have undergone thorough psychometric validation. This study evaluates the psychometric properties of the Parental Perceptions on Antibiotics (PAPA) scales. The PAPA scales attempt to measure the factors influencing parental use of antibiotics in children. Methods: 1111 parents of children younger than 12 years old were recruited from primary schools’ parental meetings in the Eastern Province of Saudi Arabia from September 2012 to January 2013. The structure of the PAPA instrument was validated using Confirmatory Factor Analysis (CFA) with measurement model fit evaluated using the raw and scaled χ2, Goodness of Fit Index, and Root Mean Square Error of Approximation. Results: A five-factor model was confirmed with the model showing good fit. Constructs in the model include: Knowledge and Beliefs, Behaviors, Sources of information, Adherence, and Awareness about antibiotics resistance. The instrument was shown to have good internal consistency, and good discriminant and convergent validity. Conclusion: The availability of an instrument able to measure the psychosocial factors underlying antibiotics usage allows the risk factors underlying antibiotic use and overuse to now be investigated.
Resumo:
The accident record of the repair, maintenance, minor alteration, and addition (RMAA) sector has been alarmingly high; however, research in the RMAA sector remains limited. Unsafe behavior is considered one of the key causes of accidents. Thus, the organizational factors that influence individual safety behavior at work continue to be the focus of many studies. The safety climate, which reflects the true priority of safety in an organization, has drawn much attention. Safety climate measurement helps to identify areas for safety improvement. The current study aims to identify safety climate factors in the RMAA sector. A questionnaire survey was conducted in the RMAA sector in Hong Kong. Data were randomly split into the calibration and the validation samples. The RMAA safety climate factors were determined by exploratory factor analysis on the calibration sample. Three safety climate factors of the RMAA works were identified: (1) management commitment to occupational health and safety (OHS) and employee involvement, (2) application of safety rules and work practices, and; (3) responsibility for health and safety. Confirmatory factor analysis (CFA) was then conducted on the validation sample. The CFA model showed satisfactory goodness of fit, reliability, and validity. The suggested RMAA safety climate factors can be utilized by construction industry practitioners in developed economies to measure the safety climate of their RMAA projects, thereby enhancing the safety of RMAA works.
Resumo:
Background There are few theoretically derived questionnaires of physical activity determinants among youth, and the existing questionnaires have not been subjected to tests of factorial validity and invariance, The present study employed confirmatory factor analysis (CFA) to test the factorial validity and invariance of questionnaires designed to be unidimensional measures of attitudes, subjective norms, perceived behavioral control, and self-efficacy about physical activity. Methods Adolescent girls in eighth grade from two cohorts (N = 955 and 1,797) completed the questionnaires at baseline; participants from cohort 1 (N = 845) also completed the questionnaires in ninth grade (i.e., 1-year follow-up). Factorial validity and invariance were tested using CFA with full-information maximum likelihood estimation in AMOS 4.0, Initially, baseline data from cohort 1 were employed to test the fit and, when necessary, to modify the unidimensional models. The models were cross-validated using a multigroup analysis of factorial invariance on baseline data from cohorts 1 and 2, The models then were subjected to a longitudinal analysis of factorial invariance using baseline and follow-up data from cohort i, Results The CFAs supported the fit of unidimensional models to the four questionnaires, and the models were cross-validated, as indicated by evidence of multigroup factorial invariance, The models also possessed evidence of longitudinal factorial invariance. Conclusions Evidence was provided for the factorial validity and the invariance of the questionnaires designed to be unidimensional measures of attitudes, subjective norms, perceived behavioral control, and self-efficacy about physical activity among adolescent girls, (C) 2000 American Health Foundation and academic Press.
Resumo:
Spatial data are now prevalent in a wide range of fields including environmental and health science. This has led to the development of a range of approaches for analysing patterns in these data. In this paper, we compare several Bayesian hierarchical models for analysing point-based data based on the discretization of the study region, resulting in grid-based spatial data. The approaches considered include two parametric models and a semiparametric model. We highlight the methodology and computation for each approach. Two simulation studies are undertaken to compare the performance of these models for various structures of simulated point-based data which resemble environmental data. A case study of a real dataset is also conducted to demonstrate a practical application of the modelling approaches. Goodness-of-fit statistics are computed to compare estimates of the intensity functions. The deviance information criterion is also considered as an alternative model evaluation criterion. The results suggest that the adaptive Gaussian Markov random field model performs well for highly sparse point-based data where there are large variations or clustering across the space; whereas the discretized log Gaussian Cox process produces good fit in dense and clustered point-based data. One should generally consider the nature and structure of the point-based data in order to choose the appropriate method in modelling a discretized spatial point-based data.