925 resultados para Statistical packages


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Provides an accessible foundation to Bayesian analysis using real world models This book aims to present an introduction to Bayesian modelling and computation, by considering real case studies drawn from diverse fields spanning ecology, health, genetics and finance. Each chapter comprises a description of the problem, the corresponding model, the computational method, results and inferences as well as the issues that arise in the implementation of these approaches. Case Studies in Bayesian Statistical Modelling and Analysis: •Illustrates how to do Bayesian analysis in a clear and concise manner using real-world problems. •Each chapter focuses on a real-world problem and describes the way in which the problem may be analysed using Bayesian methods. •Features approaches that can be used in a wide area of application, such as, health, the environment, genetics, information science, medicine, biology, industry and remote sensing. Case Studies in Bayesian Statistical Modelling and Analysis is aimed at statisticians, researchers and practitioners who have some expertise in statistical modelling and analysis, and some understanding of the basics of Bayesian statistics, but little experience in its application. Graduate students of statistics and biostatistics will also find this book beneficial.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectives Demonstrate the application of decision trees – classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs) – to understand structure in missing data. Setting Data taken from employees at three different industry sites in Australia. Participants 7915 observations were included. Materials and Methods The approach was evaluated using an occupational health dataset comprising results of questionnaires, medical tests, and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the Type of data (medical or environmental), the site in which it was collected, the number of visits and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusion Researchers are encouraged to use CART and BRT models to explore and understand missing data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work examined a new method of detecting small water filled cracks in underground insulation ('water trees') using data from commecially available non-destructive testing equipment. A testing facility was constructed and a computer simulation of the insulation designed in order to test the proposed ageing factor - the degree of non-linearity. This was a large industry-backed project involving an ARC linkage grant, Ergon Energy and the University of Queensland, as well as the Queensland University of Technology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We defined a new statistical fluid registration method with Lagrangian mechanics. Although several authors have suggested that empirical statistics on brain variation should be incorporated into the registration problem, few algorithms have included this information and instead use regularizers that guarantee diffeomorphic mappings. Here we combine the advantages of a large-deformation fluid matching approach with empirical statistics on population variability in anatomy. We reformulated the Riemannian fluid algorithmdeveloped in [4], and used a Lagrangian framework to incorporate 0 th and 1st order statistics in the regularization process. 92 2D midline corpus callosum traces from a twin MRI database were fluidly registered using the non-statistical version of the algorithm (algorithm 0), giving initial vector fields and deformation tensors. Covariance matrices were computed for both distributions and incorporated either separately (algorithm 1 and algorithm 2) or together (algorithm 3) in the registration. We computed heritability maps and two vector and tensorbased distances to compare the power and the robustness of the algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we used a nonconservative Lagrangian mechanics approach to formulate a new statistical algorithm for fluid registration of 3-D brain images. This algorithm is named SAFIRA, acronym for statistically-assisted fluid image registration algorithm. A nonstatistical version of this algorithm was implemented, where the deformation was regularized by penalizing deviations from a zero rate of strain. In, the terms regularizing the deformation included the covariance of the deformation matrices Σ and the vector fields (q). Here, we used a Lagrangian framework to reformulate this algorithm, showing that the regularizing terms essentially allow nonconservative work to occur during the flow. Given 3-D brain images from a group of subjects, vector fields and their corresponding deformation matrices are computed in a first round of registrations using the nonstatistical implementation. Covariance matrices for both the deformation matrices and the vector fields are then obtained and incorporated (separately or jointly) in the nonconservative terms, creating four versions of SAFIRA. We evaluated and compared our algorithms' performance on 92 3-D brain scans from healthy monozygotic and dizygotic twins; 2-D validations are also shown for corpus callosum shapes delineated at midline in the same subjects. After preliminary tests to demonstrate each method, we compared their detection power using tensor-based morphometry (TBM), a technique to analyze local volumetric differences in brain structure. We compared the accuracy of each algorithm variant using various statistical metrics derived from the images and deformation fields. All these tests were also run with a traditional fluid method, which has been quite widely used in TBM studies. The versions incorporating vector-based empirical statistics on brain variation were consistently more accurate than their counterparts, when used for automated volumetric quantification in new brain images. This suggests the advantages of this approach for large-scale neuroimaging studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This chapter addresses opportunities for problem posing in developing young children’s statistical literacy, with a focus on student-directed investigations. Although the notion of problem posing has broadened in recent years, there nevertheless remains limited research on how problem posing can be integrated within the regular mathematics curriculum, especially in the areas of statistics and probability. The chapter first reviews briefly aspects of problem posing that have featured in the literature over the years. Consideration is next given to the importance of developing children’s statistical literacy in which problem posing is an inherent feature. Some findings from a school playground investigation conducted in four, fourth-grade classes illustrate the different ways in which children posed investigative questions, how they made predictions about their outcomes and compared these with their findings, and the ways in which they chose to represent their findings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As statistical education becomes more firmly embedded in the school curriculum and its value across the curriculum is recognised, attention moves from knowing procedures, such as calculating a mean or drawing a graph, to understanding the purpose of a statistical investigation in decision making in many disciplines. As students learn to complete the stages of an investigation, the question of meaningful assessment of the process arises. This paper considers models for carrying out a statistical inquiry and, based on a four-phase model, creates a developmental squence that can be used for the assessment of outcomes from each of the four phases as well as for the complete inquiry. The developmental sequence is based on the SOLO model, focussing on the "observed" outcomes during the inquiry process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article examines a social media assignment used to teach and practice statistical literacy with over 400 students each semester in large-lecture traditional, fully online, and flipped sections of an introductory-level statistics course. Following the social media assignment, students completed a survey on how they approached the assignment. Drawing from the authors’ experiences with the project and the survey results, this article offers recommendations for developing social media assignments in large courses that focus on the interplay between the social media tool and the implications of assignment prompts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using surface charts at 0330GMT, the movement df the monsoon trough during the months June to September 1990 al two fixed longitudes, namely 79 degrees E and 85 degrees E, is studied. The probability distribution of trough position shows that the median, mean and mode occur at progressively more northern latitudes, especially at 85 degrees E, with a pronounced mode that is close to the northern-most limit reached by the trough. A spectral analysis of the fluctuating latitudinal position of the trough is carried out using FFT and the Maximum Entropy Method (MEM). Both methods show significant peaks around 7.5 and 2.6 days, and a less significant one around 40-50 days. The two peaks at the shorter period are more prominent at the eastern longitude. MEM shows an additional peak around 15 days. A study of the weather systems that occurred during the season shows them to have a duration around 3 days and an interval between systems of around 9 days, suggesting a possible correlation with the dominant short periods observed in the spectrum of trough position.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of statistical analysis of experi- mental investigations is to make predictions on the basis of mathematical equations so as the number of experiments. Abrasive jet machining (AJM) is an unconventional and novel machining process wherein microabrasive particles are propelled at high veloc- ities on to a workpiece. The resulting erosion can be used for cutting, etching, cleaning, deburring, drilling and polishing. In the study completed by the authors, statistical design of experiments was successfully employed to predict the rate of material removal by AJM. This paper discusses the details of such an approach and the findings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The export of sediments from coastal catchments can have detrimental impacts on estuaries and near shore reef ecosystems such as the Great Barrier Reef. Catchment management approaches aimed at reducing sediment loads require monitoring to evaluate their effectiveness in reducing loads over time. However, load estimation is not a trivial task due to the complex behaviour of constituents in natural streams, the variability of water flows and often a limited amount of data. Regression is commonly used for load estimation and provides a fundamental tool for trend estimation by standardising the other time specific covariates such as flow. This study investigates whether load estimates and resultant power to detect trends can be enhanced by (i) modelling the error structure so that temporal correlation can be better quantified, (ii) making use of predictive variables, and (iii) by identifying an efficient and feasible sampling strategy that may be used to reduce sampling error. To achieve this, we propose a new regression model that includes an innovative compounding errors model structure and uses two additional predictive variables (average discounted flow and turbidity). By combining this modelling approach with a new, regularly optimised, sampling strategy, which adds uniformity to the event sampling strategy, the predictive power was increased to 90%. Using the enhanced regression model proposed here, it was possible to detect a trend of 20% over 20 years. This result is in stark contrast to previous conclusions presented in the literature. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The charge at which adsorption of orgamc compounds attains a maximum ( \sigma MAX M) at an electrochenucal interface is analysed using several multi-state models in a hierarchical manner The analysis is based on statistical mechamcal results for the following models (A) two-state site parity, (B) two-state muhl-slte, and (C) three-state site parity The coulombic interactions due to permanent and reduced dipole effects (using mean field approximation), electrostatic field effects and specific substrate interactions have been taken into account. The simplest model in the hierarchy (two-state site parity) yields the exphcit dependence of ( \sigma MAX M) on the permanent dipole moment, polarizability of the solvent and the adsorbate, lattice spacing, effective coordination number, etc Other models in the baerarchy bring to hght the influence of the solvent structure and the role of substrate interactions, etc As a result of this approach, the "composition" of oM.x m terms of the fundamental molecular constants becomes clear. With a view to use these molecular results to maxamum advantage, the derived results for ( \sigma MAX M) have been converted into those involving experimentally observable parameters lake Co, C 1, E N, etc Wherever possible, some of the earlier phenomenologlcal relations reported for ( \sigma MAX M), notably by Parsons, Damaskm and Frumkln, and Trasattl, are shown to have a certain molecular basis, vlz a simple two-state sate panty model.As a corollary to the hxerarcbacal modelling, \sigma MAX M and the potential corresponding to at (Emax) are shown to be constants independent of 0max or Corg for all models The lmphcatlon of our analysis f o r OmMa x with respect to that predicted by the generalized surface layer equation (which postulates Om~ and Ema x varlaUon with 0) is discussed in detail Finally we discuss an passing o M. and the electrosorptlon valency an this context.