363 resultados para Statistical method
em Queensland University of Technology - ePrints Archive
Resumo:
Reliable budget/cost estimates for road maintenance and rehabilitation are subjected to uncertainties and variability in road asset condition and characteristics of road users. The CRC CI research project 2003-029-C ‘Maintenance Cost Prediction for Road’ developed a method for assessing variation and reliability in budget/cost estimates for road maintenance and rehabilitation. The method is based on probability-based reliable theory and statistical method. The next stage of the current project is to apply the developed method to predict maintenance/rehabilitation budgets/costs of large networks for strategic investment. The first task is to assess the variability of road data. This report presents initial results of the analysis in assessing the variability of road data. A case study of the analysis for dry non reactive soil is presented to demonstrate the concept in analysing the variability of road data for large road networks. In assessing the variability of road data, large road networks were categorised into categories with common characteristics according to soil and climatic conditions, pavement conditions, pavement types, surface types and annual average daily traffic. The probability distributions, statistical means, and standard deviation values of asset conditions and annual average daily traffic for each type were quantified. The probability distributions and the statistical information obtained in this analysis will be used to asset the variation and reliability in budget/cost estimates in later stage. Generally, we usually used mean values of asset data of each category as input values for investment analysis. The variability of asset data in each category is not taken into account. This analysis method demonstrated that it can be used for practical application taking into account the variability of road data in analysing large road networks for maintenance/rehabilitation investment analysis.
Resumo:
The measurement error model is a well established statistical method for regression problems in medical sciences, although rarely used in ecological studies. While the situations in which it is appropriate may be less common in ecology, there are instances in which there may be benefits in its use for prediction and estimation of parameters of interest. We have chosen to explore this topic using a conditional independence model in a Bayesian framework using a Gibbs sampler, as this gives a great deal of flexibility, allowing us to analyse a number of different models without losing generality. Using simulations and two examples, we show how the conditional independence model can be used in ecology, and when it is appropriate.
Resumo:
The effect of sample geometry on the melting rates of burning iron rods was assessed. Promoted-ignition tests were conducted with rods having cylindrical, rectangular, and triangular cross-sectional shapes over a range of cross-sectional areas. The regression rate of the melting interface (RRMI) was assessed using a statistical approach which enabled the quantification of confidence levels for the observed differences in RRMI. Statistically significant differences in RRMI were observed for rods with the same cross-sectional area but different cross-sectional shape. The magnitude of the proportional difference in RRMI increased with the cross-sectional area. Triangular rods had the highest RRMI, followed by rectangular rods, and then cylindrical rods. The dependence of RRMI on rod shape is shown to relate to the action of molten metal at corners. The corners of the rectangular and triangular rods melted faster than the faces due to their locally higher surface area to volume ratios. This phenomenon altered the attachment geometry between liquid and solid phases, increasing the surface area available for heat transfer, causing faster melting. Findings relating to the application of standard flammability test results in industrial situations are also presented.
Resumo:
Approximate Bayesian computation has become an essential tool for the analysis of complex stochastic models when the likelihood function is numerically unavailable. However, the well-established statistical method of empirical likelihood provides another route to such settings that bypasses simulations from the model and the choices of the approximate Bayesian computation parameters (summary statistics, distance, tolerance), while being convergent in the number of observations. Furthermore, bypassing model simulations may lead to significant time savings in complex models, for instance those found in population genetics. The Bayesian computation with empirical likelihood algorithm we develop in this paper also provides an evaluation of its own performance through an associated effective sample size. The method is illustrated using several examples, including estimation of standard distributions, time series, and population genetics models.
Resumo:
This paper presents the application of a statistical method for model structure selection of lift-drag and viscous damping components in ship manoeuvring models. The damping model is posed as a family of linear stochastic models, which is postulated based on previous work in the literature. Then a nested test of hypothesis problem is considered. The testing reduces to a recursive comparison of two competing models, for which optimal tests in the Neyman sense exist. The method yields a preferred model structure and its initial parameter estimates. Alternatively, the method can give a reduced set of likely models. Using simulated data we study how the selection method performs when there is both uncorrelated and correlated noise in the measurements. The first case is related to instrumentation noise, whereas the second case is related to spurious wave-induced motion often present during sea trials. We then consider the model structure selection of a modern high-speed trimaran ferry from full scale trial data.
Resumo:
Introduction Natural product provenance is important in the food, beverage and pharmaceutical industries, for consumer confidence and with health implications. Raman spectroscopy has powerful molecular fingerprint abilities. Surface Enhanced Raman Spectroscopy’s (SERS) sharp peaks allow distinction between minimally different molecules, so it should be suitable for this purpose. Methods Naturally caffeinated beverages with Guarana extract, coffee and Red Bull energy drink as a synthetic caffeinated beverage for comparison (20 µL ea.) were reacted 1:1 with Gold nanoparticles functionalised with anti-caffeine antibody (ab15221) (10 minutes), air dried and analysed in a micro-Raman instrument. The spectral data was processed using Principle Component Analysis (PCA). Results The PCA showed Guarana sourced caffeine varied significantly from synthetic caffeine (Red Bull) on component 1 (containing 76.4% of the variance in the data). See figure 1. The coffee containing beverages, and in particular Robert Timms (instant coffee) were very similar on component 1, but the barista espresso showed minor variance on component 1. Both coffee sourced caffeine samples varied with red Bull on component 2, (20% of variance). ************************************************************ Figure 1 PCA comparing a naturally caffeinated beverage containing Guarana with coffee. ************************************************************ Discussion PCA is an unsupervised multivariate statistical method that determines patterns within data. Figure 1 shows Caffeine in Guarana is notably different to synthetic caffeine. Other researchers have revealed that caffeine in Guarana plants is complexed with tannins. Naturally sourced/ lightly processed caffeine (Monster Energy, Espresso) are more inherently different than synthetic (Red Bull) /highly processed (Robert Timms) caffeine, in figure 1, which is consistent with this finding and demonstrates this technique’s applicability. Guarana provenance is important because it is still largely hand produced and its demand is escalating with recognition of its benefits. This could be a powerful technique for Guarana provenance, and may extend to other industries where provenance / authentication are required, e.g. the wine or natural pharmaceuticals industries.
Resumo:
Objectives Directly measuring disease incidence in a population is difficult and not feasible to do routinely. We describe the development and application of a new method of estimating at a population level the number of incident genital chlamydia infections, and the corresponding incidence rates, by age and sex using routine surveillance data. Methods A Bayesian statistical approach was developed to calibrate the parameters of a decision-pathway tree against national data on numbers of notifications and tests conducted (2001-2013). Independent beta probability density functions were adopted for priors on the time-independent parameters; the shape parameters of these beta distributions were chosen to match prior estimates sourced from peer-reviewed literature or expert opinion. To best facilitate the calibration, multivariate Gaussian priors on (the logistic transforms of) the time-dependent parameters were adopted, using the Matérn covariance function to favour changes over consecutive years and across adjacent age cohorts. The model outcomes were validated by comparing them with other independent empirical epidemiological measures i.e. prevalence and incidence as reported by other studies. Results Model-based estimates suggest that the total number of people acquiring chlamydia per year in Australia has increased by ~120% over 12 years. Nationally, an estimated 356,000 people acquired chlamydia in 2013, which is 4.3 times the number of reported diagnoses. This corresponded to a chlamydia annual incidence estimate of 1.54% in 2013, increased from 0.81% in 2001 (~90% increase). Conclusions We developed a statistical method which uses routine surveillance (notifications and testing) data to produce estimates of the extent and trends in chlamydia incidence.
Resumo:
Some statistical procedures already available in literature are employed in developing the water quality index, WQI. The nature of complexity and interdependency that occur in physical and chemical processes of water could be easier explained if statistical approaches were applied to water quality indexing. The most popular statistical method used in developing WQI is the principal component analysis (PCA). In literature, the WQI development based on the classical PCA mostly used water quality data that have been transformed and normalized. Outliers may be considered in or eliminated from the analysis. However, the classical mean and sample covariance matrix used in classical PCA methodology is not reliable if the outliers exist in the data. Since the presence of outliers may affect the computation of the principal component, robust principal component analysis, RPCA should be used. Focusing in Langat River, the RPCA-WQI was introduced for the first time in this study to re-calculate the DOE-WQI. Results show that the RPCA-WQI is capable to capture similar distribution in the existing DOE-WQI.
Resumo:
Genome-wide association studies (GWASs) have been successful at identifying single-nucleotide polymorphisms (SNPs) highly associated with common traits; however, a great deal of the heritable variation associated with common traits remains unaccounted for within the genome. Genome-wide complex trait analysis (GCTA) is a statistical method that applies a linear mixed model to estimate phenotypic variance of complex traits explained by genome-wide SNPs, including those not associated with the trait in a GWAS. We applied GCTA to 8 cohorts containing 7096 case and 19 455 control individuals of European ancestry in order to examine the missing heritability present in Parkinson's disease (PD). We meta-analyzed our initial results to produce robust heritability estimates for PD types across cohorts. Our results identify 27% (95% CI 17-38, P = 8.08E - 08) phenotypic variance associated with all types of PD, 15% (95% CI -0.2 to 33, P = 0.09) phenotypic variance associated with early-onset PD and 31% (95% CI 17-44, P = 1.34E - 05) phenotypic variance associated with late-onset PD. This is a substantial increase from the genetic variance identified by top GWAS hits alone (between 3 and 5%) and indicates there are substantially more risk loci to be identified. Our results suggest that although GWASs are a useful tool in identifying the most common variants associated with complex disease, a great deal of common variants of small effect remain to be discovered. © Published by Oxford University Press 2012.
Resumo:
Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Resumo:
Understanding the complexities that are involved in the genetics of multifactorial diseases is still a monumental task. In addition to environmental factors that can influence the risk of disease, there is also a number of other complicating factors. Genetic variants associated with age of disease onset may be different from those variants associated with overall risk of disease, and variants may be located in positions that are not consistent with the traditional protein coding genetic paradigm. Latent Variable Models are well suited for the analysis of genetic data. A latent variable is one that we do not directly observe, but which is believed to exist or is included for computational or analytic convenience in a model. This thesis presents a mixture of methodological developments utilising latent variables, and results from case studies in genetic epidemiology and comparative genomics. Epidemiological studies have identified a number of environmental risk factors for appendicitis, but the disease aetiology of this oft thought useless vestige remains largely a mystery. The effects of smoking on other gastrointestinal disorders are well documented, and in light of this, the thesis investigates the association between smoking and appendicitis through the use of latent variables. By utilising data from a large Australian twin study questionnaire as both cohort and case-control, evidence is found for the association between tobacco smoking and appendicitis. Twin and family studies have also found evidence for the role of heredity in the risk of appendicitis. Results from previous studies are extended here to estimate the heritability of age-at-onset and account for the eect of smoking. This thesis presents a novel approach for performing a genome-wide variance components linkage analysis on transformed residuals from a Cox regression. This method finds evidence for a dierent subset of genes responsible for variation in age at onset than those associated with overall risk of appendicitis. Motivated by increasing evidence of functional activity in regions of the genome once thought of as evolutionary graveyards, this thesis develops a generalisation to the Bayesian multiple changepoint model on aligned DNA sequences for more than two species. This sensitive technique is applied to evaluating the distributions of evolutionary rates, with the finding that they are much more complex than previously apparent. We show strong evidence for at least 9 well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least 7 classes in an alignment of four mammals, including human. A pattern of enrichment and depletion of genic regions in the profiled segments suggests they are functionally significant, and most likely consist of various functional classes. Furthermore, a method of incorporating alignment characteristics representative of function such as GC content and type of mutation into the segmentation model is developed within this thesis. Evidence of fine-structured segmental variation is presented.
Resumo:
We have developed a new experimental method for interrogating statistical theories of music perception by implementing these theories as generative music algorithms. We call this method Generation in Context. This method differs from most experimental techniques in music perception in that it incorporates aesthetic judgments. Generation In Context is designed to measure percepts for which the musical context is suspected to play an important role. In particular the method is suitable for the study of perceptual parameters which are temporally dynamic. We outline a use of this approach to investigate David Temperley’s (2007) probabilistic melody model, and provide some provisional insights as to what is revealed about the model. We suggest that Temperley’s model could be improved by dynamically modulating the probability distributions according to the changing musical context.