941 resultados para Statistical power
Resumo:
Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.
Resumo:
Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^
Resumo:
Statistical software is now commonly available to calculate Power (P') and sample size (N) for most experimental designs. In many circumstances, however, sample size is constrained by lack of time, cost, and in research involving human subjects, the problems of recruiting suitable individuals. In addition, the calculation of N is often based on erroneous assumptions about variability and therefore such estimates are often inaccurate. At best, we would suggest that such calculations provide only a very rough guide of how to proceed in an experiment. Nevertheless, calculation of P' is very useful especially in experiments that have failed to detect a difference which the experimenter thought was present. We would recommend that P' should always be calculated in these circumstances to determine whether the experiment was actually too small to test null hypotheses adequately.
Resumo:
The concept of sample size and statistical power estimation is now something that Optometrists that want to perform research, whether it be in practice or in an academic institution, cannot simply hide away from. Ethics committees, journal editors and grant awarding bodies are now increasingly requesting that all research be backed up with sample size and statistical power estimation in order to justify any study and its findings. This article presents a step-by-step guide of the process for determining sample sizeand statistical power. It builds on statistical concepts presented in earlier articles in Optometry Today by Richard Armstrong and Frank Eperjesi.
Resumo:
Multivariate volatility forecasts are an important input in many financial applications, in particular portfolio optimisation problems. Given the number of models available and the range of loss functions to discriminate between them, it is obvious that selecting the optimal forecasting model is challenging. The aim of this thesis is to thoroughly investigate how effective many commonly used statistical (MSE and QLIKE) and economic (portfolio variance and portfolio utility) loss functions are at discriminating between competing multivariate volatility forecasts. An analytical investigation of the loss functions is performed to determine whether they identify the correct forecast as the best forecast. This is followed by an extensive simulation study examines the ability of the loss functions to consistently rank forecasts, and their statistical power within tests of predictive ability. For the tests of predictive ability, the model confidence set (MCS) approach of Hansen, Lunde and Nason (2003, 2011) is employed. As well, an empirical study investigates whether simulation findings hold in a realistic setting. In light of these earlier studies, a major empirical study seeks to identify the set of superior multivariate volatility forecasting models from 43 models that use either daily squared returns or realised volatility to generate forecasts. This study also assesses how the choice of volatility proxy affects the ability of the statistical loss functions to discriminate between forecasts. Analysis of the loss functions shows that QLIKE, MSE and portfolio variance can discriminate between multivariate volatility forecasts, while portfolio utility cannot. An examination of the effective loss functions shows that they all can identify the correct forecast at a point in time, however, their ability to discriminate between competing forecasts does vary. That is, QLIKE is identified as the most effective loss function, followed by portfolio variance which is then followed by MSE. The major empirical analysis reports that the optimal set of multivariate volatility forecasting models includes forecasts generated from daily squared returns and realised volatility. Furthermore, it finds that the volatility proxy affects the statistical loss functions’ ability to discriminate between forecasts in tests of predictive ability. These findings deepen our understanding of how to choose between competing multivariate volatility forecasts.
Resumo:
Nitrous oxide emissions from soil are known to be spatially and temporally volatile. Reliable estimation of emissions over a given time and space depends on measuring with sufficient intensity but deciding on the number of measuring stations and the frequency of observation can be vexing. The question of low frequency manual observations providing comparable results to high frequency automated sampling also arises. Data collected from a replicated field experiment was intensively studied with the intention to give some statistically robust guidance on these issues. The experiment had nitrous oxide soil to air flux monitored within 10 m by 2.5 m plots by automated closed chambers under a 3 h average sampling interval and by manual static chambers under a three day average sampling interval over sixty days. Observed trends in flux over time by the static chambers were mostly within the auto chamber bounds of experimental error. Cumulated nitrous oxide emissions as measured by each system were also within error bounds. Under the temporal response pattern in this experiment, no significant loss of information was observed after culling the data to simulate results under various low frequency scenarios. Within the confines of this experiment observations from the manual chambers were not spatially correlated above distances of 1 m. Statistical power was therefore found to improve due to increased replicates per treatment or chambers per replicate. Careful after action review of experimental data can deliver savings for future work.
Resumo:
For complex disease genetics research in human populations, remarkable progress has been made in recent times with the publication of a number of genome-wide association scans (GWAS) and subsequent statistical replications. These studies have identified new genes and pathways implicated in disease, many of which were not known before. Given these early successes, more GWAS are being conducted and planned, both for disease and quantitative phenotypes. Many researchers and clinicians have DNA samples available on collections of families, including both cases and controls. Twin registries around the world have facilitated the collection of large numbers of families, with DNA and multiple quantitative phenotypes collected on twin pairs and their relatives. In the design of a new GWAS with a fixed budget for the number of chips, the question arises whether to include or exclude related individuals. It is commonly believed to be preferable to use unrelated individuals in the first stage of a GWAS because relatives are 'over-matched' for genotypes. In this study, we quantify that for GWAS of a quantitative phenotype, relative to a sample of unrelated individuals surprisingly little power is lost when using relatives. The advantages of using relatives are manifold, including the ability to perform more quality control, the choice to perform within-family tests of association that are robust to population stratification, and the ability to perform joint linkage and association analysis. Therefore, the advantages of using relatives in GWAS for quantitative traits may well outweigh the small disadvantage in terms of statistical power.
Resumo:
Interest in development of offshore renewable energy facilities has led to a need for high-quality, statistically robust information on marine wildlife distributions. A practical approach is described to estimate the amount of sampling effort required to have sufficient statistical power to identify species specific “hotspots” and “coldspots” of marine bird abundance and occurrence in an offshore environment divided into discrete spatial units (e.g., lease blocks), where “hotspots” and “coldspots” are defined relative to a reference (e.g., regional) mean abundance and/or occurrence probability for each species of interest. For example, a location with average abundance or occurrence that is three times larger the mean (3x effect size) could be defined as a “hotspot,” and a location that is three times smaller than the mean (1/3x effect size) as a “coldspot.” The choice of the effect size used to define hot and coldspots will generally depend on a combination of ecological and regulatory considerations. A method is also developed for testing the statistical significance of possible hotspots and coldspots. Both methods are illustrated with historical seabird survey data from the USGS Avian Compendium Database.
Resumo:
There are now many reports of imaging experiments with small cohorts of typical participants that precede large-scale, often multicentre studies of psychiatric and neurological disorders. Data from these calibration experiments are sufficient to make estimates of statistical power and predictions of sample size and minimum observable effect sizes. In this technical note, we suggest how previously reported voxel-based power calculations can support decision making in the design, execution and analysis of cross-sectional multicentre imaging studies. The choice of MRI acquisition sequence, distribution of recruitment across acquisition centres, and changes to the registration method applied during data analysis are considered as examples. The consequences of modification are explored in quantitative terms by assessing the impact on sample size for a fixed effect size and detectable effect size for a fixed sample size. The calibration experiment dataset used for illustration was a precursor to the now complete Medical Research Council Autism Imaging Multicentre Study (MRC-AIMS). Validation of the voxel-based power calculations is made by comparing the predicted values from the calibration experiment with those observed in MRC-AIMS. The effect of non-linear mappings during image registration to a standard stereotactic space on the prediction is explored with reference to the amount of local deformation. In summary, power calculations offer a validated, quantitative means of making informed choices on important factors that influence the outcome of studies that consume significant resources.
Resumo:
In the 1980s, leukaemia clusters were discovered around nuclear fuel reprocessing plants in Sellafield and Dounreay in the United Kingdom. This raised public concern about the risk of childhood leukaemia near nuclear power plants (NPPs). Since then, the topic has been well-studied, but methodological limitations make results difficult to interpret. Our review aims to: (1.) summarise current evidence on the relationship between NPPs and risk of childhood leukaemia, with a focus on the Swiss CANUPIS (Childhood cancer and nuclear power plants in Switzerland) study; (2.) discuss the limitations of previous research; and (3.) suggest directions for future research. There are various reasons that previous studies produced inconclusive results. These include: inadequate study designs and limited statistical power due to the low prevalence of exposure (living near a NPP) and outcome (leukaemia); lack of accurate exposure estimates; limited knowledge of the aetiology of childhood leukaemia, particularly of vulnerable time windows and latent periods; use of residential location at time of diagnosis only and lack of data on address histories; and inability to adjust for potential confounders. We conclude that risk of childhood leukaemia around NPPs should continue to be monitored and that study designs should be improved and standardised. Data should be pooled internationally to increase the statistical power. More research needs to be done on other putative risk factors for childhood cancer such as low-dose ionizing radiation, exposure to certain chemicals and exposure to infections. Studies should be designed to allow examining multiple exposures.
Resumo:
Prostate cancer (CaP) is the second leading cause of cancer-related deaths in North American males and the most common newly diagnosed cancer in men world wide. Biomarkers are widely used for both early detection and prognostic tests for cancer. The current, commonly used biomarker for CaP is serum prostate specific antigen (PSA). However, the specificity of this biomarker is low as its serum level is not only increased in CaP but also in various other diseases, with age and even body mass index. Human body fluids provide an excellent resource for the discovery of biomarkers, with the advantage over tissue/biopsy samples of their ease of access, due to the less invasive nature of collection. However, their analysis presents challenges in terms of variability and validation. Blood and urine are two human body fluids commonly used for CaP research, but their proteomic analyses are limited both by the large dynamic range of protein abundance making detection of low abundance proteins difficult and in the case of urine, by the high salt concentration. To overcome these challenges, different techniques for removal of high abundance proteins and enrichment of low abundance proteins are used. Their applications and limitations are discussed in this review. A number of innovative proteomic techniques have improved detection of biomarkers. They include two dimensional differential gel electrophoresis (2D-DIGE), quantitative mass spectrometry (MS) and functional proteomic studies, i.e., investigating the association of post translational modifications (PTMs) such as phosphorylation, glycosylation and protein degradation. The recent development of quantitative MS techniques such as stable isotope labeling with amino acids in cell culture (SILAC), isobaric tags for relative and absolute quantitation (iTRAQ) and multiple reaction monitoring (MRM) have allowed proteomic researchers to quantitatively compare data from different samples. 2D-DIGE has greatly improved the statistical power of classical 2D gel analysis by introducing an internal control. This chapter aims to review novel CaP biomarkers as well as to discuss current trends in biomarker research from two angles: the source of biomarkers (particularly human body fluids such as blood and urine), and emerging proteomic approaches for biomarker research.
Resumo:
Background: Despite important implications for the budgets, statistical power and generalisability of research findings, detailed reports of recruitment and retention in randomised controlled trials (RCTs) are rare. The NOURISH RCT evaluated a community-based intervention for first-time mothers that promoted protective infant feeding practices as a primary prevention strategy for childhood obesity. The aim of this paper is to provide a detailed description and evaluation of the recruitment and retention strategies used. Methods: A two stage recruitment process designed to provide a consecutive sampling framework was used. First time mothers delivering healthy term infants were initially approached in postnatal wards of the major maternity services in two Australian cities for consent to later contact (Stage 1). When infants were about four months old mothers were re-contacted by mail for enrolment (Stage 2), baseline measurements (Time 1) and subsequent random allocation to the intervention or control condition. Outcomes were assessed at infant ages 14 months (Time 2) and 24 months (Time 3). Results: At Stage 1, 86% of eligible mothers were approached and of these women, 76% consented to later contact. At Stage 2, 3% had become ineligible and 76% could be recontacted. Of the latter, 44% consented to full enrolment and were allocated. This represented 21% of mothers screened as eligible at Stage 1. Retention at Time 3 was 78%. Mothers who did not consent or discontinued the study were younger and less likely to have a university education. Conclusions: The consent and retention rates of our sample of first time mothers are comparable with or better than other similar studies. The recruitment strategy used allowed for detailed information from non-consenters to be collected; thus selection bias could be estimated. Recommendations for future studies include being able to contact participants via mobile phone (particular text messaging), offering home visits to reduce participant burden and considering the use of financial incentives to support participant retention.