989 resultados para statistical bias


Relevância:

100.00% 100.00%

Publicador:

Resumo:

I describe an exploration criterion that attempts to minimize the error of a learner by minimizing its estimated squared bias. I describe experiments with locally-weighted regression on two simple kinematics problems, and observe that this "bias-only" approach outperforms the more common "variance-only" exploration approach, even in the presence of noise.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Projections of Arctic sea ice thickness (SIT) have the potential to inform stakeholders about accessibility to the region, but are currently rather uncertain. The latest suite of CMIP5 Global Climate Models (GCMs) produce a wide range of simulated SIT in the historical period (1979–2014) and exhibit various biases when compared with the Pan-Arctic Ice Ocean Modelling and Assimilation System (PIOMAS) sea ice reanalysis. We present a new method to constrain such GCM simulations of SIT via a statistical bias correction technique. The bias correction successfully constrains the spatial SIT distribution and temporal variability in the CMIP5 projections whilst retaining the climatic fluctuations from individual ensemble members. The bias correction acts to reduce the spread in projections of SIT and reveals the significant contributions of climate internal variability in the first half of the century and of scenario uncertainty from mid-century onwards. The projected date of ice-free conditions in the Arctic under the RCP8.5 high emission scenario occurs in the 2050s, which is a decade earlier than without the bias correction, with potentially significant implications for stakeholders in the Arctic such as the shipping industry. The bias correction methodology developed could be similarly applied to other variables to reduce spread in climate projections more generally.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We address the issue of noise robustness of reconstruction techniques for frequency-domain optical-coherence tomography (FDOCT). We consider three reconstruction techniques: Fourier, iterative phase recovery, and cepstral techniques. We characterize the reconstructions in terms of their statistical bias and variance and obtain approximate analytical expressions under the assumption of small noise. We also perform Monte Carlo analyses and show that the experimental results are in agreement with the theoretical predictions. It turns out that the iterative and cepstral techniques yield reconstructions with a smaller bias than the Fourier method. The three techniques, however, have identical variance profiles, and their consistency increases linearly as a function of the signal-to-noise ratio.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study is intended to estimate the existing rate of participation of women beneficiaries in the development programmes of different organisations in Kerala. It would enable one to understand whether participation is at the satisfactory level or not. Given the rate of participation, the major thrust of the analysis is on the impact of governmental and non-governmental organisations on the rate of participation. This is undertaken under the assumption that NGOs, due to their proximity to people and their needs, ensure better participation rates. Besides the organisational differences, the other major determinants of women participation such as their socio-economic characteristics, psychological make up, the nature of the programme etc. are also highlighted. 0 Since the ascribed status of women in society is inferior, the role of organisers, development personnel and local leaders is also pointed out. Thus the basic objective of the study is women participation and its determinants in the development programmes

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background It can be argued that adaptive designs are underused in clinical research. We have explored concerns related to inadequate reporting of such trials, which may influence their uptake. Through a careful examination of the literature, we evaluated the standards of reporting of group sequential (GS) randomised controlled trials, one form of a confirmatory adaptive design. Methods We undertook a systematic review, by searching Ovid MEDLINE from the 1st January 2001 to 23rd September 2014, supplemented with trials from an audit study. We included parallel group, confirmatory, GS trials that were prospectively designed using a Frequentist approach. Eligible trials were examined for compliance in their reporting against the CONSORT 2010 checklist. In addition, as part of our evaluation, we developed a supplementary checklist to explicitly capture group sequential specific reporting aspects, and investigated how these are currently being reported. Results Of the 284 screened trials, 68(24%) were eligible. Most trials were published in “high impact” peer-reviewed journals. Examination of trials established that 46(68%) were stopped early, predominantly either for futility or efficacy. Suboptimal reporting compliance was found in general items relating to: access to full trials protocols; methods to generate randomisation list(s); details of randomisation concealment, and its implementation. Benchmarking against the supplementary checklist, GS aspects were largely inadequately reported. Only 3(7%) trials which stopped early reported use of statistical bias correction. Moreover, 52(76%) trials failed to disclose methods used to minimise the risk of operational bias, due to the knowledge or leakage of interim results. Occurrence of changes to trial methods and outcomes could not be determined in most trials, due to inaccessible protocols and amendments. Discussion and Conclusions There are issues with the reporting of GS trials, particularly those specific to the conduct of interim analyses. Suboptimal reporting of bias correction methods could potentially imply most GS trials stopping early are giving biased results of treatment effects. As a result, research consumers may question credibility of findings to change practice when trials are stopped early. These issues could be alleviated through a CONSORT extension. Assurance of scientific rigour through transparent adequate reporting is paramount to the credibility of findings from adaptive trials. Our systematic literature search was restricted to one database due to resource constraints.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objective
The use of then-test (retrospective pre-test) scores has frequently been proposed as a solution to potential confounding of change scores because of response shift, as it is assumed that then-test and post-test responses are provided from the same perspective. However, this assumption has not been formally tested using robust quantitative methods. The aim of this study was to compare the psychometric performance of then-test/post-test with traditional pre-test/post-test data and assessing whether the resulting data structures support the application of the then-test for evaluations of chronic disease self-management interventions.

Study Design and Setting
Pre-test, post-test, and then-test data were collected from 314 participants of self-management courses using the Health Education Impact Questionnaire (heiQ). The derived change scores (pre-test/post-test; then-test/post-test) were examined for their psychometric performance using tests of measurement invariance.

Results
Few questionnaire items were noninvariant across pre-test/post-test, with four items identified and requiring removal to enable an unbiased comparison of factor means. In contrast, 12 items were identified and required removal in then-test/post-test data to avoid biased change score estimates.

Conclusion
Traditional pre-test/post-test data appear to be robust with little indication of response shift. In contrast, the weaker psychometric performance of then-test/post-test data suggests psychometric flaws that may be the result of implicit theory of change, social desirability, and recall bias.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The development of new statistical and computational methods is increasingly making it possible to bridge the gap between hard sciences and humanities. In this study, we propose an approach based on a quantitative evaluation of attributes of objects in fields of humanities, from which concepts such as dialectics and opposition are formally defined mathematically. As case studies, we analyzed the temporal evolution of classical music and philosophy by obtaining data for 8 features characterizing the corresponding fields for 7 well-known composers and philosophers, which were treated with multivariate statistics and pattern recognition methods. A bootstrap method was applied to avoid statistical bias caused by the small sample data set, with which hundreds of artificial composers and philosophers were generated, influenced by the 7 names originally chosen. Upon defining indices for opposition, skewness and counter-dialectics, we confirmed the intuitive analysis of historians in that classical music evolved according to a master apprentice tradition, while in philosophy changes were driven by opposition. Though these case studies were meant only to show the possibility of treating phenomena in humanities quantitatively, including a quantitative measure of concepts such as dialectics and opposition, the results are encouraging for further application of the approach presented here to many other areas, since it is entirely generic.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sequences of two chloroplast photosystem genes, psaA and psbB, together comprising about 3,500 bp, were obtained for all five major groups of extant seed plants and several outgroups among other vascular plants. Strongly supported, but significantly conflicting, phylogenetic signals were obtained in parsimony analyses from partitions of the data into first and second codon positions versus third positions. In the former, both genes agreed on a monophyletic gymnosperms, with Gnetales closely related to certain conifers. In the latter, Gnetales are inferred to be the sister group of all other seed plants, with gymnosperms paraphyletic. None of the data supported the modern ‘‘anthophyte hypothesis,’’ which places Gnetales as the sister group of flowering plants. A series of simulation studies were undertaken to examine the error rate for parsimony inference. Three kinds of errors were examined: random error, systematic bias (both properties of finite data sets), and statistical inconsistency owing to long-branch attraction (an asymptotic property). Parsimony reconstructions were extremely biased for third-position data for psbB. Regardless of the true underlying tree, a tree in which Gnetales are sister to all other seed plants was likely to be reconstructed for these data. None of the combinations of genes or partitions permits the anthophyte tree to be reconstructed with high probability. Simulations of progressively larger data sets indicate the existence of long-branch attraction (statistical inconsistency) for third-position psbB data if either the anthophyte tree or the gymnosperm tree is correct. This is also true for the anthophyte tree using either psaA third positions or psbB first and second positions. A factor contributing to bias and inconsistency is extremely short branches at the base of the seed plant radiation, coupled with extremely high rates in Gnetales and nonseed plant outgroups. M. J. Sanderson,* M. F. Wojciechowski,*† J.-M. Hu,* T. Sher Khan,* and S. G. Brady

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many traffic situations require drivers to cross or merge into a stream having higher priority. Gap acceptance theory enables us to model such processes to analyse traffic operation. This discussion demonstrated that numerical search fine tuned by statistical analysis can be used to determine the most likely critical gap for a sample of drivers, based on their largest rejected gap and accepted gap. This method shares some common features with the Maximum Likelihood Estimation technique (Troutbeck 1992) but lends itself well to contemporary analysis tools such as spreadsheet and is particularly analytically transparent. This method is considered not to bias estimation of critical gap due to very small rejected gaps or very large rejected gaps. However, it requires a sufficiently large sample that there is reasonable representation of largest rejected gap/accepted gap pairs within a fairly narrow highest likelihood search band.