877 resultados para Variable sample size
Resumo:
Spatial data analysis has become more and more important in the studies of ecology and economics during the last decade. One focus of spatial data analysis is how to select predictors, variance functions and correlation functions. However, in general, the true covariance function is unknown and the working covariance structure is often misspecified. In this paper, our target is to find a good strategy to identify the best model from the candidate set using model selection criteria. This paper is to evaluate the ability of some information criteria (corrected Akaike information criterion, Bayesian information criterion (BIC) and residual information criterion (RIC)) for choosing the optimal model when the working correlation function, the working variance function and the working mean function are correct or misspecified. Simulations are carried out for small to moderate sample sizes. Four candidate covariance functions (exponential, Gaussian, Matern and rational quadratic) are used in simulation studies. With the summary in simulation results, we find that the misspecified working correlation structure can still capture some spatial correlation information in model fitting. When the sample size is large enough, BIC and RIC perform well even if the the working covariance is misspecified. Moreover, the performance of these information criteria is related to the average level of model fitting which can be indicated by the average adjusted R square ( [GRAPHICS] ), and overall RIC performs well.
Resumo:
Estimation of von Bertalanffy growth parameters has received considerable attention in fisheries research. Since Sainsbury (1980, Can. J. Fish. Aquat. Sci. 37: 241-247) much of this research effort has centered on accounting for individual variability in the growth parameters. In this paper we demonstrate that, in analysis of tagging data, Sainsbury's method and its derivatives do not, in general, satisfactorily account for individual variability in growth, leading to inconsistent parameter estimates (the bias does not tend to zero as sample size increases to infinity). The bias arises because these methods do not use appropriate conditional expectations as a basis for estimation. This bias is found to be similar to that of the Fabens method. Such methods would be appropriate only under the assumption that the individual growth parameters that generate the growth increment were independent of the growth parameters that generated the initial length. However, such an assumption would be unrealistic. The results are derived analytically, and illustrated with a simulation study. Until techniques that take full account of the appropriate conditioning have been developed, the effect of individual variability on growth has yet to be fully understood.
Resumo:
We propose an iterative estimating equations procedure for analysis of longitudinal data. We show that, under very mild conditions, the probability that the procedure converges at an exponential rate tends to one as the sample size increases to infinity. Furthermore, we show that the limiting estimator is consistent and asymptotically efficient, as expected. The method applies to semiparametric regression models with unspecified covariances among the observations. In the special case of linear models, the procedure reduces to iterative reweighted least squares. Finite sample performance of the procedure is studied by simulations, and compared with other methods. A numerical example from a medical study is considered to illustrate the application of the method.
Resumo:
Summary. Interim analysis is important in a large clinical trial for ethical and cost considerations. Sometimes, an interim analysis needs to be performed at an earlier than planned time point. In that case, methods using stochastic curtailment are useful in examining the data for early stopping while controlling the inflation of type I and type II errors. We consider a three-arm randomized study of treatments to reduce perioperative blood loss following major surgery. Owing to slow accrual, an unplanned interim analysis was required by the study team to determine whether the study should be continued. We distinguish two different cases: when all treatments are under direct comparison and when one of the treatments is a control. We used simulations to study the operating characteristics of five different stochastic curtailment methods. We also considered the influence of timing of the interim analyses on the type I error and power of the test. We found that the type I error and power between the different methods can be quite different. The analysis for the perioperative blood loss trial was carried out at approximately a quarter of the planned sample size. We found that there is little evidence that the active treatments are better than a placebo and recommended closure of the trial.
Resumo:
Purpose This research investigates whether application of a community-based social marketing principle, namely increasing the visibility of a target behaviour in the community, can change social norms surrounding the behaviour. Design/methodology/approach A repeated measures quasi-experimental design was employed to evaluate the Victorian Health Promotion Foundation’s Walk to School 2013 programme, which increases the visibility of walking to and from school through programme participation to promote active transportation for primary school children. The target population for the survey were caregivers of primary school children aged between 5-12 years old. The final sample size across the three online surveys administered was 102 respondents. Findings The results suggest that the programme increased caregivers’ perceptions that children in their community walked to and from school and that walking to and from school is socially acceptable. Originality/value The study contributes to addressing the recent call for research examining the relationship between community-based social marketing principles and programme outcomes. Further, the results provide insight for enhancing the social norms approach, which has traditionally relied on changing social norms exclusively through media campaigns.
Resumo:
Several articles in this journal have studied optimal designs for testing a series of treatments to identify promising ones for further study. These designs formulate testing as an ongoing process until a promising treatment is identified. This formulation is considered to be more realistic but substantially increases the computational complexity. In this article, we show that these new designs, which control the error rates for a series of treatments, can be reformulated as conventional designs that control the error rates for each individual treatment. This reformulation leads to a more meaningful interpretation of the error rates and hence easier specification of the error rates in practice. The reformulation also allows us to use conventional designs from published tables or standard computer programs to design trials for a series of treatments. We illustrate these using a study in soft tissue sarcoma.
Resumo:
Traditional comparisons between the capture efficiency of sampling devices have generally looked at the absolute differences between devices. We recommend that the signal-to-noise ratio be used when comparing the capture efficiency of benthic sampling devices. Using the signal-to-noise ratio rather than the absolute difference has the advantages that the variance is taken into account when determining how important the difference is, the hypothesis and minimum detectable difference can be made identical for all taxa, it is independent of the units used for measurement, and the sample-size calculation is independent of the variance. This new technique is illustrated by comparing the capture efficiency of a 0.05 m(2) van Veen grab and an airlift suction device, using samples taken from Heron and One Tree lagoons, Australia.
Resumo:
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Resumo:
Genetic mark–recapture requires efficient methods of uniquely identifying individuals. 'Shadows' (individuals with the same genotype at the selected loci) become more likely with increasing sample size, and bias harvest rate estimates. Finding loci is costly, but better loci reduce analysis costs and improve power. Optimal microsatellite panels minimize shadows, but panel design is a complex optimization process. locuseater and shadowboxer permit power and cost analysis of this process and automate some aspects, by simulating the entire experiment from panel design to harvest rate estimation.
Resumo:
A computationally efficient agglomerative clustering algorithm based on multilevel theory is presented. Here, the data set is divided randomly into a number of partitions. The samples of each such partition are clustered separately using hierarchical agglomerative clustering algorithm to form sub-clusters. These are merged at higher levels to get the final classification. This algorithm leads to the same classification as that of hierarchical agglomerative clustering algorithm when the clusters are well separated. The advantages of this algorithm are short run time and small storage requirement. It is observed that the savings, in storage space and computation time, increase nonlinearly with the sample size.
Resumo:
This thesis utilises an evidence-based approach to critically evaluate and summarize effectiveness research on physiotherapy, physiotherapy-related motor-based interventions and orthotic devices in children and adolescents with cerebral palsy (CP). It aims to assess the methodological challenges of the systematic reviews and trials, to evaluate the effectiveness of interventions in current use, and to make suggestions for future trials Methods: Systematic reviews were searched from computerized bibliographic databases up to August 2007 for physiotherapy and physiotherapy-related interventions, and up to May 2003 for orthotic devices. Two reviewers independently identified, selected, and assessed the quality of the reviews using the Overview Quality Assessment Questionnaire complemented with decision rules. From a sample of 14 randomized controlled trials (RCT) published between January 1990 and June 2003 we analysed the methods of sampling, recruitment, and comparability of groups; defined the components of a complex intervention; identified outcome measures based on the International Classification of Functioning, Disability and Health (ICF); analysed the clinical interpretation of score changes; and analysed trial reporting using a modified 33-item CONSORT (Consolidated Standards of Reporting Trials) checklist. The effectiveness of physiotherapy and physiotherapy-related interventions in children with diagnosed CP was evaluated in a systematic review of randomised controlled trials that were searched from computerized databases from January 1990 up to February 2007. Two reviewers independently assessed the methodological quality, extracted the data, classified the outcomes using the ICF, and considered the level of evidence according to van Tulder et al. (2003). Results: We identified 21 reviews on physiotherapy and physiotherapy-related interventions and five on orthotic devices. These reviews summarized 23 or 5 randomised controlled trials and 104 or 27 observational studies, respectively. Only six reviews were of high quality. These found some evidence supporting strength training, constraint-induced movement therapy or hippotherapy, and insufficient evidence on comprehensive interventions. Based on the original studies included in the reviews on orthotic devices we found some short-term effects of lower limb casting on passive range of movement, and of ankle-foot orthoses on equinus walk. Long term effects of lower limb orthoses have not been studied. Evidence of upper limb casting or orthoses is conflicting. In the sample of 14 RCTs, most trials used simple randomisation, complemented with matching or stratification, but only three specified the concealed allocation. Numerous studies provided sufficient details on the components of a complex intervention, but the overlap of outcome measures across studies was poor and the clinical interpretation of observed score changes was mostly missing. Almost half (48%) of the applicable CONSORT-based items (range 28 32) were reported adequately. Most reporting inadequacies were in outcome measures, sample size determination, details of the sequence generation, allocation concealment and implementation of the randomization, success of assessor blinding, recruitment and follow-up dates, intention-to-treat analysis, precision of the effect size, co-interventions, and adverse events. The systematic review identified 22 trials on eight intervention categories. Four trials were of high quality. Moderate evidence of effectiveness was established for upper extremity treatments on attained goals, active supination and developmental status, and of constraint-induced therapy on the amount and quality of hand use and new emerging behaviours. Moderate evidence of ineffectiveness was found for strength training's effect on walking speed and stride length. Conflicting evidence was found for strength training's effect on gross motor function. For the other intervention categories the evidence was limited due to the low methodological quality and the statistically insignificant results of the studies. Conclusions: The high-quality reviews provide both supportive and insufficient evidence on some physiotherapy interventions. The poor quality of most reviews calls for caution, although most reviews drew no conclusions on effectiveness due to the poor quality of the primary studies. A considerable number of RCTs of good to fair methodological and reporting quality indicate that informative and well-reported RCTs on complex interventions in children and adolescents with CP are feasible. Nevertheless, methodological improvement is needed in certain areas of the trial design and performance, and the trial authors are encouraged to follow the CONSORT criteria. Based on RCTs we established moderate evidence for some effectiveness of upper extremity training. Due to limitations in methodological quality and variations in population, interventions and outcomes, mostly limited evidence on the effectiveness of most physiotherapy interventions is available to guide clinical practice. Well-designed trials are needed, especially for focused physiotherapy interventions.
Resumo:
Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and 'genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ~4000 and ~133,000 individuals.
Resumo:
Migraine is a common neurological disorder with a genetically complex background. This paper describes a meta-analysis of genome-wide association (GWA) studies on migraine, performed by the Dutch-Icelandic migraine genetics (DICE) consortium, which brings together six population-based European migraine cohorts with a total sample size of 10,980 individuals (2446 cases and 8534 controls). A total of 32 SNPs showed marginal evidence for association at a P-value<10(-5). The best result was obtained for SNP rs9908234, which had a P-value of 8.00 x 10(-8). This top SNP is located in the nerve growth factor receptor (NGFR) gene. However, this SNP did not replicate in three cohorts from the Netherlands and Australia. Of the other 31 SNPs, 18 SNPs were tested in two replication cohorts, but none replicated. In addition, we explored previously identified candidate genes in the meta-analysis data set. This revealed a modest gene-based significant association between migraine and the metadherin (MTDH) gene, previously identified in the first clinic-based GWA study (GWAS) for migraine (Bonferroni-corrected gene-based P-value=0.026). This finding is consistent with the involvement of the glutamate pathway in migraine. Additional research is necessary to further confirm the involvement of glutamate.
Resumo:
The detection and replication of schizophrenia risk loci can require substantial sample sizes, which has prompted various collaborative efforts for combining multiple samples. However, pooled samples may comprise sub-samples with substantial population genetic differences, including allele frequency differences. We investigated the impact of population differences via linkage reanalysis of Molecular Genetics of Schizophrenia 1 (MGS1) affected sibling-pair data, comprising two samples of distinct ancestral origin: European (EA: 263 pedigrees) and African-American (AA: 146 pedigrees). To exploit the linkage information contained within these distinct continental samples, we performed separate analyses of the individual samples, allowing for within-sample locus heterogeneity, and the pooled sample, allowing for both within-sample and between-sample heterogeneity. Significance levels, corrected for the multiple tests, were determined empirically. For all suggestive peaks, stronger linkage evidence was obtained in either the EA or AA sample than the combined sample, regardless of how heterogeneity was modeled for the latter. Notably, we report genomewide significant linkage of schizophrenia to 8p23.3 and evidence for a second, independent susceptibility locus, reaching suggestive linkage, 29 cM away on 8p21.3. We also detected suggestive linkage on chromosomes 5p13.3 and 7q36.2. Many regions showed pronounced differences in the extent of linkage between the EA and AA samples. This reanalysis highlights the potential impact of population differences upon linkage evidence in pooled data and demonstrates a useful approach for the analysis of samples drawn from distinct continental groups.
Resumo:
As for other complex diseases, linkage analyses of schizophrenia (SZ) have produced evidence for numerous chromosomal regions, with inconsistent results reported across studies. The presence of locus heterogeneity appears likely and may reduce the power of linkage analyses if homogeneity is assumed. In addition, when multiple heterogeneous datasets are pooled, inter-sample variation in the proportion of linked families (alpha) may diminish the power of the pooled sample to detect susceptibility loci, in spite of the larger sample size obtained. We compare the significance of linkage findings obtained using allele-sharing LOD scores (LOD(exp))-which assume homogeneity-and heterogeneity LOD scores (HLOD) in European American and African American NIMH SZ families. We also pool these two samples and evaluate the relative power of the LOD(exp) and two different heterogeneity statistics. One of these (HLOD-P) estimates the heterogeneity parameter alpha only in aggregate data, while the second (HLOD-S) determines alpha separately for each sample. In separate and combined data, we show consistently improved performance of HLOD scores over LOD(exp). Notably, genome-wide significant evidence for linkage is obtained at chromosome 10p in the European American sample using a recessive HLOD score. When the two samples are combined, linkage at the 10p locus also achieves genome-wide significance under HLOD-S, but not HLOD-P. Using HLOD-S, improved evidence for linkage was also obtained for a previously reported region on chromosome 15q. In linkage analyses of complex disease, power may be maximised by routinely modelling locus heterogeneity within individual datasets, even when multiple datasets are combined to form larger samples.