897 resultados para Sample size
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Recent studies have shown that the X̄ chart with variable sampling intervals (VSI) and/or with variable sample sizes (VSS) detects process shifts faster than the traditional X̄ chart. This article extends these studies for processes that are monitored by both the X̄ and R charts. A Markov chain model is used to determine the properties of the joint X and R charts with variable sample sizes and sampling intervals (VSSI). The VSSI scheme improves the joint X̄ and R control chart performance in terms of the speed with which shifts in the process mean and/or variance are detected.
Resumo:
The 3PL model is a flexible and widely used tool in assessment. However, it suffers from limitations due to its need for large sample sizes. This study introduces and evaluates the efficacy of a new sample size augmentation technique called Duplicate, Erase, and Replace (DupER) Augmentation through a simulation study. Data are augmented using several variations of DupER Augmentation (based on different imputation methodologies, deletion rates, and duplication rates), analyzed in BILOG-MG 3, and results are compared to those obtained from analyzing the raw data. Additional manipulated variables include test length and sample size. Estimates are compared using seven different evaluative criteria. Results are mixed and inconclusive. DupER augmented data tend to result in larger root mean squared errors (RMSEs) and lower correlations between estimates and parameters for both item and ability parameters. However, some DupER variations produce estimates that are much less biased than those obtained from the raw data alone. For one DupER variation, it was found that DupER produced better results for low-ability simulees and worse results for those with high abilities. Findings, limitations, and recommendations for future studies are discussed. Specific recommendations for future studies include the application of Duper Augmentation (1) to empirical data, (2) with additional IRT models, and (3) the analysis of the efficacy of the procedure for different item and ability parameter distributions.
Resumo:
Effective population size is an important parameter for the assessment of genetic diversity within a livestock population and its development over time. If pedigree information is not available, linkage disequilibrium (LD) analysis might offer an alternative perspective for the estimation of effective population size. In this study, 128 individuals of the Swiss Eringer breed were genotyped using the Illumina BovineSNP50 beadchip. We set bin size at 50 kb for LD analysis, assuming that LD for proximal single nucleotide polymorphism (SNP)-pairs reflects distant breeding history while LD from distal SNP-pairs would reflect near history. Recombination rates varied among different regions of the genome. The use of physical distances as an approximation of genetic distances (e.g. setting 1 Mb = 0.01 Morgan) led to an upward bias in LD-based estimates of effective population size for generations beyond 50, while estimates for recent history were unaffected. Correction for restricted sample size did not substantially affect these results. LD-based actual effective population size was estimated in the range of 87-149, whereas pedigree-based effective population size resulted in 321 individuals. For conservation purposes, requiring knowledge of recent history (<50 generations), approximation assuming constant recombination rate seemed adequate.
Resumo:
OBJECTIVES: To determine sample sizes in studies on diagnostic accuracy and the proportion of studies that report calculations of sample size. DESIGN: Literature survey. DATA SOURCES: All issues of eight leading journals published in 2002. METHODS: Sample sizes, number of subgroup analyses, and how often studies reported calculations of sample size were extracted. RESULTS: 43 of 8999 articles were non-screening studies on diagnostic accuracy. The median sample size was 118 (interquartile range 71-350) and the median prevalence of the target condition was 43% (27-61%). The median number of patients with the target condition--needed to calculate a test's sensitivity--was 49 (28-91). The median number of patients without the target condition--needed to determine a test's specificity--was 76 (27-209). Two of the 43 studies (5%) reported a priori calculations of sample size. Twenty articles (47%) reported results for patient subgroups. The number of subgroups ranged from two to 19 (median four). No studies reported that sample size was calculated on the basis of preplanned analyses of subgroups. CONCLUSION: Few studies on diagnostic accuracy report considerations of sample size. The number of participants in most studies on diagnostic accuracy is probably too small to analyse variability of measures of accuracy across patient subgroups.
Resumo:
Power calculations in a small sample comparative study, with a continuous outcome measure, are typically undertaken using the asymptotic distribution of the test statistic. When the sample size is small, this asymptotic result can be a poor approximation. An alternative approach, using a rank based test statistic, is an exact power calculation. When the number of groups is greater than two, the number of calculations required to perform an exact power calculation is prohibitive. To reduce the computational burden, a Monte Carlo resampling procedure is used to approximate the exact power function of a k-sample rank test statistic under the family of Lehmann alternative hypotheses. The motivating example for this approach is the design of animal studies, where the number of animals per group is typically small.
Resumo:
The distribution of the number of heterozygous loci in two randomly chosen gametes or in a random diploid zygote provides information regarding the nonrandom association of alleles among different genetic loci. Two alternative statistics may be employed for detection of nonrandom association of genes of different loci when observations are made on these distributions: observed variance of the number of heterozygous loci (s2k) and a goodness-of-fit criterion (X2) to contrast the observed distribution with that expected under the hypothesis of random association of genes. It is shown, by simulation, that s2k is statistically more efficient than X2 to detect a given extent of nonrandom association. Asymptotic normality of s2k is justified, and X2 is shown to follow a chi-square (chi 2) distribution with partial loss of degrees of freedom arising because of estimation of parameters from the marginal gene frequency data. Whenever direct evaluations of linkage disequilibrium values are possible, tests based on maximum likelihood estimators of linkage disequilibria require a smaller sample size (number of zygotes or gametes) to detect a given level of nonrandom association in comparison with that required if such tests are conducted on the basis of s2k. Summarization of multilocus genotype (or haplotype) data, into the different number of heterozygous loci classes, thus, amounts to appreciable loss of information.
Resumo:
Sample size calculations are advocated by the CONSORT group to justify sample sizes in randomized controlled trials (RCTs). The aim of this study was primarily to evaluate the reporting of sample size calculations, to establish the accuracy of these calculations in dental RCTs and to explore potential predictors associated with adequate reporting. Electronic searching was undertaken in eight leading specific and general dental journals. Replication of sample size calculations was undertaken where possible. Assumed variances or odds for control and intervention groups were also compared against those observed. The relationship between parameters including journal type, number of authors, trial design, involvement of methodologist, single-/multi-center study and region and year of publication, and the accuracy of sample size reporting was assessed using univariable and multivariable logistic regression. Of 413 RCTs identified, sufficient information to allow replication of sample size calculations was provided in only 121 studies (29.3%). Recalculations demonstrated an overall median overestimation of sample size of 15.2% after provisions for losses to follow-up. There was evidence that journal, methodologist involvement (OR = 1.97, CI: 1.10, 3.53), multi-center settings (OR = 1.86, CI: 1.01, 3.43) and time since publication (OR = 1.24, CI: 1.12, 1.38) were significant predictors of adequate description of sample size assumptions. Among journals JCP had the highest odds of adequately reporting sufficient data to permit sample size recalculation, followed by AJODO and JDR, with 61% (OR = 0.39, CI: 0.19, 0.80) and 66% (OR = 0.34, CI: 0.15, 0.75) lower odds, respectively. Both assumed variances and odds were found to underestimate the observed values. Presentation of sample size calculations in the dental literature is suboptimal; incorrect assumptions may have a bearing on the power of RCTs.
Resumo:
This paper examines how the geospatial accuracy of samples and sample size influence conclusions from geospatial analyses. It does so using the example of a study investigating the global phenomenon of large-scale land acquisitions and the socio-ecological characteristics of the areas they target. First, we analysed land deal datasets of varying geospatial accuracy and varying sizes and compared the results in terms of land cover, population density, and two indicators for agricultural potential: yield gap and availability of uncultivated land that is suitable for rainfed agriculture. We found that an increase in geospatial accuracy led to a substantial and greater change in conclusions about the land cover types targeted than an increase in sample size, suggesting that using a sample of higher geospatial accuracy does more to improve results than using a larger sample. The same finding emerged for population density, yield gap, and the availability of uncultivated land suitable for rainfed agriculture. Furthermore, the statistical median proved to be more consistent than the mean when comparing the descriptive statistics for datasets of different geospatial accuracy. Second, we analysed effects of geospatial accuracy on estimations regarding the potential for advancing agricultural development in target contexts. Our results show that the target contexts of the majority of land deals in our sample whose geolocation is known with a high level of accuracy contain smaller amounts of suitable, but uncultivated land than regional- and national-scale averages suggest. Consequently, the more target contexts vary within a country, the more detailed the spatial scale of analysis has to be in order to draw meaningful conclusions about the phenomena under investigation. We therefore advise against using national-scale statistics to approximate or characterize phenomena that have a local-scale impact, particularly if key indicators vary widely within a country.
Resumo:
The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^
Resumo:
Background Reliable information on causes of death is a fundamental component of health development strategies, yet globally only about one-third of countries have access to such information. For countries currently without adequate mortality reporting systems there are useful models other than resource-intensive population-wide medical certification. Sample-based mortality surveillance is one such approach. This paper provides methods for addressing appropriate sample size considerations in relation to mortality surveillance, with particular reference to situations in which prior information on mortality is lacking. Methods The feasibility of model-based approaches for predicting the expected mortality structure and cause composition is demonstrated for populations in which only limited empirical data is available. An algorithm approach is then provided to derive the minimum person-years of observation needed to generate robust estimates for the rarest cause of interest in three hypothetical populations, each representing different levels of health development. Results Modelled life expectancies at birth and cause of death structures were within expected ranges based on published estimates for countries at comparable levels of health development. Total person-years of observation required in each population could be more than halved by limiting the set of age, sex, and cause groups regarded as 'of interest'. Discussion The methods proposed are consistent with the philosophy of establishing priorities across broad clusters of causes for which the public health response implications are similar. The examples provided illustrate the options available when considering the design of mortality surveillance for population health monitoring purposes.
Resumo:
Registration of births, recording deaths by age, sex and cause, and calculating mortality levels and differentials are fundamental to evidence-based health policy, monitoring and evaluation. Yet few of the countries with the greatest need for these data have functioning systems to produce them despite legislation providing for the establishment and maintenance of vital registration. Sample vital registration (SVR), when applied in conjunction with validated verbal autopsy, procedures and implemented in a nationally representative sample of population clusters represents an affordable, cost-effective, and sustainable short- and medium-term solution to this problem. SVR complements other information sources by producing age-, sex-, and cause-specific mortality data that are more complete and continuous than those currently available. The tools and methods employed in an SVR system, however, are imperfect and require rigorous validation and continuous quality assurance; sampling strategies for SVR are also still evolving. Nonetheless, interest in establishing SVR is rapidly growing in Africa and Asia. Better systems for reporting and recording data on vital events will be sustainable only if developed hand-in-hand with existing health information strategies at the national and district levels; governance structures; and agendas for social research and development monitoring. If the global community wishes to have mortality measurements 5 or 10 years hence, the foundation stones of SVR must be laid today.
Resumo:
The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds.
Resumo:
The coming out process has been conceptualized as a developmental imperative for those who will eventually accept their same-sex attractions. It is widely accepted that homophobia, heterosexism, and homonegativity are cultural realities that may complicate this developmental process for gay men. The current study views coming out as an extra-developmental life task that is at best a stressful event, and at worst traumatic when coming out results in the rupture of salient relationships with parents, siblings, and/or close friends. To date, the minority stress model (Meyer, 1995; 2003) has been utilized as an organizing framework for how to empirically examine external stressors and mental health disparities for lesbians, gay men, and bisexual individuals in the United States. The current study builds on this literature by focusing on the influence of how gay men make sense of and represent the coming out process in a semi-structured interview, more specifically, by examining the legacy of the coming out process on indicators of wellness. In a two-part process, this study first employs the framework well articulated in the adult attachment literature of coherence of narratives to explore both variation and implications of the coming out experience for a sample of gay men (n = 60) in romantic relationships (n = 30). In particular, this study employed constructs identified in the adult attachment literature, namely Preoccupied and Dismissing current state of mind, to code a Coming Out Interview (COI). In the present study current state of mind refers to the degree of coherent discourse produced about coming out experiences as relayed during the COI. Multilevel analyses tested the extent to which these COI dimensions, as revealed through an analysis of coming out narratives in the COI, were associated with relationship quality, including self-reported satisfaction and observed emotional tone in a standard laboratory interaction task and self-reported symptoms of psychopathology. In addition, multilevel analyses also assessed the Acceptance by primary relationship figures at the time of disclosure, as well as the degree of Outness at the time of the study. Results revealed that participant’s narratives on the COI varied with regard to Preoccupied and Dismissing current state of mind, suggesting that the AAI coding system provides a viable organizing framework for extracting meaning from coming out narratives as related to attachment relevant constructs. Multilevel modeling revealed construct validity of the attachment dimensions assessed via the COI; attachment (i.e., Preoccupied and Dismissing current state of mind) as assessed via the Adult Attachment Interview (AAI) was significantly correlated with the corresponding COI variables. These finding suggest both methodological and conceptual convergence between these two measures. However, with one exception, COI Preoccupied and Dismissing current state of mind did not predict relationship outcomes or self-reported internalizing and externalizing symptoms. However, further analyses revealed that the degree to which one is out to others moderated the relationship between COI Preoccupied and internalizing. Specifically, for those who were less out to others, there was a significant and positive relationship between Preoccupied current state of mind towards coming out and internalizing symptoms. In addition, the degree of perceived acceptance of sexual orientation by salient relationship figures at the time of disclosure emerged as a predictor of mental health. In particular, Acceptance was significantly negatively related to internalizing symptoms. Overall, the results offer preliminary support that gay men’s narratives do reflect variation as assessed by attachment dimensions and highlights the role of Acceptance by salient relationship figures at the time of disclosure. Still, for the most part, current state of mind towards coming out in this study was not associated with relationship quality and self-reported indicators of mental health. This finding may be a function of low statistical power given the modest sample size. However, the relationship between Preoccupied current state of mind and mental health (i.e., internalizing) appears to depend on degree of Outness. In addition, the response of primary relationships figures to coming out may be a relevant factor in shaping mental health outcomes for gay men. Limitations and suggestions for future research and clinical intervention are offered.
Resumo:
This study aimed at evaluating whether human papillomavirus (HPV) groups and E6/E7 mRNA of HPV 16, 18, 31, 33, and 45 are prognostic of cervical intraepithelial neoplasia (CIN) 2 outcome in women with a cervical smear showing a low-grade squamous intraepithelial lesion (LSIL). This cohort study included women with biopsy-confirmed CIN 2 who were followed up for 12 months, with cervical smear and colposcopy performed every three months. Women with a negative or low-risk HPV status showed 100% CIN 2 regression. The CIN 2 regression rates at the 12-month follow-up were 69.4% for women with alpha-9 HPV versus 91.7% for other HPV species or HPV-negative status (P < 0.05). For women with HPV 16, the CIN 2 regression rate at the 12-month follow-up was 61.4% versus 89.5% for other HPV types or HPV-negative status (P < 0.05). The CIN 2 regression rate was 68.3% for women who tested positive for HPV E6/E7 mRNA versus 82.0% for the negative results, but this difference was not statistically significant. The expectant management for women with biopsy-confirmed CIN 2 and previous cytological tests showing LSIL exhibited a very high rate of spontaneous regression. HPV 16 is associated with a higher CIN 2 progression rate than other HPV infections. HPV E6/E7 mRNA is not a prognostic marker of the CIN 2 clinical outcome, although this analysis cannot be considered conclusive. Given the small sample size, this study could be considered a pilot for future larger studies on the role of predictive markers of CIN 2 evolution.