884 resultados para random sample
Resumo:
Designing an efficient sampling strategy is of crucial importance for habitat suitability modelling. This paper compares four such strategies, namely, 'random', 'regular', 'proportional-stratified' and 'equal -stratified'- to investigate (1) how they affect prediction accuracy and (2) how sensitive they are to sample size. In order to compare them, a virtual species approach (Ecol. Model. 145 (2001) 111) in a real landscape, based on reliable data, was chosen. The distribution of the virtual species was sampled 300 times using each of the four strategies in four sample sizes. The sampled data were then fed into a GLM to make two types of prediction: (1) habitat suitability and (2) presence/ absence. Comparing the predictions to the known distribution of the virtual species allows model accuracy to be assessed. Habitat suitability predictions were assessed by Pearson's correlation coefficient and presence/absence predictions by Cohen's K agreement coefficient. The results show the 'regular' and 'equal-stratified' sampling strategies to be the most accurate and most robust. We propose the following characteristics to improve sample design: (1) increase sample size, (2) prefer systematic to random sampling and (3) include environmental information in the design'
Resumo:
Recently, several anonymization algorithms have appeared for privacy preservation on graphs. Some of them are based on random-ization techniques and on k-anonymity concepts. We can use both of them to obtain an anonymized graph with a given k-anonymity value. In this paper we compare algorithms based on both techniques in orderto obtain an anonymized graph with a desired k-anonymity value. We want to analyze the complexity of these methods to generate anonymized graphs and the quality of the resulting graphs.
Resumo:
OBJECTIVES: To conduct a national survey on adolescent health and lifestyles in Georgia and to thus set up a database on adolescent. METHODS: A two-stage cluster sample of around 8000-10000 in-school 15-18 years adolescents are being reached through a random selection of classes in Georgia. The sample has been stratified by age, region, type of school and language. A self-administered questionnaire of 87 questions has been developed and translated into the four main languages used in Georgia. RESULTS: Up to June 2004, the researchers have reached 511 classes (9306 pupils). In total, 8039 questionnaires have been considered valid. The main concerns encountered for this survey are linked with acceptance of the survey, cross-cultural issues, political and strategic problems as well as inadequate physical environmental support. CONCLUSION: Despite Georgia's unfavourable economical and political situation, it has been possible to run a national survey on the health of adolescents, according to the usual standards used in the field. This survey should allow for 1) the identification of priorities in the field of health care and health promotion 2) the monitoring of adolescent health in the future.
Resumo:
Analysis of variance is commonly used in morphometry in order to ascertain differences in parameters between several populations. Failure to detect significant differences between populations (type II error) may be due to suboptimal sampling and lead to erroneous conclusions; the concept of statistical power allows one to avoid such failures by means of an adequate sampling. Several examples are given in the morphometry of the nervous system, showing the use of the power of a hierarchical analysis of variance test for the choice of appropriate sample and subsample sizes. In the first case chosen, neuronal densities in the human visual cortex, we find the number of observations to be of little effect. For dendritic spine densities in the visual cortex of mice and humans, the effect is somewhat larger. A substantial effect is shown in our last example, dendritic segmental lengths in monkey lateral geniculate nucleus. It is in the nature of the hierarchical model that sample size is always more important than subsample size. The relative weight to be attributed to subsample size thus depends on the relative magnitude of the between observations variance compared to the between individuals variance.
Resumo:
Cannabis use is highly prevalent among people with schizophrenia, and coupled with impaired cognition, is thought to heighten the risk of illness onset. However, while heavy cannabis use has been associated with cognitive deficits in long-term users, studies among patients with schizophrenia have been contradictory. This article consists of 2 studies. In Study I, a meta-analysis of 10 studies comprising 572 patients with established schizophrenia (with and without comorbid cannabis use) was conducted. Patients with a history of cannabis use were found to have superior neuropsychological functioning. This finding was largely driven by studies that included patients with a lifetime history of cannabis use rather than current or recent use. In Study II, we examined the neuropsychological performance of 85 patients with first-episode psychosis (FEP) and 43 healthy nonusing controls. Relative to controls, FEP patients with a history of cannabis use (FEP + CANN; n = 59) displayed only selective neuropsychological impairments while those without a history (FEP - CANN; n = 26) displayed generalized deficits. When directly compared, FEP + CANN patients performed better on tests of visual memory, working memory, and executive functioning. Patients with early onset cannabis use had less neuropsychological impairment than patients with later onset use. Together, these findings suggest that patients with schizophrenia or FEP with a history of cannabis use have superior neuropsychological functioning compared with nonusing patients. This association between better cognitive performance and cannabis use in schizophrenia may be driven by a subgroup of "neurocognitively less impaired" patients, who only developed psychosis after a relatively early initiation into cannabis use.
Resumo:
BACKGROUND/AIMS: Cannabis use is a growing challenge for public health, calling for adequate instruments to identify problematic consumption patterns. The Cannabis Use Disorders Identification Test (CUDIT) is a 10-item questionnaire used for screening cannabis abuse and dependency. The present study evaluated that screening instrument. METHODS: In a representative population sample of 5,025 Swiss adolescents and young adults, 593 current cannabis users replied to the CUDIT. Internal consistency was examined by means of Cronbach's alpha and confirmatory factor analysis. In addition, the CUDIT was compared to accepted concepts of problematic cannabis use (e.g. using cannabis and driving). ROC analyses were used to test the CUDIT's discriminative ability and to determine an appropriate cut-off. RESULTS: Two items ('injuries' and 'hours being stoned') had loadings below 0.5 on the unidimensional construct and correlated lower than 0.4 with the total CUDIT score. All concepts of problematic cannabis use were related to CUDIT scores. An ideal cut-off between six and eight points was found. CONCLUSIONS: Although the CUDIT seems to be a promising instrument to identify problematic cannabis use, there is a need to revise some of its items.
Batch effect confounding leads to strong bias in performance estimates obtained by cross-validation.
Resumo:
BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.
Resumo:
A serological survey of hepatitis B virus (HBV) and hepatitis C virus (HCV) infections was carried out on a random sex- and age-stratified sample of 1006 individuals aged 25-64 years in the Seychelles islands. Anti-HBc and anti-HCV antibodies were detected using commercially available enzyme-linked immunosorbent assays (ELISA), followed by a Western blot assay in the case of a positive result for anti-HCV. The age-adjusted seroprevalence of anti-HBc antibodies was 8.0% (95% CI: 6.5-9.9%) and the percentage prevalence among males/females increased from 7.0/3.1 to 19.1/13.4 in the age groups 25-34 to 55-64 years, respectively. Two men and three women were positive for anti-HCV antibodies, with an age-adjusted seroprevalence of 0.34% (95% CI: 0.1-0.8%). Two out of these five subjects who were positive for anti-HCV also had anti-HBc antibodies. The seroprevalence of anti-HBc was significantly higher in unskilled workers, persons with low education, and heavy drinkers. The age-specific seroprevalence of anti-HBc in this population-based survey, which was conducted in 1994, was approximately three times lower than in a previous patient-based survey carried out in 1979. Although there are methodological differences between the two surveys, it is likely that the substantial decrease in anti-HBc prevalence during the last 15 years may be due to significant socioeconomic development and the systematic screening of blood donors since 1981. Because hepatitis C virus infections are serious and the cost of treatment is high, the fact that the prevalence of anti-HCV antibodies is at present low should not be an argument for not screening blood donors for anti-HCV and eliminating those who are positive.
Resumo:
We have devised a program that allows computation of the power of F-test, and hence determination of appropriate sample and subsample sizes, in the context of the one-way hierarchical analysis of variance with fixed effects. The power at a fixed alternative is an increasing function of the sample size and of the subsample size. The program makes it easy to obtain the power of F-test for a range of values of sample and subsample sizes, and therefore the appropriate sizes based on a desired power. The program can be used for the 'ordinary' case of the one-way analysis of variance, as well as for hierarchical analysis of variance with two stages of sampling. Examples are given of the practical use of the program.
Resumo:
The Constructive Thinking Inventory (CTI) measures cognitive coping strategies used in everyday problem solving. The main objective of this study was to assess the factorial structure, the internal consistency, the correspondence with the American normative values, and the discriminant validity of the French translation. A community sample of 777 students aged 12 to 26 years, recruited from schools, colleges and universities, answered the 108item selfreport CTI questionnaire during a class period. A sample of 60 male adolescent offenders aged 13 to 18 years, recruited from two institutions for juvenile offenders, answered the CTI during an individual interview. Results show that the French translation of the CTI follows an identical factorial structure as the Epstein's American version in both adolescents and young adults, and that its internal consistency is satisfactory. Differences in Constructive Thinking profiles according to gender and age and between Swiss and American samples, are discussed. Juvenile offenders differed from community youths on most of the scales, speaking for a good discriminant validity of the CTI. In conclusion, the French translation of the CTI appears to preserve the original version's psychometric properties. The present study provides normative values from a community sample of Swiss adolescents and young adults.