958 resultados para Contingency tables
Resumo:
By using suitable parameters, we present a uni¯ed aproach for describing four methods for representing categorical data in a contingency table. These methods include: correspondence analysis (CA), the alternative approach using Hellinger distance (HD), the log-ratio (LR) alternative, which is appropriate for compositional data, and the so-called non-symmetrical correspondence analysis (NSCA). We then make an appropriate comparison among these four methods and some illustrative examples are given. Some approaches based on cumulative frequencies are also linked and studied using matrices. Key words: Correspondence analysis, Hellinger distance, Non-symmetrical correspondence analysis, log-ratio analysis, Taguchi inertia
Resumo:
A condition needed for testing nested hypotheses from a Bayesian viewpoint is that the prior for the alternative model concentrates mass around the small, or null, model. For testing independence in contingency tables, the intrinsic priors satisfy this requirement. Further, the degree of concentration of the priors is controlled by a discrete parameter m, the training sample size, which plays an important role in the resulting answer regardless of the sample size. In this paper we study robustness of the tests of independence in contingency tables with respect to the intrinsic priors with different degree of concentration around the null, and compare with other “robust” results by Good and Crook. Consistency of the intrinsic Bayesian tests is established. We also discuss conditioning issues and sampling schemes, and argue that conditioning should be on either one margin or the table total, but not on both margins. Examples using real are simulated data are given
Resumo:
When the data are counts or the frequencies of particular events and can be expressed as a contingency table, then they can be analysed using the chi-square distribution. When applied to a 2 x 2 table, the test is approximate and care needs to be taken in analysing tables when the expected frequencies are small either by applying Yate’s correction or by using Fisher’s exact test. Larger contingency tables can also be analysed using this method. Note that it is a serious statistical error to use any of these tests on measurement data!
Resumo:
Omnibus tests of significance in contingency tables use statistics of the chi-square type. When the null is rejected, residual analyses are conducted to identify cells in which observed frequencies differ significantly from expected frequencies. Residual analyses are thus conditioned on a significant omnibus test. Conditional approaches have been shown to substantially alter type I error rates in cases involving t tests conditional on the results of a test of equality of variances, or tests of regression coefficients conditional on the results of tests of heteroscedasticity. We show that residual analyses conditional on a significant omnibus test are also affected by this problem, yielding type I error rates that can be up to 6 times larger than nominal rates, depending on the size of the table and the form of the marginal distributions. We explored several unconditional approaches in search for a method that maintains the nominal type I error rate and found out that a bootstrap correction for multiple testing achieved this goal. The validity of this approach is documented for two-way contingency tables in the contexts of tests of independence, tests of homogeneity, and fitting psychometric functions. Computer code in MATLAB and R to conduct these analyses is provided as Supplementary Material.
Resumo:
Most statistical methods use hypothesis testing. Analysis of variance, regression, discrete choice models, contingency tables, and other analysis methods commonly used in transportation research share hypothesis testing as the means of making inferences about the population of interest. Despite the fact that hypothesis testing has been a cornerstone of empirical research for many years, various aspects of hypothesis tests commonly are incorrectly applied, misinterpreted, and ignored—by novices and expert researchers alike. On initial glance, hypothesis testing appears straightforward: develop the null and alternative hypotheses, compute the test statistic to compare to a standard distribution, estimate the probability of rejecting the null hypothesis, and then make claims about the importance of the finding. This is an oversimplification of the process of hypothesis testing. Hypothesis testing as applied in empirical research is examined here. The reader is assumed to have a basic knowledge of the role of hypothesis testing in various statistical methods. Through the use of an example, the mechanics of hypothesis testing is first reviewed. Then, five precautions surrounding the use and interpretation of hypothesis tests are developed; examples of each are provided to demonstrate how errors are made, and solutions are identified so similar errors can be avoided. Remedies are provided for common errors, and conclusions are drawn on how to use the results of this paper to improve the conduct of empirical research in transportation.
Resumo:
Purpose Paper-based nutrition screening tools can be challenging to implement in the ambulatory oncology setting. The aim of this study was to determine the validity of the Malnutrition Screening Tool (MST) and a novel, automated nutrition screening system compared to a ‘gold standard’ full nutrition assessment using the Patient-Generated Subjective Global Assessment (PG-SGA). Methods An observational, cross-sectional study was conducted in an outpatient oncology day treatment unit (ODTU) within an Australian tertiary health service. Eligibility criteria were as follows: ≥18 years, receiving outpatient anticancer treatment and English literate. Patients self-administered the MST. A dietitian assessed nutritional status using the PGSGA, blinded to the MST score. Automated screening system data were extracted from an electronic oncology prescribing system. This system used weight loss over 3 to 6 weeks prior to the most recent weight record or age-categorised body mass index (BMI) to identify nutritional risk. Sensitivity and specificity against PG-SGA (malnutrition) were calculated using contingency tables and receiver operating curves. Results There were a total of 300 oncology outpatients (51.7 % male, 58.6±13.3 years). The area under the curve (AUC) for weight loss alone was 0.69 with a cut-off value of ≥1 % weight loss yielding 63 % sensitivity and 76.7 % specificity. MST (score ≥2) resulted in 70.6 % sensitivity and 69.5 % specificity, AUC 0.77. Conclusions Both the MST and the automated method fell short of the accepted professional standard for sensitivity (~≥80 %) derived from the PG-SGA. Further investigation into other automated nutrition screening options and the most appropriate parameters available electronically is warranted to support targeted service provision.
Resumo:
Objective: To identify genetic associations with severity of radiographic damage in ankylosing spondylitis (AS). Method: We studied 1537 AS cases of European descent; all fulfilled the modified New York Criteria. Radiographic severity was assessed from digitised lateral radiographs of the cervical and lumbar spine using the modified Stoke Ankylosing Spondylitis Spinal Score (mSASSS). A two-phase genotyping design was used. In phase 1, 498 single nucleotide polymorphisms (SNPs) were genotyped in 688 cases; these were selected to capture >90% of the common haplotypic variation in the exons, exon-intron boundaries, and 5 kb flanking DNA in the 5' and 3' UTR of 74 genes involved in anabolic or catabolic bone pathways. In phase 2, 15 SNPs exhibiting p<0.05 were genotyped in a further cohort of 830 AS cases; results were analysed both separately and in combination with the discovery phase data. Association was tested by contingency tables after separating the samples into 'mild' and 'severe' groups, defined as the bottom and top 40% by mSASSS, adjusted for gender and disease duration. Results: Experiment-wise association was observed with the SNP rs8092336 (combined OR 0.32, p=1.2×10-5), which lies within RANK (receptor activator of NF?B), a gene involved in osteoclastogenesis, and in the interaction between T cells and dendritic cells. Association was also found with the SNP rs1236913 in PTGS1 (prostaglandin-endoperoxide synthase 1, cyclooxygenase 1), giving an OR of 0.53 (p=2.6×10-3). There was no observed association between radiographic severity and HLA-B*27. Conclusions: These findings support roles for bone resorption and prostaglandins pathways in the osteoproliferative changes in AS.
Resumo:
Objectives. It has been shown previously that IL-23R variants are associated with AS. We conducted an extended analysis in the UK population and a meta-analysis with the previously published studies, in order to refine these IL-23R associations with AS. Methods. The UK case-control study included 730 new cases and 1331 healthy controls. In the extended study, the 730 cases were combined with 1088 published cases. Allelic associations were analysed using contingency tables. In the meta-analysis, 3482 cases and 3150 controls from four different published studies and the new UK cases were combined. DerSimonian-Laird test was used to calculate random effects pooled odds ratios (ORs). Results. In the UK case-control study with new cases, four of the eight SNPs showed significant associations, whereas in the extended UK study, seven of the eight IL-23R SNPs showed significant associations (P < 0.05) with AS, maximal with rs11209032 (P < 10-5, OR 1.3), when cases with IBD and/or psoriasis were excluded. The meta-analysis showed significant associations with all eight SNPs; the strongest associations were again seen not only with rs11209032 (P = 4.06 × 10-9, OR ∼1.2) but also with rs11209026 (P < 10-10, OR ∼0.6). Conclusions. IL-23R polymorphisms are clearly associated with AS, but the primary causal association(s) is(are) still not established. These polymorphisms could contribute either increased or decreased susceptibility to AS; functional studies will be required for their full evaluation. Additionally, observed stronger associations with SNPs rs11209026 and rs11465804 upon exclusion of IBD and/or psoriasis cases may represent an independent association with AS. © The Author 2009. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved.
Resumo:
Background. Rheumatoid arthritis (RA) is strongly associated with a series of HLA-DRB1 alleles that encode a conserved sequence of amino acids (70Q/R K/R R A A74) in the DRβ1 chain, known as the shared epitope (SE). However 30% of patients are negative for DRB1*04 and 15% are SE-negative. Exposure to these alleles as non-inherited maternal antigens (NIMA) might explain this discrepancy. We undertook a family study to investigate the role of NIMA in RA. Methods. One hundred families, including the RA proband and both parents, were recruited. HLA-DRB1 genotyping was performed using an allele-specific polymerase chain reaction by standard methods. The frequencies of NIMA and non-inherited paternal antigens (NIPA) were compared using contingency tables and a two-tailed P test. We then reviewed four previously published studies of NIMA in RA and conducted an analysis of the combined data Results. We identified 36 families in which the proband was DRB1*04-negative and 13 in which the proband lacked the SE. There was an excess of DRB1*04 and SE NIMA (P=0.05) compared with NIPA. Combined analysis with previous studies showed that 53/231 mothers (23%) versus 25/205 fathers (12%) had a non-inherited DRB1*04 (P=0.003) and 30/99 mothers versus 18/101 fathers had a non-inherited SE allele (P=0.03). Conclusion. A role for HLA NIMA in RA is suggested by these results.
Resumo:
A nonparametric, small-sample-size test for the homogeneity of two psychometric functions against the left- and right-shift alternatives has been developed. The test is designed to determine whether it is safe to amalgamate psychometric functions obtained in different experimental sessions. The sum of the lower and upper p-values of the exact (conditional) Fisher test for several 2 × 2 contingency tables (one for each point of the psychometric function) is employed as the test statistic. The probability distribution of the statistic under the null (homogeneity) hypothesis is evaluated to obtain corresponding p-values. Power functions of the test have been computed by randomly generating samples from Weibull psychometric functions. The test is free of any assumptions about the shape of the psychometric function; it requires only that all observations are statistically independent. © 2011 Psychonomic Society, Inc.
Resumo:
ABSTRACT - Objectives: We attempted to show how the implementation of the key elements of the World Health Organization Patient Safety Curriculum Guide Multi-professional Edition in an undergraduate curriculum affected the knowledge, skills, and attitudes towards patient safety in a graduate entry Portuguese Medical School. Methods: After receiving formal recognition by the WHO as a Complementary Test Site and approval of the organizational ethics committee , the validated pre-course questionnaires measuring the knowledge, skills, and attitudes to patient safety were administered to the 2nd and3rd year students pursuing a four-year course (N = 46). The key modules of the curriculum were implemented over the academic year by employing a variety of learning strategies including expert lecturers, small group problem-based teaching sessions, and Simulation Laboratory sessions. The identical questionnaires were then administered and the impact was measured. The Curriculum Guide was evaluated as a health education tool in this context. Results: A significant number of the respondents, 47 % (n = 22), reported having received some form of prior patient safety training. The effect on Patient Safety Knowledge was assessed by using the percentage of correct pre- and post-course answers to construct 2 × 2 contingency tables and by applying Fishers’ test (two-tailed). No significant differences were detected (p < 0.05). To assess the effect of the intervention on Patient Safety skills and attitudes, the mean and standard deviation were calculated for the pre and post-course responses, and independent samples were subjected to Mann-Whitney’s test. The attitudinal survey indicated a very high baseline incidence of desirable attitudes and skills toward patient safety. Significant changes were detected (p < 0.05) regarding what should happen if an error is made (p = 0.016), the role of healthcare organizations in error reporting (p = 0.006), and the extent of medical error (p = 0.005). Conclusions: The implementation of selected modules of the WHO Patient Safety Curriculum was associated with a number of positive changes regarding patient safety skills and attitudes, with a baseline incidence of highly desirable patient safety attitudes, but no measureable change on the patient safety knowledge, at the University of Algarve Medical School. The significance of these results is discussed along with implications and suggestions for future research.
Resumo:
It has been well documented, within the field of landscape ecology, that terrestrial fragmentation contributes to increased heterogeneity at the landscape level. It has also been observed that elevated areas of edge habitat occur within fragmented landscapes. Spatial and temporal edge effects were investigated in four areas designated as Nature Reserve Zones within Short Hills Provincial Park, near St. Catharines, Ontario. Random sampling along exposed edges was performed on trees and saplings, at 5 and 25 ill edge depths, using the point-centred quarter method. Diameter at breast height (dbh) and distance from point measurements were used to establish relative density, dominance, frequency and importance value. One-way analyses of variance were used on dbh measurements of tree species and Chi-Square contingency tables were used on size class distributions of saplings species to determine significant differences between 5 and 25 metres. Qualitative comparisons of importance values were also used to determine differences between 5 and 25 metres as well as between trees and saplings. These statistical and qualitative comparisons suggest that a significant overall spatial edge effect is currently exhibited by fragmented wooded islands within the park. The major species of the park, Acersaccharuln, may be exhibiting a temporal edge effect. The heterogeneous nature of the park may be of importance in understanding this area as a complex, ecological system. It is possible that the remaining forest tracts of the park have been affected, and continue to be affected by previous disturbances. Based on these findings, recommendations are made to the Ontario Ministry of Natural Resources concerning the management of Short Hills Provincial Park in accordance with their 1990 proposed Management Plan.
Resumo:
A measure of association is row-size invariant if it is unaffected by the multiplication of all entries in a row of a cross-classification table by a same positive number. It is class-size invariant if it is unaffected by the multiplication of all entries in a class (i.e., a row or a column). We prove that every class-size invariant measure of association as-signs to each m x n cross-classification table a number which depends only on the cross-product ratios of its 2 x 2 subtables. We propose a monotonicity axiom requiring that the degree of association should increase after shifting mass from cells of a table where this mass is below its expected value to cells where it is above .provided that total mass in each class remains constant. We prove that no continuous row-size invariant measure of association is monotonic if m ≥ 4. Keywords: association, contingency tables, margin-free measures, size invariance, monotonicity, transfer principle.
Resumo:
Ce mémoire propose que l’origine d’un cyberdélinquant soit un facteur explicatif du phénomène de la cybercriminalité. Il comporte deux objectifs : premièrement, décrire les profils des cybercriminels recensés dans les médias internationaux; deuxièmement, vérifier si ces profils varient selon le lieu d’origine du cyberdélinquant. Une base de données, comportant 90 cas de cybercriminels répertoriés à travers le monde, fut créée. Quinze (15) cybercriminels par territoire ont été sélectionnés, les régions ciblées allant comme suit : Amérique du Nord, Amérique latine, Australasie, Europe de l’Ouest, Eurasie et Afrique/péninsule Arabique. En premier lieu, des analyses descriptives ont été exécutées afin de dresser un portrait de ce phénomène. En second lieu, des analyses de tableaux de contingence ont été effectuées entre les variables à l’étude afin de voir si des relations existaient. Enfin, d’autres analyses de tableaux de contingence ont été réalisées afin d’établir les différences des paramètres en fonction de l’origine. Les résultats de ces divers tests démontrent que ce sont généralement de jeunes hommes âgés en moyenne de 25 ans qui seront susceptibles de commettre des délits informatiques. Quelques profils types se sont dégagés des analyses et peuvent s’expliquer par l’accès au matériel informatique, les inégalités économiques entre classes sociales, tout comme la vitesse d’Internet, et ce, en fonction de l’origine du cyberdélinquant.
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristics in a text that are rarely controlled by the author, with those in other texts. When the goal is to settle authorship questions, these characteristics should relate to the author’s style and not to the genre, epoch or editor, and they should be such that their variation between authors is larger than the variation within comparable texts from the same author. For an overview of the literature on stylometry and some of the techniques involved, see for example Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or Lebart, Salem and Berry (1998). Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be “the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated several times into Spanish, Italian and French, with modern English translations by Rosenthal (1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465, but it was not printed until 1490. There is an intense and long lasting debate around its authorship sprouting from its first edition, where its introduction states that the whole book is the work of Martorell (1413?-1468), while at the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of Martorell. Some of the authors that support the theory of single authorship are Riquer (1990), Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer (1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990). Neither of the two candidate authors left any text comparable to the one under study, and therefore discriminant analysis can not be used to help classify chapters by author. By using sample texts encompassing about ten percent of the book, and looking at word length and at the use of 44 conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and Ginebra (2000) estimates that stylistic boundary to be near chapter 383. Following the lead of the extensive literature, this paper looks into word length, the use of the most frequent words and into the use of vowels in each chapter of the book. Given that the features selected are categorical, that leads to three contingency tables of ordered rows and therefore to three sequences of multinomial observations. Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3 describes the problem of the estimation of a suden change-point in those sequences, in the following sections we propose various ways to estimate change-points in multinomial sequences; the method in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models onto the sequence of Chi-square distances between each row profiles and the average profile, the one in Section 6 fits models onto the sequence of values taken by the first component of the correspondence analysis as well as onto sequences of other summary measures like the average word length. In Section 7 we fit models onto the marginal binomial sequences to identify the features that distinguish the chapters before and after that boundary. Most methods rely heavily on the use of generalized linear models