Biblioteca Digital

854 resultados para Ordinal data

em Queensland University of Technology - ePrints Archive

Rank regression for analyzing ordinal qualitative data for treatment comparison

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ordinal qualitative data are often collected for phenotypical measurements in plant pathology and other biological sciences. Statistical methods, such as t tests or analysis of variance, are usually used to analyze ordinal data when comparing two groups or multiple groups. However, the underlying assumptions such as normality and homogeneous variances are often violated for qualitative data. To this end, we investigated an alternative methodology, rank regression, for analyzing the ordinal data. The rank-based methods are essentially based on pairwise comparisons and, therefore, can deal with qualitative data naturally. They require neither normality assumption nor data transformation. Apart from robustness against outliers and high efficiency, the rank regression can also incorporate covariate effects in the same way as the ordinary regression. By reanalyzing a data set from a wheat Fusarium crown rot study, we illustrated the use of the rank regression methodology and demonstrated that the rank regression models appear to be more appropriate and sensible for analyzing nonnormal data and data with outliers.

Environments For Healthy Living (EFHL) Griffith birth cohort study : characteristics of sample and profile of antenatal exposures

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background The Environments for Healthy Living (EFHL) study is a repeated sample, longitudinal birth cohort in South East Queensland, Australia. We describe the sample characteristics and profile of maternal, household, and antenatal exposures. Variation and data stability over recruitment years were examined. Methods Four months each year from 2006, pregnant women were recruited to EFHL at routine antenatal visits on or after 24 weeks gestation, from three public maternity hospitals. Participating mothers completed a baseline questionnaire on individual, familial, social and community exposure factors. Perinatal data were extracted from hospital birth records. Descriptive statistics and measures of association were calculated comparing the EFHL birth sample with regional and national reference populations. Data stability of antenatal exposure factors was assessed across five recruitment years (2006–2010 inclusive) using the Gamma statistic for ordinal data and chi-squared for nominal data. Results Across five recruitment years 2,879 pregnant women were recruited which resulted in 2904 live births with 29 sets of twins. EFHL has a lower representation of early gestational babies, fewer still births and a lower percentage of low birth weight babies, when compared to regional data. The majority of women (65%) took a multivitamin supplement during pregnancy, 47% consumed alcohol, and 26% reported having smoked cigarettes. There were no differences in rates of a range of antenatal exposures across five years of recruitment, with the exception of increasing maternal pre-pregnancy weight (p=0.0349), decreasing rates of high maternal distress (p=0.0191) and decreasing alcohol consumption (p<0.0001). Conclusions The study sample is broadly representative of births in the region and almost all factors showed data stability over time. This study, with repeated sampling of birth cohorts over multiple years, has the potential to make important contributions to population health through evaluating longitudinal follow-up and within cohort temporal effects.

Combined analysis of categorical and numerical descriptors of Australian groundnut accessions using nonlinear principal component analysis

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For users of germplasm collections, the purpose of measuring characterization and evaluation descriptors, and subsequently using statistical methodology to summarize the data, is not only to interpret the relationships between the descriptors, but also to characterize the differences and similarities between accessions in relation to their phenotypic variability for each of the measured descriptors. The set of descriptors for the accessions of most germplasm collections consists of both numerical and categorical descriptors. This poses problems for a combined analysis of all descriptors because few statistical techniques deal with mixtures of measurement types. In this article, nonlinear principal component analysis was used to analyze the descriptors of the accessions in the Australian groundnut collection. It was demonstrated that the nonlinear variant of ordinary principal component analysis is an appropriate analytical tool because subspecies and botanical varieties could be identified on the basis of the analysis and characterized in terms of all descriptors. Moreover, outlying accessions could be easily spotted and their characteristics established. The statistical results and their interpretations provide users with a more efficient way to identify accessions of potential relevance for their plant improvement programs and encourage and improve the usefulness and utilization of germplasm collections.

On Ordinal VC-Dimension and Some Notions of Complexity

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We generalize the classical notion of Vapnik–Chernovenkis (VC) dimension to ordinal VC-dimension, in the context of logical learning paradigms. Logical learning paradigms encompass the numerical learning paradigms commonly studied in Inductive Inference. A logical learning paradigm is defined as a set W of structures over some vocabulary, and a set D of first-order formulas that represent data. The sets of models of ϕ in W, where ϕ varies over D, generate a natural topology W over W. We show that if D is closed under boolean operators, then the notion of ordinal VC-dimension offers a perfect characterization for the problem of predicting the truth of the members of D in a member of W, with an ordinal bound on the number of mistakes. This shows that the notion of VC-dimension has a natural interpretation in Inductive Inference, when cast into a logical setting. We also study the relationships between predictive complexity, selective complexity—a variation on predictive complexity—and mind change complexity. The assumptions that D is closed under boolean operators and that W is compact often play a crucial role to establish connections between these concepts. We then consider a computable setting with effective versions of the complexity measures, and show that the equivalence between ordinal VC-dimension and predictive complexity fails. More precisely, we prove that the effective ordinal VC-dimension of a paradigm can be defined when all other effective notions of complexity are undefined. On a better note, when W is compact, all effective notions of complexity are defined, though they are not related as in the noncomputable version of the framework.

Model Consistency and Data Specification in Property DCF Studies

Relevância:

20.00% 20.00%

Publicador:

A Conditional Autoregressive Gaussiean Process for Irregularly Spaced Multivariate Data with Application to Modelling Large Sets of Binary Data

Relevância:

20.00% 20.00%

Publicador:

Web Data Mining and Reasoning Model

Relevância:

20.00% 20.00%

Publicador:

Airborne laser scanning : exploratory data analysis indicates potential variables for classification of individual trees or forest stands according to species

Relevância:

20.00% 20.00%

Publicador:

Building And Querying E-Catalog Networks Using P2P And Data Summarisation Techniques

Relevância:

20.00% 20.00%

Publicador:

High body mass index is not a barrier to physical activity: Analysis of international rugby players' anthropometric data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent data indicate that levels of overweight and obesity are increasing at an alarming rate throughout the world. At a population level (and commonly to assess individual health risk), the prevalence of overweight and obesity is calculated using cut-offs of the Body Mass Index (BMI) derived from height and weight. Similarly, the BMI is also used to classify individuals and to provide a notional indication of potential health risk. It is likely that epidemiologic surveys that are reliant on BMI as a measure of adiposity will overestimate the number of individuals in the overweight (and slightly obese) categories. This tendency to misclassify individuals may be more pronounced in athletic populations or groups in which the proportion of more active individuals is higher. This differential is most pronounced in sports where it is advantageous to have a high BMI (but not necessarily high fatness). To illustrate this point we calculated the BMIs of international professional rugby players from the four teams involved in the semi-finals of the 2003 Rugby Union World Cup. According to the World Health Organisation (WHO) cut-offs for BMI, approximately 65% of the players were classified as overweight and approximately 25% as obese. These findings demonstrate that a high BMI is commonplace (and a potentially desirable attribute for sport performance) in professional rugby players. An unanswered question is what proportion of the wider population, classified as overweight (or obese) according to the BMI, is misclassified according to both fatness and health risk? It is evident that being overweight should not be an obstacle to a physically active lifestyle. Similarly, a reliance on BMI alone may misclassify a number of individuals who might otherwise have been automatically considered fat and/or unfit.

A Petrov-Galerkin method for a singularly perturbed ordinary differential equation with non-smooth data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a singularly perturbed ordinary differential equation with non-smooth data is considered. The numerical method is generated by means of a Petrov-Galerkin finite element method with the piecewise-exponential test function and the piecewise-linear trial function. At the discontinuous point of the coefficient, a special technique is used. The method is shown to be first-order accurate and singular perturbation parameter uniform convergence. Finally, numerical results are presented, which are in agreement with theoretical results.

Multifractal Characterization of Hong Kong Air Quality Data

Relevância:

20.00% 20.00%

Publicador:

Security Review of Telecommunications Data Services for Rail Communications

Relevância:

20.00% 20.00%

Publicador:

Missing Data and Interpolation in Dynamic Term Structure Models

Relevância:

20.00% 20.00%

Publicador:

System Thevenin Impedance Estimation Using Signal Processing on Load Bus Data

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
56
57
»