156 resultados para binary data
em University of Queensland eSpace - Australia
Resumo:
We compare Bayesian methodology utilizing free-ware BUGS (Bayesian Inference Using Gibbs Sampling) with the traditional structural equation modelling approach based on another free-ware package, Mx. Dichotomous and ordinal (three category) twin data were simulated according to different additive genetic and common environment models for phenotypic variation. Practical issues are discussed in using Gibbs sampling as implemented by BUGS to fit subject-specific Bayesian generalized linear models, where the components of variation may be estimated directly. The simulation study (based on 2000 twin pairs) indicated that there is a consistent advantage in using the Bayesian method to detect a correct model under certain specifications of additive genetics and common environmental effects. For binary data, both methods had difficulty in detecting the correct model when the additive genetic effect was low (between 10 and 20%) or of moderate range (between 20 and 40%). Furthermore, neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large (50%). Power was significantly improved with ordinal data for most scenarios, except for the case of low heritability under a true ACE model. We illustrate and compare both methods using data from 1239 twin pairs over the age of 50 years, who were registered with the Australian National Health and Medical Research Council Twin Registry (ATR) and presented symptoms associated with osteoarthritis occurring in joints of the hand.
Resumo:
Standard factorial designs sometimes may be inadequate for experiments that aim to estimate a generalized linear model, for example, for describing a binary response in terms of several variables. A method is proposed for finding exact designs for such experiments that uses a criterion allowing for uncertainty in the link function, the linear predictor, or the model parameters, together with a design search. Designs are assessed and compared by simulation of the distribution of efficiencies relative to locally optimal designs over a space of possible models. Exact designs are investigated for two applications, and their advantages over factorial and central composite designs are demonstrated.
Resumo:
Pharmacodynamics (PD) is the study of the biochemical and physiological effects of drugs. The construction of optimal designs for dose-ranging trials with multiple periods is considered in this paper, where the outcome of the trial (the effect of the drug) is considered to be a binary response: the success or failure of a drug to bring about a particular change in the subject after a given amount of time. The carryover effect of each dose from one period to the next is assumed to be proportional to the direct effect. It is shown for a logistic regression model that the efficiency of optimal parallel (single-period) or crossover (two-period) design is substantially greater than a balanced design. The optimal designs are also shown to be robust to misspecification of the value of the parameters. Finally, the parallel and crossover designs are combined to provide the experimenter with greater flexibility.
Resumo:
Genetic research on risk of alcohol, tobacco or drug dependence must make allowance for the partial overlap of risk-factors for initiation of use, and risk-factors for dependence or other outcomes in users. Except in the extreme cases where genetic and environmental risk-factors for initiation and dependence overlap completely or are uncorrelated, there is no consensus about how best to estimate the magnitude of genetic or environmental correlations between Initiation and Dependence in twin and family data. We explore by computer simulation the biases to estimates of genetic and environmental parameters caused by model misspecification when Initiation can only be defined as a binary variable. For plausible simulated parameter values, the two-stage genetic models that we consider yield estimates of genetic and environmental variances for Dependence that, although biased, are not very discrepant from the true values. However, estimates of genetic (or environmental) correlations between Initiation and Dependence may be seriously biased, and may differ markedly under different two-stage models. Such estimates may have little credibility unless external data favor selection of one particular model. These problems can be avoided if Initiation can be assessed as a multiple-category variable (e.g. never versus early-onset versus later onset user), with at least two categories measurable in users at risk for dependence. Under these conditions, under certain distributional assumptions., recovery of simulated genetic and environmental correlations becomes possible, Illustrative application of the model to Australian twin data on smoking confirmed substantial heritability of smoking persistence (42%) with minimal overlap with genetic influences on initiation.
Resumo:
A heterogeneous modified vacancy solution model of adsorption developed is evaluated. The new model considers the adsorption process through a mass-action law and is thermodynamically consistent, while maintaining the simplicity in calculation of multicomponent adsorption equilibria, as in the original vacancy solution theory. It incorporates the adsorbent heterogeneity through a pore-width-related potential energy, represented by Steele's 10-4-3 potential expression. The experimental data of various hydrocarbons, CO2 and SO2 on four different activated carbons - Ajax, Norit, Nuxit, and BPL - at multiple temperatures over a wide range of pressures were studied by the heterogeneous modified VST model to obtain the isotherm parameters and micropore-size distribution of carbons. The model successfully correlates the single-component adsorption equilibrium data for all compounds studied on various carbons. The fitting results for the vacancy occupancy parameter are consistent with the pressure change on different carbons, and the effect of pore heterogeneity is important in adsorption at elevated pressure. It predicts binary adsorption equilibria better than the IAST scheme, reflecting the significance of molecular size nonideality.
Resumo:
The use of a fully parametric Bayesian method for analysing single patient trials based on the notion of treatment 'preference' is described. This Bayesian hierarchical modelling approach allows for full parameter uncertainty, use of prior information and the modelling of individual and patient sub-group structures. It provides updated probabilistic results for individual patients, and groups of patients with the same medical condition, as they are sequentially enrolled into individualized trials using the same medication alternatives. Two clinically interpretable criteria for determining a patient's response are detailed and illustrated using data from a previously published paper under two different prior information scenarios. Copyright (C) 2005 John Wiley & Sons, Ltd.
Resumo:
Computer modelling promises to. be an important tool for analysing and predicting interactions between trees within mixed species forest plantations. This study explored the use of an individual-based mechanistic model as a predictive tool for designing mixed species plantations of Australian tropical trees. The 'spatially explicit individually based-forest simulator' (SeXI-FS) modelling system was used to describe the spatial interaction of individual tree crowns within a binary mixed-species experiment. The three-dimensional model was developed and verified with field data from three forest tree species grown in tropical Australia. The model predicted the interactions within monocultures and binary mixtures of Flindersia brayleyana, Eucalyptus pellita and Elaeocarpus grandis, accounting for an average of 42% of the growth variation exhibited by species in different treatments. The model requires only structural dimensions and shade tolerance as species parameters. By modelling interactions in existing tree mixtures, the model predicted both increases and reductions in the growth of mixtures (up to +/- 50% of stem volume at 7 years) compared to monocultures. This modelling approach may be useful for designing mixed tree plantations. (c) 2006 Published by Elsevier B.V.
Resumo:
Hannenhalli and Pevzner developed the first polynomial-time algorithm for the combinatorial problem of sorting of signed genomic data. Their algorithm solves the minimum number of reversals required for rearranging a genome to another when gene duplication is nonexisting. In this paper, we show how to extend the Hannenhalli-Pevzner approach to genomes with multigene families. We propose a new heuristic algorithm to compute the reversal distance between two genomes with multigene families via the concept of binary integer programming without removing gene duplicates. The experimental results on simulated and real biological data demonstrate that the proposed algorithm is able to find the reversal distance accurately. ©2005 IEEE
Resumo:
In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree*. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones. © Springer-Verlag Berlin Heidelberg 2005
Resumo:
This document records the process of migrating eprints.org data to a Fez repository. Fez is a Web-based digital repository and workflow management system based on Fedora (http://www.fedora.info/). At the time of migration, the University of Queensland Library was using EPrints 2.2.1 [pepper] for its ePrintsUQ repository. Once we began to develop Fez, we did not upgrade to later versions of eprints.org software since we knew we would be migrating data from ePrintsUQ to the Fez-based UQ eSpace. Since this document records our experiences of migration from an earlier version of eprints.org, anyone seeking to migrate eprints.org data into a Fez repository might encounter some small differences. Moving UQ publication data from an eprints.org repository into a Fez repository (hereafter called UQ eSpace (http://espace.uq.edu.au/) was part of a plan to integrate metadata (and, in some cases, full texts) about all UQ research outputs, including theses, images, multimedia and datasets, in a single repository. This tied in with the plan to identify and capture the research output of a single institution, the main task of the eScholarshipUQ testbed for the Australian Partnership for Sustainable Repositories project (http://www.apsr.edu.au/). The migration could not occur at UQ until the functionality in Fez was at least equal to that of the existing ePrintsUQ repository. Accordingly, as Fez development occurred throughout 2006, a list of eprints.org functionality not currently supported in Fez was created so that programming of such development could be planned for and implemented.
Resumo:
The final-year project for Mechanical & Space Engineering students at UQ often involves the design and flight testing of an experiment. This report describes the design and use of a simple data logger that should be suitable for collecting data from the students' flight experiments. The exercise here was taken as far as the construction of a prototype device that is suitable for ground-based testing, say, the static firing of a hybrid rocket motor.
Resumo:
A combination of deductive reasoning, clustering, and inductive learning is given as an example of a hybrid system for exploratory data analysis. Visualization is replaced by a dialogue with the data.
Resumo:
This paper reports a comparative study of Australian and New Zealand leadership attributes, based on the GLOBE (Global Leadership and Organizational Behavior Effectiveness) program. Responses from 344 Australian managers and 184 New Zealand managers in three industries were analyzed using exploratory and confirmatory factor analysis. Results supported some of the etic leadership dimensions identified in the GLOBE study, but also found some emic dimensions of leadership for each country. An interesting finding of the study was that the New Zealand data fitted the Australian model, but not vice versa, suggesting asymmetric perceptions of leadership in the two countries.
Resumo:
In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to construct a prediction rule from only a few genes such that it has a negligible prediction error rate. However, in these results the test error or the leave-one-out cross-validated error is calculated without allowance for the selection bias. There is no allowance because the rule is either tested on tissue samples that were used in the first instance to select the genes being used in the rule or because the cross-validation of the rule is not external to the selection process; that is, gene selection is not performed in training the rule at each stage of the cross-validation process. We describe how in practice the selection bias can be assessed and corrected for by either performing a cross-validation or applying the bootstrap external to the selection process. We recommend using 10-fold rather than leave-one-out cross-validation, and concerning the bootstrap, we suggest using the so-called. 632+ bootstrap error estimate designed to handle overfitted prediction rules. Using two published data sets, we demonstrate that when correction is made for the selection bias, the cross-validated error is no longer zero for a subset of only a few genes.