71 resultados para Count data models
em University of Queensland eSpace - Australia
Resumo:
To account for the preponderance of zero counts and simultaneous correlation of observations, a class of zero-inflated Poisson mixed regression models is applicable for accommodating the within-cluster dependence. In this paper, a score test for zero-inflation is developed for assessing correlated count data with excess zeros. The sampling distribution and the power of the test statistic are evaluated by simulation studies. The results show that the test statistic performs satisfactorily under a wide range of conditions. The test procedure is further illustrated using a data set on recurrent urinary tract infections. Copyright (c) 2005 John Wiley & Sons, Ltd.
Resumo:
The schema of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. Obtaining quickly the appropriate data increases the likelihood that an organization will make good decisions and respond adeptly to challenges. This research presents and validates a methodology for evaluating, ex ante, the relative desirability of alternative instantiations of a model of data. In contrast to prior research, each instantiation is based on a different formal theory. This research theorizes that the instantiation that yields the lowest weighted average query complexity for a representative sample of information requests is the most desirable instantiation for end-user queries. The theory was validated by an experiment that compared end-user performance using an instantiation of a data structure based on the relational model of data with performance using the corresponding instantiation of the data structure based on the object-relational model of data. Complexity was measured using three different Halstead metrics: program length, difficulty, and effort. For a representative sample of queries, the average complexity using each instantiation was calculated. As theorized, end users querying the instantiation with the lower average complexity made fewer semantic errors, i.e., were more effective at composing queries. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.
Resumo:
Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Conceptual modeling forms an important part of systems analysis. If this is done incorrectly or incompletely, there can be serious implications for the resultant system, specifically in terms of rework and useability. One approach to improving the conceptual modelling process is to evaluate how well the model represents reality. Emergence of the Bunge-Wand-Weber (BWW) ontological model introduced a platform to classify and compare the grammar of conceptual modelling languages. This work applies the BWW theory to a real world example in the health arena. The general practice computing group data model was developed using the Barker Entity Relationship Modelling technique. We describe an experiment, grounded in ontological theory, which evaluates how well the GPCG data model is understood by domain experts. The results show that with the exception of the use of entities to represent events, the raw model is better understood by domain experts
Resumo:
Even when data repositories exhibit near perfect data quality, users may formulate queries that do not correspond to the information requested. Users’ poor information retrieval performance may arise from either problems understanding of the data models that represent the real world systems, or their query skills. This research focuses on users’ understanding of the data structures, i.e., their ability to map the information request and the data model. The Bunge-Wand-Weber ontology was used to formulate three sets of hypotheses. Two laboratory experiments (one using a small data model and one using a larger data model) tested the effect of ontological clarity on users’ performance when undertaking component, record, and aggregate level tasks. The results indicate for the hypotheses associated with different representations but equivalent semantics that parsimonious data model participants performed better for component level tasks but that ontologically clearer data model participants performed better for record and aggregate level tasks.
Resumo:
Urbanization and the ability to manage for a sustainable future present numerous challenges for geographers and planners in metropolitan regions. Remotely sensed data are inherently suited to provide information on urban land cover characteristics, and their change over time, at various spatial and temporal scales. Data models for establishing the range of urban land cover types and their biophysical composition (vegetation, soil, and impervious surfaces) are integrated to provide a hierarchical approach to classifying land cover within urban environments. These data also provide an essential component for current simulation models of urban growth patterns, as both calibration and validation data. The first stages of the approach have been applied to examine urban growth between 1988 and 1995 for a rapidly developing area in southeast Queensland, Australia. Landsat Thematic Mapper image data provided accurate (83% adjusted overall accuracy) classification of broad land cover types and their change over time. The combination of commonly available remotely sensed data, image processing methods, and emerging urban growth models highlights an important application for current and next generation moderate spatial resolution image data in studies of urban environments.
Resumo:
Many studies on birds focus on the collection of data through an experimental design, suitable for investigation in a classical analysis of variance (ANOVA) framework. Although many findings are confirmed by one or more experts, expert information is rarely used in conjunction with the survey data to enhance the explanatory and predictive power of the model. We explore this neglected aspect of ecological modelling through a study on Australian woodland birds, focusing on the potential impact of different intensities of commercial cattle grazing on bird density in woodland habitat. We examine a number of Bayesian hierarchical random effects models, which cater for overdispersion and a high frequency of zeros in the data using WinBUGS and explore the variation between and within different grazing regimes and species. The impact and value of expert information is investigated through the inclusion of priors that reflect the experience of 20 experts in the field of bird responses to disturbance. Results indicate that expert information moderates the survey data, especially in situations where there are little or no data. When experts agreed, credible intervals for predictions were tightened considerably. When experts failed to agree, results were similar to those evaluated in the absence of expert information. Overall, we found that without expert opinion our knowledge was quite weak. The fact that the survey data is quite consistent, in general, with expert opinion shows that we do know something about birds and grazing and we could learn a lot faster if we used this approach more in ecology, where data are scarce. Copyright (c) 2005 John Wiley & Sons, Ltd.
Resumo:
The Edinburgh-Cape Blue Object Survey is a major survey to discover blue stellar objects brighter than B similar to 18 in the southern sky. It is planned to cover an area of sky of 10 000 deg(2) with \b\ > 30 degrees and delta < 0 degrees. The blue stellar objects are selected by automatic techniques from U and B pairs of UK Schmidt Telescope plates scanned with the COSMOS measuring machine. Follow-up photometry and spectroscopy are being obtained with the SAAO telescopes to classify objects brighter than B = 16.5. This paper describes the survey, the techniques used to extract the blue stellar objects, the photometric methods and accuracy, the spectroscopic classification, and the limits and completeness of the survey.
Resumo:
An important feature of some conceptual modelling grammars is the features they provide to allow database designers to show real-world things may or may not possess a particular attribute or relationship. In the entity-relationship model, for example, the fact that a thing may not possess an attribute can be represented by using a special symbol to indicate that the attribute is optional. Similarly, the fact that a thing may or may not be involved in a relationship can be represented by showing the minimum cardinality of the relationship as zero. Whether these practices should be followed, however, is a contentious issue. An alternative approach is to eliminate optional attributes and relationships from conceptual schema diagrams by using subtypes that have only mandatory attributes and relationships. In this paper, we first present a theory that led us to predict that optional attributes and relationships should be used in conceptual schema diagrams only when users of the diagrams require a surface-level understanding of the domain being represented by the diagrams. When users require a deep-level understanding, however, optional attributes and relationships should not be used because they undermine users' abilities to grasp important domain semantics. We describe three experiments which we then undertook to test our predictions. The results of the experiments support our predictions.
Resumo:
Refinement in software engineering allows a specification to be developed in stages, with design decisions taken at earlier stages constraining the design at later stages. Refinement in complex data models is difficult due to lack of a way of defining constraints, which can be progressively maintained over increasingly detailed refinements. Category theory provides a way of stating wide scale constraints. These constraints lead to a set of design guidelines, which maintain the wide scale constraints under increasing detail. Previous methods of refinement are essentially local, and the proposed method does not interfere very much with these local methods. The result is particularly applicable to semantic web applications, where ontologies provide systems of more or less abstract constraints on systems, which must be implemented and therefore refined by participating systems. With the approach of this paper, the concept of committing to an ontology carries much more force. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Chambers and Quiggin (2000) use state-contingent representations of risky production technologies to establish important theoretical results concerning producer behavior under uncertainty. Unfortunately, perceived problems in the estimation of state-contingent models have limited the usefulness of the approach in policy formulation. We show that fixed and random effects state-contingent production frontiers can be conveniently estimated in a finite mixtures framework. An empirical example is provided. Compared to conventional estimation approaches, we find that estimating production frontiers in a state-contingent framework produces significantly different estimates of elasticities, firm technical efficiencies, and other quantities of economic interest.
Resumo:
Background The 2001 Australian census revealed that adults aged 65 years and over constituted 12.6% of the population, up from 12.1% in 1996. It is projected that this figure will rise to 21% or 5.1 million Australians by 2031. In 1998, 6% (134 000) of adults in Australia aged 65 years and over were residing in nursing homes or hostels and this number is also expected to rise. As skin ages, there is a decreased turnover and replacement of epidermal skin cells, a thinning subcutaneous fat layer and a reduced production of protective oils. These changes can affect the normal functions of the skin such as its role as a barrier to irritants and pathogens, temperature and water regulation. Generally, placement in a long-term care facility indicates an inability of the older person to perform all of the activities of daily living such as skin care. Therefore, skin care management protocols should be available to reduce the likelihood of skin irritation and breakdown and ultimately promote comfort of the older person. Objectives The objective of this review was to determine the best available evidence for the effectiveness and safety of topical skin care regimens for older adults residing in long-term aged care facilities. The primary outcome was the incidence of adverse skin conditions with patient satisfaction considered as a secondary outcome. Search strategy A literature search was performed using the following databases: PubMed (NLM) (1966–4/2003), Embase (1966–4/2003), CINAHL (1966–4/2003), Current Contents (1993–4/2003), Cochrane Library (1966–2/2003), Web of Science (1995–12/2002), Science Citation Index Expanded and ProceedingsFirst (1993–12/2002). Health Technology Assessment websites were also searched. No language restrictions were applied. Selection criteria Systematic reviews of randomised controlled trials, randomised and non-randomised controlled trials evaluating any non-medical intervention or program that aimed to maintain or improve the integrity of skin in older adults were considered for inclusion. Participants were 65 years of age or over and residing in an aged care facility, hospital or long-term care in the community. Studies were excluded if they evaluated pressure-relieving techniques for the prevention of skin breakdown. Data collection and analysis Two independent reviewers assessed study eligibility for inclusion. Study design and quality were tabulated and relative risks, odds ratios, mean differences and associated 95% confidence intervals were calculated from individual comparative studies containing count data. Results The resulting evidence of the effectiveness of topical skin care interventions was variable and dependent upon the skin condition outcome being assessed. The strongest evidence for maintenance of skin condition in incontinent patients found that disposable bodyworn incontinence protection reduced the odds of deterioration of skin condition compared with non-disposable bodyworns. The best evidence for non-pressure relieving topical skin care interventions on pressure sore formation found the no-rinse cleanser Clinisan to be more effective than soap and water at maintaining healthy skin (no ulcers) in elderly incontinent patients in long-term care. The quality of studies examining the effectiveness of topical skin care interventions on the incidence of skin tears was very poor and inconclusive. Topical skin care for prevention of dermatitis found that Sudocrem could reduce the redness of skin compared with zinc cream if applied regularly after each pad change, but not the number of lesions. Topical skin care on dry skin found the Bag Bath/Travel Bath no-rinse skin care cleanser to be more effective at preventing overall skin dryness and most specifically flaking and scaling when compared with the traditional soap and water washing method in residents of a long-term care facility. Information on the safety of topical skin care interventions is lacking. Therefore, because of the lack of evidence, no recommendation on the safety on any intervention included in this review can be made.
Resumo:
Background The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease. Results We show that GPNN has high power to detect even relatively small genetic effects (2–3% heritability) in simulated data models involving two and three locus interactions. The limits of detection were reached under conditions with very small heritability (
Resumo:
Background: The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease. Results: We show that GPNN has high power to detect even relatively small genetic effects (2-3% heritability) in simulated data models involving two and three locus interactions. The limits of detection were reached under conditions with very small heritability (