71 resultados para Count data models
Resumo:
Semantic data models provide a map of the components of an information system. The characteristics of these models affect their usefulness for various tasks (e.g., information retrieval). The quality of information retrieval has obvious important consequences, both economic and otherwise. Traditionally, data base designers have produced parsimonious logical data models. In spite of their increased size, ontologically clearer conceptual models have been shown to facilitate better performance for both problem solving and information retrieval tasks in experimental settings. The experiments producing evidence of enhanced performance for ontologically clearer models have, however, used application domains of modest size. Data models in organizational settings are likely to be substantially larger than those used in these experiments. This research used an experiment to investigate whether the benefits of improved information retrieval performance associated with ontologically clearer models are robust as the size of the application domains increase. The experiment used an application domain of approximately twice the size as tested in prior experiments. The results indicate that, relative to the users of the parsimonious implementation, end users of the ontologically clearer implementation made significantly more semantic errors, took significantly more time to compose their queries, and were significantly less confident in the accuracy of their queries.
Resumo:
This paper discusses a multi-layer feedforward (MLF) neural network incident detection model that was developed and evaluated using field data. In contrast to published neural network incident detection models which relied on simulated or limited field data for model development and testing, the model described in this paper was trained and tested on a real-world data set of 100 incidents. The model uses speed, flow and occupancy data measured at dual stations, averaged across all lanes and only from time interval t. The off-line performance of the model is reported under both incident and non-incident conditions. The incident detection performance of the model is reported based on a validation-test data set of 40 incidents that were independent of the 60 incidents used for training. The false alarm rates of the model are evaluated based on non-incident data that were collected from a freeway section which was video-taped for a period of 33 days. A comparative evaluation between the neural network model and the incident detection model in operation on Melbourne's freeways is also presented. The results of the comparative performance evaluation clearly demonstrate the substantial improvement in incident detection performance obtained by the neural network model. The paper also presents additional results that demonstrate how improvements in model performance can be achieved using variable decision thresholds. Finally, the model's fault-tolerance under conditions of corrupt or missing data is investigated and the impact of loop detector failure/malfunction on the performance of the trained model is evaluated and discussed. The results presented in this paper provide a comprehensive evaluation of the developed model and confirm that neural network models can provide fast and reliable incident detection on freeways. (C) 1997 Elsevier Science Ltd. All rights reserved.
Resumo:
The Eysenck Personality Questionnaire-Revised (EPQ-R), the Eysenck Personality Profiler Short Version (EPP-S), and the Big Five Inventory (BFI-V4a) were administered to 135 postgraduate students of business in Pakistan. Whilst Extraversion and Neuroticism scales from the three questionnaires were highly correlated, it was found that Agreeableness was most highly correlated with Psychoticism in the EPQ-R and Conscientiousness was most highly correlated with Psychoticism in the EPP-S. Principal component analyses with varimax rotation were carried out. The analyses generally suggested that the five factor model rather than the three-factor model was more robust and better for interpretation of all the higher order scales of the EPQ-R, EPP-S, and BFI-V4a in the Pakistani data. Results show that the superiority of the five factor solution results from the inclusion of a broader variety of personality scales in the input data, whereas Eysenck's three factor solution seems to be best when a less complete but possibly more important set of variables are input. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
Genetic research on risk of alcohol, tobacco or drug dependence must make allowance for the partial overlap of risk-factors for initiation of use, and risk-factors for dependence or other outcomes in users. Except in the extreme cases where genetic and environmental risk-factors for initiation and dependence overlap completely or are uncorrelated, there is no consensus about how best to estimate the magnitude of genetic or environmental correlations between Initiation and Dependence in twin and family data. We explore by computer simulation the biases to estimates of genetic and environmental parameters caused by model misspecification when Initiation can only be defined as a binary variable. For plausible simulated parameter values, the two-stage genetic models that we consider yield estimates of genetic and environmental variances for Dependence that, although biased, are not very discrepant from the true values. However, estimates of genetic (or environmental) correlations between Initiation and Dependence may be seriously biased, and may differ markedly under different two-stage models. Such estimates may have little credibility unless external data favor selection of one particular model. These problems can be avoided if Initiation can be assessed as a multiple-category variable (e.g. never versus early-onset versus later onset user), with at least two categories measurable in users at risk for dependence. Under these conditions, under certain distributional assumptions., recovery of simulated genetic and environmental correlations becomes possible, Illustrative application of the model to Australian twin data on smoking confirmed substantial heritability of smoking persistence (42%) with minimal overlap with genetic influences on initiation.
Resumo:
We compare Bayesian methodology utilizing free-ware BUGS (Bayesian Inference Using Gibbs Sampling) with the traditional structural equation modelling approach based on another free-ware package, Mx. Dichotomous and ordinal (three category) twin data were simulated according to different additive genetic and common environment models for phenotypic variation. Practical issues are discussed in using Gibbs sampling as implemented by BUGS to fit subject-specific Bayesian generalized linear models, where the components of variation may be estimated directly. The simulation study (based on 2000 twin pairs) indicated that there is a consistent advantage in using the Bayesian method to detect a correct model under certain specifications of additive genetics and common environmental effects. For binary data, both methods had difficulty in detecting the correct model when the additive genetic effect was low (between 10 and 20%) or of moderate range (between 20 and 40%). Furthermore, neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large (50%). Power was significantly improved with ordinal data for most scenarios, except for the case of low heritability under a true ACE model. We illustrate and compare both methods using data from 1239 twin pairs over the age of 50 years, who were registered with the Australian National Health and Medical Research Council Twin Registry (ATR) and presented symptoms associated with osteoarthritis occurring in joints of the hand.
Resumo:
Remotely sensed data have been used extensively for environmental monitoring and modeling at a number of spatial scales; however, a limited range of satellite imaging systems often. constrained the scales of these analyses. A wider variety of data sets is now available, allowing image data to be selected to match the scale of environmental structure(s) or process(es) being examined. A framework is presented for use by environmental scientists and managers, enabling their spatial data collection needs to be linked to a suitable form of remotely sensed data. A six-step approach is used, combining image spatial analysis and scaling tools, within the context of hierarchy theory. The main steps involved are: (1) identification of information requirements for the monitoring or management problem; (2) development of ideal image dimensions (scene model), (3) exploratory analysis of existing remotely sensed data using scaling techniques, (4) selection and evaluation of suitable remotely sensed data based on the scene model, (5) selection of suitable spatial analytic techniques to meet information requirements, and (6) cost-benefit analysis. Results from a case study show that the framework provided an objective mechanism to identify relevant aspects of the monitoring problem and environmental characteristics for selecting remotely sensed data and analysis techniques.
Resumo:
Traditionally the basal ganglia have been implicated in motor behavior, as they are involved in both the execution of automatic actions and the modification of ongoing actions in novel contexts. Corresponding to cognition, the role of the basal ganglia has not been defined as explicitly. Relative to linguistic processes, contemporary theories of subcortical participation in language have endorsed a role for the globus pallidus internus (GPi) in the control of lexical-semantic operations. However, attempts to empirically validate these postulates have been largely limited to neuropsychological investigations of verbal fluency abilities subsequent to pallidotomy. We evaluated the impact of bilateral posteroventral pallidotomy (BPVP) on language function across a range of general and high-level linguistic abilities, and validated/extended working theories of pallidal participation in language. Comprehensive linguistic profiles were compiled up to 1 month before and 3 months after BPVP in 6 subjects with Parkinson's disease (PD). Commensurate linguistic profiles were also gathered over a 3-month period for a nonsurgical control cohort of 16 subjects with PD and a group of 16 non-neurologically impaired controls (NC). Nonparametric between-groups comparisons were conducted and reliable change indices calculated, relative to baseline/3-month follow-up difference scores. Group-wise statistical comparisons between the three groups failed to reveal significant postoperative changes in language performance. Case-by-case data analysis relative to clinically consequential change indices revealed reliable alterations in performance across several language variables as a consequence of BPVP. These findings lend support to models of subcortical participation in language, which promote a role for the GPi in lexical-semantic manipulation mechanisms. Concomitant improvements and decrements in postoperative performance were interpreted within the context of additive and subtractive postlesional effects. Relative to parkinsonian cohorts, clinically reliable versus statistically significant changes on a case by case basis may provide the most accurate method of characterizing the way in which pathophysiologically divergent basal ganglia linguistic circuits respond to BPVP.
Resumo:
Many images consist of two or more 'phases', where a phase is a collection of homogeneous zones. For example, the phases may represent the presence of different sulphides in an ore sample. Frequently, these phases exhibit very little structure, though all connected components of a given phase may be similar in some sense. As a consequence, random set models are commonly used to model such images. The Boolean model and models derived from the Boolean model are often chosen. An alternative approach to modelling such images is to use the excursion sets of random fields to model each phase. In this paper, the properties of excursion sets will be firstly discussed in terms of modelling binary images. Ways of extending these models to multi-phase images will then be explored. A desirable feature of any model is to be able to fit it to data reasonably well. Different methods for fitting random set models based on excursion sets will be presented and some of the difficulties with these methods will be discussed.
Resumo:
Polytomous Item Response Theory Models provides a unified, comprehensive introduction to the range of polytomous models available within item response theory (IRT). It begins by outlining the primary structural distinction between the two major types of polytomous IRT models. This focuses on the two types of response probability that are unique to polytomous models and their associated response functions, which are modeled differently by the different types of IRT model. It describes, both conceptually and mathematically, the major specific polytomous models, including the Nominal Response Model, the Partial Credit Model, the Rating Scale model, and the Graded Response Model. Important variations, such as the Generalized Partial Credit Model are also described as are less common variations, such as the Rating Scale version of the Graded Response Model. Relationships among the models are also investigated and the operation of measurement information is described for each major model. Practical examples of major models using real data are provided, as is a chapter on choosing an appropriate model. Figures are used throughout to illustrate important elements as they are described.