876 resultados para Models Of Data
Resumo:
Vintage-based vector autoregressive models of a single macroeconomic variable are shown to be a useful vehicle for obtaining forecasts of different maturities of future and past observations, including estimates of post-revision values. The forecasting performance of models which include information on annual revisions is superior to that of models which only include the first two data releases. However, the empirical results indicate that a model which reflects the seasonal nature of data releases more closely does not offer much improvement over an unrestricted vintage-based model which includes three rounds of annual revisions.
Resumo:
A problem frequently encountered in Data Envelopment Analysis (DEA) is that the total number of inputs and outputs included tend to be too many relative to the sample size. One way to counter this problem is to combine several inputs (or outputs) into (meaningful) aggregate variables reducing thereby the dimension of the input (or output) vector. A direct effect of input aggregation is to reduce the number of constraints. This, in its turn, alters the optimal value of the objective function. In this paper, we show how a statistical test proposed by Banker (1993) may be applied to test the validity of a specific way of aggregating several inputs. An empirical application using data from Indian manufacturing for the year 2002-03 is included as an example of the proposed test.
Resumo:
The schema of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. Obtaining quickly the appropriate data increases the likelihood that an organization will make good decisions and respond adeptly to challenges. This research presents and validates a methodology for evaluating, ex ante, the relative desirability of alternative instantiations of a model of data. In contrast to prior research, each instantiation is based on a different formal theory. This research theorizes that the instantiation that yields the lowest weighted average query complexity for a representative sample of information requests is the most desirable instantiation for end-user queries. The theory was validated by an experiment that compared end-user performance using an instantiation of a data structure based on the relational model of data with performance using the corresponding instantiation of the data structure based on the object-relational model of data. Complexity was measured using three different Halstead metrics: program length, difficulty, and effort. For a representative sample of queries, the average complexity using each instantiation was calculated. As theorized, end users querying the instantiation with the lower average complexity made fewer semantic errors, i.e., were more effective at composing queries. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques were used to derive this interesting information. Mining on XML documents is impacted by its model due to the semi-structured nature of these documents. Hence, in this chapter we present an overview of the various models of XML documents, how these models were used for mining and some of the issues and challenges in these models. In addition, this chapter also provides some insights into the future models of XML documents for effectively capturing the two important features namely structure and content of XML documents for mining.
Resumo:
The motion response of marine structures in waves can be studied using finite-dimensional linear-time-invariant approximating models. These models, obtained using system identification with data computed by hydrodynamic codes, find application in offshore training simulators, hardware-in-the-loop simulators for positioning control testing, and also in initial designs of wave-energy conversion devices. Different proposals have appeared in the literature to address the identification problem in both time and frequency domains, and recent work has highlighted the superiority of the frequency-domain methods. This paper summarises practical frequency-domain estimation algorithms that use constraints on model structure and parameters to refine the search of approximating parametric models. Practical issues associated with the identification are discussed, including the influence of radiation model accuracy in force-to-motion models, which are usually the ultimate modelling objective. The illustration examples in the paper are obtained using a freely available MATLAB toolbox developed by the authors, which implements the estimation algorithms described.
Resumo:
Animal models of critical illness are vital in biomedical research. They provide possibilities for the investigation of pathophysiological processes that may not otherwise be possible in humans. In order to be clinically applicable, the model should simulate the critical care situation realistically, including anaesthesia, monitoring, sampling, utilising appropriate personnel skill mix, and therapeutic interventions. There are limited data documenting the constitution of ideal technologically advanced large animal critical care practices and all the processes of the animal model. In this paper, we describe the procedure of animal preparation, anaesthesia induction and maintenance, physiologic monitoring, data capture, point-of-care technology, and animal aftercare that has been successfully used to study several novel ovine models of critical illness. The relevant investigations are on respiratory failure due to smoke inhalation, transfusion related acute lung injury, endotoxin-induced proteogenomic alterations, haemorrhagic shock, septic shock, brain death, cerebral microcirculation, and artificial heart studies. We have demonstrated the functionality of monitoring practices during anaesthesia required to provide a platform for undertaking systematic investigations in complex ovine models of critical illness.
Resumo:
Understanding the effects of different types and quality of data on bioclimatic modeling predictions is vital to ascertaining the value of existing models, and to improving future models. Bioclimatic models were constructed using the CLIMEX program, using different data types – seasonal dynamics, geographic (overseas) distribution, and a combination of the two – for two biological control agents for the major weed Lantana camara L. in Australia. The models for one agent, Teleonemia scrupulosa Stål (Hemiptera:Tingidae) were based on a higher quality and quantity of data than the models for the other agent, Octotoma scabripennis Guérin-Méneville (Coleoptera: Chrysomelidae). Predictions of the geographic distribution for Australia showed that T. scrupulosa models exhibited greater accuracy with a progressive improvement from seasonal dynamics data, to the model based on overseas distribution, and finally the model combining the two data types. In contrast, O. scabripennis models were of low accuracy, and showed no clear trends across the various model types. These case studies demonstrate the importance of high quality data for developing models, and of supplementing distributional data with species seasonal dynamics data wherever possible. Seasonal dynamics data allows the modeller to focus on the species response to climatic trends, while distributional data enables easier fitting of stress parameters by restricting the species envelope to the described distribution. It is apparent that CLIMEX models based on low quality seasonal dynamics data, together with a small quantity of distributional data, are of minimal value in predicting the spatial extent of species distribution.
Resumo:
A systematic assessment of the submodels of conditional moment closure (CMC) formalism for the autoignition problem is carried out using direct numerical simulation (DNS) data. An initially non-premixed, n-heptane/air system, subjected to a three-dimensional, homogeneous, isotropic, and decaying turbulence, is considered. Two kinetic schemes, (1) a one-step and (2) a reduced four-step reaction mechanism, are considered for chemistry An alternative formulation is developed for closure of the mean chemical source term
Resumo:
Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.