878 resultados para regression discrete models
Resumo:
Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.
Resumo:
Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. In this paper, we propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest," whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets. © 2005 IEEE.
Resumo:
2000 Mathematics Subject Classification: 62G08, 62P30.
Resumo:
As more of the economy moves from traditional manufacturing to the service sector, the nature of work is becoming less tangible and thus, the representation of human behaviour in models is becoming more important. Representing human behaviour and decision making in models is challenging, both in terms of capturing the essence of the processes, and also the way that those behaviours and decisions are or can be represented in the models themselves. In order to advance understanding in this area, a useful first step is to evaluate and start to classify the various types of behaviour and decision making that are required to be modelled. This talk will attempt to set out and provide an initial classification of the different types of behaviour and decision making that a modeller might want to represent in a model. Then, it will be useful to start to assess the main methods of simulation in terms of their capability in representing these various aspects. The three main simulation methods, System Dynamics, Agent Based Modelling and Discrete Event Simulation all achieve this to varying degrees. There is some evidence that all three methods can, within limits, represent the key aspects of the system being modelled. The three simulation approaches are then assessed for their suitability in modelling these various aspects. Illustration of behavioural modelling will be provided from cases in supply chain management, evacuation modelling and rail disruption.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62K15, 91B42, 62H99.
Resumo:
In non-linear random effects some attention has been very recently devoted to the analysis ofsuitable transformation of the response variables separately (Taylor 1996) or not (Oberg and Davidian 2000) from the transformations of the covariates and, as far as we know, no investigation has been carried out on the choice of link function in such models. In our study we consider the use of a random effect model when a parameterized family of links (Aranda-Ordaz 1981, Prentice 1996, Pregibon 1980, Stukel 1988 and Czado 1997) is introduced. We point out the advantages and the drawbacks associated with the choice of this data-driven kind of modeling. Difficulties in the interpretation of regression parameters, and therefore in understanding the influence of covariates, as well as problems related to loss of efficiency of estimates and overfitting, are discussed. A case study on radiotherapy usage in breast cancer treatment is discussed.
Resumo:
2000 Mathematics Subject Classification: 62H12, 62P99
Resumo:
2000 Mathematics Subject Classification: 94A29, 94B70
Resumo:
2000 Mathematics Subject Classification: 60J80
Resumo:
2010 Mathematics Subject Classification: 68T50,62H30,62J05.
Resumo:
There has been an increasing interest in the use of agent-based simulation and some discussion of the relative merits of this approach as compared to discrete-event simulation. There are differing views on whether an agent-based simulation offers capabilities that discrete-event cannot provide or whether all agent-based applications can at least in theory be undertaken using a discrete-event approach. This paper presents a simple agent-based NetLogo model and corresponding discrete-event versions implemented in the widely used ARENA software. The two versions of the discrete-event model presented use a traditional process flow approach normally adopted in discrete-event simulation software and also an agent-based approach to the model build. In addition a real-time spatial visual display facility is provided using a spreadsheet platform controlled by VBA code embedded within the ARENA model. Initial findings from this investigation are that discrete-event simulation can indeed be used to implement agent-based models and with suitable integration elements such as VBA provide the spatial displays associated with agent-based software.
Resumo:
The solution of a TU cooperative game can be a distribution of the value of the grand coalition, i.e. it can be a distribution of the payo (utility) all the players together achieve. In a regression model, the evaluation of the explanatory variables can be a distribution of the overall t, i.e. the t of the model every regressor variable is involved. Furthermore, we can take regression models as TU cooperative games where the explanatory (regressor) variables are the players. In this paper we introduce the class of regression games, characterize it and apply the Shapley value to evaluating the explanatory variables in regression models. In order to support our approach we consider Young (1985)'s axiomatization of the Shapley value, and conclude that the Shapley value is a reasonable tool to evaluate the explanatory variables of regression models.
Resumo:
Duality can be viewed as the soul of each von Neumann growth model. This is not at all surprising because von Neumann (1955), a mathematical genius, extensively studied quantum mechanics which involves a “dual nature” (electromagnetic waves and discrete corpuscules or light quanta). This may have had some influence on developing his own economic duality concept. The main object of this paper is to restore the spirit of economic duality in the investigations of the multiple von Neumann equilibria. By means of the (ir)reducibility taxonomy in Móczár (1995) the author transforms the primal canonical decomposition given by Bromek (1974) in the von Neumann growth model into the synergistic primal and dual canonical decomposition. This enables us to obtain all the information about the steadily maintainable states of growth sustained by the compatible price-constellations at each distinct expansion factor.
Resumo:
In 2010, a household survey was carried out in Hungary among 1037 respondents to study consumer preferences and willingness to pay for health care services. In this paper, we use the data from the discrete choice experiments included in the survey, to elicit the preferences of health care consumers about the choice of health care providers. Regression analysis is used to estimate the effect of the improvement of service attributes (quality, access, and price) on patients’ choice, as well as the differences among the socio-demographic groups. We also estimate the marginal willingness to pay for the improvement in attribute levels by calculating marginal rates of substitution. The results show that respondents from a village or the capital, with low education and bad health status are more driven by the changes in the price attribute when choosing between health care providers. Respondents value the good skills and reputation of the physician and the attitude of the personnel most, followed by modern equipment and maintenance of the office/hospital. Access attributes (travelling and waiting time) are less important. The method of discrete choice experiment is useful to reveal patients’ preferences, and might support the development of an evidence-based and sustainable health policy on patient payments.
Resumo:
This paper explains how Poisson regression can be used in studies in which the dependent variable describes the number of occurrences of some rare event such as suicide. After pointing out why ordinary linear regression is inappropriate for treating dependent variables of this sort, we go on to present the basic Poisson regression model and show how it fits in the broad class of generalized linear models. Then we turn to discussing a major problem of Poisson regression known as overdispersion and suggest possible solutions, including the correction of standard errors and negative binomial regression. The paper ends with a detailed empirical example, drawn from our own research on suicide.