898 resultados para Discrete Regression and Qualitative Choice Models
Resumo:
Several methods and approaches for measuring parameters to determine fecal sources of pollution in water have been developed in recent years. No single microbial or chemical parameter has proved sufficient to determine the source of fecal pollution. Combinations of parameters involving at least one discriminating indicator and one universal fecal indicator offer the most promising solutions for qualitative and quantitative analyses. The universal (nondiscriminating) fecal indicator provides quantitative information regarding the fecal load. The discriminating indicator contributes to the identification of a specific source. The relative values of the parameters derived from both kinds of indicators could provide information regarding the contribution to the total fecal load from each origin. It is also essential that both parameters characteristically persist in the environment for similar periods. Numerical analysis, such as inductive learning methods, could be used to select the most suitable and the lowest number of parameters to develop predictive models. These combinations of parameters provide information on factors affecting the models, such as dilution, specific types of animal source, persistence of microbial tracers, and complex mixtures from different sources. The combined use of the enumeration of somatic coliphages and the enumeration of Bacteroides-phages using different host specific strains (one from humans and another from pigs), both selected using the suggested approach, provides a feasible model for quantitative and qualitative analyses of fecal source identification.
Resumo:
Many European states apply score systems to evaluate the disability severity of non-fatal motor victims under the law of third-party liability. The score is a non-negative integer with an upper bound at 100 that increases with severity. It may be automatically converted into financial terms and thus also reflects the compensation cost for disability. In this paper, discrete regression models are applied to analyze the factors that influence the disability severity score of victims. Standard and zero-altered regression models are compared from two perspectives: an interpretation of the data generating process and the level of statistical fit. The results have implications for traffic safety policy decisions aimed at reducing accident severity. An application using data from Spain is provided.
Resumo:
In general, models of ecological systems can be broadly categorized as ’top-down’ or ’bottom-up’ models, based on the hierarchical level that the model processes are formulated on. The structure of a top-down, also known as phenomenological, population model can be interpreted in terms of population characteristics, but it typically lacks an interpretation on a more basic level. In contrast, bottom-up, also known as mechanistic, population models are derived from assumptions and processes on a more basic level, which allows interpretation of the model parameters in terms of individual behavior. Both approaches, phenomenological and mechanistic modelling, can have their advantages and disadvantages in different situations. However, mechanistically derived models might be better at capturing the properties of the system at hand, and thus give more accurate predictions. In particular, when models are used for evolutionary studies, mechanistic models are more appropriate, since natural selection takes place on the individual level, and in mechanistic models the direct connection between model parameters and individual properties has already been established. The purpose of this thesis is twofold. Firstly, a systematical way to derive mechanistic discrete-time population models is presented. The derivation is based on combining explicitly modelled, continuous processes on the individual level within a reproductive period with a discrete-time maturation process between reproductive periods. Secondly, as an example of how evolutionary studies can be carried out in mechanistic models, the evolution of the timing of reproduction is investigated. Thus, these two lines of research, derivation of mechanistic population models and evolutionary studies, are complementary to each other.
Resumo:
Systems biology is a new, emerging and rapidly developing, multidisciplinary research field that aims to study biochemical and biological systems from a holistic perspective, with the goal of providing a comprehensive, system- level understanding of cellular behaviour. In this way, it addresses one of the greatest challenges faced by contemporary biology, which is to compre- hend the function of complex biological systems. Systems biology combines various methods that originate from scientific disciplines such as molecu- lar biology, chemistry, engineering sciences, mathematics, computer science and systems theory. Systems biology, unlike “traditional” biology, focuses on high-level concepts such as: network, component, robustness, efficiency, control, regulation, hierarchical design, synchronization, concurrency, and many others. The very terminology of systems biology is “foreign” to “tra- ditional” biology, marks its drastic shift in the research paradigm and it indicates close linkage of systems biology to computer science. One of the basic tools utilized in systems biology is the mathematical modelling of life processes tightly linked to experimental practice. The stud- ies contained in this thesis revolve around a number of challenges commonly encountered in the computational modelling in systems biology. The re- search comprises of the development and application of a broad range of methods originating in the fields of computer science and mathematics for construction and analysis of computational models in systems biology. In particular, the performed research is setup in the context of two biolog- ical phenomena chosen as modelling case studies: 1) the eukaryotic heat shock response and 2) the in vitro self-assembly of intermediate filaments, one of the main constituents of the cytoskeleton. The range of presented approaches spans from heuristic, through numerical and statistical to ana- lytical methods applied in the effort to formally describe and analyse the two biological processes. We notice however, that although applied to cer- tain case studies, the presented methods are not limited to them and can be utilized in the analysis of other biological mechanisms as well as com- plex systems in general. The full range of developed and applied modelling techniques as well as model analysis methodologies constitutes a rich mod- elling framework. Moreover, the presentation of the developed methods, their application to the two case studies and the discussions concerning their potentials and limitations point to the difficulties and challenges one encounters in computational modelling of biological systems. The problems of model identifiability, model comparison, model refinement, model inte- gration and extension, choice of the proper modelling framework and level of abstraction, or the choice of the proper scope of the model run through this thesis.
Resumo:
It Has Been Argued That in the Construction and Simulation Process of Computable General Equilibrium (Cge) Models, the Choice of the Proper Macroclosure Remains a Fundamental Problem. in This Study, with a Standard Cge Model, We Simulate Disturbances Stemming From the Supply Or Demand Side of the Economy, Under Alternative Macroclosures. According to Our Results, the Choice of a Particular Closure Rule, for a Given Disturbance, May Have Different Quantitative and Qualitative Impacts. This Seems to Confirm the Imiportance of Simulating Cge Models Under Alternative Closure Rules and Eventually Choosing the Closure Which Best Applies to the Economy Under Study.
Resumo:
We study the problem of testing the error distribution in a multivariate linear regression (MLR) model. The tests are functions of appropriately standardized multivariate least squares residuals whose distribution is invariant to the unknown cross-equation error covariance matrix. Empirical multivariate skewness and kurtosis criteria are then compared to simulation-based estimate of their expected value under the hypothesized distribution. Special cases considered include testing multivariate normal, Student t; normal mixtures and stable error models. In the Gaussian case, finite-sample versions of the standard multivariate skewness and kurtosis tests are derived. To do this, we exploit simple, double and multi-stage Monte Carlo test methods. For non-Gaussian distribution families involving nuisance parameters, confidence sets are derived for the the nuisance parameters and the error distribution. The procedures considered are evaluated in a small simulation experi-ment. Finally, the tests are applied to an asset pricing model with observable risk-free rates, using monthly returns on New York Stock Exchange (NYSE) portfolios over five-year subperiods from 1926-1995.
Resumo:
Dans cette thèse, je me suis interessé à l’identification partielle des effets de traitements dans différents modèles de choix discrets avec traitements endogènes. Les modèles d’effets de traitement ont pour but de mesurer l’impact de certaines interventions sur certaines variables d’intérêt. Le type de traitement et la variable d’intérêt peuvent être défini de manière générale afin de pouvoir être appliqué à plusieurs différents contextes. Il y a plusieurs exemples de traitement en économie du travail, de la santé, de l’éducation, ou en organisation industrielle telle que les programmes de formation à l’emploi, les techniques médicales, l’investissement en recherche et développement, ou l’appartenance à un syndicat. La décision d’être traité ou pas n’est généralement pas aléatoire mais est basée sur des choix et des préférences individuelles. Dans un tel contexte, mesurer l’effet du traitement devient problématique car il faut tenir compte du biais de sélection. Plusieurs versions paramétriques de ces modèles ont été largement étudiées dans la littérature, cependant dans les modèles à variation discrète, la paramétrisation est une source importante d’identification. Dans un tel contexte, il est donc difficile de savoir si les résultats empiriques obtenus sont guidés par les données ou par la paramétrisation imposée au modèle. Etant donné, que les formes paramétriques proposées pour ces types de modèles n’ont généralement pas de fondement économique, je propose dans cette thèse de regarder la version nonparamétrique de ces modèles. Ceci permettra donc de proposer des politiques économiques plus robustes. La principale difficulté dans l’identification nonparamétrique de fonctions structurelles, est le fait que la structure suggérée ne permet pas d’identifier un unique processus générateur des données et ceci peut être du soit à la présence d’équilibres multiples ou soit à des contraintes sur les observables. Dans de telles situations, les méthodes d’identifications traditionnelles deviennent inapplicable d’où le récent développement de la littérature sur l’identification dans les modèles incomplets. Cette littérature porte une attention particuliere à l’identification de l’ensemble des fonctions structurelles d’intérêt qui sont compatibles avec la vraie distribution des données, cet ensemble est appelé : l’ensemble identifié. Par conséquent, dans le premier chapitre de la thèse, je caractérise l’ensemble identifié pour les effets de traitements dans le modèle triangulaire binaire. Dans le second chapitre, je considère le modèle de Roy discret. Je caractérise l’ensemble identifié pour les effets de traitements dans un modèle de choix de secteur lorsque la variable d’intérêt est discrète. Les hypothèses de sélection du secteur comprennent le choix de sélection simple, étendu et généralisé de Roy. Dans le dernier chapitre, je considère un modèle à variable dépendante binaire avec plusieurs dimensions d’hétérogéneité, tels que les jeux d’entrées ou de participation. je caractérise l’ensemble identifié pour les fonctions de profits des firmes dans un jeux avec deux firmes et à information complète. Dans tout les chapitres, l’ensemble identifié des fonctions d’intérêt sont écrites sous formes de bornes et assez simple pour être estimées à partir des méthodes d’inférence existantes.
Resumo:
Interaction effects are usually modeled by means of moderated regression analysis. Structural equation models with non-linear constraints make it possible to estimate interaction effects while correcting for measurement error. From the various specifications, Jöreskog and Yang's (1996, 1998), likely the most parsimonious, has been chosen and further simplified. Up to now, only direct effects have been specified, thus wasting much of the capability of the structural equation approach. This paper presents and discusses an extension of Jöreskog and Yang's specification that can handle direct, indirect and interaction effects simultaneously. The model is illustrated by a study of the effects of an interactive style of use of budgets on both company innovation and performance
Resumo:
We study the role of natural resource windfalls in explaining the efficiency of public expenditures. Using a rich dataset of expenditures and public good provision for 1,836 municipalities in Peru for period 2001-2010, we estimate a non-monotonic relationship between the efficiency of public good provision and the level of natural resource transfers. Local governments that were extremely favored by the boom of mineral prices were more efficient in using fiscal windfalls whereas those benefited with modest transfers were more inefficient. These results can be explained by the increase in political competition associated with the boom. However, the fact that increases in efficiency were related to reductions in public good provision casts doubts about the beneficial effects of political competition in promoting efficiency.
Resumo:
Experiments with CO2 instantaneously quadrupled and then held constant are used to show that the relationship between the global-mean net heat input to the climate system and the global-mean surface-air-temperature change is nonlinear in Coupled Model Intercomparison Project phase 5 (CMIP5) Atmosphere-Ocean General Circulation Models (AOGCMs). The nonlinearity is shown to arise from a change in strength of climate feedbacks driven by an evolving pattern of surface warming. In 23 out of the 27 AOGCMs examined the climate feedback parameter becomes significantly (95% confidence) less negative – i.e. the effective climate sensitivity increases – as time passes. Cloud feedback parameters show the largest changes. In the AOGCM-mean approximately 60% of the change in feedback parameter comes from the topics (30N-30S). An important region involved is the tropical Pacific where the surface warming intensifies in the east after a few decades. The dependence of climate feedbacks on an evolving pattern of surface warming is confirmed using the HadGEM2 and HadCM3 atmosphere GCMs (AGCMs). With monthly evolving sea-surface-temperatures and sea-ice prescribed from its AOGCM counterpart each AGCM reproduces the time-varying feedbacks, but when a fixed pattern of warming is prescribed the radiative response is linear with global temperature change or nearly so. We also demonstrate that the regression and fixed-SST methods for evaluating effective radiative forcing are in principle different, because rapid SST adjustment when CO2 is changed can produce a pattern of surface temperature change with zero global mean but non-zero change in net radiation at the top of the atmosphere (~ -0.5 Wm-2 in HadCM3).
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
We introduce in this paper a new class of discrete generalized nonlinear models to extend the binomial, Poisson and negative binomial models to cope with count data. This class of models includes some important models such as log-nonlinear models, logit, probit and negative binomial nonlinear models, generalized Poisson and generalized negative binomial regression models, among other models, which enables the fitting of a wide range of models to count data. We derive an iterative process for fitting these models by maximum likelihood and discuss inference on the parameters. The usefulness of the new class of models is illustrated with an application to a real data set. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The class of symmetric linear regression models has the normal linear regression model as a special case and includes several models that assume that the errors follow a symmetric distribution with longer-than-normal tails. An important member of this class is the t linear regression model, which is commonly used as an alternative to the usual normal regression model when the data contain extreme or outlying observations. In this article, we develop second-order asymptotic theory for score tests in this class of models. We obtain Bartlett-corrected score statistics for testing hypotheses on the regression and the dispersion parameters. The corrected statistics have chi-squared distributions with errors of order O(n(-3/2)), n being the sample size. The corrections represent an improvement over the corresponding original Rao`s score statistics, which are chi-squared distributed up to errors of order O(n(-1)). Simulation results show that the corrected score tests perform much better than their uncorrected counterparts in samples of small or moderate size.
Resumo:
This article presents important properties of standard discrete distributions and its conjugate densities. The Bernoulli and Poisson processes are described as generators of such discrete models. A characterization of distributions by mixtures is also introduced. This article adopts a novel singular notation and representation. Singular representations are unusual in statistical texts. Nevertheless, the singular notation makes it simpler to extend and generalize theoretical results and greatly facilitates numerical and computational implementation.