955 resultados para NON-LINEAR MODELS
Resumo:
Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases. This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM to subsetted versions of the data. Additional gains in efficiency are achieved for Poisson models, commonly used in disease mapping problems, because of their special collapsibility property which allows data reduction through summaries. Convergence of the proposed iterative procedure is guaranteed for canonical link functions. The strategy is applied to investigate the relationship between ischemic heart disease, socioeconomic status and age/gender category in New South Wales, Australia, based on outcome data consisting of approximately 33 million records. A simulation study demonstrates the algorithm's reliability in analyzing a data set with 12 million records for a (non-collapsible) logistic regression model.
Resumo:
This paper proposes Poisson log-linear multilevel models to investigate population variability in sleep state transition rates. We specifically propose a Bayesian Poisson regression model that is more flexible, scalable to larger studies, and easily fit than other attempts in the literature. We further use hierarchical random effects to account for pairings of individuals and repeated measures within those individuals, as comparing diseased to non-diseased subjects while minimizing bias is of epidemiologic importance. We estimate essentially non-parametric piecewise constant hazards and smooth them, and allow for time varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming piecewise constant hazards. This relationship allows us to synthesize two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed.
Resumo:
In non-linear random effects some attention has been very recently devoted to the analysis ofsuitable transformation of the response variables separately (Taylor 1996) or not (Oberg and Davidian 2000) from the transformations of the covariates and, as far as we know, no investigation has been carried out on the choice of link function in such models. In our study we consider the use of a random effect model when a parameterized family of links (Aranda-Ordaz 1981, Prentice 1996, Pregibon 1980, Stukel 1988 and Czado 1997) is introduced. We point out the advantages and the drawbacks associated with the choice of this data-driven kind of modeling. Difficulties in the interpretation of regression parameters, and therefore in understanding the influence of covariates, as well as problems related to loss of efficiency of estimates and overfitting, are discussed. A case study on radiotherapy usage in breast cancer treatment is discussed.
Resumo:
In this work we report on a comparison of some theoretical models usually used to fit the dependence on temperature of the fundamental energy gap of semiconductor materials. We used in our investigations the theoretical models of Viña, Pässler-p and Pässler-ρ to fit several sets of experimental data, available in the literature for the energy gap of GaAs in the temperature range from 12 to 974 K. Performing several fittings for different values of the upper limit of the analyzed temperature range (Tmax), we were able to follow in a systematic way the evolution of the fitting parameters up to the limit of high temperatures and make a comparison between the zero-point values obtained from the different models by extrapolating the linear dependence of the gaps at high T to T = 0 K and that determined by the dependence of the gap on isotope mass. Using experimental data measured by absorption spectroscopy, we observed the non-linear behavior of Eg(T) of GaAs for T > ΘD.
Resumo:
This communication proposes a simple way to introduce fibers into finite element modelling. This is a promising formulation to deal with fiber-reinforced composites by the finite element method (FEM), as it allows the consideration of short or long fibers placed arbitrarily inside a continuum domain (matrix). The most important feature of the formulation is that no additional degree of freedom is introduced into the pre-existent finite element numerical system to consider any distribution of fiber inclusions. In other words, the size of the system of equations used to solve a non-reinforced medium is the same as the one used to solve the reinforced counterpart. Another important characteristic is the reduced work required by the user to introduce fibers, avoiding `rebar` elements, node-by-node geometrical definitions or even complex mesh generation. An additional characteristic of the technique is the possibility of representing unbounded stresses at the end of fibers using a finite number of degrees of freedom. Further studies are required for non-linear applications in which localization may occur. Along the text the linear formulation is presented and the bounded connection between fibers and continuum is considered. Four examples are presented, including non-linear analysis, to validate and show the capabilities of the formulation. Copyright (c) 2007 John Wiley & Sons, Ltd.
Resumo:
In this work a new probabilistic and dynamical approach to an extension of the Gompertz law is proposed. A generalized family of probability density functions, designated by Beta* (p, q), which is proportional to the right hand side of the Tsoularis-Wallace model, is studied. In particular, for p = 2, the investigation is extended to the extreme value models of Weibull and Frechet type. These models, described by differential equations, are proportional to the hyper-Gompertz growth model. It is proved that the Beta* (2, q) densities are a power of betas mixture, and that its dynamics are determined by a non-linear coupling of probabilities. The dynamical analysis is performed using techniques of symbolic dynamics and the system complexity is measured using topological entropy. Generally, the natural history of a malignant tumour is reflected through bifurcation diagrams, in which are identified regions of regression, stability, bifurcation, chaos and terminus.
Resumo:
Interaction effects are usually modeled by means of moderated regression analysis. Structural equation models with non-linear constraints make it possible to estimate interaction effects while correcting formeasurement error. From the various specifications, Jöreskog and Yang's(1996, 1998), likely the most parsimonious, has been chosen and further simplified. Up to now, only direct effects have been specified, thus wasting much of the capability of the structural equation approach. This paper presents and discusses an extension of Jöreskog and Yang's specification that can handle direct, indirect and interaction effects simultaneously. The model is illustrated by a study of the effects of an interactive style of use of budgets on both company innovation and performance
Resumo:
An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001.We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling.
Resumo:
The paper proposes an approach aimed at detecting optimal model parameter combinations to achieve the most representative description of uncertainty in the model performance. A classification problem is posed to find the regions of good fitting models according to the values of a cost function. Support Vector Machine (SVM) classification in the parameter space is applied to decide if a forward model simulation is to be computed for a particular generated model. SVM is particularly designed to tackle classification problems in high-dimensional space in a non-parametric and non-linear way. SVM decision boundaries determine the regions that are subject to the largest uncertainty in the cost function classification, and, therefore, provide guidelines for further iterative exploration of the model space. The proposed approach is illustrated by a synthetic example of fluid flow through porous media, which features highly variable response due to the parameter values' combination.
Resumo:
In dynamic models of energy allocation, assimilated energy is allocated to reproduction, somatic growth, maintenance or storage, and the allocation pattern can change with age. The expected evolutionary outcome is an optimal allocation pattern, but this depends on the environment experienced during the evolutionary process and on the fitness costs and benefits incurred by allocating resources in different ways. Here we review existing treatments which encompass some of the possibilities as regards constant or variable environments and their predictability or unpredictability, and the ways in which production rates and mortality rates depend on body size and composition and age and on the pattern of energy allocation. The optimal policy is to allocate resources where selection pressures are highest, and simultaneous allocation to several body subsystems and reproduction can be optimal if these pressures are equal. This may explain balanced growth commonly observed during ontogeny. Growth ceases at maturity in many models; factors favouring growth after maturity include non-linear trade-offs, variable season length, and production and mortality rates both increasing (or decreasing) functions of body size. We cannot yet say whether these are sufficient to account for the many known cases of growth after maturity and not all reasonable models have yet been explored. Factors favouring storage are also reviewed.
Resumo:
The comparison of radiotherapy techniques regarding secondary cancer risk has yielded contradictory results possibly stemming from the many different approaches used to estimate risk. The purpose of this study was to make a comprehensive evaluation of different available risk models applied to detailed whole-body dose distributions computed by Monte Carlo for various breast radiotherapy techniques including conventional open tangents, 3D conformal wedged tangents and hybrid intensity modulated radiation therapy (IMRT). First, organ-specific linear risk models developed by the International Commission on Radiological Protection (ICRP) and the Biological Effects of Ionizing Radiation (BEIR) VII committee were applied to mean doses for remote organs only and all solid organs. Then, different general non-linear risk models were applied to the whole body dose distribution. Finally, organ-specific non-linear risk models for the lung and breast were used to assess the secondary cancer risk for these two specific organs. A total of 32 different calculated absolute risks resulted in a broad range of values (between 0.1% and 48.5%) underlying the large uncertainties in absolute risk calculation. The ratio of risk between two techniques has often been proposed as a more robust assessment of risk than the absolute risk. We found that the ratio of risk between two techniques could also vary substantially considering the different approaches to risk estimation. Sometimes the ratio of risk between two techniques would range between values smaller and larger than one, which then translates into inconsistent results on the potential higher risk of one technique compared to another. We found however that the hybrid IMRT technique resulted in a systematic reduction of risk compared to the other techniques investigated even though the magnitude of this reduction varied substantially with the different approaches investigated. Based on the epidemiological data available, a reasonable approach to risk estimation would be to use organ-specific non-linear risk models applied to the dose distributions of organs within or near the treatment fields (lungs and contralateral breast in the case of breast radiotherapy) as the majority of radiation-induced secondary cancers are found in the beam-bordering regions.
Resumo:
As modern molecular biology moves towards the analysis of biological systems as opposed to their individual components, the need for appropriate mathematical and computational techniques for understanding the dynamics and structure of such systems is becoming more pressing. For example, the modeling of biochemical systems using ordinary differential equations (ODEs) based on high-throughput, time-dense profiles is becoming more common-place, which is necessitating the development of improved techniques to estimate model parameters from such data. Due to the high dimensionality of this estimation problem, straight-forward optimization strategies rarely produce correct parameter values, and hence current methods tend to utilize genetic/evolutionary algorithms to perform non-linear parameter fitting. Here, we describe a completely deterministic approach, which is based on interval analysis. This allows us to examine entire sets of parameters, and thus to exhaust the global search within a finite number of steps. In particular, we show how our method may be applied to a generic class of ODEs used for modeling biochemical systems called Generalized Mass Action Models (GMAs). In addition, we show that for GMAs our method is amenable to the technique in interval arithmetic called constraint propagation, which allows great improvement of its efficiency. To illustrate the applicability of our method we apply it to some networks of biochemical reactions appearing in the literature, showing in particular that, in addition to estimating system parameters in the absence of noise, our method may also be used to recover the topology of these networks.
Resumo:
This work presents new, efficient Markov chain Monte Carlo (MCMC) simulation methods for statistical analysis in various modelling applications. When using MCMC methods, the model is simulated repeatedly to explore the probability distribution describing the uncertainties in model parameters and predictions. In adaptive MCMC methods based on the Metropolis-Hastings algorithm, the proposal distribution needed by the algorithm learns from the target distribution as the simulation proceeds. Adaptive MCMC methods have been subject of intensive research lately, as they open a way for essentially easier use of the methodology. The lack of user-friendly computer programs has been a main obstacle for wider acceptance of the methods. This work provides two new adaptive MCMC methods: DRAM and AARJ. The DRAM method has been built especially to work in high dimensional and non-linear problems. The AARJ method is an extension to DRAM for model selection problems, where the mathematical formulation of the model is uncertain and we want simultaneously to fit several different models to the same observations. The methods were developed while keeping in mind the needs of modelling applications typical in environmental sciences. The development work has been pursued while working with several application projects. The applications presented in this work are: a winter time oxygen concentration model for Lake Tuusulanjärvi and adaptive control of the aerator; a nutrition model for Lake Pyhäjärvi and lake management planning; validation of the algorithms of the GOMOS ozone remote sensing instrument on board the Envisat satellite of European Space Agency and the study of the effects of aerosol model selection on the GOMOS algorithm.
Resumo:
This thesis is concerned with the state and parameter estimation in state space models. The estimation of states and parameters is an important task when mathematical modeling is applied to many different application areas such as the global positioning systems, target tracking, navigation, brain imaging, spread of infectious diseases, biological processes, telecommunications, audio signal processing, stochastic optimal control, machine learning, and physical systems. In Bayesian settings, the estimation of states or parameters amounts to computation of the posterior probability density function. Except for a very restricted number of models, it is impossible to compute this density function in a closed form. Hence, we need approximation methods. A state estimation problem involves estimating the states (latent variables) that are not directly observed in the output of the system. In this thesis, we use the Kalman filter, extended Kalman filter, Gauss–Hermite filters, and particle filters to estimate the states based on available measurements. Among these filters, particle filters are numerical methods for approximating the filtering distributions of non-linear non-Gaussian state space models via Monte Carlo. The performance of a particle filter heavily depends on the chosen importance distribution. For instance, inappropriate choice of the importance distribution can lead to the failure of convergence of the particle filter algorithm. In this thesis, we analyze the theoretical Lᵖ particle filter convergence with general importance distributions, where p ≥2 is an integer. A parameter estimation problem is considered with inferring the model parameters from measurements. For high-dimensional complex models, estimation of parameters can be done by Markov chain Monte Carlo (MCMC) methods. In its operation, the MCMC method requires the unnormalized posterior distribution of the parameters and a proposal distribution. In this thesis, we show how the posterior density function of the parameters of a state space model can be computed by filtering based methods, where the states are integrated out. This type of computation is then applied to estimate parameters of stochastic differential equations. Furthermore, we compute the partial derivatives of the log-posterior density function and use the hybrid Monte Carlo and scaled conjugate gradient methods to infer the parameters of stochastic differential equations. The computational efficiency of MCMC methods is highly depend on the chosen proposal distribution. A commonly used proposal distribution is Gaussian. In this kind of proposal, the covariance matrix must be well tuned. To tune it, adaptive MCMC methods can be used. In this thesis, we propose a new way of updating the covariance matrix using the variational Bayesian adaptive Kalman filter algorithm.
Resumo:
L'objectif du présent mémoire vise à présenter des modèles de séries chronologiques multivariés impliquant des vecteurs aléatoires dont chaque composante est non-négative. Nous considérons les modèles vMEM (modèles vectoriels et multiplicatifs avec erreurs non-négatives) présentés par Cipollini, Engle et Gallo (2006) et Cipollini et Gallo (2010). Ces modèles représentent une généralisation au cas multivarié des modèles MEM introduits par Engle (2002). Ces modèles trouvent notamment des applications avec les séries chronologiques financières. Les modèles vMEM permettent de modéliser des séries chronologiques impliquant des volumes d'actif, des durées, des variances conditionnelles, pour ne citer que ces applications. Il est également possible de faire une modélisation conjointe et d'étudier les dynamiques présentes entre les séries chronologiques formant le système étudié. Afin de modéliser des séries chronologiques multivariées à composantes non-négatives, plusieurs spécifications du terme d'erreur vectoriel ont été proposées dans la littérature. Une première approche consiste à considérer l'utilisation de vecteurs aléatoires dont la distribution du terme d'erreur est telle que chaque composante est non-négative. Cependant, trouver une distribution multivariée suffisamment souple définie sur le support positif est plutôt difficile, au moins avec les applications citées précédemment. Comme indiqué par Cipollini, Engle et Gallo (2006), un candidat possible est une distribution gamma multivariée, qui impose cependant des restrictions sévères sur les corrélations contemporaines entre les variables. Compte tenu que les possibilités sont limitées, une approche possible est d'utiliser la théorie des copules. Ainsi, selon cette approche, des distributions marginales (ou marges) peuvent être spécifiées, dont les distributions en cause ont des supports non-négatifs, et une fonction de copule permet de tenir compte de la dépendance entre les composantes. Une technique d'estimation possible est la méthode du maximum de vraisemblance. Une approche alternative est la méthode des moments généralisés (GMM). Cette dernière méthode présente l'avantage d'être semi-paramétrique dans le sens que contrairement à l'approche imposant une loi multivariée, il n'est pas nécessaire de spécifier une distribution multivariée pour le terme d'erreur. De manière générale, l'estimation des modèles vMEM est compliquée. Les algorithmes existants doivent tenir compte du grand nombre de paramètres et de la nature élaborée de la fonction de vraisemblance. Dans le cas de l'estimation par la méthode GMM, le système à résoudre nécessite également l'utilisation de solveurs pour systèmes non-linéaires. Dans ce mémoire, beaucoup d'énergies ont été consacrées à l'élaboration de code informatique (dans le langage R) pour estimer les différents paramètres du modèle. Dans le premier chapitre, nous définissons les processus stationnaires, les processus autorégressifs, les processus autorégressifs conditionnellement hétéroscédastiques (ARCH) et les processus ARCH généralisés (GARCH). Nous présentons aussi les modèles de durées ACD et les modèles MEM. Dans le deuxième chapitre, nous présentons la théorie des copules nécessaire pour notre travail, dans le cadre des modèles vectoriels et multiplicatifs avec erreurs non-négatives vMEM. Nous discutons également des méthodes possibles d'estimation. Dans le troisième chapitre, nous discutons les résultats des simulations pour plusieurs méthodes d'estimation. Dans le dernier chapitre, des applications sur des séries financières sont présentées. Le code R est fourni dans une annexe. Une conclusion complète ce mémoire.