884 resultados para Generalized linear mixed model


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an effective decision making system for leak detection based on multiple generalized linear models and clustering techniques. The training data for the proposed decision system is obtained by setting up an experimental pipeline fully operational distribution system. The system is also equipped with data logging for three variables; namely, inlet pressure, outlet pressure, and outlet flow. The experimental setup is designed such that multi-operational conditions of the distribution system, including multi pressure and multi flow can be obtained. We then statistically tested and showed that pressure and flow variables can be used as signature of leak under the designed multi-operational conditions. It is then shown that the detection of leakages based on the training and testing of the proposed multi model decision system with pre data clustering, under multi operational conditions produces better recognition rates in comparison to the training based on the single model approach. This decision system is then equipped with the estimation of confidence limits and a method is proposed for using these confidence limits for obtaining more robust leakage recognition results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2002 Mathematics Subject Classification: 62M10.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two-stage data envelopment analysis (DEA) efficiency models identify the efficient frontier of a two-stage production process. In some two-stage processes, the inputs to the first stage are shared by the second stage, known as shared inputs. This paper proposes a new relational linear DEA model for dealing with measuring the efficiency score of two-stage processes with shared inputs under constant returns-to-scale assumption. Two case studies of banking industry and university operations are taken as two examples to illustrate the potential applications of the proposed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.

Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.

The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.

The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.

All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-08

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, the relationship between diameter at breast height (d) and total height (h) of individual-tree was modeled with the aim to establish provisory height-diameter (h-d) equations for maritime pine (Pinus pinaster Ait.) stands in the Lomba ZIF, Northeast Portugal. Using data collected locally, several local and generalized h-d equations from the literature were tested and adaptations were also considered. Model fitting was conducted by using usual nonlinear least squares (nls) methods. The best local and generalized models selected, were also tested as mixed models applying a first-order conditional expectation (FOCE) approximation procedure and maximum likelihood methods to estimate fixed and random effects. For the calibration of the mixed models and in order to be consistent with the fitting procedure, the FOCE method was also used to test different sampling designs. The results showed that the local h-d equations with two parameters performed better than the analogous models with three parameters. However a unique set of parameter values for the local model can not be used to all maritime pine stands in Lomba ZIF and thus, a generalized model including covariates from the stand, in addition to d, was necessary to obtain an adequate predictive performance. No evident superiority of the generalized mixed model in comparison to the generalized model with nonlinear least squares parameters estimates was observed. On the other hand, in the case of the local model, the predictive performance greatly improved when random effects were included. The results showed that the mixed model based in the local h-d equation selected is a viable alternative for estimating h if variables from the stand are not available. Moreover, it was observed that it is possible to obtain an adequate calibrated response using only 2 to 5 additional h-d measurements in quantile (or random) trees from the distribution of d in the plot (stand). Balancing sampling effort, accuracy and straightforwardness in practical applications, the generalized model from nls fit is recommended. Examples of applications of the selected generalized equation to the forest management are presented, namely how to use it to complete missing information from forest inventory and also showing how such an equation can be incorporated in a stand-level decision support system that aims to optimize the forest management for the maximization of wood volume production in Lomba ZIF maritime pine stands.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The roles of weather variability and sunspots in the occurrence of cyanobacteria blooms, were investigated using cyanobacteria cell data collected from the Fred Haigh Dam, Queensland, Australia. Time series generalized linear model and classification and regression (CART) model were used in the analysis. Data on notified cell numbers of cyanobacteria and weather variables over the periods 2001 and 2005 were provided by the Australian Department of Natural Resources and Water, and Australian Bureau of Meteorology, respectively. The results indicate that monthly minimum temperature (relative risk [RR]: 1.13, 95% confidence interval [CI]: 1.02-1.25) and rainfall (RR: 1.11; 95% CI: 1.03-1.20) had a positive association, but relative humidity (RR: 0.94; 95% CI: 0.91-0.98) and wind speed (RR:0.90; 95% CI: 0.82-0.98) were negatively associated with the cyanobacterial numbers, after adjustment for seasonality and auto-correlation. The CART model showed that the cyanobacteria numbers were best described by an interaction between minimum temperature, relative humidity, and sunspot numbers. When minimum temperature exceeded 18%C and relative humidity was under 66%, the number of cyanobacterial cells rose by 2.15-fold. We conclude that the weather variability and sunspot activity may affect cyanobacterial blooms in dams.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vigilance declines when exposed to highly predictable and uneventful tasks. Monotonous tasks provide little cognitive and motor stimulation and contribute to human errors. This paper aims to model and detect vigilance decline in real time through participant’s reaction times during a monotonous task. A lab-based experiment adapting the Sustained Attention to Response Task (SART) is conducted to quantify the effect of monotony on overall performance. Then relevant parameters are used to build a model detecting hypovigilance throughout the experiment. The accuracy of different mathematical models are compared to detect in real-time – minute by minute - the lapses in vigilance during the task. We show that monotonous tasks can lead to an average decline in performance of 45%. Furthermore, vigilance modelling enables to detect vigilance decline through reaction times with an accuracy of 72% and a 29% false alarm rate. Bayesian models are identified as a better model to detect lapses in vigilance as compared to Neural Networks and Generalised Linear Mixed Models. This modelling could be used as a framework to detect vigilance decline of any human performing monotonous tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is focussed on developing a commissioning procedure so that a Monte Carlo model, which uses BEAMnrc’s standard VARMLC component module, can be adapted to match a specific BrainLAB m3 micro-multileaf collimator (μMLC). A set of measurements are recommended, for use as a reference against which the model can be tested and optimised. These include radiochromic film measurements of dose from small and offset fields, as well as measurements of μMLC transmission and interleaf leakage. Simulations and measurements to obtain μMLC scatter factors are shown to be insensitive to relevant model parameters and are therefore not recommended, unless the output of the linear accelerator model is in doubt. Ultimately, this note provides detailed instructions for those intending to optimise a VARMLC model to match the dose delivered by their local BrainLAB m3 μMLC device.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current estimates of soil C storage potential are based on models or factors that assume linearity between C input levels and C stocks at steady-state, implying that SOC stocks could increase without limit as C input levels increase. However, some soils show little or no increase in steady-state SOC stock with increasing C input levels suggesting that SOC can become saturated with respect to C input. We used long-term field experiment data to assess alternative hypotheses of soil carbon storage by three simple models: a linear model (no saturation), a one-pool whole-soil C saturation model, and a two-pool mixed model with C saturation of a single C pool, but not the whole soil. The one-pool C saturation model best fit the combined data from 14 sites, four individual sites were best-fit with the linear model, and no sites were best fit by the mixed model. These results indicate that existing agricultural field experiments generally have too small a range in C input levels to show saturation behavior, and verify the accepted linear relationship between soil C and C input used to model SOM dynamics. However, all sites combined and the site with the widest range in C input levels were best fit with the C-saturation model. Nevertheless, the same site produced distinct effective stabilization capacity curves rather than an absolute C saturation level. We conclude that the saturation of soil C does occur and therefore the greatest efficiency in soil C sequestration will be in soils further from C saturation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Optimal design for generalized linear models has primarily focused on univariate data. Often experiments are performed that have multiple dependent responses described by regression type models, and it is of interest and of value to design the experiment for all these responses. This requires a multivariate distribution underlying a pre-chosen model for the data. Here, we consider the design of experiments for bivariate binary data which are dependent. We explore Copula functions which provide a rich and flexible class of structures to derive joint distributions for bivariate binary data. We present methods for deriving optimal experimental designs for dependent bivariate binary data using Copulas, and demonstrate that, by including the dependence between responses in the design process, more efficient parameter estimates are obtained than by the usual practice of simply designing for a single variable only. Further, we investigate the robustness of designs with respect to initial parameter estimates and Copula function, and also show the performance of compound criteria within this bivariate binary setting.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Catheter ablation for atrial fibrillation (AF) is more efficacious than antiarrhythmic therapy. Post ablation recurrences reduce ablation effectiveness and are contributed by lesion discontinuity in the fibrotic linear ablation lesions. The anti-fibrotic role of statins in reducing AF is being assessed in current trials. By reducing the chronic pathological fibrosis that occurs in AF they may reduce AF. However if statins also have an effect on the acute therapeutic fibrosis of an ablation, this could exacerbate lesion discontinuity and AF recurrence. We tested the hypothesis that statins attenuate ablation lesion continuity in a recognised pig atrial linear ablation model. Aims: To assess whether Atorvastatin diminishes the bi-directional conduction block produced by a linear atrial ablation lesion. Methods: Sixteen pigs were randomised to statin (n=8) or placebo (n=8) with drug pre-treatment for 3 days and a further 4 weeks. At initial electrophysiological study (EPS1) 3D right atrium (RA) mapping and a vertical ablation linear lesion in the posterior RA with bidirectional conduction block were completed (Gepstein Circ 1999). Follow-up electrophysiological assessment (EPS2) at 28 days assessed bidirectional conduction block maintenance. Results: Data of 15/16 (statin=7) pigs were analysed. Mean lesion length was 3.7 ± 0.8cm with a mean of 17.9 ± 5.7 lesion applications. Bi-directional conduction block was confirmed in 15/15 pigs (100%) at EPS1 and EPS2. Conclusions: Atorvastatin did not affect ablation lesion continuity in this pig atrial linear ablation model. If patients are on long-term statins for AF reduction, periablation cessation is probably not necessary.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional crash prediction models, such as generalized linear regression models, are incapable of taking into account the multilevel data structure, which extensively exists in crash data. Disregarding the possible within-group correlations can lead to the production of models giving unreliable and biased estimates of unknowns. This study innovatively proposes a -level hierarchy, viz. (Geographic region level – Traffic site level – Traffic crash level – Driver-vehicle unit level – Vehicle-occupant level) Time level, to establish a general form of multilevel data structure in traffic safety analysis. To properly model the potential cross-group heterogeneity due to the multilevel data structure, a framework of Bayesian hierarchical models that explicitly specify multilevel structure and correctly yield parameter estimates is introduced and recommended. The proposed method is illustrated in an individual-severity analysis of intersection crashes using the Singapore crash records. This study proved the importance of accounting for the within-group correlations and demonstrated the flexibilities and effectiveness of the Bayesian hierarchical method in modeling multilevel structure of traffic crash data.