926 resultados para maximum-likelihood approach
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 × 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture–recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture–recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
We introduce a procedure for association based analysis of nuclear families that allows for dichotomous and more general measurements of phenotype and inclusion of covariate information. Standard generalized linear models are used to relate phenotype and its predictors. Our test procedure, based on the likelihood ratio, unifies the estimation of all parameters through the likelihood itself and yields maximum likelihood estimates of the genetic relative risk and interaction parameters. Our method has advantages in modelling the covariate and gene-covariate interaction terms over recently proposed conditional score tests that include covariate information via a two-stage modelling approach. We apply our method in a study of human systemic lupus erythematosus and the C-reactive protein that includes sex as a covariate.
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 x 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture-recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture-recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
Objectives: To assess the potential source of variation that surgeon may add to patient outcome in a clinical trial of surgical procedures. Methods: Two large (n = 1380) parallel multicentre randomized surgical trials were undertaken to compare laparoscopically assisted hysterectomy with conventional methods of abdominal and vaginal hysterectomy; involving 43 surgeons. The primary end point of the trial was the occurrence of at least one major complication. Patients were nested within surgeons giving the data set a hierarchical structure. A total of 10% of patients had at least one major complication, that is, a sparse binary outcome variable. A linear mixed logistic regression model (with logit link function) was used to model the probability of a major complication, with surgeon fitted as a random effect. Models were fitted using the method of maximum likelihood in SAS((R)). Results: There were many convergence problems. These were resolved using a variety of approaches including; treating all effects as fixed for the initial model building; modelling the variance of a parameter on a logarithmic scale and centring of continuous covariates. The initial model building process indicated no significant 'type of operation' across surgeon interaction effect in either trial, the 'type of operation' term was highly significant in the abdominal trial, and the 'surgeon' term was not significant in either trial. Conclusions: The analysis did not find a surgeon effect but it is difficult to conclude that there was not a difference between surgeons. The statistical test may have lacked sufficient power, the variance estimates were small with large standard errors, indicating that the precision of the variance estimates may be questionable.
Resumo:
The disturbance vicariance hypothesis (DV) has been proposed to explain speciation in Amazonia, especially its edge regions, e. g. in eastern Guiana Shield harlequin frogs (Atelopus) which are suggested to have derived from a cool-adapted Andean ancestor. In concordance with DV predictions we studied that (i) these amphibians display a natural distribution gap in central Amazonia; (ii) east of this gap they constitute a monophyletic lineage which is nested within a pre-Andean/western clade; (iii) climate envelopes of Atelopus west and east of the distribution gap show some macroclimatic divergence due to a regional climate envelope shift; (iv) geographic distributions of climate envelopes of western and eastern Atelopus range into central Amazonia but with limited spatial overlap. We tested if presence and apparent absence data points of Atelopus were homogenously distributed with Ripley's K function. A molecular phylogeny (mitochondrial 16S rRNA gene) was reconstructed using Maximum Likelihood and Bayesian Inference to study if Guianan Atelopus constitute a clade nested within a larger genus phylogeny. We focused on climate envelope divergence and geographic distribution by computing climatic envelope models with MaxEnt based on macroscale bioclimatic parameters and testing them by using Schoener's index and modified Hellinger distance. We corroborated existing DV predictions and, for the first time, formulated new DV predictions aiming on species' climate envelope change. Our results suggest that cool-adapted Andean Atelopus ancestors had dispersed into the Amazon basin and further onto the eastern Guiana Shield where, under warm conditions, they were forced to change climate envelopes. © 2010 The Author(s).
Resumo:
Questions: We assess gap size and shape distributions, two important descriptors of the forest disturbance regime, by asking: which statistical model best describes gap size distribution; can simple geometric forms adequately describe gap shape; does gap size or shape vary with forest type, gap age or the method used for gap delimitation; and how similar are the studied forests and other tropical and temperate forests? Location: Southeastern Atlantic Forest, Brazil. Methods: Analysing over 150 gaps in two distinct forest types (seasonal and rain forests), a model selection framework was used to select appropriate probability distributions and functions to describe gap size and gap shape. The first was described using univariate probability distributions, whereas the latter was assessed based on the gap area-perimeter relationship. Comparisons of gap size and shape between sites, as well as size and age classes were then made based on the likelihood of models having different assumptions for the values of their parameters. Results: The log-normal distribution was the best descriptor of gap size distribution, independently of the forest type or gap delimitation method. Because gaps became more irregular as they increased in size, all geometric forms (triangle, rectangle and ellipse) were poor descriptors of gap shape. Only when small and large gaps (> 100 or 400m2 depending on the delimitation method) were treated separately did the rectangle and isosceles triangle become accurate predictors of gap shape. Ellipsoidal shapes were poor descriptors. At both sites, gaps were at least 50% longer than they were wide, a finding with important implications for gap microclimate (e.g. light entrance regime) and, consequently, for gap regeneration. Conclusions: In addition to more appropriate descriptions of gap size and shape, the model selection framework used here efficiently provided a means by which to compare the patterns of two different types of forest. With this framework we were able to recommend the log-normal parameters μ and σ for future comparisons of gap size distribution, and to propose possible mechanisms related to random rates of gap expansion and closure. We also showed that gap shape varied highly and that no single geometric form was able to predict the shape of all gaps, the ellipse in particular should no longer be used as a standard gap shape. © 2012 International Association for Vegetation Science.
Resumo:
A demographic model is developed based on interbirth intervals and is applied to estimate the population growth rate of humpback whales (Megaptera novaeangliae) in the Gulf of Maine. Fecundity rates in this model are based on the probabilities of giving birth at time t after a previous birth and on the probabilities of giving birth first at age x. Maximum likelihood methods are used to estimate these probabilities using sighting data collected for individually identified whales. Female survival rates are estimated from these same sighting data using a modified Jolly–Seber method. The youngest age at first parturition is 5 yr, the estimated mean birth interval is 2.38 yr (SE = 0.10 yr), the estimated noncalf survival rate is 0.960 (SE = 0.008), and the estimated calf survival rate is 0.875 (SE = 0.047). The population growth rate (l) is estimated to be 1.065; its standard error is estimated as 0.012 using a Monte Carlo approach, which simulated sampling from a hypothetical population of whales. The simulation is also used to investigate the bias in estimating birth intervals by previous methods. The approach developed here is applicable to studies of other populations for which individual interbirth intervals can be measured.
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Introduction: According to the ecological view, coordination establishes byvirtueof social context. Affordances thought of as situational opportunities to interact are assumed to represent the guiding principles underlying decisions involved in interpersonal coordination. It’s generally agreed that affordances are not an objective part of the (social) environment but that they depend on the constructive perception of involved subjects. Theory and empirical data hold that cognitive operations enabling domain-specific efficacy beliefs are involved in the perception of affordances. The aim of the present study was to test the effects of these cognitive concepts in the subjective construction of local affordances and their influence on decision making in football. Methods: 71 football players (M = 24.3 years, SD = 3.3, 21 % women) from different divisions participated in the study. Participants were presented scenarios of offensive game situations. They were asked to take the perspective of the person on the ball and to indicate where they would pass the ball from within each situation. The participants stated their decisions in two conditions with different game score (1:0 vs. 0:1). The playing fields of all scenarios were then divided into ten zones. For each zone, participants were asked to rate their confidence in being able to pass the ball there (self-efficacy), the likelihood of the group staying in ball possession if the ball were passed into the zone (group-efficacy I), the likelihood of the ball being covered safely by a team member (pass control / group-efficacy II), and whether a pass would establish a better initial position to attack the opponents’ goal (offensive convenience). Answers were reported on visual analog scales ranging from 1 to 10. Data were analyzed specifying general linear models for binomially distributed data (Mplus). Maximum likelihood with non-normality robust standard errors was chosen to estimate parameters. Results: Analyses showed that zone- and domain-specific efficacy beliefs significantly affected passing decisions. Because of collinearity with self-efficacy and group-efficacy I, group-efficacy II was excluded from the models to ease interpretation of the results. Generally, zones with high values in the subjective ratings had a higher probability to be chosen as passing destination (βself-efficacy = 0.133, p < .001, OR = 1.142; βgroup-efficacy I = 0.128, p < .001, OR = 1.137; βoffensive convenience = 0.057, p < .01, OR = 1.059). There were, however, characteristic differences in the two score conditions. While group-efficacy I was the only significant predictor in condition 1 (βgroup-efficacy I = 0.379, p < .001), only self-efficacy and offensive convenience contributed to passing decisions in condition 2 (βself-efficacy = 0.135, p < .01; βoffensive convenience = 0.120, p < .001). Discussion: The results indicate that subjectively distinct attributes projected to playfield zones affect passing decisions. The study proposes a probabilistic alternative to Lewin’s (1951) hodological and deterministic field theory and enables insight into how dimensions of the psychological landscape afford passing behavior. Being part of a team, this psychological landscape is not only constituted by probabilities that refer to the potential and consequences of individual behavior, but also to that of the group system of which individuals are part of. Hence, in regulating action decisions in group settings, informers are extended to aspects referring to the group-level. References: Lewin, K. (1951). In D. Cartwright (Ed.), Field theory in social sciences: Selected theoretical papers by Kurt Lewin. New York: Harper & Brothers.
Resumo:
Analysis of recurrent events has been widely discussed in medical, health services, insurance, and engineering areas in recent years. This research proposes to use a nonhomogeneous Yule process with the proportional intensity assumption to model the hazard function on recurrent events data and the associated risk factors. This method assumes that repeated events occur for each individual, with given covariates, according to a nonhomogeneous Yule process with intensity function λx(t) = λ 0(t) · exp( x′β). One of the advantages of using a non-homogeneous Yule process for recurrent events is that it assumes that the recurrent rate is proportional to the number of events that occur up to time t. Maximum likelihood estimation is used to provide estimates of the parameters in the model, and a generalized scoring iterative procedure is applied in numerical computation. ^ Model comparisons between the proposed method and other existing recurrent models are addressed by simulation. One example concerning recurrent myocardial infarction events compared between two distinct populations, Mexican-American and Non-Hispanic Whites in the Corpus Christi Heart Project is examined. ^
Resumo:
A Bayesian approach to estimating the intraclass correlation coefficient was used for this research project. The background of the intraclass correlation coefficient, a summary of its standard estimators, and a review of basic Bayesian terminology and methodology were presented. The conditional posterior density of the intraclass correlation coefficient was then derived and estimation procedures related to this derivation were shown in detail. Three examples of applications of the conditional posterior density to specific data sets were also included. Two sets of simulation experiments were performed to compare the mean and mode of the conditional posterior density of the intraclass correlation coefficient to more traditional estimators. Non-Bayesian methods of estimation used were: the methods of analysis of variance and maximum likelihood for balanced data; and the methods of MIVQUE (Minimum Variance Quadratic Unbiased Estimation) and maximum likelihood for unbalanced data. The overall conclusion of this research project was that Bayesian estimates of the intraclass correlation coefficient can be appropriate, useful and practical alternatives to traditional methods of estimation. ^
Resumo:
A Bayesian approach to estimation of the regression coefficients of a multinominal logit model with ordinal scale response categories is presented. A Monte Carlo method is used to construct the posterior distribution of the link function. The link function is treated as an arbitrary scalar function. Then the Gauss-Markov theorem is used to determine a function of the link which produces a random vector of coefficients. The posterior distribution of the random vector of coefficients is used to estimate the regression coefficients. The method described is referred to as a Bayesian generalized least square (BGLS) analysis. Two cases involving multinominal logit models are described. Case I involves a cumulative logit model and Case II involves a proportional-odds model. All inferences about the coefficients for both cases are described in terms of the posterior distribution of the regression coefficients. The results from the BGLS method are compared to maximum likelihood estimates of the regression coefficients. The BGLS method avoids the nonlinear problems encountered when estimating the regression coefficients of a generalized linear model. The method is not complex or computationally intensive. The BGLS method offers several advantages over Bayesian approaches. ^
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^
Resumo:
This paper presents the Expectation Maximization algorithm (EM) applied to operational modal analysis of structures. The EM algorithm is a general-purpose method for maximum likelihood estimation (MLE) that in this work is used to estimate state space models. As it is well known, the MLE enjoys some optimal properties from a statistical point of view, which make it very attractive in practice. However, the EM algorithm has two main drawbacks: its slow convergence and the dependence of the solution on the initial values used. This paper proposes two different strategies to choose initial values for the EM algorithm when used for operational modal analysis: to begin with the parameters estimated by Stochastic Subspace Identification method (SSI) and to start using random points. The effectiveness of the proposed identification method has been evaluated through numerical simulation and measured vibration data in the context of a benchmark problem. Modal parameters (natural frequencies, damping ratios and mode shapes) of the benchmark structure have been estimated using SSI and the EM algorithm. On the whole, the results show that the application of the EM algorithm starting from the solution given by SSI is very useful to identify the vibration modes of a structure, discarding the spurious modes that appear in high order models and discovering other hidden modes. Similar results are obtained using random starting values, although this strategy allows us to analyze the solution of several starting points what overcome the dependence on the initial values used.