947 resultados para Maximum Likelihood Estimation
Resumo:
We study in detail the so-called beta-modified Weibull distribution, motivated by the wide use of the Weibull distribution in practice, and also for the fact that the generalization provides a continuous crossover towards cases with different shapes. The new distribution is important since it contains as special sub-models some widely-known distributions, such as the generalized modified Weibull, beta Weibull, exponentiated Weibull, beta exponential, modified Weibull and Weibull distributions, among several others. It also provides more flexibility to analyse complex real data. Various mathematical properties of this distribution are derived, including its moments and moment generating function. We examine the asymptotic distributions of the extreme values. Explicit expressions are also derived for the chf, mean deviations, Bonferroni and Lorenz curves, reliability and entropies. The estimation of parameters is approached by two methods: moments and maximum likelihood. We compare by simulation the performances of the estimates from these methods. We obtain the expected information matrix. Two applications are presented to illustrate the proposed distribution.
Resumo:
Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.
Resumo:
This article deals with the efficiency of fractional integration parameter estimators. This study was based on Monte Carlo experiments involving simulated stochastic processes with integration orders in the range]-1,1[. The evaluated estimation methods were classified into two groups: heuristics and semiparametric/maximum likelihood (ML). The study revealed that the comparative efficiency of the estimators, measured by the lesser mean squared error, depends on the stationary/non-stationary and persistency/anti-persistency conditions of the series. The ML estimator was shown to be superior for stationary persistent processes; the wavelet spectrum-based estimators were better for non-stationary mean reversible and invertible anti-persistent processes; the weighted periodogram-based estimator was shown to be superior for non-invertible anti-persistent processes.
Resumo:
A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-specific mortality data. The survival experience of stroke patients after index stroke may be described by a subpopulation of patients in the acute condition and another subpopulation of patients in the chronic phase. To adjust for the inherent correlation of observations due to random hospital effects, a mixture model of two survival functions with random effects is formulated. Assuming a Weibull hazard in both components, an EM algorithm is developed for the estimation of fixed effect parameters and variance components. A simulation study is conducted to assess the performance of the two-component survival mixture model estimators. Simulation results confirm the applicability of the proposed model in a small sample setting. Copyright (C) 2004 John Wiley Sons, Ltd.
Resumo:
In this paper use consider the problem of providing standard errors of the component means in normal mixture models fitted to univariate or multivariate data by maximum likelihood via the EM algorithm. Two methods of estimation of the standard errors are considered: the standard information-based method and the computationally-intensive bootstrap method. They are compared empirically by their application to three real data sets and by a small-scale Monte Carlo experiment.
Resumo:
Analysis of a major multi-site epidemiologic study of heart disease has required estimation of the pairwise correlation of several measurements across sub-populations. Because the measurements from each sub-population were subject to sampling variability, the Pearson product moment estimator of these correlations produces biased estimates. This paper proposes a model that takes into account within and between sub-population variation, provides algorithms for obtaining maximum likelihood estimates of these correlations and discusses several approaches for obtaining interval estimates. (C) 1997 by John Wiley & Sons, Ltd.
Resumo:
HE PROBIT MODEL IS A POPULAR DEVICE for explaining binary choice decisions in econometrics. It has been used to describe choices such as labor force participation, travel mode, home ownership, and type of education. These and many more examples can be found in papers by Amemiya (1981) and Maddala (1983). Given the contribution of economics towards explaining such choices, and given the nature of data that are collected, prior information on the relationship between a choice probability and several explanatory variables frequently exists. Bayesian inference is a convenient vehicle for including such prior information. Given the increasing popularity of Bayesian inference it is useful to ask whether inferences from a probit model are sensitive to a choice between Bayesian and sampling theory techniques. Of interest is the sensitivity of inference on coefficients, probabilities, and elasticities. We consider these issues in a model designed to explain choice between fixed and variable interest rate mortgages. Two Bayesian priors are employed: a uniform prior on the coefficients, designed to be noninformative for the coefficients, and an inequality restricted prior on the signs of the coefficients. We often know, a priori, whether increasing the value of a particular explanatory variable will have a positive or negative effect on a choice probability. This knowledge can be captured by using a prior probability density function (pdf) that is truncated to be positive or negative. Thus, three sets of results are compared:those from maximum likelihood (ML) estimation, those from Bayesian estimation with an unrestricted uniform prior on the coefficients, and those from Bayesian estimation with a uniform prior truncated to accommodate inequality restrictions on the coefficients.
Resumo:
We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.
Resumo:
Objectives: To compare the population modelling programs NONMEM and P-PHARM during investigation of the pharmacokinetics of tacrolimus in paediatric liver-transplant recipients. Methods: Population pharmacokinetic analysis was performed using NONMEM and P-PHARM on retrospective data from 35 paediatric liver-transplant patients receiving tacrolimus therapy. The same data were presented to both programs. Maximum likelihood estimates were sought for apparent clearance (CL/F) and apparent volume of distribution (V/F). Covariates screened for influence on these parameters were weight, age, gender, post-operative day, days of tacrolimus therapy, transplant type, biliary reconstructive procedure, liver function tests, creatinine clearance, haematocrit, corticosteroid dose, and potential interacting drugs. Results: A satisfactory model was developed in both programs with a single categorical covariate - transplant type - providing stable parameter estimates and small, normally distributed (weighted) residuals. In NONMEM, the continuous covariates - age and liver function tests - improved modelling further. Mean parameter estimates were CL/F (whole liver) = 16.3 1/h, CL/F (cut-down liver) = 8.5 1/h and V/F = 565 1 in NONMEM, and CL/F = 8.3 1/h and V/F = 155 1 in P-PHARM. Individual Bayesian parameter estimates were CL/F (whole liver) = 17.9 +/- 8.8 1/h, CL/F (cutdown liver) = 11.6 +/- 18.8 1/h and V/F = 712 792 1 in NONMEM, and CL/F (whole liver) = 12.8 +/- 3.5 1/h, CL/F (cut-down liver) = 8.2 +/- 3.4 1/h and V/F = 221 1641 in P-PHARM. Marked interindividual kinetic variability (38-108%) and residual random error (approximately 3 ng/ml) were observed. P-PHARM was more user friendly and readily provided informative graphical presentation of results. NONMEM allowed a wider choice of errors for statistical modelling and coped better with complex covariate data sets. Conclusion: Results from parametric modelling programs can vary due to different algorithms employed to estimate parameters, alternative methods of covariate analysis and variations and limitations in the software itself.
Resumo:
Objectives: The aims of this study were to investigate the population pharmacokinetics of tacrolimus in adult kidney transplant recipients and to identify factors that explain variability. Methods: Population analysis was performed on retrospective data from 70 patients who received oral tacrolimus twice daily. Morning blood trough concentrations were measured by liquid chromatography-tandem mass spectrometry. Maximum likelihood estimates were sought for apparent clearance (CL/F) and apparent volume of distribution (V/F), with the use of NONMEM (GloboMax LLC, Hanover, Md). Factors screened for influence on these parameters were weight, age, gender, postoperative day, days of tacrolimus therapy, liver function tests, creatinine clearance, hematocrit fraction, corticosteroid dose, and potential interacting drugs. Results. CL/F was greater in patients with abnormally low hematocrit fraction (data from 21 patients only), and it decreased with increasing days of therapy and AST concentrations (P
Resumo:
We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
Extreme value models are widely used in different areas. The Birnbaum–Saunders distribution is receiving considerable attention due to its physical arguments and its good properties. We propose a methodology based on extreme value Birnbaum–Saunders regression models, which includes model formulation, estimation, inference and checking. We further conduct a simulation study for evaluating its performance. A statistical analysis with real-world extreme value environmental data using the methodology is provided as illustration.
Resumo:
This paper shows how a high level matrix programming language may be used to perform Monte Carlo simulation, bootstrapping, estimation by maximum likelihood and GMM, and kernel regression in parallel on symmetric multiprocessor computers or clusters of workstations. The implementation of parallelization is done in a way such that an investigator may use the programs without any knowledge of parallel programming. A bootable CD that allows rapid creation of a cluster for parallel computing is introduced. Examples show that parallelization can lead to important reductions in computational time. Detailed discussion of how the Monte Carlo problem was parallelized is included as an example for learning to write parallel programs for Octave.
Resumo:
We use a threshold seemingly unrelated regressions specification to assess whether the Central and East European countries (CEECs) are synchronized in their business cycles to the Euro-area. This specification is useful in two ways: First, it takes into account the common institutional factors and the similarities across CEECs in their process of economic transition. Second, it captures business cycle asymmetries by allowing for the presence of two distinct regimes for the CEECs. As the CEECs are strongly affected by the Euro-area these regimes may be associated with Euro-area expansions and contractions. We discuss representation, estimation by maximum likelihood and inference. The methodology is illustrated by using monthly industrial production in 8 CEECs. The results show that apart from Lithuania the rest of the CEECs experience “normal” growth when the Euro-area contracts and “high” growth when the Euro-area expands. Given that the CEECs are “catching up” with the Euro-area this result shows that most CEECs seem synchronized to the Euro-area cycle. Keywords: Threshold SURE; asymmetry; business cycles; CEECs. JEL classification: C33; C50; E32.