980 resultados para Modified signed likelihood ratio statistic
Resumo:
Resumen tomado de la publicaci??n
Resumo:
We introduce a procedure for association based analysis of nuclear families that allows for dichotomous and more general measurements of phenotype and inclusion of covariate information. Standard generalized linear models are used to relate phenotype and its predictors. Our test procedure, based on the likelihood ratio, unifies the estimation of all parameters through the likelihood itself and yields maximum likelihood estimates of the genetic relative risk and interaction parameters. Our method has advantages in modelling the covariate and gene-covariate interaction terms over recently proposed conditional score tests that include covariate information via a two-stage modelling approach. We apply our method in a study of human systemic lupus erythematosus and the C-reactive protein that includes sex as a covariate.
Resumo:
We consider the issue of performing accurate small-sample likelihood-based inference in beta regression models, which are useful for modelling continuous proportions that are affected by independent variables. We derive small-sample adjustments to the likelihood ratio statistic in this class of models. The adjusted statistics can be easily implemented from standard statistical software. We present Monte Carlo simulations showing that inference based on the adjusted statistics we propose is much more reliable than that based on the usual likelihood ratio statistic. A real data example is presented.
Resumo:
Although the asymptotic distributions of the likelihood ratio for testing hypotheses of null variance components in linear mixed models derived by Stram and Lee [1994. Variance components testing in longitudinal mixed effects model. Biometrics 50, 1171-1177] are valid, their proof is based on the work of Self and Liang [1987. Asymptotic properties of maximum likelihood estimators and likelihood tests under nonstandard conditions. J. Amer. Statist. Assoc. 82, 605-610] which requires identically distributed random variables, an assumption not always valid in longitudinal data problems. We use the less restrictive results of Vu and Zhou [1997. Generalization of likelihood ratio tests under nonstandard conditions. Ann. Statist. 25, 897-916] to prove that the proposed mixture of chi-squared distributions is the actual asymptotic distribution of such likelihood ratios used as test statistics for null variance components in models with one or two random effects. We also consider a limited simulation study to evaluate the appropriateness of the asymptotic distribution of such likelihood ratios in moderately sized samples. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
We propose a likelihood ratio test ( LRT) with Bartlett correction in order to identify Granger causality between sets of time series gene expression data. The performance of the proposed test is compared to a previously published bootstrapbased approach. LRT is shown to be significantly faster and statistically powerful even within non- Normal distributions. An R package named gGranger containing an implementation for both Granger causality identification tests is also provided.
Resumo:
The main purpose of this work is to study the behaviour of Skovgaard`s [Skovgaard, I.M., 2001. Likelihood asymptotics. Scandinavian journal of Statistics 28, 3-32] adjusted likelihood ratio statistic in testing simple hypothesis in a new class of regression models proposed here. The proposed class of regression models considers Dirichlet distributed observations, and the parameters that index the Dirichlet distributions are related to covariates and unknown regression coefficients. This class is useful for modelling data consisting of multivariate positive observations summing to one and generalizes the beta regression model described in Vasconcellos and Cribari-Neto [Vasconcellos, K.L.P., Cribari-Neto, F., 2005. Improved maximum likelihood estimation in a new class of beta regression models. Brazilian journal of Probability and Statistics 19,13-31]. We show that, for our model, Skovgaard`s adjusted likelihood ratio statistics have a simple compact form that can be easily implemented in standard statistical software. The adjusted statistic is approximately chi-squared distributed with a high degree of accuracy. Some numerical simulations show that the modified test is more reliable in finite samples than the usual likelihood ratio procedure. An empirical application is also presented and discussed. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.
Resumo:
This paper presents new techniques with relevant improvements added to the primary system presented by our group to the Albayzin 2012 LRE competition, where the use of any additional corpora for training or optimizing the models was forbidden. In this work, we present the incorporation of an additional phonotactic subsystem based on the use of phone log-likelihood ratio features (PLLR) extracted from different phonotactic recognizers that contributes to improve the accuracy of the system in a 21.4% in terms of Cavg (we also present results for the official metric during the evaluation, Fact). We will present how using these features at the phone state level provides significant improvements, when used together with dimensionality reduction techniques, especially PCA. We have also experimented with applying alternative SDC-like configurations on these PLLR features with additional improvements. Also, we will describe some modifications to the MFCC-based acoustic i-vector system which have also contributed to additional improvements. The final fused system outperformed the baseline in 27.4% in Cavg.
Resumo:
We present a novel method, called the transform likelihood ratio (TLR) method, for estimation of rare event probabilities with heavy-tailed distributions. Via a simple transformation ( change of variables) technique the TLR method reduces the original rare event probability estimation with heavy tail distributions to an equivalent one with light tail distributions. Once this transformation has been established we estimate the rare event probability via importance sampling, using the classical exponential change of measure or the standard likelihood ratio change of measure. In the latter case the importance sampling distribution is chosen from the same parametric family as the transformed distribution. We estimate the optimal parameter vector of the importance sampling distribution using the cross-entropy method. We prove the polynomial complexity of the TLR method for certain heavy-tailed models and demonstrate numerically its high efficiency for various heavy-tailed models previously thought to be intractable. We also show that the TLR method can be viewed as a universal tool in the sense that not only it provides a unified view for heavy-tailed simulation but also can be efficiently used in simulation with light-tailed distributions. We present extensive simulation results which support the efficiency of the TLR method.
Resumo:
PURPOSE: Many guidelines advocate measurement of total or low density lipoprotein cholesterol (LDL), high density lipoprotein cholesterol (HDL), and triglycerides (TG) to determine treatment recommendations for preventing coronary heart disease (CHD) and cardiovascular disease (CVD). This analysis is a comparison of lipid variables as predictors of cardiovascular disease. METHODS: Hazard ratios for coronary and cardiovascular deaths by fourths of total cholesterol (TC), LDL, HDL, TG, non-HDL, TC/HDL, and TG/HDL values, and for a one standard deviation change in these variables, were derived in an individual participant data meta-analysis of 32 cohort studies conducted in the Asia-Pacific region. The predictive value of each lipid variable was assessed using the likelihood ratio statistic. RESULTS: Adjusting for confounders and regression dilution, each lipid variable had a positive (negative for HDL) log-linear association with fatal CHD and CVD. Individuals in the highest fourth of each lipid variable had approximately twice the risk of CHD compared with those with lowest levels. TG and HDL were each better predictors of CHD and CVD risk compared with TC alone, with test statistics similar to TC/HDL and TG/HDL ratios. Calculated LDL was a relatively poor predictor. CONCLUSIONS: While LDL reduction remains the main target of intervention for lipid-lowering, these data support the potential use of TG or lipid ratios for CHD risk prediction. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.
Resumo:
The two-parameter Birnbaum-Saunders distribution has been used successfully to model fatigue failure times. Although censoring is typical in reliability and survival studies, little work has been published on the analysis of censored data for this distribution. In this paper, we address the issue of performing testing inference on the two parameters of the Birnbaum-Saunders distribution under type-II right censored samples. The likelihood ratio statistic and a recently proposed statistic, the gradient statistic, provide a convenient framework for statistical inference in such a case, since they do not require to obtain, estimate or invert an information matrix, which is an advantage in problems involving censored data. An extensive Monte Carlo simulation study is carried out in order to investigate and compare the finite sample performance of the likelihood ratio and the gradient tests. Our numerical results show evidence that the gradient test should be preferred. Further, we also consider the generalized Birnbaum-Saunders distribution under type-II right censored samples and present some Monte Carlo simulations for testing the parameters in this class of models using the likelihood ratio and gradient tests. Three empirical applications are presented. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Survival models deals with the modelling of time to event data. In certain situations, a share of the population can no longer be subjected to the event occurrence. In this context, the cure fraction models emerged. Among the models that incorporate a fraction of cured one of the most known is the promotion time model. In the present study we discuss hypothesis testing in the promotion time model with Weibull distribution for the failure times of susceptible individuals. Hypothesis testing in this model may be performed based on likelihood ratio, gradient, score or Wald statistics. The critical values are obtained from asymptotic approximations, which may result in size distortions in nite sample sizes. This study proposes bootstrap corrections to the aforementioned tests and Bartlett bootstrap to the likelihood ratio statistic in Weibull promotion time model. Using Monte Carlo simulations we compared the nite sample performances of the proposed corrections in contrast with the usual tests. The numerical evidence favors the proposed corrected tests. At the end of the work an empirical application is presented.
Resumo:
Detecting change points in epidemic models has been studied by many scholars. Yao (1993) summarized five existing test statistics in the literature. Out of those test statistics, it was observed that the likelihood ratio statistic showed its standout power. However, all of the existing test statistics are based on an assumption that population variance is known, which is an unrealistic assumption in practice. To avoid assuming known population variance, a new test statistic for detecting epidemic models is studied in this thesis. The new test statistic is a parameter-free test statistic which is more powerful compared to the existing test statistics. Different sample sizes and lengths of epidemic durations are used for the power comparison purpose. Monte Carlo simulation is used to find the critical values of the new test statistic and to perform the power comparison. Based on the Monte Carlo simulation result, it can be concluded that the sample size and the length of the duration have some effect on the power of the tests. It can also be observed that the new test statistic studied in this thesis has higher power than the existing test statistics do in all of cases.