978 resultados para Pooled-regression model


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output, helping the end-user to get more confidence in the prediction and providing the basis for the end-user to have new insight about the data, confirming or rejecting hypotheses previously formed. Moreover, model trees present an acceptable level of predictive performance in comparison to most techniques used for solving regression problems. Since generating the optimal model tree is an NP-Complete problem, traditional model tree induction algorithms make use of a greedy top-down divide-and-conquer strategy, which may not converge to the global optimal solution. In this paper, we propose a novel algorithm based on the use of the evolutionary algorithms paradigm as an alternate heuristic to generate model trees in order to improve the convergence to globally near-optimal solutions. We call our new approach evolutionary model tree induction (E-Motion). We test its predictive performance using public UCI data sets, and we compare the results to traditional greedy regression/model trees induction algorithms, as well as to other evolutionary approaches. Results show that our method presents a good trade-off between predictive performance and model comprehensibility, which may be crucial in many machine learning applications. (C) 2010 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We obtain adjustments to the profile likelihood function in Weibull regression models with and without censoring. Specifically, we consider two different modified profile likelihoods: (i) the one proposed by Cox and Reid [Cox, D.R. and Reid, N., 1987, Parameter orthogonality and approximate conditional inference. Journal of the Royal Statistical Society B, 49, 1-39.], and (ii) an approximation to the one proposed by Barndorff-Nielsen [Barndorff-Nielsen, O.E., 1983, On a formula for the distribution of the maximum likelihood estimator. Biometrika, 70, 343-365.], the approximation having been obtained using the results by Fraser and Reid [Fraser, D.A.S. and Reid, N., 1995, Ancillaries and third-order significance. Utilitas Mathematica, 47, 33-53.] and by Fraser et al. [Fraser, D.A.S., Reid, N. and Wu, J., 1999, A simple formula for tail probabilities for frequentist and Bayesian inference. Biometrika, 86, 655-661.]. We focus on point estimation and likelihood ratio tests on the shape parameter in the class of Weibull regression models. We derive some distributional properties of the different maximum likelihood estimators and likelihood ratio tests. The numerical evidence presented in the paper favors the approximation to Barndorff-Nielsen`s adjustment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The class of symmetric linear regression models has the normal linear regression model as a special case and includes several models that assume that the errors follow a symmetric distribution with longer-than-normal tails. An important member of this class is the t linear regression model, which is commonly used as an alternative to the usual normal regression model when the data contain extreme or outlying observations. In this article, we develop second-order asymptotic theory for score tests in this class of models. We obtain Bartlett-corrected score statistics for testing hypotheses on the regression and the dispersion parameters. The corrected statistics have chi-squared distributions with errors of order O(n(-3/2)), n being the sample size. The corrections represent an improvement over the corresponding original Rao`s score statistics, which are chi-squared distributed up to errors of order O(n(-1)). Simulation results show that the corrected score tests perform much better than their uncorrected counterparts in samples of small or moderate size.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper deals with asymptotic results on a multivariate ultrastructural errors-in-variables regression model with equation errors Sufficient conditions for attaining consistent estimators for model parameters are presented Asymptotic distributions for the line regression estimators are derived Applications to the elliptical class of distributions with two error assumptions are presented The model generalizes previous results aimed at univariate scenarios (C) 2010 Elsevier Inc All rights reserved

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We review several asymmetrical links for binary regression models and present a unified approach for two skew-probit links proposed in the literature. Moreover, under skew-probit link, conditions for the existence of the ML estimators and the posterior distribution under improper priors are established. The framework proposed here considers two sets of latent variables which are helpful to implement the Bayesian MCMC approach. A simulation study to criteria for models comparison is conducted and two applications are made. Using different Bayesian criteria we show that, for these data sets, the skew-probit links are better than alternative links proposed in the literature.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Regression models for the mean quality-adjusted survival time are specified from hazard functions of transitions between two states and the mean quality-adjusted survival time may be a complex function of covariates. We discuss a regression model for the mean quality-adjusted survival (QAS) time based on pseudo-observations, which has the advantage of directly modeling the effect of covariates in the QAS time. Both Monte Carlo Simulations and a real data set are studied. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Birnbaum-Saunders models have largely been applied in material fatigue studies and reliability analyses to relate the total time until failure with some type of cumulative damage. In many problems related to the medical field, such as chronic cardiac diseases and different types of cancer, a cumulative damage caused by several risk factors might cause some degradation that leads to a fatigue process. In these cases, BS models can be suitable for describing the propagation lifetime. However, since the cumulative damage is assumed to be normally distributed in the BS distribution, the parameter estimates from this model can be sensitive to outlying observations. In order to attenuate this influence, we present in this paper BS models, in which a Student-t distribution is assumed to explain the cumulative damage. In particular, we show that the maximum likelihood estimates of the Student-t log-BS models attribute smaller weights to outlying observations, which produce robust parameter estimates. Also, some inferential results are presented. In addition, based on local influence and deviance component and martingale-type residuals, a diagnostics analysis is derived. Finally, a motivating example from the medical field is analyzed using log-BS regression models. Since the parameter estimates appear to be very sensitive to outlying and influential observations, the Student-t log-BS regression model should attenuate such influences. The model checking methodologies developed in this paper are used to compare the fitted models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Birnbaum-Saunders regression model is commonly used in reliability studies. We derive a simple matrix formula for second-order covariances of maximum-likelihood estimators in this class of models. The formula is quite suitable for computer implementation, since it involves only simple operations on matrices and vectors. Some simulation results show that the second-order covariances can be quite pronounced in small to moderate sample sizes. We also present empirical applications.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The main purpose of this work is to study the behaviour of Skovgaard`s [Skovgaard, I.M., 2001. Likelihood asymptotics. Scandinavian journal of Statistics 28, 3-32] adjusted likelihood ratio statistic in testing simple hypothesis in a new class of regression models proposed here. The proposed class of regression models considers Dirichlet distributed observations, and the parameters that index the Dirichlet distributions are related to covariates and unknown regression coefficients. This class is useful for modelling data consisting of multivariate positive observations summing to one and generalizes the beta regression model described in Vasconcellos and Cribari-Neto [Vasconcellos, K.L.P., Cribari-Neto, F., 2005. Improved maximum likelihood estimation in a new class of beta regression models. Brazilian journal of Probability and Statistics 19,13-31]. We show that, for our model, Skovgaard`s adjusted likelihood ratio statistics have a simple compact form that can be easily implemented in standard statistical software. The adjusted statistic is approximately chi-squared distributed with a high degree of accuracy. Some numerical simulations show that the modified test is more reliable in finite samples than the usual likelihood ratio procedure. An empirical application is also presented and discussed. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We introduce, for the first time, a new class of Birnbaum-Saunders nonlinear regression models potentially useful in lifetime data analysis. The class generalizes the regression model described by Rieck and Nedelman [Rieck, J.R., Nedelman, J.R., 1991. A log-linear model for the Birnbaum-Saunders distribution. Technometrics 33, 51-60]. We discuss maximum-likelihood estimation for the parameters of the model, and derive closed-form expressions for the second-order biases of these estimates. Our formulae are easily computed as ordinary linear regressions and are then used to define bias corrected maximum-likelihood estimates. Some simulation results show that the bias correction scheme yields nearly unbiased estimates without increasing the mean squared errors. Two empirical applications are analysed and discussed. Crown Copyright (C) 2009 Published by Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we study the influence of the National Telecom Business Volume by the data in 2008 that have been published in China Statistical Yearbook of Statistics. We illustrate the procedure of modeling “National Telecom Business Volume” on the following eight variables, GDP, Consumption Levels, Retail Sales of Social Consumer Goods Total Renovation Investment, the Local Telephone Exchange Capacity, Mobile Telephone Exchange Capacity, Mobile Phone End Users, and the Local Telephone End Users. The testing of heteroscedasticity and multicollinearity for model evaluation is included. We also consider AIC and BIC criterion to select independent variables, and conclude the result of the factors which are the optimal regression model for the amount of telecommunications business and the relation between independent variables and dependent variable. Based on the final results, we propose several recommendations about how to improve telecommunication services and promote the economic development.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The pathogen Phytophthora cinnamomi causes extensive 'dieback' of Australian native vegetation. This study investigated the distribution of infection in an area of significant sclerophyll vegetation in Australia. It aimed to determine the relationship of infection to site variables and to develop a predictive model of infection. Site variables recorded at 50 study sites included aspect, slope, altitude, proximity to road and road characteristics, soil profile characteristics and vegetation attributes. Soil and plant tissues were assayed for the presence of the pathogen. A geographical information systyem (GIS) was employed to provide accurate estimations of spatial variables and develop a predictive model for the distribution of P. cinnamomi. The pathogen was isolated from 76% of the study sites. Of the 17 site variables initially investigated during the study a logistic regression model identified only two, elevation and sun-index, as significant in determining the probability of infection. The presence of P. cinnamomi infection was negatively associated with elevation and positively associated with sun-index. The model predicted that up to 74% of the study area (11 875 ha) had a high probability of being affected by P. cinnamomi. However, the present areas of infection were small, providing an opportunity for management to minimize spread into highly susceptible uninvaded areas.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper adopted logistic regression model to examine the relationship between level of managerial ownership concentration and agency conflict which are proxied by level of risk, firms leverage and firms dividend policy. The study covers a period of 5 years from 1997 through 2001. The study is based on the 100 blue-chip stocks, majority of which are derived from CI components. The findings suggest a positive and significant association between level of level of risk at lower level and managerial ownership while a negative and significant association is also evidenced between risk at higher level and managerial ownership concentration. While debt policy which serves as positive monitoring substitute for agency conflict is found to be positive and significant explaining the level of ownership concentration. Furthermore, dividend policies, which also serve as monitoring, substitute to reduce agency conflict between manager and external shareholders do not appear to have any significant impact on managerial ownership. On the other hand, the level of institutional ownership, which serves as external monitoring force, is found to have inverse impact on level of managerial ownership concentration. This is marginally significant at 10 level (p=.12). The findings, in part explain the argument that the managerial ownership help reduce agency conflict between outside equity holders and managers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider a random design model based on independent and identically distributed (iid) pairs of observations (Xi, Yi), where the regression function m(x) is given by m(x) = E(Yi|Xi = x) with one independent variable. In a nonparametric setting the aim is to produce a reasonable approximation to the unknown function m(x) when we have no precise information about the form of the true density, f(x) of X. We describe an estimation procedure of non-parametric regression model at a given point by some appropriately constructed fixed-width (2d) confidence interval with the confidence coefficient of at least 1−. Here, d(> 0) and 2 (0, 1) are two preassigned values. Fixed-width confidence intervals are developed using both Nadaraya-Watson and local linear kernel estimators of nonparametric regression with data-driven bandwidths.

The sample size was optimized using the purely and two-stage sequential procedure together with asymptotic properties of the Nadaraya-Watson and local linear estimators. A large scale simulation study was performed to compare their coverage accuracy. The numerical results indicate that the confidence bands based on the local linear estimator have the best performance than those constructed by using Nadaraya-Watson estimator. However both estimators are shown to have asymptotically correct coverage properties.