888 resultados para Atheoretical regression trees
Resumo:
The family of distributions proposed by Birnbaum and Saunders (1969) can be used to model lifetime data and it is widely applicable to model failure times of fatiguing materials. We give a simple matrix formula of order n(-1/2), where n is the sample size, for the skewness of the distributions of the maximum likelihood estimates of the parameters in Birnbaum-Saunders nonlinear regression models, recently introduced by Lemonte and Cordeiro (2009). The formula is quite suitable for computer implementation, since it involves only simple operations on matrices and vectors, in order to obtain closed-form skewness in a wide range of nonlinear regression models. Empirical and real applications are analyzed and discussed. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The main purpose of this work is to study the behaviour of Skovgaard`s [Skovgaard, I.M., 2001. Likelihood asymptotics. Scandinavian journal of Statistics 28, 3-32] adjusted likelihood ratio statistic in testing simple hypothesis in a new class of regression models proposed here. The proposed class of regression models considers Dirichlet distributed observations, and the parameters that index the Dirichlet distributions are related to covariates and unknown regression coefficients. This class is useful for modelling data consisting of multivariate positive observations summing to one and generalizes the beta regression model described in Vasconcellos and Cribari-Neto [Vasconcellos, K.L.P., Cribari-Neto, F., 2005. Improved maximum likelihood estimation in a new class of beta regression models. Brazilian journal of Probability and Statistics 19,13-31]. We show that, for our model, Skovgaard`s adjusted likelihood ratio statistics have a simple compact form that can be easily implemented in standard statistical software. The adjusted statistic is approximately chi-squared distributed with a high degree of accuracy. Some numerical simulations show that the modified test is more reliable in finite samples than the usual likelihood ratio procedure. An empirical application is also presented and discussed. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
We introduce, for the first time, a new class of Birnbaum-Saunders nonlinear regression models potentially useful in lifetime data analysis. The class generalizes the regression model described by Rieck and Nedelman [Rieck, J.R., Nedelman, J.R., 1991. A log-linear model for the Birnbaum-Saunders distribution. Technometrics 33, 51-60]. We discuss maximum-likelihood estimation for the parameters of the model, and derive closed-form expressions for the second-order biases of these estimates. Our formulae are easily computed as ordinary linear regressions and are then used to define bias corrected maximum-likelihood estimates. Some simulation results show that the bias correction scheme yields nearly unbiased estimates without increasing the mean squared errors. Two empirical applications are analysed and discussed. Crown Copyright (C) 2009 Published by Elsevier B.V. All rights reserved.
Resumo:
This paper derives the second-order biases Of maximum likelihood estimates from a multivariate normal model where the mean vector and the covariance matrix have parameters in common. We show that the second order bias can always be obtained by means of ordinary weighted least-squares regressions. We conduct simulation studies which indicate that the bias correction scheme yields nearly unbiased estimators. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
We construct some examples using trees. Some of them are consistent counterexamples for the discrete reflection of certain topological properties. All the properties dealt with here were already known to be non-discretely reflexive if we assume CH and we show that the same is true assuming the existence of a Suslin tree. In some cases we actually get some ZFC results. We construct also, using a Suslin tree, a compact space that is pseudo-radial but it is not discretely generated. With a similar construction, but using an Aronszajn tree, we present a ZFC space that is first countable, omega-bounded but is not strongly w-bounded, answering a question of Peter Nyikos. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Citrus sudden death (CSD) is a new disease of sweet orange and mandarin trees grafted on Rangpur lime and Citrus volkameriana rootstocks. It was first seen in Brazil in 1999, and has since been detected in more than four million trees. The CSD causal agent is unknown and the current hypothesis involves a virus similar to Citrus tristeza virus or a new virus named Citrus sudden death-associated virus. CSD symptoms include generalized foliar discoloration, defoliation and root death, and, in most cases, it can cause tree death. One of the unique characteristics of CSD disease is the presence of a yellow stain in the rootstock bark near the bud union. This region also undergoes profound anatomical changes. In this study, we analyse the metabolic disorder caused by CSD in the bark of sweet orange grafted on Rangpur lime by nuclear magnetic resonance (NMR) spectroscopy and imaging. The imaging results show the presence of a large amount of non-functional phloem in the rootstock bark of affected plants. The spectroscopic analysis shows a high content of triacylglyceride and sucrose, which may be related to phloem blockage close to the bud union. We also propose that, without knowing the causal CSD agent, the determination of oil content in rootstock bark by low-resolution NMR can be used as a complementary method for CSD diagnosis, screening about 300 samples per hour.
Resumo:
In this paper, we study the influence of the National Telecom Business Volume by the data in 2008 that have been published in China Statistical Yearbook of Statistics. We illustrate the procedure of modeling “National Telecom Business Volume” on the following eight variables, GDP, Consumption Levels, Retail Sales of Social Consumer Goods Total Renovation Investment, the Local Telephone Exchange Capacity, Mobile Telephone Exchange Capacity, Mobile Phone End Users, and the Local Telephone End Users. The testing of heteroscedasticity and multicollinearity for model evaluation is included. We also consider AIC and BIC criterion to select independent variables, and conclude the result of the factors which are the optimal regression model for the amount of telecommunications business and the relation between independent variables and dependent variable. Based on the final results, we propose several recommendations about how to improve telecommunication services and promote the economic development.
Resumo:
Variation in wood properties for Picea abies trees and logs of different dimensions has been studied at two sites in southern Sweden of different site quality class. Trees have been classified as dominant or sub-dominant, according to their height. Log and board grades were classified and strength grade of boards, basic density and annual ring width measured. A similar study made on four northern sites was used as reference material.Sub-dominant trees were of superior quality in comparison to dominant trees, when classified by log and board grades or strength grading. Differences were accentuated for the second log where the sub-dominant trees had superior strength and low amount of boards with coarse branches. The results correspond well to those from the northern region, Jämtland. The classifica¬tion of boards as well as bending strength indicated superior properties on timber from northern sites even though the basic density was similar.
Resumo:
This is a note about proxy variables and instruments for identification of structural parameters in regression models. We have experienced that in the econometric textbooks these two issues are treated separately, although in practice these two concepts are very often combined. Usually, proxy variables are inserted in instrument variable regressions with the motivation they are exogenous. Implicitly meaning they are exogenous in a reduced form model and not in a structural model. Actually if these variables are exogenous they should be redundant in the structural model, e.g. IQ as a proxy for ability. Valid proxies reduce unexplained variation and increases the efficiency of the estimator of the structural parameter of interest. This is especially important in situations when the instrument is weak. With a simple example we demonstrate what is required of a proxy and an instrument when they are combined. It turns out that when a researcher has a valid instrument the requirements on the proxy variable is weaker than if no such instrument exists
Predictive models for chronic renal disease using decision trees, naïve bayes and case-based methods
Resumo:
Data mining can be used in healthcare industry to “mine” clinical data to discover hidden information for intelligent and affective decision making. Discovery of hidden patterns and relationships often goes intact, yet advanced data mining techniques can be helpful as remedy to this scenario. This thesis mainly deals with Intelligent Prediction of Chronic Renal Disease (IPCRD). Data covers blood, urine test, and external symptoms applied to predict chronic renal disease. Data from the database is initially transformed to Weka (3.6) and Chi-Square method is used for features section. After normalizing data, three classifiers were applied and efficiency of output is evaluated. Mainly, three classifiers are analyzed: Decision Tree, Naïve Bayes, K-Nearest Neighbour algorithm. Results show that each technique has its unique strength in realizing the objectives of the defined mining goals. Efficiency of Decision Tree and KNN was almost same but Naïve Bayes proved a comparative edge over others. Further sensitivity and specificity tests are used as statistical measures to examine the performance of a binary classification. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified while Specificity measures the proportion of negatives which are correctly identified. CRISP-DM methodology is applied to build the mining models. It consists of six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.