866 resultados para Boosted regression trees
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The paper's original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.
Resumo:
In this paper, we propose a cure rate survival model by assuming the number of competing causes of the event of interest follows the Geometric distribution and the time to event follow a Birnbaum Saunders distribution. We consider a frequentist analysis for parameter estimation of a Geometric Birnbaum Saunders model with cure rate. Finally, to analyze a data set from the medical area. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The use of a photodegradable tape was evaluated on 'Valencia' sweet orange nursery trees budded both on Rangpur lime and Swingle citrumelo in a greenhouse in Bebedouro-SP, Brazil, from September to November 2009. On both rootstocks three wrapping procedures were evaluated: i) conventional polyethylene tape wrapped around the bud eye; ii) photodegradable tape wrapped around the bud eye, and iii) photodegradable tape wrapped around the graft junction without covering the bud eye. The following variables were measured: time spent for wrapping, percentage of bud sprouting, length and stem diameter of the scion shoot, and percentage of commercially valuable nursery trees. The trial was conducted following a randomized complete block design, with six treatments, four replications and 12 trees per plot. The use of photodegradable tape, with or without covering the bud eye, anticipated bud sprouting; despite of the longer time spent with wrapping when the photodegradable tape was used. Plants grafted onto the less vigorous Swingle citrumelo rootstock showed lower bud sprout percentages when the bud eye was covered with the photodegradable tape.
Resumo:
The log-Burr XII regression model for grouped survival data is evaluated in the presence of many ties. The methodology for grouped survival data is based on life tables, where the times are grouped in k intervals, and we fit discrete lifetime regression models to the data. The model parameters are estimated by maximum likelihood and jackknife methods. To detect influential observations in the proposed model, diagnostic measures based on case deletion, so-called global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to these measures, the total local influence and influential estimates are also used. We conduct Monte Carlo simulation studies to assess the finite sample behavior of the maximum likelihood estimators of the proposed model for grouped survival. A real data set is analyzed using a regression model for grouped data.
Resumo:
Habitat use by Sharp-tailed Tyrant (Culicivora caudacuta), and Cock-tailed Tyrant (Alectrurus tricolor) in the Cerrado of Southeastern Brazil. Obligatory grassland birds are dependent on a limited set of native habitats that are disappearing almost everywhere. We examined the use of macrohabitat and microhabitat by two threatened species of flycatchers, the Sharp-tailed Tyrant, Culicivora caudacuta and the Cock-tailed Tyrant, Alectrurus tricolor in a preserved area of cerrado. We generated logistic regression models to explain the presence of these species through variables of microhabitat. Both flycatchers occurred mainly in grassland areas and favored areas with a low density of palms (Attalea geraensis) and trees. The Sharp-tailed Tyrant also favored areas with a high density of low shrubs (< 1 m) and less exposed soil. The positive relationship found between the presence of Sharp-tailed Tyrant and soil cover may indicate the importance of litter and understory vegetation for shelter and food. The conservation of both flycatcher species in the study area should benefit from controlling palm density and the maintenance of grasslands with low shrubs.
Resumo:
In this paper we obtain asymptotic expansions, up to order n(-1/2) and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Estimates of evapotranspiration on a local scale is important information for agricultural and hydrological practices. However, equations to estimate potential evapotranspiration based only on temperature data, which are simple to use, are usually less trustworthy than the Food and Agriculture Organization (FAO)Penman-Monteith standard method. The present work describes two correction procedures for potential evapotranspiration estimates by temperature, making the results more reliable. Initially, the standard FAO-Penman-Monteith method was evaluated with a complete climatologic data set for the period between 2002 and 2006. Then temperature-based estimates by Camargo and Jensen-Haise methods have been adjusted by error autocorrelation evaluated in biweekly and monthly periods. In a second adjustment, simple linear regression was applied. The adjusted equations have been validated with climatic data available for the Year 2001. Both proposed methodologies showed good agreement with the standard method indicating that the methodology can be used for local potential evapotranspiration estimates.
Resumo:
In this paper, a new family of survival distributions is presented. It is derived by considering that the latent number of failure causes follows a Poisson distribution and the time for these causes to be activated follows an exponential distribution. Three different activation schemes are also considered. Moreover, we propose the inclusion of covariates in the model formulation in order to study their effect on the expected value of the number of causes and on the failure rate function. Inferential procedure based on the maximum likelihood method is discussed and evaluated via simulation. The developed methodology is illustrated on a real data set on ovarian cancer.
Resumo:
Background: We aimed to investigate the performance of five different trend analysis criteria for the detection of glaucomatous progression and to determine the most frequently and rapidly progressing locations of the visual field. Design: Retrospective cohort. Participants or Samples: Treated glaucoma patients with =8 Swedish Interactive Thresholding Algorithm (SITA)-standard 24-2 visual field tests. Methods: Progression was determined using trend analysis. Five different criteria were used: (A) =1 significantly progressing point; (B) =2 significantly progressing points; (C) =2 progressing points located in the same hemifield; (D) at least two adjacent progressing points located in the same hemifield; (E) =2 progressing points in the same Garway-Heath map sector. Main Outcome Measures: Number of progressing eyes and false-positive results. Results: We included 587 patients. The number of eyes reaching a progression endpoint using each criterion was: A = 300 (51%); B = 212 (36%); C = 194 (33%); D = 170 (29%); and E = 186 (31%) (P = 0.03). The numbers of eyes with positive slopes were: A = 13 (4.3%); B = 3 (1.4%); C = 3 (1.5%); D = 2 (1.1%); and E = 3 (1.6%) (P = 0.06). The global slopes for progressing eyes were more negative in Groups B, C and D than in Group A (P = 0.004). The visual field locations that progressed more often were those in the nasal field adjacent to the horizontal midline. Conclusions: Pointwise linear regression criteria that take into account the retinal nerve fibre layer anatomy enhances the specificity of trend analysis for the detection glaucomatous visual field progression.
Resumo:
This work aimed to evaluate the incidence and severity of scab in prune trees under different fungicide management, with two time patterns of application; one at the early fruit formation, up to pit hardening, and another starting after pit hardening, and compare the number of fungicide application reductions with the management adopted by the producer Four experiments, with different treatments, were carried out during the 2004-2005 and 2005-2006 seasons (two experiments,) and that of 2008-2009, using the Harry Pickstone and Reubennel cultivars. The most efficient control of the disease was achieved with the combination of metiram, piraclostrobina e ditianona fungicides from late bloom to pit hardening. Efficient scab control in prune was dependent on the combination of the fungicides used and the application timing. Reduced fungicide management is possible, while spraying initiated after pit hardening was not efficient for scab control.
Resumo:
Lemonte and Cordeiro [Birnbaum-Saunders nonlinear regression models, Comput. Stat. Data Anal. 53 (2009), pp. 4441-4452] introduced a class of Birnbaum-Saunders (BS) nonlinear regression models potentially useful in lifetime data analysis. We give a general matrix Bartlett correction formula to improve the likelihood ratio (LR) tests in these models. The formula is simple enough to be used analytically to obtain several closed-form expressions in special cases. Our results generalize those in Lemonte et al. [Improved likelihood inference in Birnbaum-Saunders regressions, Comput. Stat. DataAnal. 54 (2010), pp. 1307-1316], which hold only for the BS linear regression models. We consider Monte Carlo simulations to show that the corrected tests work better than the usual LR tests.
Resumo:
To understand the effect of summer and winter on the relationships between leaf carbohydrate and photosynthesis in citrus trees growing in subtropical conditions, 'Valencia' orange trees were subjected to external manipulation of their carbohydrate concentration by exposing them to darkness and evaluating the maximal photosynthetic capacity. In addition, the relationships between carbohydrate and photosynthesis in the citrus leaves were studied under natural conditions. Exposing the leaves to dark conditions decreased the carbohydrate concentration and increased photosynthesis in both seasons, which is in accordance with the current model of carbohydrate regulation. Significant negative correlations were found between total non-structural carbohydrates and photosynthesis in both seasons. However, non-reducing sugars were the most important carbohydrate that apparently regulated photosynthesis on a typical summer day, whereas starch was important on a typical winter day. As a novelty, photosynthesis stimulation by carbohydrate consumption was approximately three times higher during the summer, i.e. the growing season. Under subtropical conditions, citrus leaves exhibited relatively high photosynthesis and high carbohydrate levels on the summer day, as well as a high nocturnal consumption of starch and soluble sugars. A positive association was determined between photosynthesis and photoassimilate consumption/exportation, even in leaves showing a high carbohydrate concentration. This paper provides evidence that photosynthesis in citrus leaves is regulated by an increase in sink demand rather than by the absolute carbohydrate concentration in leaves.
Resumo:
Background: Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Methods: Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. Results: We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. Conclusions: The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in countries with limited resources.
Resumo:
Site-specific height-diameter models may be used to improve biomass estimates for forest inventories where only diameter at breast height (DBH) measurements are available. In this study, we fit height-diameter models for vegetation types of a tropical Atlantic forest using field measurements of height across plots along an altitudinal gradient. To fit height-diameter models, we sampled trees by DBH class and measured tree height within 13 one-hectare permanent plots established at four altitude classes. To select the best model we tested the performance of 11 height-diameter models using the Akaike Information Criterion (AIC). The Weibull and Chapman-Richards height-diameter models performed better than other models, and regional site-specific models performed better than the general model. In addition, there is a slight variation of height-diameter relationships across the altitudinal gradient and an extensive difference in the stature between the Atlantic and Amazon forests. The results showed the effect of altitude on tree height estimates and emphasize the need for altitude-specific models that produce more accurate results than a general model that encompasses all altitudes. To improve biomass estimation, the development of regional height-diameter models that estimate tree height using a subset of randomly sampled trees presents an approach to supplement surveys where only diameter has been measured.