889 resultados para Regression Trees
Resumo:
Classical regression methods take vectors as covariates and estimate the corresponding vectors of regression parameters. When addressing regression problems on covariates of more complex form such as multi-dimensional arrays (i.e. tensors), traditional computational models can be severely compromised by ultrahigh dimensionality as well as complex structure. By exploiting the special structure of tensor covariates, the tensor regression model provides a promising solution to reduce the model’s dimensionality to a manageable level, thus leading to efficient estimation. Most of the existing tensor-based methods independently estimate each individual regression problem based on tensor decomposition which allows the simultaneous projections of an input tensor to more than one direction along each mode. As a matter of fact, multi-dimensional data are collected under the same or very similar conditions, so that data share some common latent components but can also have their own independent parameters for each regression task. Therefore, it is beneficial to analyse regression parameters among all the regressions in a linked way. In this paper, we propose a tensor regression model based on Tucker Decomposition, which identifies not only the common components of parameters across all the regression tasks, but also independent factors contributing to each particular regression task simultaneously. Under this paradigm, the number of independent parameters along each mode is constrained by a sparsity-preserving regulariser. Linked multiway parameter analysis and sparsity modeling further reduce the total number of parameters, with lower memory cost than their tensor-based counterparts. The effectiveness of the new method is demonstrated on real data sets.
Resumo:
The evergreen Quercus ilex L. is one of the most common trees in Italian urban environments and is considered effective in the uptake of particulate and gaseous atmospheric pollutants. However, the few available estimates on O3 and NO2 removal by urban Q. ilex originate from model-based studies (which indicate NO2/O3 removal capacity of Q. ilex) and not from direct measurements of air pollutant concentrations. Thus, in the urban area of Siena (central Italy) we began long-term monitoring of O3/NO2 concentrations using passive samplers at a distance of 1, 5, 10 m from a busy road, under the canopies of Q. ilex and in a nearby open-field. Measurements performed in the period June 2011-October 2013 showed always a greater decrease of NO2 concentrations under the Q. ilex canopy than in the open-field transect. Conversely, a decrease of average O3 concentrations under the tree canopy was found only in autumn after the typical Mediterranean post-summer rainfalls. Our results indicate that interactions between O3/NO2 concentrations and trees in Mediterranean urban ecosystems are affected by temporal variations in climatic conditions. We argue therefore that the direct measurement of atmospheric pollutant concentrations should be chosen to describe local changes of aerial pollution.
Resumo:
The growth of mining activities in Africa in the last decade has coincided with increased attention on the fate of the continent’s forests, specifically in the contexts of livelihoods and climate change. Although mining has serious environmental impacts, scant attention has been paid to the processes which shape decision-making in contexts where minerals and forests overlap. Focussing on the illustrative case of Ghana, this paper articulates the dynamics of power, authority and legitimacy of private companies, traditional authorities and key state institutions in governing mining activities in forests. The analysis highlights how mining companies and donors promote a neoliberal model of resource management which entrenches their ability to benefit from mineral exploitation and marginalises the role of state institutions and traditional authorities in decision-making. This subsequently erodes state authority and legitimacy and compounds the contested nature of traditional authorities’ legitimacy. A more nuanced examination of foundational governance questions concerning the relative role of the state, traditional authorities and private interests is needed.
Resumo:
An evaluation of a surviving stretch of the Abbot's Way, in the Somerset Levels and Moors, was undertaken to assess the consequences of the previous management regime and inform future management of the site. The scheduled site appeared to have been dewatered and desiccated as a consequence of tree planting and the effects of a deep, adjacent drainage ditch during the previous decade. The evaluation considered the condition of the Neolithic timbers and associated palaeoenvironmental record from three trenches and, where possible, compared the results with those obtained form the 1974 excavation (Girling, 1976). The results of this analysis suggest that the hydrological consequences of tree planting and colonization had a detrimental effect on both the condition of the timbers and insect remains. However, pollen and plant macro-fossils survived well although there was modern contamination. A trench opened outside the scheduled area. where the ground was waterlogged and supported a wet acid grassland flora, revealed similar problems of survival and condition. This almost certainly reflects a period of peat extraction and an associated seasonally fluctuating water table in the 1950s and 1960s; in fact pollen survived better in the scheduled dewatered area. These results are compared with those recovered from the Sweet Track which was evaluated in 1996. Both sites have been subject to recent tree growth but the Sweet Track has been positively managed in terms of hydrology. The most notable difference between the two sites is that insects and wood survived better at the Sweet Track sites than at the Abbot's Way. Insects seem to be a more sensitive indicator of site desiccation than plant remains. It is recommended that any programme of management of wetland for archaeology should avoid deliberate tree planting and natural scrub and woodland generation. It should also take into account past as well as present land use.
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.
Resumo:
Knowledge on juvenile tree growth is crucial to understand how trees reach the canopy in tropical forests. However, long-term data on juvenile tree growth are usually unavailable. Annual tree rings provide growth information for the entire life of trees and their analysis has become more popular in tropical forest regions over the past decades. Nonetheless, tree ring studies mainly deal with adult rings as the annual character of juvenile rings has been questioned. We evaluated whether juvenile tree rings can be used for three Bolivian rainforest species. First, we characterized the rings of juvenile and adult trees anatomically. We then evaluated the annual nature of tree rings by a combination of three indirect methods: evaluation of synchronous growth patterns in the tree- ring series, (14)C bomb peak dating and correlations with rainfall. Our results indicate that rings of juvenile and adult trees are defined by similar ring-boundary elements. We built juvenile tree-ring chronologies and verified the ring age of several samples using (14)C bomb peak dating. We found that ring width was correlated with rainfall in all species, but in different ways. In all, the chronology, rainfall correlations and (14)C dating suggest that rings in our study species are formed annually.
Resumo:
In this article, we present a generalization of the Bayesian methodology introduced by Cepeda and Gamerman (2001) for modeling variance heterogeneity in normal regression models where we have orthogonality between mean and variance parameters to the general case considering both linear and highly nonlinear regression models. Under the Bayesian paradigm, we use MCMC methods to simulate samples for the joint posterior distribution. We illustrate this algorithm considering a simulated data set and also considering a real data set related to school attendance rate for children in Colombia. Finally, we present some extensions of the proposed MCMC algorithm.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.
Resumo:
Nesse artigo, tem-se o interesse em avaliar diferentes estratégias de estimação de parâmetros para um modelo de regressão linear múltipla. Para a estimação dos parâmetros do modelo foram utilizados dados de um ensaio clínico em que o interesse foi verificar se o ensaio mecânico da propriedade de força máxima (EM-FM) está associada com a massa femoral, com o diâmetro femoral e com o grupo experimental de ratas ovariectomizadas da raça Rattus norvegicus albinus, variedade Wistar. Para a estimação dos parâmetros do modelo serão comparadas três metodologias: a metodologia clássica, baseada no método dos mínimos quadrados; a metodologia Bayesiana, baseada no teorema de Bayes; e o método Bootstrap, baseado em processos de reamostragem.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
A bivariate regression model for matched paired survival data: local influence and residual analysis
Resumo:
The use of bivariate distributions plays a fundamental role in survival and reliability studies. In this paper, we consider a location scale model for bivariate survival times based on the proposal of a copula to model the dependence of bivariate survival data. For the proposed model, we consider inferential procedures based on maximum likelihood. Gains in efficiency from bivariate models are also examined in the censored data setting. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the bivariate regression model for matched paired survival data. Sensitivity analysis methods such as local and total influence are presented and derived under three perturbation schemes. The martingale marginal and the deviance marginal residual measures are used to check the adequacy of the model. Furthermore, we propose a new measure which we call modified deviance component residual. The methodology in the paper is illustrated on a lifetime data set for kidney patients.
Resumo:
In this paper we have discussed inference aspects of the skew-normal nonlinear regression models following both, a classical and Bayesian approach, extending the usual normal nonlinear regression models. The univariate skew-normal distribution that will be used in this work was introduced by Sahu et al. (Can J Stat 29:129-150, 2003), which is attractive because estimation of the skewness parameter does not present the same degree of difficulty as in the case with Azzalini (Scand J Stat 12:171-178, 1985) one and, moreover, it allows easy implementation of the EM-algorithm. As illustration of the proposed methodology, we consider a data set previously analyzed in the literature under normality.
Resumo:
The purpose of this paper is to develop a Bayesian approach for log-Birnbaum-Saunders Student-t regression models under right-censored survival data. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the considered model. In order to attenuate the influence of the outlying observations on the parameter estimates, we present in this paper Birnbaum-Saunders models in which a Student-t distribution is assumed to explain the cumulative damage. Also, some discussions on the model selection to compare the fitted models are given and case deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback-Leibler divergence. The developed procedures are illustrated with a real data set. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this work we introduce a new hierarchical surface decomposition method for multiscale analysis of surface meshes. In contrast to other multiresolution methods, our approach relies on spectral properties of the surface to build a binary hierarchical decomposition. Namely, we utilize the first nontrivial eigenfunction of the Laplace-Beltrami operator to recursively decompose the surface. For this reason we coin our surface decomposition the Fiedler tree. Using the Fiedler tree ensures a number of attractive properties, including: mesh-independent decomposition, well-formed and nearly equi-areal surface patches, and noise robustness. We show how the evenly distributed patches can be exploited for generating multiresolution high quality uniform meshes. Additionally, our decomposition permits a natural means for carrying out wavelet methods, resulting in an intuitive method for producing feature-sensitive meshes at multiple scales. Published by Elsevier Ltd.