866 resultados para Boosted regression trees


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an approach to predict the operating conditions of machine based on classification and regression trees (CART) and adaptive neuro-fuzzy inference system (ANFIS) in association with direct prediction strategy for multi-step ahead prediction of time series techniques. In this study, the number of available observations and the number of predicted steps are initially determined by using false nearest neighbor method and auto mutual information technique, respectively. These values are subsequently utilized as inputs for prediction models to forecast the future values of the machines’ operating conditions. The performance of the proposed approach is then evaluated by using real trending data of low methane compressor. A comparative study of the predicted results obtained from CART and ANFIS models is also carried out to appraise the prediction capability of these models. The results show that the ANFIS prediction model can track the change in machine conditions and has the potential for using as a tool to machine fault prognosis.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectives Demonstrate the application of decision trees – classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs) – to understand structure in missing data. Setting Data taken from employees at three different industry sites in Australia. Participants 7915 observations were included. Materials and Methods The approach was evaluated using an occupational health dataset comprising results of questionnaires, medical tests, and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the Type of data (medical or environmental), the site in which it was collected, the number of visits and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusion Researchers are encouraged to use CART and BRT models to explore and understand missing data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ENGLISH: We analyzed catches per unit of effort (CPUE) from the Japanese longline fishery for bigeye tuna (Thunnus obesus) in the central and eastern Pacific Ocean (EPO) with regression tree methods. Regression trees have not previously been used to estimate time series of abundance indices fronl CPUE data. The "optimally sized" tree had 139 parameters; year, month, latitude, and longitude interacted to affect bigeye CPUE. The trend in tree-based abundance indices for the EPO was similar to trends estimated from a generalized linear model and fronl an empirical model that combines oceanographic data with information on the distribution of fish relative to environmental conditions. The regression tree was more parsimonious and would be easier to implement than the other two nl0dels, but the tree provided no information about the nlechanisms that caused bigeye CPUEs to vary in time and space. Bigeye CPUEs increased sharply during the mid-1980's and were more variable at the northern and southern edges of the fishing grounds. Both of these results can be explained by changes in actual abundance and changes in catchability. Results from a regression tree that was fitted to a subset of the data indicated that, in the EPO, bigeye are about equally catchable with regular and deep longlines. This is not consistent with observations that bigeye are more abundant at depth and indicates that classification by gear type (regular or deep longline) may not provide a good measure of capture depth. Asimulated annealing algorithm was used to summarize the tree-based results by partitioning the fishing grounds into regions where trends in bigeye CPUE were similar. Simulated annealing can be useful for designing spatial strata in future sampling programs. SPANISH: Analizamos la captura por unidad de esfuerzo (CPUE) de la pesquería palangrera japonesa de atún patudo (Thunnus obesus) en el Océano Pacifico oriental (OPO) y central con métodos de árbol de regresión. Hasta ahora no se han usado árboles de regresión para estimar series de tiempo de índices de abundancia a partir de datos de CPUE. EI árbol de "tamaño optimo" tuvo 139 parámetros; ano, mes, latitud, y longitud interactuaron para afectar la CPUE de patudo. La tendencia en los índices de abundancia basados en árboles para el OPO fue similar a las tendencias estimadas con un modelo lineal generalizado y con un modelo empírico que combina datos oceanográficos con información sobre la distribución de los peces en relación con las condiciones ambientales. EI árbol de regresión fue mas parsimonioso y seria mas fácil de utilizar que los dos otros modelos, pero no proporciono información sobre los mecanismos que causaron que las CPUE de patudo valiaran en el tiempo y en el espacio. Las CPUE de patudo aumentaron notablemente a mediados de los anos 80 y fueron mas variables en los extremos norte y sur de la zona de pesca. Estos dos resultados pueden ser explicados por cambios en la abundancia real y cambios en la capturabilidad. Los resultados de un arbal de regresión ajustado a un subconjunto de los datos indican que, en el OPO, el patudo es igualmente capturable con palangres regulares y profundos. Esto no es consistente con observaciones de que el patudo abunda mas a profundidad e indica que clasificación por tipo de arte (palangre regular 0 profundo) podría no ser una buena medida de la profundidad de captura. Se uso un algoritmo de templado simulado para resumir los resultados basados en el árbol clasificando las zonas de pesca en zonas con tendencias similares en la CPUE de patudo. El templado simulado podría ser útil para diseñar estratos espaciales en programas futuros de muestreo. (PDF contains 45 pages.)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The mesoscale (100–102 m) of river habitats has been identified as the scale that simultaneously offers insights into ecological structure and falls within the practical bounds of river management. Mesoscale habitat (mesohabitat) classifications for relatively large rivers, however, are underdeveloped compared with those produced for smaller streams. Approaches to habitat modelling have traditionally focused on individual species or proceeded on a species-by-species basis. This is particularly problematic in larger rivers where the effects of biological interactions are more complex and intense. Community-level approaches can rapidly model many species simultaneously, thereby integrating the effects of biological interactions while providing information on the relative importance of environmental variables in structuring the community. One such community-level approach, multivariate regression trees, was applied in order to determine the relative influences of abiotic factors on fish assemblages within shoreline mesohabitats of San Pedro River, Chile, and to define reference communities prior to the planned construction of a hydroelectric power plant. Flow depth, bank materials and the availability of riparian and instream cover, including woody debris, were the main variables driving differences between the assemblages. Species strongly indicative of distinctive mesohabitat types included the endemic Galaxias platei. Among other outcomes, the results provide information on the impact of non-native salmonids on river-dwelling Galaxias platei, suggesting a degree of habitat segregation between these taxa based on flow depth. The results support the use of the mesohabitat concept in large, relatively pristine river systems, and they represent a basis for assessing the impact of any future hydroelectric power plant construction and operation. By combing community classifications with simple sets of environmental rules, the multivariate regression trees produced can be used to predict the community structure of any mesohabitat along the reach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatially explicit information on local perceptions of ecosystem services is needed to inform land use planning within rapidly changing landscapes. In this paper we spatially modelled local people's use and perceptions of benefits from forest ecosystem services in Borneo, from interviews of 1837 people in 185 villages. Questions related to provisioning, cultural/spiritual, regulating and supporting ecosystem services derived from forest, and attitudes towards forest conversion. We used boosted regression trees (BRTs) to combine interview data with social and environmental predictors to understand spatial variation of perceptions across Borneo. Our results show that people use a variety of products from intact and highly degraded forests. Perceptions of benefits from forests were strongest: in human-altered forest landscapes for cultural and spiritual benefits; in human-altered and intact forests landscapes for health benefits; intact forest for direct health benefits, such as medicinal plants; and in regions with little forest and extensive plantations, for environmental benefits, such as climatic impacts from deforestation. Forest clearing for small scale agriculture was predicted to be widely supported yet less so for large-scale agriculture. Understanding perceptions of rural communities in dynamic, multi-use landscapes is important where people are often directly affected by the decline in ecosystem services.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aim Determining how ecological processes vary across space is a major focus in ecology. Current methods that investigate such effects remain constrained by important limiting assumptions. Here we provide an extension to geographically weighted regression in which local regression and spatial weighting are used in combination. This method can be used to investigate non-stationarity and spatial-scale effects using any regression technique that can accommodate uneven weighting of observations, including machine learning. Innovation We extend the use of spatial weights to generalized linear models and boosted regression trees by using simulated data for which the results are known, and compare these local approaches with existing alternatives such as geographically weighted regression (GWR). The spatial weighting procedure (1) explained up to 80% deviance in simulated species richness, (2) optimized the normal distribution of model residuals when applied to generalized linear models versus GWR, and (3) detected nonlinear relationships and interactions between response variables and their predictors when applied to boosted regression trees. Predictor ranking changed with spatial scale, highlighting the scales at which different species–environment relationships need to be considered. Main conclusions GWR is useful for investigating spatially varying species–environment relationships. However, the use of local weights implemented in alternative modelling techniques can help detect nonlinear relationships and high-order interactions that were previously unassessed. Therefore, this method not only informs us how location and scale influence our perception of patterns and processes, it also offers a way to deal with different ecological interpretations that can emerge as different areas of spatial influence are considered during model fitting.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Given the limited resources available for weed management, a strategic approach is required to give the best bang for your buck. The current study incorporates: (1) a model ensemble approach to identify areas of uncertainty and commonality regarding a species invasive potential, (2) current distribution of the invaded species, and (3) connectivity of systems to identify target regions and focus efforts for more effective management. Uncertainty in the prediction of suitable habitat for H. amplexicaulis (study species) in Australia was addressed in an ensemble-forecasting approach to compare distributional scenarios from four models (CLIMATCH; CLIMEX; boosted regression trees [BRT]; maximum entropy [Maxent]). Models were built using subsets of occurrence and environmental data. Catchment risk was determined through incorporating habitat suitability, the current abundance and distribution of H. amplexicaulis, and catchment connectivity. Our results indicate geographic differences between predictions of different approaches. Despite these differences a number of catchments in northern, central, and southern Australia were identified as high risk of invasion or further spread by all models suggesting they should be given priority for the management of H. amplexicaulis. The study also highlighted the utility of ensemble approaches in indentifying areas of uncertainty and commonality regarding the species invasive potential.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships with four modeling methods run with multiple scenarios of (1) sources of occurrences and geographically isolated background ranges for absences, (2) approaches to drawing background (absence) points, and (3) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved by using a global dataset for model training, rather than restricting data input to the species’ native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e. into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g. boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post-hoc test conducted on a new Partenium dataset from Nepal validated excellent predictive performance of our “best” model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for Parthenium hysterophorus L. (Asteraceae; parthenium). However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed. This article is protected by copyright. All rights reserved.