883 resultados para Regression Trees
Resumo:
Objectives: The objectives of this study were to specifically investigate the differences in culture, attitudes and social networks between Australian and Taiwanese men and women and identify the factors that predict midlife men and women’s quality of life in both countries. Methods: A stratified random sample strategy based on probability proportional sampling (PPS) was conducted to investigate 278 Australian and 398 Taiwanese midlife men and women’s quality of life. Multiple regression modelling and classification and regression trees (CARTs) were performed to examine the potential differences on culture, attitude, social networks, social demographic factors and religion/spirituality in midlife men and women’s quality of life in both Australia and Taiwan. Results: The results of this study suggest that culture involves multiple functions and interacts with attitudes, social networks and individual factors to influence a person’s quality of life. Significant relationships were found between the interaction between cultural circumstances and a person’s internal and external factors. The research found that good social support networks and a healthy optimistic disposition may significantly enhance midlife men and women’s quality of life. Conclusion: The study indicated that there is a significant relationship between culture, attitude, social networks and quality of life in midlife Australian and Taiwanese men and women. People who had higher levels of horizontal individualism and collectivism, positive attitudes and better social support had better psychological, social, physical and environmental health, while it emerged that vertical individualists with competitive characteristics would experience a lower quality of life. This study has highlighted areas where opportunities exist to further reflect upon contemporary social health policies for Australian and Taiwanese societies and also within the global perspective, in order to provide enhanced quality care for growing midlife populations.
Resumo:
Habitat models are widely used in ecology, however there are relatively few studies of rare species, primarily because of a paucity of survey records and lack of robust means of assessing accuracy of modelled spatial predictions. We investigated the potential of compiled ecological data in developing habitat models for Macadamia integrifolia, a vulnerable mid-stratum tree endemic to lowland subtropical rainforests of southeast Queensland, Australia. We compared performance of two binomial models—Classification and Regression Trees (CART) and Generalised Additive Models (GAM)—with Maximum Entropy (MAXENT) models developed from (i) presence records and available absence data and (ii) developed using presence records and background data. The GAM model was the best performer across the range of evaluation measures employed, however all models were assessed as potentially useful for informing in situ conservation of M. integrifolia, A significant loss in the amount of M. integrifolia habitat has occurred (p < 0.05), with only 37% of former habitat (pre-clearing) remaining in 2003. Remnant patches are significantly smaller, have larger edge-to-area ratios and are more isolated from each other compared to pre-clearing configurations (p < 0.05). Whilst the network of suitable habitat patches is still largely intact, there are numerous smaller patches that are more isolated in the contemporary landscape compared with their connectedness before clearing. These results suggest that in situ conservation of M. integrifolia may be best achieved through a landscape approach that considers the relative contribution of small remnant habitat fragments to the species as a whole, as facilitating connectivity among the entire network of habitat patches.
Resumo:
This study examined the distribution of major mosquito species and their roles in the transmission of Ross River virus (RRV) infection for coastline and inland areas in Brisbane, Australia (27°28′ S, 153°2′ E). We obtained data on the monthly counts of RRV cases in Brisbane between November 1998 and December 2001 by statistical local areas from the Queensland Department of Health and the monthly mosquito abundance from the Brisbane City Council. Correlation analysis was used to assess the pairwise relationships between mosquito density and the incidence of RRV disease. This study showed that the mosquito abundance of Aedes vigilax (Skuse), Culex annulirostris (Skuse), and Aedes vittiger (Skuse) were significantly associated with the monthly incidence of RRV in the coastline area, whereas Aedes vigilax, Culex annulirostris, and Aedes notoscriptus (Skuse) were significantly associated with the monthly incidence of RRV in the inland area. The results of the classification and regression tree (CART) analysis show that both occurrence and incidence of RRV were influenced by interactions between species in both coastal and inland regions. We found that there was an 89% chance for an occurrence of RRV if the abundance of Ae. vigifax was between 64 and 90 in the coastline region. There was an 80% chance for an occurrence of RRV if the density of Cx. annulirostris was between 53 and 74 in the inland area. The results of this study may have applications as a decision support tool in planning disease control of RRV and other mosquito-borne diseases.
Resumo:
Road crashes cost world and Australian society a significant proportion of GDP, affecting productivity and causing significant suffering for communities and individuals. This paper presents a case study that generates data mining models that contribute to understanding of road crashes by allowing examination of the role of skid resistance (F60) and other road attributes in road crashes. Predictive data mining algorithms, primarily regression trees, were used to produce road segment crash count models from the road and traffic attributes of crash scenarios. The rules derived from the regression trees provide evidence of the significance of road attributes in contributing to crash, with a focus on the evaluation of skid resistance.
Resumo:
Protocols for bioassessment often relate changes in summary metrics that describe aspects of biotic assemblage structure and function to environmental stress. Biotic assessment using multimetric indices now forms the basis for setting regulatory standards for stream quality and a range of other goals related to water resource management in the USA and elsewhere. Biotic metrics are typically interpreted with reference to the expected natural state to evaluate whether a site is degraded. It is critical that natural variation in biotic metrics along environmental gradients is adequately accounted for, in order to quantify human disturbance-induced change. A common approach used in the IBI is to examine scatter plots of variation in a given metric along a single stream size surrogate and a fit a line (drawn by eye) to form the upper bound, and hence define the maximum likely value of a given metric in a site of a given environmental characteristic (termed the 'maximum species richness line' - MSRL). In this paper we examine whether the use of a single environmental descriptor and the MSRL is appropriate for defining the reference condition for a biotic metric (fish species richness) and for detecting human disturbance gradients in rivers of south-eastern Queensland, Australia. We compare the accuracy and precision of the MSRL approach based on single environmental predictors, with three regression-based prediction methods (Simple Linear Regression, Generalised Linear Modelling and Regression Tree modelling) that use (either singly or in combination) a set of landscape and local scale environmental variables as predictors of species richness. We compared the frequency of classification errors from each method against set biocriteria and contrast the ability of each method to accurately reflect human disturbance gradients at a large set of test sites. The results of this study suggest that the MSRL based upon variation in a single environmental descriptor could not accurately predict species richness at minimally disturbed sites when compared with SLR's based on equivalent environmental variables. Regression-based modelling incorporating multiple environmental variables as predictors more accurately explained natural variation in species richness than did simple models using single environmental predictors. Prediction error arising from the MSRL was substantially higher than for the regression methods and led to an increased frequency of Type I errors (incorrectly classing a site as disturbed). We suggest that problems with the MSRL arise from the inherent scoring procedure used and that it is limited to predicting variation in the dependent variable along a single environmental gradient.
Resumo:
Spatially explicit information on local perceptions of ecosystem services is needed to inform land use planning within rapidly changing landscapes. In this paper we spatially modelled local people's use and perceptions of benefits from forest ecosystem services in Borneo, from interviews of 1837 people in 185 villages. Questions related to provisioning, cultural/spiritual, regulating and supporting ecosystem services derived from forest, and attitudes towards forest conversion. We used boosted regression trees (BRTs) to combine interview data with social and environmental predictors to understand spatial variation of perceptions across Borneo. Our results show that people use a variety of products from intact and highly degraded forests. Perceptions of benefits from forests were strongest: in human-altered forest landscapes for cultural and spiritual benefits; in human-altered and intact forests landscapes for health benefits; intact forest for direct health benefits, such as medicinal plants; and in regions with little forest and extensive plantations, for environmental benefits, such as climatic impacts from deforestation. Forest clearing for small scale agriculture was predicted to be widely supported yet less so for large-scale agriculture. Understanding perceptions of rural communities in dynamic, multi-use landscapes is important where people are often directly affected by the decline in ecosystem services.
Resumo:
Aim Determining how ecological processes vary across space is a major focus in ecology. Current methods that investigate such effects remain constrained by important limiting assumptions. Here we provide an extension to geographically weighted regression in which local regression and spatial weighting are used in combination. This method can be used to investigate non-stationarity and spatial-scale effects using any regression technique that can accommodate uneven weighting of observations, including machine learning. Innovation We extend the use of spatial weights to generalized linear models and boosted regression trees by using simulated data for which the results are known, and compare these local approaches with existing alternatives such as geographically weighted regression (GWR). The spatial weighting procedure (1) explained up to 80% deviance in simulated species richness, (2) optimized the normal distribution of model residuals when applied to generalized linear models versus GWR, and (3) detected nonlinear relationships and interactions between response variables and their predictors when applied to boosted regression trees. Predictor ranking changed with spatial scale, highlighting the scales at which different species–environment relationships need to be considered. Main conclusions GWR is useful for investigating spatially varying species–environment relationships. However, the use of local weights implemented in alternative modelling techniques can help detect nonlinear relationships and high-order interactions that were previously unassessed. Therefore, this method not only informs us how location and scale influence our perception of patterns and processes, it also offers a way to deal with different ecological interpretations that can emerge as different areas of spatial influence are considered during model fitting.
Resumo:
The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.
Resumo:
Given the limited resources available for weed management, a strategic approach is required to give the best bang for your buck. The current study incorporates: (1) a model ensemble approach to identify areas of uncertainty and commonality regarding a species invasive potential, (2) current distribution of the invaded species, and (3) connectivity of systems to identify target regions and focus efforts for more effective management. Uncertainty in the prediction of suitable habitat for H. amplexicaulis (study species) in Australia was addressed in an ensemble-forecasting approach to compare distributional scenarios from four models (CLIMATCH; CLIMEX; boosted regression trees [BRT]; maximum entropy [Maxent]). Models were built using subsets of occurrence and environmental data. Catchment risk was determined through incorporating habitat suitability, the current abundance and distribution of H. amplexicaulis, and catchment connectivity. Our results indicate geographic differences between predictions of different approaches. Despite these differences a number of catchments in northern, central, and southern Australia were identified as high risk of invasion or further spread by all models suggesting they should be given priority for the management of H. amplexicaulis. The study also highlighted the utility of ensemble approaches in indentifying areas of uncertainty and commonality regarding the species invasive potential.
Resumo:
Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships with four modeling methods run with multiple scenarios of (1) sources of occurrences and geographically isolated background ranges for absences, (2) approaches to drawing background (absence) points, and (3) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved by using a global dataset for model training, rather than restricting data input to the species’ native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e. into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g. boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post-hoc test conducted on a new Partenium dataset from Nepal validated excellent predictive performance of our “best” model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for Parthenium hysterophorus L. (Asteraceae; parthenium). However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed. This article is protected by copyright. All rights reserved.
Resumo:
Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships with four modeling methods run with multiple scenarios of (1) sources of occurrences and geographically isolated background ranges for absences, (2) approaches to drawing background (absence) points, and (3) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved by using a global dataset for model training, rather than restricting data input to the species’ native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e. into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g. boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post-hoc test conducted on a new Partenium dataset from Nepal validated excellent predictive performance of our “best” model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for Parthenium hysterophorus L. (Asteraceae; parthenium). However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed. This article is protected by copyright. All rights reserved.
Resumo:
The aim of this study was to evaluate and test methods which could improve local estimates of a general model fitted to a large area. In the first three studies, the intention was to divide the study area into sub-areas that were as homogeneous as possible according to the residuals of the general model, and in the fourth study, the localization was based on the local neighbourhood. According to spatial autocorrelation (SA), points closer together in space are more likely to be similar than those that are farther apart. Local indicators of SA (LISAs) test the similarity of data clusters. A LISA was calculated for every observation in the dataset, and together with the spatial position and residual of the global model, the data were segmented using two different methods: classification and regression trees (CART) and the multiresolution segmentation algorithm (MS) of the eCognition software. The general model was then re-fitted (localized) to the formed sub-areas. In kriging, the SA is modelled with a variogram, and the spatial correlation is a function of the distance (and direction) between the observation and the point of calculation. A general trend is corrected with the residual information of the neighbourhood, whose size is controlled by the number of the nearest neighbours. Nearness is measured as Euclidian distance. With all methods, the root mean square errors (RMSEs) were lower, but with the methods that segmented the study area, the deviance in single localized RMSEs was wide. Therefore, an element capable of controlling the division or localization should be included in the segmentation-localization process. Kriging, on the other hand, provided stable estimates when the number of neighbours was sufficient (over 30), thus offering the best potential for further studies. Even CART could be combined with kriging or non-parametric methods, such as most similar neighbours (MSN).
Resumo:
Lahopuun määrästä ja sijoittumisesta ollaan kiinnostuneita paitsi elinympäristöjen monimuotoisuuden, myös ilmakehän hiilen varastoinnin kannalta. Tutkimuksen tavoitteena oli kehittää aluepohjainen laserkeilausdataa hyödyntävä malli lahopuukohteiden paikantamiseksi ja lahopuun määrän estimoimiseksi. Samalla tutkittiin mallin selityskyvyn muuttumista mallinnettavan ruudun kokoa suurennettaessa. Tutkimusalue sijaitsi Itä-Suomessa Sonkajärvellä ja koostui pääasiassa nuorista hoidetuista talousmetsistä. Tutkimuksessa käytettiin harvapulssista laserkeilausdataa sekä kaistoittain mitattua maastodataa kuolleesta puuaineksesta. Aineisto jaettiin siten, että neljäsosa datasta oli käytössä mallinnusta varten ja loput varattiin valmiiden mallien testaamiseen. Lahopuun mallintamisessa käytettiin sekä parametrista että ei-parametrista mallinnusmenetelmää. Logistisen regression avulla erikokoisille (0,04, 0,20, 0,32, 0,52 ja 1,00 ha) ruuduille ennustettiin todennäköisyys lahopuun esiintymiselle. Muodostettujen mallien selittävät muuttujat valittiin 80 laserpiirteen ja näiden muunnoksien joukosta. Mallien selittävät muuttujat valittiin kolmessa vaiheessa. Aluksi muuttujia tarkasteltiin visuaalisesti kuvaamalla ne lahopuumäärän suhteen. Ensimmäisessä vaiheessa sopivimmiksi arvioitujen muuttujien selityskykyä testattiin mallinnuksen toisessa vaiheessa yhden muuttujan mallien avulla. Lopullisessa usean muuttujan mallissa selittävien muuttujien kriteerinä oli tilastollinen merkitsevyys 5 % riskitasolla. 0,20 hehtaarin ruutukoolle luotu malli parametrisoitiin muun kokoisille ruuduille. Logistisella regressiolla toteutetun parametrisen mallintamisen lisäksi, 0,04 ja 1,0 hehtaarin ruutukokojen aineistot luokiteltiin ei-parametrisen CART-mallinnuksen (Classification and Regression Trees) avulla. CARTmenetelmällä etsittiin aineistosta vaikeasti havaittavia epälineaarisia riippuvuuksia laserpiirteiden ja lahopuumäärän välillä. CART-luokittelu tehtiin sekä lahopuustoisuuden että lahopuutilavuuden suhteen. CART-luokituksella päästiin logistista regressiota parempiin tuloksiin ruutujen luokituksessa lahopuustoisuuden suhteen. Logistisella mallilla tehty luokitus parani ruutukoon suurentuessa 0,04 ha:sta(kappa 0,19) 0,32 ha:iin asti (kappa 0,38). 0,52 ha:n ruutukoolla luokituksen kappa-arvo kääntyi laskuun (kappa 0,32) ja laski edelleen hehtaarin ruutukokoon saakka (kappa 0,26). CART-luokitus parani ruutukoon kasvaessa. Luokitustulokset olivat logistista mallinnusta parempia sekä 0,04 ha:n (kappa 0,24) että 1,0 ha:n (kappa 0,52) ruutukoolla. CART-malleilla määritettyjen ruutukohtaisten lahopuutilavuuksien suhteellinen RMSE pieneni ruutukoon kasvaessa. 0,04 hehtaarin ruutukoolla koko aineiston lahopuumäärän suhteellinen RMSE oli 197,1 %, kun hehtaarin ruutukoolla vastaava luku oli 120,3 %. Tämän tutkimuksen tulosten perusteella voidaan todeta, että maastossa mitatun lahopuumäärän ja tutkimuksessa käytettyjen laserpiirteiden yhteys on pienellä ruutukoolla hyvin heikko, mutta vahvistuu hieman ruutukoon kasvaessa. Kun mallinnuksessa käytetty ruutukoko kasvaa, pienialaisten lahopuukeskittymien havaitseminen kuitenkin vaikeutuu. Tutkimuksessa kohteen lahopuustoisuus pystyttiin kartoittamaan kohtuullisesti suurella ruutukoolla, mutta pienialaisten kohteiden kartoittaminen ei onnistunut käytetyillä menetelmillä. Pienialaisten kohteiden paikantaminen laserkeilauksen avulla edellyttää jatkotutkimusta erityisesti tiheäpulssisen laserdatan käytöstä lahopuuinventoinneissa.
Mapping reef fish and the seascape: using acoustics and spatial modeling to guide coastal management
Resumo:
Reef fish distributions are patchy in time and space with some coral reef habitats supporting higher densities (i.e., aggregations) of fish than others. Identifying and quantifying fish aggregations (particularly during spawning events) are often top priorities for coastal managers. However, the rapid mapping of these aggregations using conventional survey methods (e.g., non-technical SCUBA diving and remotely operated cameras) are limited by depth, visibility and time. Acoustic sensors (i.e., splitbeam and multibeam echosounders) are not constrained by these same limitations, and were used to concurrently map and quantify the location, density and size of reef fish along with seafloor structure in two, separate locations in the U.S. Virgin Islands. Reef fish aggregations were documented along the shelf edge, an ecologically important ecotone in the region. Fish were grouped into three classes according to body size, and relationships with the benthic seascape were modeled in one area using Boosted Regression Trees. These models were validated in a second area to test their predictive performance in locations where fish have not been mapped. Models predicting the density of large fish (≥29 cm) performed well (i.e., AUC = 0.77). Water depth and standard deviation of depth were the most influential predictors at two spatial scales (100 and 300 m). Models of small (≤11 cm) and medium (12–28 cm) fish performed poorly (i.e., AUC = 0.49 to 0.68) due to the high prevalence (45–79%) of smaller fish in both locations, and the unequal prevalence of smaller fish in the training and validation areas. Integrating acoustic sensors with spatial modeling offers a new and reliable approach to rapidly identify fish aggregations and to predict the density large fish in un-surveyed locations. This integrative approach will help coastal managers to prioritize sites, and focus their limited resources on areas that may be of higher conservation value.