97 resultados para Boosted regression trees
Resumo:
This study examined the distribution of major mosquito species and their roles in the transmission of Ross River virus (RRV) infection for coastline and inland areas in Brisbane, Australia (27°28′ S, 153°2′ E). We obtained data on the monthly counts of RRV cases in Brisbane between November 1998 and December 2001 by statistical local areas from the Queensland Department of Health and the monthly mosquito abundance from the Brisbane City Council. Correlation analysis was used to assess the pairwise relationships between mosquito density and the incidence of RRV disease. This study showed that the mosquito abundance of Aedes vigilax (Skuse), Culex annulirostris (Skuse), and Aedes vittiger (Skuse) were significantly associated with the monthly incidence of RRV in the coastline area, whereas Aedes vigilax, Culex annulirostris, and Aedes notoscriptus (Skuse) were significantly associated with the monthly incidence of RRV in the inland area. The results of the classification and regression tree (CART) analysis show that both occurrence and incidence of RRV were influenced by interactions between species in both coastal and inland regions. We found that there was an 89% chance for an occurrence of RRV if the abundance of Ae. vigifax was between 64 and 90 in the coastline region. There was an 80% chance for an occurrence of RRV if the density of Cx. annulirostris was between 53 and 74 in the inland area. The results of this study may have applications as a decision support tool in planning disease control of RRV and other mosquito-borne diseases.
Resumo:
Road crashes cost world and Australian society a significant proportion of GDP, affecting productivity and causing significant suffering for communities and individuals. This paper presents a case study that generates data mining models that contribute to understanding of road crashes by allowing examination of the role of skid resistance (F60) and other road attributes in road crashes. Predictive data mining algorithms, primarily regression trees, were used to produce road segment crash count models from the road and traffic attributes of crash scenarios. The rules derived from the regression trees provide evidence of the significance of road attributes in contributing to crash, with a focus on the evaluation of skid resistance.
Resumo:
Protocols for bioassessment often relate changes in summary metrics that describe aspects of biotic assemblage structure and function to environmental stress. Biotic assessment using multimetric indices now forms the basis for setting regulatory standards for stream quality and a range of other goals related to water resource management in the USA and elsewhere. Biotic metrics are typically interpreted with reference to the expected natural state to evaluate whether a site is degraded. It is critical that natural variation in biotic metrics along environmental gradients is adequately accounted for, in order to quantify human disturbance-induced change. A common approach used in the IBI is to examine scatter plots of variation in a given metric along a single stream size surrogate and a fit a line (drawn by eye) to form the upper bound, and hence define the maximum likely value of a given metric in a site of a given environmental characteristic (termed the 'maximum species richness line' - MSRL). In this paper we examine whether the use of a single environmental descriptor and the MSRL is appropriate for defining the reference condition for a biotic metric (fish species richness) and for detecting human disturbance gradients in rivers of south-eastern Queensland, Australia. We compare the accuracy and precision of the MSRL approach based on single environmental predictors, with three regression-based prediction methods (Simple Linear Regression, Generalised Linear Modelling and Regression Tree modelling) that use (either singly or in combination) a set of landscape and local scale environmental variables as predictors of species richness. We compared the frequency of classification errors from each method against set biocriteria and contrast the ability of each method to accurately reflect human disturbance gradients at a large set of test sites. The results of this study suggest that the MSRL based upon variation in a single environmental descriptor could not accurately predict species richness at minimally disturbed sites when compared with SLR's based on equivalent environmental variables. Regression-based modelling incorporating multiple environmental variables as predictors more accurately explained natural variation in species richness than did simple models using single environmental predictors. Prediction error arising from the MSRL was substantially higher than for the regression methods and led to an increased frequency of Type I errors (incorrectly classing a site as disturbed). We suggest that problems with the MSRL arise from the inherent scoring procedure used and that it is limited to predicting variation in the dependent variable along a single environmental gradient.
Resumo:
This paper presents a fault diagnosis method based on adaptive neuro-fuzzy inference system (ANFIS) in combination with decision trees. Classification and regression tree (CART) which is one of the decision tree methods is used as a feature selection procedure to select pertinent features from data set. The crisp rules obtained from the decision tree are then converted to fuzzy if-then rules that are employed to identify the structure of ANFIS classifier. The hybrid of back-propagation and least squares algorithm are utilized to tune the parameters of the membership functions. In order to evaluate the proposed algorithm, the data sets obtained from vibration signals and current signals of the induction motors are used. The results indicate that the CART–ANFIS model has potential for fault diagnosis of induction motors.
Resumo:
Expert elicitation is the process of retrieving and quantifying expert knowledge in a particular domain. Such information is of particular value when the empirical data is expensive, limited, or unreliable. This paper describes a new software tool, called Elicitator, which assists in quantifying expert knowledge in a form suitable for use as a prior model in Bayesian regression. Potential environmental domains for applying this elicitation tool include habitat modeling, assessing detectability or eradication, ecological condition assessments, risk analysis, and quantifying inputs to complex models of ecological processes. The tool has been developed to be user-friendly, extensible, and facilitate consistent and repeatable elicitation of expert knowledge across these various domains. We demonstrate its application to elicitation for logistic regression in a geographically based ecological context. The underlying statistical methodology is also novel, utilizing an indirect elicitation approach to target expert knowledge on a case-by-case basis. For several elicitation sites (or cases), experts are asked simply to quantify their estimated ecological response (e.g. probability of presence), and its range of plausible values, after inspecting (habitat) covariates via GIS.
Resumo:
Numerous expert elicitation methods have been suggested for generalised linear models (GLMs). This paper compares three relatively new approaches to eliciting expert knowledge in a form suitable for Bayesian logistic regression. These methods were trialled on two experts in order to model the habitat suitability of the threatened Australian brush-tailed rock-wallaby (Petrogale penicillata). The first elicitation approach is a geographically assisted indirect predictive method with a geographic information system (GIS) interface. The second approach is a predictive indirect method which uses an interactive graphical tool. The third method uses a questionnaire to elicit expert knowledge directly about the impact of a habitat variable on the response. Two variables (slope and aspect) are used to examine prior and posterior distributions of the three methods. The results indicate that there are some similarities and dissimilarities between the expert informed priors of the two experts formulated from the different approaches. The choice of elicitation method depends on the statistical knowledge of the expert, their mapping skills, time constraints, accessibility to experts and funding available. This trial reveals that expert knowledge can be important when modelling rare event data, such as threatened species, because experts can provide additional information that may not be represented in the dataset. However care must be taken with the way in which this information is elicited and formulated.
Resumo:
The work was both conceived and constructed in-situ within Gnombup Swamp a seasonal water body at Bremer Bay, Western Australia. The work interacts with site-specific conditions including wind patterns and a datum of seasonal water levels marks. The work is the result of collaboration between soil scientist Paula Deegan and Ian Weir. The installation was documented with a series of 30 still digital photographs, later animated in Microsoft Powerpoint.