27 resultados para Atheoretical regression trees
em University of Queensland eSpace - Australia
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
Areas of the landscape that are priorities for conservation should be those that are both vulnerable to threatening processes and that if lost or degraded, will result in conservation targets being compromised. While much attention is directed towards understanding the patterns of biodiversity, much less is given to determining the areas of the landscape most vulnerable to threats. We assessed the relative vulnerability of remaining areas of native forest to conversion to plantations in the ecologically significant temperate rainforest region of south central Chile. The area of the study region is 4.2 million ha and the extent of plantations is approximately 200000 ha. First, the spatial distribution of native forest conversion to plantations was determined. The variables related to the spatial distribution of this threatening process were identified through the development of a classification tree and the generation of a multivariate. spatially explicit, statistical model. The model of native forest conversion explained 43% of the deviance and the discrimination ability of the model was high. Predictions were made of where native forest conversion is likely to occur in the future. Due to patterns of climate, topography, soils and proximity to infrastructure and towns, remaining forest areas differ in their relative risk of being converted to plantations. Another factor that may increase the vulnerability of remaining native forest in a subset of the study region is the proposed construction of a highway. We found that 90% of the area of existing plantations within this region is within 2.5 km of roads. When the predictions of native forest conversion were recalculated accounting for the construction of this highway, it was found that: approximately 27000 ha of native forest had an increased probability of conversion. The areas of native forest identified to be vulnerable to conversion are outside of the existing reserve network. (C) 2004 Elsevier Ltd. All tights reserved.
Resumo:
Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Most of the modem developments with classification trees are aimed at improving their predictive capacity. This article considers a curiously neglected aspect of classification trees, namely the reliability of predictions that come from a given classification tree. In the sense that a node of a tree represents a point in the predictor space in the limit, the aim of this article is the development of localized assessment of the reliability of prediction rules. A classification tree may be used either to provide a probability forecast, where for each node the membership probabilities for each class constitutes the prediction, or a true classification where each new observation is predictively assigned to a unique class. Correspondingly, two types of reliability measure will be derived-namely, prediction reliability and classification reliability. We use bootstrapping methods as the main tool to construct these measures. We also provide a suite of graphical displays by which they may be easily appreciated. In addition to providing some estimate of the reliability of specific forecasts of each type, these measures can also be used to guide future data collection to improve the effectiveness of the tree model. The motivating example we give has a binary response, namely the presence or absence of a species of Eucalypt, Eucalyptus cloeziana, at a given sampling location in response to a suite of environmental covariates, (although the methods are not restricted to binary response data).
Resumo:
Risk assessment systems for introduced species are being developed and applied globally, but methods for rigorously evaluating them are still in their infancy. We explore classification and regression tree models as an alternative to the current Australian Weed Risk Assessment system, and demonstrate how the performance of screening tests for unwanted alien species may be quantitatively compared using receiver operating characteristic (ROC) curve analysis. The optimal classification tree model for predicting weediness included just four out of a possible 44 attributes of introduced plants examined, namely: (i) intentional human dispersal of propagules; (ii) evidence of naturalization beyond native range; (iii) evidence of being a weed elsewhere; and (iv) a high level of domestication. Intentional human dispersal of propagules in combination with evidence of naturalization beyond a plants native range led to the strongest prediction of weediness. A high level of domestication in combination with no evidence of naturalization mitigated the likelihood of an introduced plant becoming a weed resulting from intentional human dispersal of propagules. Unlikely intentional human dispersal of propagules combined with no evidence of being a weed elsewhere led to the lowest predicted probability of weediness. The failure to include intrinsic plant attributes in the model suggests that either these attributes are not useful general predictors of weediness, or data and analysis were inadequate to elucidate the underlying relationship(s). This concurs with the historical pessimism that we will ever be able to accurately predict invasive plants. Given the apparent importance of propagule pressure (the number of individuals of an species released), future attempts at evaluating screening model performance for identifying unwanted plants need to account for propagule pressure when collating and/or analysing datasets. The classification tree had a cross-validated sensitivity of 93.6% and specificity of 36.7%. Based on the area under the ROC curve, the performance of the classification tree in correctly classifying plants as weeds or non-weeds was slightly inferior (Area under ROC curve = 0.83 +/- 0.021 (+/- SE)) to that of the current risk assessment system in use (Area under ROC curve = 0.89 +/- 0.018 (+/- SE)), although requires many fewer questions to be answered.
Resumo:
This pilot project at Cotton Tree, Maroochydore, on two adjacent, linear parcels of land has one of the properties privately owned while the other is owned by the public housing authority. Both owners commissioned Lindsay and Kerry Clare to design housing for their separate needs which enabled the two projects to be governed by a single planning and design strategy. This entailed the realignment of the dividing boundary to form two approximately square blocks which made possible the retention of an important stand of mature paperbark trees and gave each block a more useful street frontage. The scheme provides seven two-bedroom units and one single-bedroom unit as the private component, with six single-bedroom units, three two-bedroom units and two three-bedroom units forming the public housing. The dwellings are deployed as an interlaced mat of freestanding blocks, car courts, courtyard gardens, patios and decks. The key distinction between the public and private parts of the scheme is the pooling of the car parking spaces in the public housing to create a shared courtyard. The housing climbs to three storeys on its southern edge and falls to a single storey on the north-western corner. This enables all units and the principal private outdoor spaces to have a northern orientation. The interiors of both the public and private units are skilfully arranged to take full advantage of views, light and breeze.
Resumo:
This study examined the relationship between isokinetic hip extensor/hip flexor strength, 1-RM squat strength, and sprint running performance for both a sprint-trained and non-sprint-trained group. Eleven male sprinters and 8 male controls volunteered for the study. On the same day subjects ran 20-m sprints from both a stationary start and with a 50-m acceleration distance, completed isokinetic hip extension/flexion exercises at 1.05, 4.74, and 8.42 rad.s(-1), and had their squat strength estimated. Stepwise multiple regression analysis showed that equations for predicting both 20-m maximum velocity nm time and 20-m acceleration time may be calculated with an error of less than 0.05 sec using only isokinetic and squat strength data. However, a single regression equation for predicting both 20-m acceleration and maximum velocity run times from isokinetic or squat tests was not found. The regression analysis indicated that hip flexor strength at all test velocities was a better predictor of sprint running performance than hip extensor strength.
Resumo:
Xylem sap from woody species in the wet/dry tropics of northern Australia was analyzed for N compounds. At the peak of the dry season, arginine was the main N compound in sap of most species of woodlands and deciduous monsoon forest. In the wet season, a marked change occurred with amides becoming the main sap N constituents of most species. Species from an evergreen monsoon forest, with a permanent water source, transported amides in the dry season. In the dry season, nitrate accounted for 7 and 12% of total xylem sap N in species of deciduous and evergreen monsoon forests, respectively In the wet season, the proportion of N present as nitrate increased to 22% in deciduous monsoon forest species. These results suggest that N is taken up and assimilated mainly in the wet season and that this newly assimilated N is mostly transported as amide-N (woodland species, monsoon forest species) and nitrate (monsoon forest species). Arginine is the form in which stored N is remobilized and transported by woodland and deciduous monsoon forest species in the dry season. Several proteins, which may represent bark storage proteins, were detected in inner bark tissue from a range of trees in the dry season, indicating that, although N uptake appears to be limited in the dry season, the many tree and shrub species that produce flowers, fruit or leaves in the dry season use stored N to support growth. Nitrogen characteristics of the studied species are discussed in relation to the tropical environment.
Resumo:
A significant problem in the collection of responses to potentially sensitive questions, such as relating to illegal, immoral or embarrassing activities, is non-sampling error due to refusal to respond or false responses. Eichhorn & Hayre (1983) suggested the use of scrambled responses to reduce this form of bias. This paper considers a linear regression model in which the dependent variable is unobserved but for which the sum or product with a scrambling random variable of known distribution, is known. The performance of two likelihood-based estimators is investigated, namely of a Bayesian estimator achieved through a Markov chain Monte Carlo (MCMC) sampling scheme, and a classical maximum-likelihood estimator. These two estimators and an estimator suggested by Singh, Joarder & King (1996) are compared. Monte Carlo results show that the Bayesian estimator outperforms the classical estimators in almost all cases, and the relative performance of the Bayesian estimator improves as the responses become more scrambled.
Resumo:
The degree and distribution of parasitisation in relation to densities of pink wax scale, Ceroplastes rubens Maskell, on umbrella trees, Schefflera actinophylla (Endl.), in south-eastern Queensland were investigated to determine whether scale outbreaks could be attributed, in part, to low levels of parasitisation. Rates of parasitisation were independent of or inversely dependent on host density, and highly variable, especially at low densities. The absence of density dependent parasitisation may occur as a result of: (i) non-aggregation by parasitoids; (ii) aggregation by parasitoids where parasitisation is limited by intrinsic or extrinsic factors; and/or (iii) high rates of hyperparasitisation.
Resumo:
This paper describes the construction of Australia-wide soil property predictions from a compiled national soils point database. Those properties considered include pH, organic carbon, total phosphorus, total nitrogen, thickness. texture, and clay content. Many of these soil properties are used directly in environmental process modelling including global climate change models. Models are constructed at the 250-m resolution using decision trees. These relate the soil property to the environment through a suite of environmental predictors at the locations where measurements are observed. These models are then used to extend predictions to the continental extent by applying the rules derived to the exhaustively available environmental predictors. The methodology and performance is described in detail for pH and summarized for other properties. Environmental variables are found to be important predictors, even at the 250-m resolution at which they are available here as they can describe the broad changes in soil property.
Resumo:
The spatial pattern of outbreaks of pink wax scale, Ceroplastes rubens Maskell, within and among umbrella trees, Schefflera actinophylla (Endl.), in southeastern Queensland was investigated. Pink wax scale was common on S. actinophylla, with approximately 84% of trees positive for scale and 14% of bees recording outbreak densities exceeding 0.4 adults per leaflet. Highly aggregated distributions of C. rubens occur within and among umbrella trees. Clumped distributions within trees appear to result from variable birth and death rates and limited movement of first instar crawlers. The patchy distribution of pink wax scale among trees is probably a consequence of variation in dispersal success of scale, host and environmental suitability for establishment and rates of biological control. Pink wax scale was more prevalent on trees in roadside positions and in exposed situations, indicating that such trees are more suitable and/or susceptible to scale colonisation.
Resumo:
1. Chrysophtharta bimaculata is a native chrysomelid species that can cause chronic defoliation of plantation and regrowth Eucalyptus forests in Tasmania, Australia. Knowledge of the dispersion pattern of C. bimaculata was needed in order to assess the efficiency of an integrated pest management (IPM) programme currently used for its control. 2. Using data from yellow flight traps, local populations of C. bimaculata adults were monitored over a season at spatial scales relevant to commercial forestry: within a 50-ha operational management unit (a forestry 'coupe') and between coupes. In addition, oviposition was monitored over a season at a subset of the between-coupe sites. 3. Dispersion indices (Taylor's Power Law and Iwao's Mean Crowding regression method) demonstrated that C. bimaculata adults were spatially aggregated within and between coupes, although the number of egg-batches laid at the between-coupe scale was uniform. Spatial autocorrelation analysis showed that trap-catches at the within-coupe level were similar (positively autocorrelated) to a radius distance of approximately 110 m, and then dissimilar (negatively autocorrelated) at approximately 250 m. At the between-coupe scale, no repeatable spatial autocorrelation patterns were observed. 4. For any individual site, rapid changes in beetle density were observed to be associated with loosely aggregated flights of beetles into and out of that site. Peak adult catches (> the weekly mean plus standard deviation trap-catch) for a site occurred for a period of 2.0 +/- 0.22 weeks at a time (n = 37), with normally only one or two peaks per site per season. Peak oviposition events for a site occurred on average 1.4 +/- 0.11 times per season and lasted 1.5 +/- 0.12 weeks. 5. Analysis of an extensive data set (n = 417) demonstrated that adult abundance at a site was positively correlated with egg density, but negatively correlated with tree damage (caused by conspecifics) and the presence of conspecific larvae. There was no relationship between adult abundance and a visual estimate of the amount of young foliage on trees. 6. Adults of C. bimaculata are show n to occur in relatively small, mobile aggregations. This means that pest surveys must be both regular (less than 2 weeks apart) and intensive (with sampling points no more than 150 m apart) if beetle populations are to be monitored with confidence. Further refinement of the current IPM strategy must recognize the problems posed by this temporal and spatial patchiness, particularly with regard to the use of biological insecticides, such as Bacillus thuringiensis, for which only a very short operational window exists.
Resumo:
We investigated some of the factors that may lead to outbreaks of pink wax scale, Ceroplastes rubens Maskell, on umbrella trees, Schefflera actinophylla (Endl.). Estimates of birth and death rates of pink wax scale were high and variable within and among trees; variation in these rates was not related to scale density. Adult fecundity correlated significantly but weakly with adult test length; mean fecundity was 292 eggs per female with a range of 5-1178. Adult test length and its variance decreased weakly with increasing density. Field experiments showed that mortality of C. rubens is greatest during the first 24 hours after hatching when approximately half disappear. The rate of loss decreases over time with 0.3% of initial motile first-instar nymphs surviving to maturity. Rates of loss varied significantly between trees, indicating that some trees are more suitable for scale colonisation and survival.