126 resultados para species distribution modelling
em Queensland University of Technology - ePrints Archive
Resumo:
1. Species' distribution modelling relies on adequate data sets to build reliable statistical models with high predictive ability. However, the money spent collecting empirical data might be better spent on management. A less expensive source of species' distribution information is expert opinion. This study evaluates expert knowledge and its source. In particular, we determine whether models built on expert knowledge apply over multiple regions or only within the region where the knowledge was derived. 2. The case study focuses on the distribution of the brush-tailed rock-wallaby Petrogale penicillata in eastern Australia. We brought together from two biogeographically different regions substantial and well-designed field data and knowledge from nine experts. We used a novel elicitation tool within a geographical information system to systematically collect expert opinions. The tool utilized an indirect approach to elicitation, asking experts simpler questions about observable rather than abstract quantities, with measures in place to identify uncertainty and offer feedback. Bayesian analysis was used to combine field data and expert knowledge in each region to determine: (i) how expert opinion affected models based on field data and (ii) how similar expert-informed models were within regions and across regions. 3. The elicitation tool effectively captured the experts' opinions and their uncertainties. Experts were comfortable with the map-based elicitation approach used, especially with graphical feedback. Experts tended to predict lower values of species occurrence compared with field data. 4. Across experts, consensus on effect sizes occurred for several habitat variables. Expert opinion generally influenced predictions from field data. However, south-east Queensland and north-east New South Wales experts had different opinions on the influence of elevation and geology, with these differences attributable to geological differences between these regions. 5. Synthesis and applications. When formulated as priors in Bayesian analysis, expert opinion is useful for modifying or strengthening patterns exhibited by empirical data sets that are limited in size or scope. Nevertheless, the ability of an expert to extrapolate beyond their region of knowledge may be poor. Hence there is significant merit in obtaining information from local experts when compiling species' distribution models across several regions.
Resumo:
Species distribution models (SDMs) are considered to exemplify Pattern rather than Process based models of a species' response to its environment. Hence when used to map species distribution, the purpose of SDMs can be viewed as interpolation, since species response is measured at a few sites in the study region, and the aim is to interpolate species response at intermediate sites. Increasingly, however, SDMs are also being used to also extrapolate species-environment relationships beyond the limits of the study region as represented by the training data. Regardless of whether SDMs are to be used for interpolation or extrapolation, the debate over how to implement SDMs focusses on evaluating the quality of the SDM, both ecologically and mathematically. This paper proposes a framework that includes useful tools previously employed to address uncertainty in habitat modelling. Together with existing frameworks for addressing uncertainty more generally when modelling, we then outline how these existing tools help inform development of a broader framework for addressing uncertainty, specifically when building habitat models. As discussed earlier we focus on extrapolation rather than interpolation, where the emphasis on predictive performance is diluted by the concerns for robustness and ecological relevance. We are cognisant of the dangers of excessively propagating uncertainty. Thus, although the framework provides a smorgasbord of approaches, it is intended that the exact menu selected for a particular application, is small in size and targets the most important sources of uncertainty. We conclude with some guidance on a strategic approach to identifying these important sources of uncertainty. Whilst various aspects of uncertainty in SDMs have previously been addressed, either as the main aim of a study or as a necessary element of constructing SDMs, this is the first paper to provide a more holistic view.
Resumo:
Psittacine beak and feather disease (PBFD) has a broad host range and is widespread in wild and captive psittacine populations in Asia, Africa, the Americas, Europe and Australasia. Beak and feather disease circovirus (BFDV) is the causative agent. BFDV has an ~2 kb single stranded circular DNA genome encoding just two proteins (Rep and CP). In this study we provide support for demarcation of BFDV strains by phylogenetic analysis of 65 complete genomes from databases and 22 new BFDV sequences isolated from infected psittacines in South Africa. We propose 94% genome-wide sequence identity as a strain demarcation threshold, with isolates sharing > 94% identity belonging to the same strain, and strain subtypes sharing> 98% identity. Currently, BFDV diversity falls within 14 strains, with five highly divergent isolates from budgerigars probably representing a new species of circovirus with three strains (budgerigar circovirus; BCV-A, -B and -C). The geographical distribution of BFDV and BCV strains is strongly linked to the international trade in exotic birds; strains with more than one host are generally located in the same geographical area. Lastly, we examined BFDV and BCV sequences for evidence of recombination, and determined that recombination had occurred in most BFDV and BCV strains. We established that there were two globally significant recombination hotspots in the viral genome: the first is along the entire intergenic region and the second is in the C-terminal portion of the CP ORF. The implications of our results for the taxonomy and classification of circoviruses are discussed. © 2011 SGM.
Resumo:
Aim Determining how ecological processes vary across space is a major focus in ecology. Current methods that investigate such effects remain constrained by important limiting assumptions. Here we provide an extension to geographically weighted regression in which local regression and spatial weighting are used in combination. This method can be used to investigate non-stationarity and spatial-scale effects using any regression technique that can accommodate uneven weighting of observations, including machine learning. Innovation We extend the use of spatial weights to generalized linear models and boosted regression trees by using simulated data for which the results are known, and compare these local approaches with existing alternatives such as geographically weighted regression (GWR). The spatial weighting procedure (1) explained up to 80% deviance in simulated species richness, (2) optimized the normal distribution of model residuals when applied to generalized linear models versus GWR, and (3) detected nonlinear relationships and interactions between response variables and their predictors when applied to boosted regression trees. Predictor ranking changed with spatial scale, highlighting the scales at which different species–environment relationships need to be considered. Main conclusions GWR is useful for investigating spatially varying species–environment relationships. However, the use of local weights implemented in alternative modelling techniques can help detect nonlinear relationships and high-order interactions that were previously unassessed. Therefore, this method not only informs us how location and scale influence our perception of patterns and processes, it also offers a way to deal with different ecological interpretations that can emerge as different areas of spatial influence are considered during model fitting.
Resumo:
Species distribution modelling (SDM) typically analyses species’ presence together with some form of absence information. Ideally absences comprise observations or are inferred from comprehensive sampling. When such information is not available, then pseudo-absences are often generated from the background locations within the study region of interest containing the presences, or else absence is implied through the comparison of presences to the whole study region, e.g. as is the case in Maximum Entropy (MaxEnt) or Poisson point process modelling. However, the choice of which absence information to include can be both challenging and highly influential on SDM predictions (e.g. Oksanen and Minchin, 2002). In practice, the use of pseudo- or implied absences often leads to an imbalance where absences far outnumber presences. This leaves analysis highly susceptible to ‘naughty-noughts’: absences that occur beyond the envelope of the species, which can exert strong influence on the model and its predictions (Austin and Meyers, 1996). Also known as ‘excess zeros’, naughty noughts can be estimated via an overall proportion in simple hurdle or mixture models (Martin et al., 2005). However, absences, especially those that occur beyond the species envelope, can often be more diverse than presences. Here we consider an extension to excess zero models. The two-staged approach first exploits the compartmentalisation provided by classification trees (CTs) (as in O’Leary, 2008) to identify multiple sources of naughty noughts and simultaneously delineate several species envelopes. Then SDMs can be fit separately within each envelope, and for this stage, we examine both CTs (as in Falk et al., 2014) and the popular MaxEnt (Elith et al., 2006). We introduce a wider range of model performance measures to improve treatment of naughty noughts in SDM. We retain an overall measure of model performance, the area under the curve (AUC) of the Receiver-Operating Curve (ROC), but focus on its constituent measures of false negative rate (FNR) and false positive rate (FPR), and how these relate to the threshold in the predicted probability of presence that delimits predicted presence from absence. We also propose error rates more relevant to users of predictions: false omission rate (FOR), the chance that a predicted absence corresponds to (and hence wastes) an observed presence, and the false discovery rate (FDR), reflecting those predicted (or potential) presences that correspond to absence. A high FDR may be desirable since it could help target future search efforts, whereas zero or low FOR is desirable since it indicates none of the (often valuable) presences have been ignored in the SDM. For illustration, we chose Bradypus variegatus, a species that has previously been published as an exemplar species for MaxEnt, proposed by Phillips et al. (2006). We used CTs to increasingly refine the species envelope, starting with the whole study region (E0), eliminating more and more potential naughty noughts (E1–E3). When combined with an SDM fit within the species envelope, the best CT SDM had similar AUC and FPR to the best MaxEnt SDM, but otherwise performed better. The FNR and FOR were greatly reduced, suggesting that CTs handle absences better. Interestingly, MaxEnt predictions showed low discriminatory performance, with the most common predicted probability of presence being in the same range (0.00-0.20) for both true absences and presences. In summary, this example shows that SDMs can be improved by introducing an initial hurdle to identify naughty noughts and partition the envelope before applying SDMs. This improvement was barely detectable via AUC and FPR yet visible in FOR, FNR, and the comparison of predicted probability of presence distribution for pres/absence.
Resumo:
The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.
Resumo:
Determining the ecologically relevant spatial scales for predicting species occurrences is an important concept when determining species–environment relationships. Therefore species distribution modelling should consider all ecologically relevant spatial scales. While several recent studies have addressed this problem in artificially fragmented landscapes, few studies have researched relevant ecological scales for organisms that also live in naturally fragmented landscapes. This situation is exemplified by the Australian rock-wallabies’ preference for rugged terrain and we addressed the issue of scale using the threatened brush-tailed rock-wallaby (Petrogale penicillata) in eastern Australia. We surveyed for brush-tailed rock-wallabies at 200 sites in southeast Queensland, collecting potentially influential site level and landscape level variables. We applied classification trees at either scale to capture a hierarchy of relationships between the explanatory variables and brush-tailed rock-wallaby presence/absence. Habitat complexity at the site level and geology at the landscape level were the best predictors of where we observed brush-tailed rock-wallabies. Our study showed that the distribution of the species is affected by both site scale and landscape scale factors, reinforcing the need for a multi-scale approach to understanding the relationship between a species and its environment. We demonstrate that careful design of data collection, using coarse scale spatial datasets and finer scale field data, can provide useful information for identifying the ecologically relevant scales for studying species–environment relationships. Our study highlights the need to determine patterns of environmental influence at multiple scales to conserve specialist species such as the brush-tailed rock-wallaby in naturally fragmented landscapes.
Resumo:
In this thesis, the issue of incorporating uncertainty for environmental modelling informed by imagery is explored by considering uncertainty in deterministic modelling, measurement uncertainty and uncertainty in image composition. Incorporating uncertainty in deterministic modelling is extended for use with imagery using the Bayesian melding approach. In the application presented, slope steepness is shown to be the main contributor to total uncertainty in the Revised Universal Soil Loss Equation. A spatial sampling procedure is also proposed to assist in implementing Bayesian melding given the increased data size with models informed by imagery. Measurement error models are another approach to incorporating uncertainty when data is informed by imagery. These models for measurement uncertainty, considered in a Bayesian conditional independence framework, are applied to ecological data generated from imagery. The models are shown to be appropriate and useful in certain situations. Measurement uncertainty is also considered in the context of change detection when two images are not co-registered. An approach for detecting change in two successive images is proposed that is not affected by registration. The procedure uses the Kolmogorov-Smirnov test on homogeneous segments of an image to detect change, with the homogeneous segments determined using a Bayesian mixture model of pixel values. Using the mixture model to segment an image also allows for uncertainty in the composition of an image. This thesis concludes by comparing several different Bayesian image segmentation approaches that allow for uncertainty regarding the allocation of pixels to different ground components. Each segmentation approach is applied to a data set of chlorophyll values and shown to have different benefits and drawbacks depending on the aims of the analysis.
Resumo:
The measurement error model is a well established statistical method for regression problems in medical sciences, although rarely used in ecological studies. While the situations in which it is appropriate may be less common in ecology, there are instances in which there may be benefits in its use for prediction and estimation of parameters of interest. We have chosen to explore this topic using a conditional independence model in a Bayesian framework using a Gibbs sampler, as this gives a great deal of flexibility, allowing us to analyse a number of different models without losing generality. Using simulations and two examples, we show how the conditional independence model can be used in ecology, and when it is appropriate.
Resumo:
Aim: To quantify the consequences of major threats to biodiversity, such as climate and land-use change, it is important to use explicit measures of species persistence, such as extinction risk. The extinction risk of metapopulations can be approximated through simple models, providing a regional snapshot of the extinction probability of a species. We evaluated the extinction risk of three species under different climate change scenarios in three different regions of the Mexican cloud forest, a highly fragmented habitat that is particularly vulnerable to climate change. Location: Cloud forests in Mexico. Methods: Using Maxent, we estimated the potential distribution of cloud forest for three different time horizons (2030, 2050 and 2080) and their overlap with protected areas. Then, we calculated the extinction risk of three contrasting vertebrate species for two scenarios: (1) climate change only (all suitable areas of cloud forest through time) and (2) climate and land-use change (only suitable areas within a currently protected area), using an explicit patch-occupancy approximation model and calculating the joint probability of all populations becoming extinct when the number of remaining patches was less than five. Results: Our results show that the extent of environmentally suitable areas for cloud forest in Mexico will sharply decline in the next 70 years. We discovered that if all habitat outside protected areas is transformed, then only species with small area requirements are likely to persist. With habitat loss through climate change only, high dispersal rates are sufficient for persistence, but this requires protection of all remaining cloud forest areas. Main conclusions: Even if high dispersal rates mitigate the extinction risk of species due to climate change, the synergistic impacts of changing climate and land use further threaten the persistence of species with higher area requirements. Our approach for assessing the impacts of threats on biodiversity is particularly useful when there is little time or data for detailed population viability analyses. © 2013 John Wiley & Sons Ltd.
Resumo:
This study considered the problem of predicting survival, based on three alternative models: a single Weibull, a mixture of Weibulls and a cure model. Instead of the common procedure of choosing a single “best” model, where “best” is defined in terms of goodness of fit to the data, a Bayesian model averaging (BMA) approach was adopted to account for model uncertainty. This was illustrated using a case study in which the aim was the description of lymphoma cancer survival with covariates given by phenotypes and gene expression. The results of this study indicate that if the sample size is sufficiently large, one of the three models emerge as having highest probability given the data, as indicated by the goodness of fit measure; the Bayesian information criterion (BIC). However, when the sample size was reduced, no single model was revealed as “best”, suggesting that a BMA approach would be appropriate. Although a BMA approach can compromise on goodness of fit to the data (when compared to the true model), it can provide robust predictions and facilitate more detailed investigation of the relationships between gene expression and patient survival. Keywords: Bayesian modelling; Bayesian model averaging; Cure model; Markov Chain Monte Carlo; Mixture model; Survival analysis; Weibull distribution
Resumo:
Intelligible and accurate risk-based decision-making requires a complex balance of information from different sources, appropriate statistical analysis of this information and consequent intelligent inference and decisions made on the basis of these analyses. Importantly, this requires an explicit acknowledgement of uncertainty in the inputs and outputs of the statistical model. The aim of this paper is to progress a discussion of these issues in the context of several motivating problems related to the wider scope of agricultural production. These problems include biosecurity surveillance design, pest incursion, environmental monitoring and import risk assessment. The information to be integrated includes observational and experimental data, remotely sensed data and expert information. We describe our efforts in addressing these problems using Bayesian models and Bayesian networks. These approaches provide a coherent and transparent framework for modelling complex systems, combining the different information sources, and allowing for uncertainty in inputs and outputs. While the theory underlying Bayesian modelling has a long and well established history, its application is only now becoming more possible for complex problems, due to increased availability of methodological and computational tools. Of course, there are still hurdles and constraints, which we also address through sharing our endeavours and experiences.
Resumo:
Regrowing forests on cleared land is a key strategy to achieve both biodiversity conservation and climate change mitigation globally. Maximizing these co-benefits, however, remains theoretically and technically challenging because of the complex relationship between carbon sequestration and biodiversity in forests, the strong influence of climate variability and landscape position on forest development, the large number of restoration strategies possible, and long time-frames needed to declare success. Through the synthesis of three decades of knowledge on forest dynamics and plant functional traits combined with decision science, we demonstrate that we cannot always maximize carbon sequestration by simply increasing the functional trait diversity of trees planted. The relationships between plant functional diversity, carbon sequestration rates above-ground and in the soil are dependent on climate and landscape positions. We show how to manage ‘identities’ and ‘complementarities’ between plant functional traits in order to achieve systematically maximal co-benefits in various climate and landscape contexts. We provide examples of optimal planting and thinning rules that satisfy this ecological strategy and guide the restoration of forests that are rich in both carbon and plant functional diversity. Our framework provides the first mechanistic approach for generating decision-making rules that can be used to manage forests for multiple objectives, and supports joined carbon credit and biodiversity conservation initiatives, such as Reducing Emissions from Deforestation and forest Degradation REDD+. The decision framework can also be linked to species distribution models and socio-economic models in order to find restoration solutions that maximize simultaneously biodiversity, carbon stocks and other ecosystem services across landscapes. Our study provides the foundation for developing and testing cost-effective and adaptable forest management rules to achieve biodiversity, carbon sequestration and other socio-economic co-benefits under global change.
Resumo:
1. The ability of many introduced fish species to thrive in degraded aquatic habitats and their potential to impact on aquatic ecosystem structure and function suggest that introduced fish may represent both a symptom and a cause of decline in river health and the integrity of native aquatic communities. 2. The varying sensitivities of many commonly introduced fish species to degraded stream conditions, the mechanism and reason for their introduction and the differential susceptibility of local stream habitats to invasion because of the environmental and biological characteristics of the receiving water body, are all confounding factors that may obscure the interpretation of patterns of introduced fish species distribution and abundance and therefore their reliability as indicators of river health. 3. In the present study, we address the question of whether alien fish (i.e. those species introduced from other countries) are a reliable indicator of the health of streams and rivers in south-eastern Queensland, Australia. We examine the relationships of alien fish species distributions and indices of abundance and biomass with the natural environmental features, the biotic characteristics of the local native fish assemblages and indicators of anthropogenic disturbance at a large number of sites subject to varying sources and intensities of human impact. 4. Alien fish species were found to be widespread and often abundant in south-eastern Queensland rivers and streams, and the five species collected were considered to be relatively tolerant to river degradation, making them good candidate indicators of river health. Variation in alien species indices was unrelated to the size of the study sites, the sampling effort expended or natural environmental gradients. The biological resistance of the native fish fauna was not concluded to be an important factor mediating invasion success by alien species. Variation in alien fish indices was, however, strongly related to indicators of disturbance intensity describing local in-stream habitat and riparian degradation, water quality and surrounding land use, particularly the amount of urban development in the catchment. 5. Potential confounding factors that may influence the likelihood of introduction and successful establishment of an alien species and the implications of these factors for river bioassessment are discussed. We conclude that the potentially strong impact that many alien fish species can have on the biological integrity of natural aquatic ecosystems, together with their potential to be used as an initial basis to find out other forms of human disturbance impacts, suggest that some alien species (particularly species from the family Poeciliidae) can represent a reliable 'first cut' indicator of river health.