960 resultados para pseudo-absence data
Resumo:
This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.
Resumo:
The development of microfinance in Vietnam since 1990s has coincided with a remarkable progress in poverty reduction. Numerous descriptive studies have illustrated that microfinance is an effective tool to eradicate poverty in Vietnam but evidence from quantitative studies is mixed. This study contributes to the literature by providing new evidence on the impact of microfinance to poverty reduction in Vietnam using the repeated cross - sectional data from the Vietnam Living Standard s Survey (VLSS) during period 1992 - 2010. Our results show that micro - loans contribute significantly to household consumption.
Resumo:
1. Little consensus has been reached as to general features of spatial variation in beta diversity, a fundamental component of species diversity. This could reflect a genuine lack of simple gradients in beta diversity, or a lack of agreement as to just what constitutes beta diversity. Unfortunately, a large number of approaches have been applied to the investigation of variation in beta diversity, which potentially makes comparisons of the findings difficult.
2. We review 24 measures of beta diversity for presence/absence data (the most frequent form of data to which such measures are applied) that have been employed in the literature, express many of them for the first time in common terms, and compare some of their basic properties.
3. Four groups of measures are distinguished, with a fundamental distinction arising between 'broad sense' measures incorporating differences in composition attributable to species richness gradients, and 'narrow sense' measures that focus on compositional differences independent of such gradients. On a number of occasions on which the former have been employed in the literature the latter may have been more appropriate, and there are many situations in which consideration of both kinds of measures would be valuable.
4. We particularly recommend (i) considering beta diversity measures in terms of matching/mismatching components (usually denoted a , b and c) and thereby identifying the contribution of different sources of variation in species composition, and (ii) the use of ternary plots to express the relationship between the values of these measures and of the components, and as a way of understanding patterns in beta diversity.
Resumo:
Several alpine vertebrates share a distribution pattern that extends across the South-western Palearctic but is limited to the main mountain massifs. Although they are usually regarded as cold-adapted species, the range of many alpine vertebrates also includes relatively warm areas, suggesting that factors beyond climatic conditions may be driving their distribution. In this work we first recognize the species belonging to the mentioned biogeographic group and, based on the environmental niche analysis of Plecotus macrobullaris, we identify and characterize the environmental factors constraining their ranges. Distribution overlap analysis of 504 European vertebrates was done using the Sorensen Similarity Index, and we identified four birds and one mammal that share the distribution with P. macrobullaris. We generated 135 environmental niche models including different variable combinations and regularization values for P. macrobullaris at two different scales and resolutions. After selecting the best models, we observed that topographic variables outperformed climatic predictors, and the abruptness of the landscape showed better predictive ability than elevation. The best explanatory climatic variable was mean summer temperature, which showed that P. macrobullaris is able to cope with mean temperature ranges spanning up to 16 degrees C. The models showed that the distribution of P. macrobullaris is mainly shaped by topographic factors that provide rock-abundant and open-space habitats rather than climatic determinants, and that the species is not a cold-adapted, but rather a cold-tolerant eurithermic organism. P. macrobullaris shares its distribution pattern as well as several ecological features with five other alpine vertebrates, suggesting that the conclusions obtained from this study might be extensible to them. We concluded that rock-dwelling and open-space foraging vertebrates with broad temperature tolerance are the best candidates to show wide alpine distribution in the Western Palearctic.
Resumo:
Across North America, Bald Eagle (Haliaeetus leucocephalus) populations appear to be recovering following bans of DDT. A limited number of studies from across North America have recorded a surplus of nonbreeding adult Bald Eagles in dense populations when optimal habitat and food become limited. Placentia Bay, Newfoundland is one of these. The area has one of the highest densities of Bald Eagles in eastern North America, and has recently experienced an increase in the proportion of nonbreeding adults within the population. We tested whether the observed Bald Eagle population trends in Placentia Bay, Newfoundland during the breeding seasons 1990-2009 are due to habitat saturation. We found no significant differences in habitat or food resource characteristics between occupied territories and pseudo-absence data or between nest sites with high vs. low nest activity/occupancy rates. Therefore there is no evidence for habitat saturation for Bald Eagles in Placentia Bay and alternative hypotheses for the high proportion of nonbreeding adults should be considered. The Newfoundland population provides an interesting case for examination because it did not historically appear to be affected by pollution. An understanding of Bald Eagle population dynamics in a relatively pristine area with a high density can be informative for restoration and conservation of Bald Eagle populations elsewhere.
Resumo:
The objectives of this study were to predict the potential distribution, relative abundance and probability of habitat use by feral camels in southern Northern Territory. Aerial survey data were used to model habitat association. The characteristics of ‘used’ (where camels were observed) v. ‘unused’ (pseudo-absence) sites were compared. Habitat association and abundance were modelled using generalised additive model (GAM) methods. The models predicted habitat suitability and the relative abundance of camels in southern Northern Territory. The habitat suitability maps derived in the present study indicate that camels have suitable habitat in most areas of southern Northern Territory. The index of abundance model identified areas of relatively high camel abundance. Identifying preferred habitats and areas of high abundance can help focus control efforts.
Resumo:
Modelling species distributions with presence data from atlases, museum collections and databases is challenging. In this paper, we compare seven procedures to generate pseudoabsence data, which in turn are used to generate GLM-logistic regressed models when reliable absence data are not available. We use pseudo-absences selected randomly or by means of presence-only methods (ENFA and MDE) to model the distribution of a threatened endemic Iberian moth species (Graellsia isabelae). The results show that the pseudo-absence selection method greatly influences the percentage of explained variability, the scores of the accuracy measures and, most importantly, the degree of constraint in the distribution estimated. As we extract pseudo-absences from environmental regions further from the optimum established by presence data, the models generated obtain better accuracy scores, and over-prediction increases. When variables other than environmental ones influence the distribution of the species (i.e., non-equilibrium state) and precise information on absences is non-existent, the random selection of pseudo-absences or their selection from environmental localities similar to those of species presence data generates the most constrained predictive distribution maps, because pseudo-absences can be located within environmentally suitable areas. This study showsthat ifwe do not have reliable absence data, the method of pseudo-absence selection strongly conditions the obtained model, generating different model predictions in the gradient between potential and realized distributions.
Resumo:
Effective detection of population trend is crucial for managing threatened species. Little theory exists, however, to assist managers in choosing the most cost-effective monitoring techniques for diagnosing trend. We present a framework for determining the optimal monitoring strategy by simulating a manager collecting data on a declining species, the Chestnut-rumped Hylacola (Hylacola pyrrhopygia parkeri), to determine whether the species should be listed under the IUCN (World Conservation Union) Red List. We compared the efficiencies of two strategies for detecting trend, abundance, and presence-absence surveys, underfinancial constraints. One might expect the abundance surveys to be superior under all circumstances because more information is collected at each site. Nevertheless, the presence-absence data can be collected at more sites because the surveyor is not obliged to spend a fixed amount of time at each site. The optimal strategy for monitoring was very dependent on the budget available. Under some circumstances, presence-absence surveys outperformed abundance surveys for diagnosing the IUCN Red List categories cost-effectively. Abundance surveys were best if the species was expected to be recorded more than 16 times/year; otherwise, presence-absence surveys were best. The relationship between the strategies we investigated is likely to be relevant for many comparisons of presence-absence or abundance data. Managers of any cryptic or low-density species who hope to maximize their success of estimating trend should find an application for our results.
Resumo:
Habitat models are widely used in ecology, however there are relatively few studies of rare species, primarily because of a paucity of survey records and lack of robust means of assessing accuracy of modelled spatial predictions. We investigated the potential of compiled ecological data in developing habitat models for Macadamia integrifolia, a vulnerable mid-stratum tree endemic to lowland subtropical rainforests of southeast Queensland, Australia. We compared performance of two binomial models—Classification and Regression Trees (CART) and Generalised Additive Models (GAM)—with Maximum Entropy (MAXENT) models developed from (i) presence records and available absence data and (ii) developed using presence records and background data. The GAM model was the best performer across the range of evaluation measures employed, however all models were assessed as potentially useful for informing in situ conservation of M. integrifolia, A significant loss in the amount of M. integrifolia habitat has occurred (p < 0.05), with only 37% of former habitat (pre-clearing) remaining in 2003. Remnant patches are significantly smaller, have larger edge-to-area ratios and are more isolated from each other compared to pre-clearing configurations (p < 0.05). Whilst the network of suitable habitat patches is still largely intact, there are numerous smaller patches that are more isolated in the contemporary landscape compared with their connectedness before clearing. These results suggest that in situ conservation of M. integrifolia may be best achieved through a landscape approach that considers the relative contribution of small remnant habitat fragments to the species as a whole, as facilitating connectivity among the entire network of habitat patches.
Resumo:
Geospatial modeling is one of the most powerful tools available to conservation biologists for estimating current species ranges of Earth's biodiversity. Now, with the advantage of predictive climate models, these methods can be deployed for understanding future impacts on threatened biota. Here, we employ predictive modeling under a conservative estimate of future climate change to examine impacts on the future abundance and geographic distributions of Malagasy lemurs. Using distribution data from the primary literature, we employed ensemble species distribution models and geospatial analyses to predict future changes in species distributions. Current species distribution models (SDMs) were created within the BIOMOD2 framework that capitalizes on ten widely used modeling techniques. Future and current SDMs were then subtracted from each other, and areas of contraction, expansion, and stability were calculated. Model overprediction is a common issue associated Malagasy taxa. Accordingly, we introduce novel methods for incorporating biological data on dispersal potential to better inform the selection of pseudo-absence points. This study predicts that 60% of the 57 species examined will experience a considerable range of reductions in the next seventy years entirely due to future climate change. Of these species, range sizes are predicted to decrease by an average of 59.6%. Nine lemur species (16%) are predicted to expand their ranges, and 13 species (22.8%) distribution sizes were predicted to be stable through time. Species ranges will experience severe shifts, typically contractions, and for the majority of lemur species, geographic distributions will be considerably altered. We identify three areas in dire need of protection, concluding that strategically managed forest corridors must be a key component of lemur and other biodiversity conservation strategies. This recommendation is all the more urgent given that the results presented here do not take into account patterns of ongoing habitat destruction relating to human activities.
Resumo:
This article outlines the approaches to modeling the distribution of threatened invertebrates using data from atlases, museums and databases. Species Distribution Models (SDMs) are useful for estimating species’ ranges, identifying suitable habitats, and identifying the primary factors affecting species’ distributions. The study tackles the strategies used to obtain SDMs without reliable absence data while exploring their applications for conservation. I examine the conservation status of Copris species and Graellsia isabelae by delimiting their populations and exploring the effectiveness of protected areas. I show that the method of pseudo‐absence selection strongly determines the model obtained, generating different model predictions along the gradient between potential and realized distributions. After assessing the effects of species’ traits and data characteristics on accuracy, I found that species are modeled more accurately when sample sizes are larger, no matter the technique used.
Resumo:
Background. EAP programs for airline pilots in companies with a well developed recovery management program are known to reduce pilot absenteeism following treatment. Given the costs and safety consequences to society, it is important to identify pilots who may be experiencing an AOD disorder to get them into treatment. ^ Hypotheses. This study investigated the predictive power of workplace absenteeism in identifying alcohol or drug disorders (AOD). The first hypothesis was that higher absenteeism in a 12-month period is associated with higher risk that an employee is experiencing AOD. The second hypothesis was that AOD treatment would reduce subsequent absence rates and the costs of replacing pilots on missed flights. ^ Methods. A case control design using eight years (time period) of monthly archival absence data (53,000 pay records) was conducted with a sample of (N = 76) employees having an AOD diagnosis (cases) matched 1:4 with (N = 304) non-diagnosed employees (controls) of the same profession and company (male commercial airline pilots). Cases and controls were matched on the variables age, rank and date of hire. Absence rate was defined as sick time hours used over the sum of the minimum guarantee pay hours annualized using the months the pilot worked for the year. Conditional logistic regression was used to determine if absence predicts employees experiencing an AOD disorder, starting 3 years prior to the cases receiving the AOD diagnosis. A repeated measures ANOVA, t tests and rate ratios (with 95% confidence intervals) were conducted to determine differences between cases and controls in absence usage for 3 years pre and 5 years post treatment. Mean replacement costs were calculated for sick leave usage 3 years pre and 5 years post treatment to estimate the cost of sick leave from the perspective of the company. ^ Results. Sick leave, as measured by absence rate, predicted the risk of being diagnosed with an AOD disorder (OR 1.10, 95% CI = 1.06, 1.15) during the 12 months prior to receiving the diagnosis. Mean absence rates for diagnosed employees increased over the three years before treatment, particularly in the year before treatment, whereas the controls’ did not (three years, x = 6.80 vs. 5.52; two years, x = 7.81 vs. 6.30, and one year, x = 11.00cases vs. 5.51controls. In the first year post treatment compared to the year prior to treatment, rate ratios indicated a significant (60%) post treatment reduction in absence rates (OR = 0.40, CI = 0.28, 0.57). Absence rates for cases remained lower than controls for the first three years after completion of treatment. Upon discharge from the FAA and company’s three year AOD monitoring program, case’s absence rates increased slightly during the fourth year (controls, x = 0.09, SD = 0.14, cases, x = 0.12, SD = 0.21). However, the following year, their mean absence rates were again below those of the controls (controls, x = 0.08, SD = 0.12, cases, x¯ = 0.06, SD = 0.07). Significant reductions in costs associated with replacing pilots calling in sick, were found to be 60% less, between the year of diagnosis for the cases and the first year after returning to work. A reduction in replacement costs continued over the next two years for the treated employees. ^ Conclusions. This research demonstrates the potential for workplace absences as an active organizational surveillance mechanism to assist managers and supervisors in identifying employees who may be experiencing or at risk of experiencing an alcohol/drug disorder. Currently, many workplaces use only performance problems and ignore the employee’s absence record. A referral to an EAP or alcohol/drug evaluation based on the employee’s absence/sick leave record as incorporated into company policy can provide another useful indicator that may also carry less stigma, thus reducing barriers to seeking help. This research also confirms two conclusions heretofore based only on cross-sectional studies: (1) higher absence rates are associated with employees experiencing an AOD disorder; (2) treatment is associated with lower costs for replacing absent pilots. Due to the uniqueness of the employee population studied (commercial airline pilots) and the organizational documentation of absence, the generalizability of this study to other professions and occupations should be considered limited. ^ Transition to Practice. The odds ratios for the relationship between absence rates and an AOD diagnosis are precise; the OR for year of diagnosis indicates the likelihood of being diagnosed increases 10% for every hour change in sick leave taken. In practice, however, a pilot uses approximately 20 hours of sick leave for one trip, because the replacement will have to be paid the guaranteed minimum of 20 hour. Thus, the rate based on hourly changes is precise but not practical. ^ To provide the organization with practical recommendations the yearly mean absence rates were used. A pilot flies on average, 90 hours a month, 1080 annually. Cases used almost twice the mean rate of sick time the year prior to diagnosis (T-1) compared to controls (cases, x = .11, controls, x = .06). Cases are expected to use on average 119 hours annually (total annual hours*mean annual absence rate), while controls will use 60 hours. The cases’ 60 hours could translate to 3 trips of 20 hours each. Management could use a standard of 80 hours or more of sick time claimed in a year as the threshold for unacceptable absence, a 25% increase over the controls (a cost to the company of approximately of $4000). At the 80-hour mark, the Chief Pilot would be able to call the pilot in for a routine check as to the nature of the pilot’s excessive absence. This management action would be based on a company standard, rather than a behavioral or performance issue. Using absence data in this fashion would make it an active surveillance mechanism. ^
Resumo:
Plant biosecurity requires statistical tools to interpret field surveillance data in order to manage pest incursions that threaten crop production and trade. Ultimately, management decisions need to be based on the probability that an area is infested or free of a pest. Current informal approaches to delimiting pest extent rely upon expert ecological interpretation of presence / absence data over space and time. Hierarchical Bayesian models provide a cohesive statistical framework that can formally integrate the available information on both pest ecology and data. The overarching method involves constructing an observation model for the surveillance data, conditional on the hidden extent of the pest and uncertain detection sensitivity. The extent of the pest is then modelled as a dynamic invasion process that includes uncertainty in ecological parameters. Modelling approaches to assimilate this information are explored through case studies on spiralling whitefly, Aleurodicus dispersus and red banded mango caterpillar, Deanolis sublimbalis. Markov chain Monte Carlo simulation is used to estimate the probable extent of pests, given the observation and process model conditioned by surveillance data. Statistical methods, based on time-to-event models, are developed to apply hierarchical Bayesian models to early detection programs and to demonstrate area freedom from pests. The value of early detection surveillance programs is demonstrated through an application to interpret surveillance data for exotic plant pests with uncertain spread rates. The model suggests that typical early detection programs provide a moderate reduction in the probability of an area being infested but a dramatic reduction in the expected area of incursions at a given time. Estimates of spiralling whitefly extent are examined at local, district and state-wide scales. The local model estimates the rate of natural spread and the influence of host architecture, host suitability and inspector efficiency. These parameter estimates can support the development of robust surveillance programs. Hierarchical Bayesian models for the human-mediated spread of spiralling whitefly are developed for the colonisation of discrete cells connected by a modified gravity model. By estimating dispersal parameters, the model can be used to predict the extent of the pest over time. An extended model predicts the climate restricted distribution of the pest in Queensland. These novel human-mediated movement models are well suited to demonstrating area freedom at coarse spatio-temporal scales. At finer scales, and in the presence of ecological complexity, exploratory models are developed to investigate the capacity for surveillance information to estimate the extent of red banded mango caterpillar. It is apparent that excessive uncertainty about observation and ecological parameters can impose limits on inference at the scales required for effective management of response programs. The thesis contributes novel statistical approaches to estimating the extent of pests and develops applications to assist decision-making across a range of plant biosecurity surveillance activities. Hierarchical Bayesian modelling is demonstrated as both a useful analytical tool for estimating pest extent and a natural investigative paradigm for developing and focussing biosecurity programs.
Resumo:
Early detection surveillance programs aim to find invasions of exotic plant pests and diseases before they are too widespread to eradicate. However, the value of these programs can be difficult to justify when no positive detections are made. To demonstrate the value of pest absence information provided by these programs, we use a hierarchical Bayesian framework to model estimates of incursion extent with and without surveillance. A model for the latent invasion process provides the baseline against which surveillance data are assessed. Ecological knowledge and pest management criteria are introduced into the model using informative priors for invasion parameters. Observation models assimilate information from spatio-temporal presence/absence data to accommodate imperfect detection and generate posterior estimates of pest extent. When applied to an early detection program operating in Queensland, Australia, the framework demonstrates that this typical surveillance regime provides a modest reduction in the estimate that a surveyed district is infested. More importantly, the model suggests that early detection surveillance programs can provide a dramatic reduction in the putative area of incursion and therefore offer a substantial benefit to incursion management. By mapping spatial estimates of the point probability of infestation, the model identifies where future surveillance resources can be most effectively deployed.