867 resultados para species distribution models
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but such approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks on the other hand, optimise the parameters of speech enhancement algorithms based on state sequences generated by a speech recogniser for utterances of known transcriptions. Previous applications of LIMA frameworks have generated a set of global enhancement parameters for all model states without taking in account the distribution of model occurrence, making optimisation susceptible to favouring frequently occurring models, in particular silence. In this paper, we demonstrate the existence of highly disproportionate phonetic distributions on two corpora with distinct speech tasks, and propose to normalise the influence of each phone based on a priori occurrence probabilities. Likelihood analysis and speech recognition experiments verify this approach for improving ASR performance in noisy environments.
Resumo:
This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.
Resumo:
Patterns of connectivity among local populations influence the dynamics of regional systems, but most ecological models have concentrated on explaining the effect of connectivity on local population structure using dynamic processes covering short spatial and temporal scales. In this study, a model was developed in an extended spatial system to examine the hypothesis that long term connectivity levels among local populations are influenced by the spatial distribution of resources and other habitat factors. The habitat heterogeneity model was applied to local wild rabbit populations in the semi-arid Mitchell region of southern central Queensland (the Eastern system). Species' specific population parameters which were appropriate for the rabbit in this region were used. The model predicted a wide range of long term connectivity levels among sites, ranging from the extreme isolation of some sites to relatively high interaction probabilities for others. The validity of model assumptions was assessed by regressing model output against independent population genetic data, and explained over 80% of the variation in the highly structured genetic data set. Furthermore, the model was robust, explaining a significant proportion of the variation in the genetic data over a wide range of parameters. The performance of the habitat heterogeneity model was further assessed by simulating the widely reported recent range expansion of the wild rabbit into the Mitchell region from the adjacent, panmictic Western rabbit population system. The model explained well the independently determined genetic characteristics of the Eastern system at different hierarchic levels, from site specific differences (for example, fixation of a single allele in the population at one site), to differences between population systems (absence of an allele in the Eastern system which is present in all Western system sites). The model therefore explained the past and long term processes which have led to the formation and maintenance of the highly structured Eastern rabbit population system. Most animals exhibit sex biased dispersal which may influence long term connectivity levels among local populations, and thus the dynamics of regional systems. When appropriate sex specific dispersal characteristics were used, the habitat heterogeneity model predicted substantially different interaction patterns between female-only and combined male and female dispersal scenarios. In the latter case, model output was validated using data from a bi-parentally inherited genetic marker. Again, the model explained over 80% of the variation in the genetic data. The fact that such a large proportion of variability is explained in two genetic data sets provides very good evidence that habitat heterogeneity influences long term connectivity levels among local rabbit populations in the Mitchell region for both males and females. The habitat heterogeneity model thus provides a powerful approach for understanding the large scale processes that shape regional population systems in general. Therefore the model has the potential to be useful as a tool to aid in the management of those systems, whether it be for pest management or conservation purposes.
Resumo:
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states—perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of “excess” zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to “excess” zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed—and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros
Resumo:
Much debate in media and communication studies is based on exaggerated opposition between the digital sublime and the digital abject: overly enthusiastic optimism versus determined pessimism over the potential of new technologies. This inhibits the discipline's claims to provide rigorous insight into industry and social change which is, after all, continuous. Instead of having to decide one way or the other, we need to ask how we study the process of change.This article examines the impact of online distribution in the film industry, particularly addressing the question of rates of change. Are there genuinely new players disrupting the established oligopoly, and if so with what effect? Is there evidence of disruption to, and innovation in, business models? Has cultural change been forced on the incumbents? Outside mainstream Hollywood, where are the new opportunities and the new players? What is the situation in Australia?
Resumo:
This study examined the distribution of major mosquito species and their roles in the transmission of Ross River virus (RRV) infection for coastline and inland areas in Brisbane, Australia (27°28′ S, 153°2′ E). We obtained data on the monthly counts of RRV cases in Brisbane between November 1998 and December 2001 by statistical local areas from the Queensland Department of Health and the monthly mosquito abundance from the Brisbane City Council. Correlation analysis was used to assess the pairwise relationships between mosquito density and the incidence of RRV disease. This study showed that the mosquito abundance of Aedes vigilax (Skuse), Culex annulirostris (Skuse), and Aedes vittiger (Skuse) were significantly associated with the monthly incidence of RRV in the coastline area, whereas Aedes vigilax, Culex annulirostris, and Aedes notoscriptus (Skuse) were significantly associated with the monthly incidence of RRV in the inland area. The results of the classification and regression tree (CART) analysis show that both occurrence and incidence of RRV were influenced by interactions between species in both coastal and inland regions. We found that there was an 89% chance for an occurrence of RRV if the abundance of Ae. vigifax was between 64 and 90 in the coastline region. There was an 80% chance for an occurrence of RRV if the density of Cx. annulirostris was between 53 and 74 in the inland area. The results of this study may have applications as a decision support tool in planning disease control of RRV and other mosquito-borne diseases.
Resumo:
From 19 authoritative lists with 164 entries of ‘endangered’ Australian mammal species, 39 species have been reported as extinct. When examined in the light of field conditions, the 18 of these species thought to be from Queensland consist of (a) species described from fragmentary museum material collected in the earliest days of exploration, (b) populations inferred to exist in Queensland by extrapolation from distribution records in neighbouring States or countries, (c) inhabitants of remote and harsh locations where search effort is extraordinarily difficult (especially in circumstances of drought or flooding). and/or (d) individuals that are clearly transitory or peripheral in distribution. ‘Rediscovery’ of such scarce species - a not infrequent occurrence - is nowadays attracting increasing attention. Management in respect of any scarce wildlife in Queensland presently derives from such official lists. The analyses here indicate that this method of prioritizing action needs review. This is especially so because action then tends to be centred on species chosen out of the lists for populist reasons and that mostly addresses Crown lands. There is reason to believe that the preferred management may lie private lands where casual observation has provided for rediscovery and where management is most desirable and practicable.
Resumo:
In many product categories of durable goods such as TV, PC, and DVD player, the largest component of sales is generated by consumers replacing existing units. Aggregate sales models proposed by diffusion of innovation researchers for the replacement component of sales have incorporated several different replacement distributions such as Rayleigh, Weibull, Truncated Normal and Gamma. Although these alternative replacement distributions have been tested using both time series sales data and individual-level actuarial “life-tables” of replacement ages, there is no census on which distributions are more appropriate to model replacement behaviour. In the current study we are motivated to develop a new “modified gamma” distribution by two reasons. First we recognise that replacements have two fundamentally different drivers – those forced by failure and early, discretionary replacements. The replacement distribution for each of these drivers is expected to be quite different. Second, we observed a poor fit of other distributions to out empirical data. We conducted a survey of 8,077 households to empirically examine models of replacement sales for six electronic consumer durables – TVs, VCRs, DVD players, digital cameras, personal and notebook computers. This data allows us to construct individual-level “life-tables” for replacement ages. We demonstrate the new modified gamma model fits the empirical data better than existing models for all six products using both a primary and a hold-out sample.
Resumo:
Complex surveillance problems are common in biosecurity, such as prioritizing detection among multiple invasive species, specifying risk over a heterogeneous landscape, combining multiple sources of surveillance data, designing for specified power to detect, resource management, and collateral effects on the environment. Moreover, when designing for multiple target species, inherent biological differences among species result in different ecological models underpinning the individual surveillance systems for each. Species are likely to have different habitat requirements, different introduction mechanisms and locations, require different methods of detection, have different levels of detectability, and vary in rates of movement and spread. Often there is a further challenge of a lack of knowledge, literature, or data, for any number of the above problems. Even so, governments and industry need to proceed with surveillance programs which aim to detect incursions in order to meet environmental, social and political requirements. We present an approach taken to meet these challenges in one comprehensive and statistically powerful surveillance design for non-indigenous terrestrial vertebrates on Barrow Island, a high conservation nature reserve off the Western Australian coast. Here, the possibility of incursions is increased due to construction and expanding industry on the island. The design, which includes mammals, amphibians and reptiles, provides a complete surveillance program for most potential terrestrial vertebrate invaders. Individual surveillance systems were developed for various potential invaders, and then integrated into an overall surveillance system which meets the above challenges using a statistical model and expert elicitation. We discuss the ecological basis for the design, the flexibility of the surveillance scheme, how it meets the above challenges, design limitations, and how it can be updated as data are collected as a basis for adaptive management.