961 resultados para STATISTICAL METHODOLOGY
Resumo:
Expert elicitation is the process of retrieving and quantifying expert knowledge in a particular domain. Such information is of particular value when the empirical data is expensive, limited, or unreliable. This paper describes a new software tool, called Elicitator, which assists in quantifying expert knowledge in a form suitable for use as a prior model in Bayesian regression. Potential environmental domains for applying this elicitation tool include habitat modeling, assessing detectability or eradication, ecological condition assessments, risk analysis, and quantifying inputs to complex models of ecological processes. The tool has been developed to be user-friendly, extensible, and facilitate consistent and repeatable elicitation of expert knowledge across these various domains. We demonstrate its application to elicitation for logistic regression in a geographically based ecological context. The underlying statistical methodology is also novel, utilizing an indirect elicitation approach to target expert knowledge on a case-by-case basis. For several elicitation sites (or cases), experts are asked simply to quantify their estimated ecological response (e.g. probability of presence), and its range of plausible values, after inspecting (habitat) covariates via GIS.
Resumo:
Survival probability prediction using covariate-based hazard approach is a known statistical methodology in engineering asset health management. We have previously reported the semi-parametric Explicit Hazard Model (EHM) which incorporates three types of information: population characteristics; condition indicators; and operating environment indicators for hazard prediction. This model assumes the baseline hazard has the form of the Weibull distribution. To avoid this assumption, this paper presents the non-parametric EHM which is a distribution-free covariate-based hazard model. In this paper, an application of the non-parametric EHM is demonstrated via a case study. In this case study, survival probabilities of a set of resistance elements using the non-parametric EHM are compared with the Weibull proportional hazard model and traditional Weibull model. The results show that the non-parametric EHM can effectively predict asset life using the condition indicator, operating environment indicator, and failure history.
Resumo:
The emergence of highly chloroquine (CQ) resistant P. vivax in Southeast Asia has created an urgent need for an improved understanding of the mechanisms of drug resistance in these parasites, the development of robust tools for defining the spread of resistance, and the discovery of new antimalarial agents. The ex vivo Schizont Maturation Test (SMT), originally developed for the study of P. falciparum, has been modified for P. vivax. We retrospectively analysed the results from 760 parasite isolates assessed by the modified SMT to investigate the relationship between parasite growth dynamics and parasite susceptibility to antimalarial drugs. Previous observations of the stage-specific activity of CQ against P. vivax were confirmed, and shown to have profound consequences for interpretation of the assay. Using a nonlinear model we show increased duration of the assay and a higher proportion of ring stages in the initial blood sample were associated with decreased effective concentration (EC50) values of CQ, and identify a threshold where these associations no longer hold. Thus, starting composition of parasites in the SMT and duration of the assay can have a profound effect on the calculated EC50 for CQ. Our findings indicate that EC50 values from assays with a duration less than 34 hours do not truly reflect the sensitivity of the parasite to CQ, nor an assay where the proportion of ring stage parasites at the start of the assay does not exceed 66%. Application of this threshold modelling approach suggests that similar issues may occur for susceptibility testing of amodiaquine and mefloquine. The statistical methodology which has been developed also provides a novel means of detecting stage-specific drug activity for new antimalarials.
Resumo:
This is a discussion of the journal article: "Construcing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation". The article and discussion have appeared in the Journal of the Royal Statistical Society: Series B (Statistical Methodology).
Resumo:
For users of germplasm collections, the purpose of measuring characterization and evaluation descriptors, and subsequently using statistical methodology to summarize the data, is not only to interpret the relationships between the descriptors, but also to characterize the differences and similarities between accessions in relation to their phenotypic variability for each of the measured descriptors. The set of descriptors for the accessions of most germplasm collections consists of both numerical and categorical descriptors. This poses problems for a combined analysis of all descriptors because few statistical techniques deal with mixtures of measurement types. In this article, nonlinear principal component analysis was used to analyze the descriptors of the accessions in the Australian groundnut collection. It was demonstrated that the nonlinear variant of ordinary principal component analysis is an appropriate analytical tool because subspecies and botanical varieties could be identified on the basis of the analysis and characterized in terms of all descriptors. Moreover, outlying accessions could be easily spotted and their characteristics established. The statistical results and their interpretations provide users with a more efficient way to identify accessions of potential relevance for their plant improvement programs and encourage and improve the usefulness and utilization of germplasm collections.
Resumo:
With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.
Resumo:
Environmental data usually include measurements, such as water quality data, which fall below detection limits, because of limitations of the instruments or of certain analytical methods used. The fact that some responses are not detected needs to be properly taken into account in statistical analysis of such data. However, it is well-known that it is challenging to analyze a data set with detection limits, and we often have to rely on the traditional parametric methods or simple imputation methods. Distributional assumptions can lead to biased inference and justification of distributions is often not possible when the data are correlated and there is a large proportion of data below detection limits. The extent of bias is usually unknown. To draw valid conclusions and hence provide useful advice for environmental management authorities, it is essential to develop and apply an appropriate statistical methodology. This paper proposes rank-based procedures for analyzing non-normally distributed data collected at different sites over a period of time in the presence of multiple detection limits. To take account of temporal correlations within each site, we propose an optimal linear combination of estimating functions and apply the induced smoothing method to reduce the computational burden. Finally, we apply the proposed method to the water quality data collected at Susquehanna River Basin in United States of America, which dearly demonstrates the advantages of the rank regression models.
Resumo:
This study investigated whether mixed-species designs can increase the growth of a tropical eucalypt when compared to monocultures. Monocultures of Eucalyptus pellita (E) and Acacia peregrina (A) and mixtures in various proportions (75E:25A, 50E:50A, 25E:75A) were planted in a replacement series design on the Atherton Tablelands of north Queensland, Australia. High mortality in the establishment phase due to repeated damage by tropical cyclones altered the trial design. Effects of experimental designs on tree growth were estimated using a linear mixed-effects model with restricted maximum likelihood analysis (REML). Volume growth of individual eucalypt trees were positively affected by the presence of acacia trees at age 5 years and this effect generally increased with time up to age 10 years. However, the stand volume and basal area increased with increasing proportions of E. pellita, due to its larger individual tree size. Conventional analysis did not offer convincing support for mixed-species designs. Preliminary individual-based modelling using a modified Hegyi competition index offered a solution and an equation that indicates acacias have positive ecological interactions (facilitation or competitive reduction) and definitely do not cause competition like a eucalypt. These results suggest that significantly increased in growth rates could be achieved with mixed-species designs. This statistical methodology could enable a better understanding of species interactions in similarly altered experiments, or undesigned mixed-species plantations.
Resumo:
We present a statistical methodology for leakage power estimation, due to subthreshold and gate tunneling leakage, in the presence of process variations, for 65 nm CMOS. The circuit leakage power variations is analyzed by Monte Carlo (MC) simulations, by characterizing NAND gate library. A statistical “hybrid model” is proposed, to extend this methodology to a generic library. We demonstrate that hybrid model based statistical design results in up to 95% improvement in the prediction of worst to best corner leakage spread, with an error of less than 0.5%, with respect to worst case design.
Resumo:
BACKGROUND: In a time-course microarray experiment, the expression level for each gene is observed across a number of time-points in order to characterize the temporal trajectories of the gene-expression profiles. For many of these experiments, the scientific aim is the identification of genes for which the trajectories depend on an experimental or phenotypic factor. There is an extensive recent body of literature on statistical methodology for addressing this analytical problem. Most of the existing methods are based on estimating the time-course trajectories using parametric or non-parametric mean regression methods. The sensitivity of these regression methods to outliers, an issue that is well documented in the statistical literature, should be of concern when analyzing microarray data. RESULTS: In this paper, we propose a robust testing method for identifying genes whose expression time profiles depend on a factor. Furthermore, we propose a multiple testing procedure to adjust for multiplicity. CONCLUSIONS: Through an extensive simulation study, we will illustrate the performance of our method. Finally, we will report the results from applying our method to a case study and discussing potential extensions.
Resumo:
The main interest in the assessment of forest species diversity for conservation purposes is in the rare species. The main problem in the tropical rain forests is that most of the species are rare. Assessment of species diversity in the tropical rain forests is therefore often concerned with estimating that which is not observed in recorded samples. Statistical methodology is therefore required to try to estimate the truncated tail of the species frequency distribution, or to estimate the asymptote of species/diversity-area curves. A Horvitz-Thompson estimator of the number of unobserved (“virtual”) species in each species intensity class is proposed. The approach allows a definition of an extended definition of diversity, ( or generalised Renyi entropy). The paper presents a case study from data collected in Jambi, Sumatra, and the “extended diversity measure” is used on the species data.
Resumo:
Trends in sample extremes are of interest in many contexts, an example being environmental statistics. Parametric models are often used to model trends in such data, but they may not be suitable for exploratory data analysis. This paper outlines a semiparametric approach to smoothing example extremes, based on local polynomial fitting of the generalized extreme value distribution and related models. The uncertainty of fits is assessed by using resampling methods. The methods are applied to data on extreme temperatures and on record times for the womens 3000m race.
Resumo:
Tese apresentada como requisito parcial para obtenção do grau de Doutor em Estatística e Gestão de Informação pelo Instituto Superior de Estatística e Gestão de Informação da Universidade Nova de Lisboa