Biblioteca Digital

713 resultados para mistimed covariates

Incomplete pregnancy and risk of ovarian cancer : results from two Australian case-control studies and systematic review

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although full-term pregnancies reduce the risk of ovarian cancer, it has not been conclusively established whether incomplete pregnancies also influence risk. We investigated the relationship between a history of incomplete pregnancy and incident epithelial ovarian cancer among over 4,500 women who participated in two large Australian population-based case-control studies in 1990-1993 and 2002-2005. They provided responses to detailed questions about their reproductive histories and other personal factors. Summary odds ratios (OR) and confidence intervals (CI) derived from each study using the same covariates were aggregated. We found no significant associations between the number of incomplete pregnancies and ovarian cancer, for parous (OR = 0.98, 95% CI: 0.89, 1.08) or nulliparous (OR = 1.06, 95% CI: 0.75, 1.48) women, nor for the number of spontaneous or induced abortions and ovarian cancer for parous women (OR = 0.95, 95% CI 0.82, 1.09; OR = 1.08, 95% CI: 0.86, 1.36) or nulliparous women (OR = 1.2, 95% CI: 0.6, 2.4; OR = 0.8, 95% CI: 0.47, 1.38), respectively. A systematic review of 37 previous studies of the topic confirmed our findings that a history of incomplete pregnancy does not influence a woman’s risk of epithelial ovarian cancer.

A fully Bayesian approach to inference for Coxian phase-type distributions with covariate dependent mean

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Phase-type distributions represent the time to absorption for a finite state Markov chain in continuous time, generalising the exponential distribution and providing a flexible and useful modelling tool. We present a new reversible jump Markov chain Monte Carlo scheme for performing a fully Bayesian analysis of the popular Coxian subclass of phase-type models; the convenient Coxian representation involves fewer parameters than a more general phase-type model. The key novelty of our approach is that we model covariate dependence in the mean whilst using the Coxian phase-type model as a very general residual distribution. Such incorporation of covariates into the model has not previously been attempted in the Bayesian literature. A further novelty is that we also propose a reversible jump scheme for investigating structural changes to the model brought about by the introduction of Erlang phases. Our approach addresses more questions of inference than previous Bayesian treatments of this model and is automatic in nature. We analyse an example dataset comprising lengths of hospital stays of a sample of patients collected from two Australian hospitals to produce a model for a patient's expected length of stay which incorporates the effects of several covariates. This leads to interesting conclusions about what contributes to length of hospital stay with implications for hospital planning. We compare our results with an alternative classical analysis of these data.

The explicit hazard model – part 1: theoretical development

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modern Engineering Asset Management (EAM) requires the accurate assessment of current and the prediction of future asset health condition. Appropriate mathematical models that are capable of estimating times to failures and the probability of failures in the future are essential in EAM. In most real-life situations, the lifetime of an engineering asset is influenced and/or indicated by different factors that are termed as covariates. Hazard prediction with covariates is an elemental notion in the reliability theory to estimate the tendency of an engineering asset failing instantaneously beyond the current time assumed that it has already survived up to the current time. A number of statistical covariate-based hazard models have been developed. However, none of them has explicitly incorporated both external and internal covariates into one model. This paper introduces a novel covariate-based hazard model to address this concern. This model is named as Explicit Hazard Model (EHM). Both the semi-parametric and non-parametric forms of this model are presented in the paper. The major purpose of this paper is to illustrate the theoretical development of EHM. Due to page limitation, a case study with the reliability field data is presented in the applications part of this study.

The Explicit Hazard Model – Part 2 : Applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hazard and reliability prediction of an engineering asset is one of the significant fields of research in Engineering Asset Health Management (EAHM). In real-life situations where an engineering asset operates under dynamic operational and environmental conditions, the lifetime of an engineering asset can be influenced and/or indicated by different factors that are termed as covariates. The Explicit Hazard Model (EHM) as a covariate-based hazard model is a new approach for hazard prediction which explicitly incorporates both internal and external covariates into one model. EHM is an appropriate model to use in the analysis of lifetime data in presence of both internal and external covariates in the reliability field. This paper presents applications of the methodology which is introduced and illustrated in the theory part of this study. In this paper, the semi-parametric EHM is applied to a case study so as to predict the hazard and reliability of resistance elements on a Resistance Corrosion Sensor Board (RCSB).

Elicitator : an expert elicitation tool for regression in ecology

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Expert elicitation is the process of retrieving and quantifying expert knowledge in a particular domain. Such information is of particular value when the empirical data is expensive, limited, or unreliable. This paper describes a new software tool, called Elicitator, which assists in quantifying expert knowledge in a form suitable for use as a prior model in Bayesian regression. Potential environmental domains for applying this elicitation tool include habitat modeling, assessing detectability or eradication, ecological condition assessments, risk analysis, and quantifying inputs to complex models of ecological processes. The tool has been developed to be user-friendly, extensible, and facilitate consistent and repeatable elicitation of expert knowledge across these various domains. We demonstrate its application to elicitation for logistic regression in a geographically based ecological context. The underlying statistical methodology is also novel, utilizing an indirect elicitation approach to target expert knowledge on a case-by-case basis. For several elicitation sites (or cases), experts are asked simply to quantify their estimated ecological response (e.g. probability of presence), and its range of plausible values, after inspecting (habitat) covariates via GIS.

Motivation, development and validation of a new spectral greenness index : a spectral dimension related to foliage projective cover

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A method is presented for the development of a regional Landsat-5 Thematic Mapper (TM) and Landsat-7 Enhanced Thematic Mapper plus (ETM+) spectral greenness index, coherent with a six-dimensional index set, based on a single ETM+ spectral image of a reference landscape. The first three indices of the set are determined by a polar transformation of the first three principal components of the reference image and relate to scene brightness, percent foliage projective cover (FPC) and water related features. The remaining three principal components, of diminishing significance with respect to the reference image, complete the set. The reference landscape, a 2200 km2 area containing a mix of cattle pasture, native woodland and forest, is located near Injune in South East Queensland, Australia. The indices developed from the reference image were tested using TM spectral images from 19 regionally dispersed areas in Queensland, representative of dissimilar landscapes containing woody vegetation ranging from tall closed forest to low open woodland. Examples of image transformations and two-dimensional feature space plots are used to demonstrate image interpretations related to the first three indices. Coherent, sensible, interpretations of landscape features in images composed of the first three indices can be made in terms of brightness (red), foliage cover (green) and water (blue). A limited comparison is made with similar existing indices. The proposed greenness index was found to be very strongly related to FPC and insensitive to smoke. A novel Bayesian, bounded space, modelling method, was used to validate the greenness index as a good predictor of FPC. Airborne LiDAR (Light Detection and Ranging) estimates of FPC along transects of the 19 sites provided the training and validation data. Other spectral indices from the set were found to be useful as model covariates that could improve FPC predictions. They act to adjust the greenness/FPC relationship to suit different spectral backgrounds. The inclusion of an external meteorological covariate showed that further improvements to regional-scale predictions of FPC could be gained over those based on spectral indices alone.

Addressing issues in sparseness, ecological bias and formulation of the adjacency matrix in Bayesian spatio-temporal analysis of disease counts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.

Brain-derived neurotrophic factor (BDNF) gene : no major impact on antidepressant treatment response

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The brain-derived neurotrophic factor (BDNF) has been suggested to play a pivotal role in the aetiology of affective disorders. In order to further clarify the impact of BDNF gene variation on major depression as well as antidepressant treatment response, association of three BDNF polymorphisms [rs7103411, Val66Met (rs6265) and rs7124442] with major depression and antidepressant treatment response was investigated in an overall sample of 268 German patients with major depression and 424 healthy controls. False discovery rate (FDR) was applied to control for multiple testing. Additionally, ten markers in BDNF were tested for association with citalopram outcome in the STAR*D sample. While BDNF was not associated with major depression as a categorical diagnosis, the BDNF rs7124442 TT genotype was significantly related to worse treatment outcome over 6 wk in major depression (p=0.01) particularly in anxious depression (p=0.003) in the German sample. However, BDNF rs7103411 and rs6265 similarly predicted worse treatment response over 6 wk in clinical subtypes of depression such as melancholic depression only (rs7103411: TTcovariates. The STAR*D analyses did not yield significant results at any of the ten BDNF markers. Our results do not support an association between genetic variation in BDNF and antidepressant treatment response or remission. Post-hoc analyses provide some preliminary support for a potential minor role of genetic variation in BDNF and antidepressant treatment outcome in the context of melancholic depression.

Bayesian models for longitudinal data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Longitudinal data, where data are repeatedly observed or measured on a temporal basis of time or age provides the foundation of the analysis of processes which evolve over time, and these can be referred to as growth or trajectory models. One of the traditional ways of looking at growth models is to employ either linear or polynomial functional forms to model trajectory shape, and account for variation around an overall mean trend with the inclusion of random eects or individual variation on the functional shape parameters. The identification of distinct subgroups or sub-classes (latent classes) within these trajectory models which are not based on some pre-existing individual classification provides an important methodology with substantive implications. The identification of subgroups or classes has a wide application in the medical arena where responder/non-responder identification based on distinctly diering trajectories delivers further information for clinical processes. This thesis develops Bayesian statistical models and techniques for the identification of subgroups in the analysis of longitudinal data where the number of time intervals is limited. These models are then applied to a single case study which investigates the neuropsychological cognition for early stage breast cancer patients undergoing adjuvant chemotherapy treatment from the Cognition in Breast Cancer Study undertaken by the Wesley Research Institute of Brisbane, Queensland. Alternative formulations to the linear or polynomial approach are taken which use piecewise linear models with a single turning point, change-point or knot at a known time point and latent basis models for the non-linear trajectories found for the verbal memory domain of cognitive function before and after chemotherapy treatment. Hierarchical Bayesian random eects models are used as a starting point for the latent class modelling process and are extended with the incorporation of covariates in the trajectory profiles and as predictors of class membership. The Bayesian latent basis models enable the degree of recovery post-chemotherapy to be estimated for short and long-term followup occasions, and the distinct class trajectories assist in the identification of breast cancer patients who maybe at risk of long-term verbal memory impairment.

Hierarchical models for 2D presence/absence data having ambiguous zeroes: With a biogeographical case study on dingo behaviour

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Development of accident modification factors on intersection crash maneuvers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In order to estimate the safety impact of roadway interventions engineers need to collect, analyze, and interpret the results of carefully implemented data collection efforts. The intent of these studies is to develop Accident Modification Factors (AMF's), which are used to predict the safety impact of various road safety features at other locations or in upon future enhancements. Models are typically estimated to estimate AMF's for total crashes, but can and should be estimated for crash outcomes as well. This paper first describes data collected with the intent estimate AMF's for rural intersections in the state of Georgia within the United Sates. Modeling results of crash prediction models for the crash outcomes: angle, head-on, rear-end, sideswipe (same direction and opposite direction) and pedestrian-involved crashes are then presented and discussed. The analysis reveals that factors such as the Annual Average Daily Traffic (AADT), the presence of turning lanes, and the number of driveways have a positive association with each type of crash, while the median width and the presence of lighting are negatively associated with crashes. The model covariates are related to crash outcome in different ways, suggesting that crash outcomes are associated with different pre-crash conditions.

On the nature of over-dispersion in motor vehicle crash prediction models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Effects of transportation accessibility on residential property values: Application of spatial hedonic price model in Seoul, South Korea, metropolitan area

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A number of studies have focused on estimating the effects of accessibility on housing values by using the hedonic price model. In the majority of studies, estimation results have revealed that housing values increase as accessibility improves, although the magnitude of estimates has varied across studies. Adequately estimating the relationship between transportation accessibility and housing values is challenging for at least two reasons. First, the monocentric city assumption applied in location theory is no longer valid for many large or growing cities. Second, rather than being randomly distributed in space, housing values are clustered in space—often exhibiting spatial dependence. Recognizing these challenges, a study was undertaken to develop a spatial lag hedonic price model in the Seoul, South Korea, metropolitan region, which includes a measure of local accessibility as well as systemwide accessibility, in addition to other model covariates. Although the accessibility measures can be improved, the modeling results suggest that the spatial interactions of apartment sales prices occur across and within traffic analysis zones, and the sales prices for apartment communities are devalued as accessibility deteriorates. Consistent with findings in other cities, this study revealed that the distance to the central business district is still a significant determinant of sales price.

Remaining useful life prediction using elliptical basis function network and Markov Chain

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a novel method for remaining useful life prediction using the Elliptical Basis Function (EBF) network and a Markov chain. The EBF structure is trained by a modified Expectation-Maximization (EM) algorithm in order to take into account the missing covariate set. No explicit extrapolation is needed for internal covariates while a Markov chain is constructed to represent the evolution of external covariates in the study. The estimated external and the unknown internal covariates constitute an incomplete covariate set which are then used and analyzed by the EBF network to provide survival information of the asset. It is shown in the case study that the method slightly underestimates the remaining useful life of an asset which is a desirable result for early maintenance decision and resource planning.

Probabilistic subgroup identification using Bayesian finite mixture modelling : a case study in Parkinson’s disease phenotype identification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article explores the use of probabilistic classification, namely finite mixture modelling, for identification of complex disease phenotypes, given cross-sectional data. In particular, if focuses on posterior probabilities of subgroup membership, a standard output of finite mixture modelling, and how the quantification of uncertainty in these probabilities can lead to more detailed analyses. Using a Bayesian approach, we describe two practical uses of this uncertainty: (i) as a means of describing a person’s membership to a single or multiple latent subgroups and (ii) as a means of describing identified subgroups by patient-centred covariates not included in model estimation. These proposed uses are demonstrated on a case study in Parkinson’s disease (PD), where latent subgroups are identified using multiple symptoms from the Unified Parkinson’s Disease Rating Scale (UPDRS).

«
1
2
3
4
5
6
7
8
...
47
48
»