545 resultados para models, genetic


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Patterns of connectivity among local populations influence the dynamics of regional systems, but most ecological models have concentrated on explaining the effect of connectivity on local population structure using dynamic processes covering short spatial and temporal scales. In this study, a model was developed in an extended spatial system to examine the hypothesis that long term connectivity levels among local populations are influenced by the spatial distribution of resources and other habitat factors. The habitat heterogeneity model was applied to local wild rabbit populations in the semi-arid Mitchell region of southern central Queensland (the Eastern system). Species' specific population parameters which were appropriate for the rabbit in this region were used. The model predicted a wide range of long term connectivity levels among sites, ranging from the extreme isolation of some sites to relatively high interaction probabilities for others. The validity of model assumptions was assessed by regressing model output against independent population genetic data, and explained over 80% of the variation in the highly structured genetic data set. Furthermore, the model was robust, explaining a significant proportion of the variation in the genetic data over a wide range of parameters. The performance of the habitat heterogeneity model was further assessed by simulating the widely reported recent range expansion of the wild rabbit into the Mitchell region from the adjacent, panmictic Western rabbit population system. The model explained well the independently determined genetic characteristics of the Eastern system at different hierarchic levels, from site specific differences (for example, fixation of a single allele in the population at one site), to differences between population systems (absence of an allele in the Eastern system which is present in all Western system sites). The model therefore explained the past and long term processes which have led to the formation and maintenance of the highly structured Eastern rabbit population system. Most animals exhibit sex biased dispersal which may influence long term connectivity levels among local populations, and thus the dynamics of regional systems. When appropriate sex specific dispersal characteristics were used, the habitat heterogeneity model predicted substantially different interaction patterns between female-only and combined male and female dispersal scenarios. In the latter case, model output was validated using data from a bi-parentally inherited genetic marker. Again, the model explained over 80% of the variation in the genetic data. The fact that such a large proportion of variability is explained in two genetic data sets provides very good evidence that habitat heterogeneity influences long term connectivity levels among local rabbit populations in the Mitchell region for both males and females. The habitat heterogeneity model thus provides a powerful approach for understanding the large scale processes that shape regional population systems in general. Therefore the model has the potential to be useful as a tool to aid in the management of those systems, whether it be for pest management or conservation purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Balimau Putih [an Indonesian cultivar tolerant to rice tungro bacilliform virus (RTBV)] was crossed with IR64 (RTBV, susceptible variety) to produce the three filial generations F1, F2 and F3. Agroinoculation was used to introduce RTBV into the test plants. RTBV tolerance was based on the RTBV level in plants by analysis of coat protein using enzyme-linked immunosorbent assay. The level of RTBV in cv. Balimau Putih was significantly lower than that of IR64 and the susceptible control, Taichung Native 1. Mean RTBV levels of the F1, F2 and F3 populations were comparable with one another and with the average of the parents. Results indicate that there was no dominance and an additive gene action may control the expression of tolerance to RTBV. Tolerance based on the level of RTBV coat protein was highly heritable (0.67) as estimated using the mean values of F3 lines, suggesting that selection for tolerance to RTBV can be performed in the early selfing generations using the technique employed in this study. The RTBV level had a negative correlation with plant height, but positive relationship with disease index value

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A national-level safety analysis tool is needed to complement existing analytical tools for assessment of the safety impacts of roadway design alternatives. FHWA has sponsored the development of the Interactive Highway Safety Design Model (IHSDM), which is roadway design and redesign software that estimates the safety effects of alternative designs. Considering the importance of IHSDM in shaping the future of safety-related transportation investment decisions, FHWA justifiably sponsored research with the sole intent of independently validating some of the statistical models and algorithms in IHSDM. Statistical model validation aims to accomplish many important tasks, including (a) assessment of the logical defensibility of proposed models, (b) assessment of the transferability of models over future time periods and across different geographic locations, and (c) identification of areas in which future model improvements should be made. These three activities are reported for five proposed types of rural intersection crash prediction models. The internal validation of the model revealed that the crash models potentially suffer from omitted variables that affect safety, site selection and countermeasure selection bias, poorly measured and surrogate variables, and misspecification of model functional forms. The external validation indicated the inability of models to perform on par with model estimation performance. Recommendations for improving the state of the practice from this research include the systematic conduct of carefully designed before-and-after studies, improvements in data standardization and collection practices, and the development of analytical methods to combine the results of before-and-after studies with cross-sectional studies in a meaningful and useful way.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the optimization of conductor size and the voltage regulator location & magnitude of long rural distribution lines. The optimization minimizes the lifetime cost of the lines, including capital costs and losses while observing voltage drop and operational constraints using a Genetic Algorithm (GA). The GA optimization is applied to a real Single Wire Earth Return (SWER) network in regional Queensland and results are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports on the study of passenger experiences and how passengers interact with services, technology and processes at an airport. As part of our research, we have followed people through the airport from check-in to security and from security to boarding. Data was collected by approaching passengers in the departures concourse of the airport and asking for their consent to be videotaped. Data was collected and coded and the analysis focused on both discretionary and process related passenger activities. Our findings show the interdependence between activities and passenger experiences. Within all activities, passengers interact with processes, domain dependent technology, services, personnel and artifacts. These levels of interaction impact on passenger experiences and are interdependent. The emerging taxonomy of activities consists of (i) ownership related activities, (ii) group activities, (iii) individual activities (such as activities at the domain interfaces) and (iv) concurrent activities. This classification is contributing to the development of descriptive models of passenger experiences and how these activities affect the facilitation and design of future airports.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in safety research—trying to improve the collective understanding of motor vehicle crash causation—rests upon the pursuit of numerous lines of inquiry. The research community has focused on analytical methods development (negative binomial specifications, simultaneous equations, etc.), on better experimental designs (before-after studies, comparison sites, etc.), on improving exposure measures, and on model specification improvements (additive terms, non-linear relations, etc.). One might think of different lines of inquiry in terms of ‘low lying fruit’—areas of inquiry that might provide significant improvements in understanding crash causation. It is the contention of this research that omitted variable bias caused by the exclusion of important variables is an important line of inquiry in safety research. In particular, spatially related variables are often difficult to collect and omitted from crash models—but offer significant ability to better understand contributing factors to crashes. This study—believed to represent a unique contribution to the safety literature—develops and examines the role of a sizeable set of spatial variables in intersection crash occurrence. In addition to commonly considered traffic and geometric variables, examined spatial factors include local influences of weather, sun glare, proximity to drinking establishments, and proximity to schools. The results indicate that inclusion of these factors results in significant improvement in model explanatory power, and the results also generally agree with expectation. The research illuminates the importance of spatial variables in safety research and also the negative consequences of their omissions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Crash prediction models are used for a variety of purposes including forecasting the expected future performance of various transportation system segments with similar traits. The influence of intersection features on safety have been examined extensively because intersections experience a relatively large proportion of motor vehicle conflicts and crashes compared to other segments in the transportation system. The effects of left-turn lanes at intersections in particular have seen mixed results in the literature. Some researchers have found that left-turn lanes are beneficial to safety while others have reported detrimental effects on safety. This inconsistency is not surprising given that the installation of left-turn lanes is often endogenous, that is, influenced by crash counts and/or traffic volumes. Endogeneity creates problems in econometric and statistical models and is likely to account for the inconsistencies reported in the literature. This paper reports on a limited-information maximum likelihood (LIML) estimation approach to compensate for endogeneity between left-turn lane presence and angle crashes. The effects of endogeneity are mitigated using the approach, revealing the unbiased effect of left-turn lanes on crash frequency for a dataset of Georgia intersections. The research shows that without accounting for endogeneity, left-turn lanes ‘appear’ to contribute to crashes; however, when endogeneity is accounted for in the model, left-turn lanes reduce angle crash frequencies as expected by engineering judgment. Other endogenous variables may lurk in crash models as well, suggesting that the method may be used to correct simultaneity problems with other variables and in other transportation modeling contexts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Predicting safety on roadways is standard practice for road safety professionals and has a corresponding extensive literature. The majority of safety prediction models are estimated using roadway segment and intersection (microscale) data, while more recently efforts have been undertaken to predict safety at the planning level (macroscale). Safety prediction models typically include roadway, operations, and exposure variables—factors known to affect safety in fundamental ways. Environmental variables, in particular variables attempting to capture the effect of rain on road safety, are difficult to obtain and have rarely been considered. In the few cases weather variables have been included, historical averages rather than actual weather conditions during which crashes are observed have been used. Without the inclusion of weather related variables researchers have had difficulty explaining regional differences in the safety performance of various entities (e.g. intersections, road segments, highways, etc.) As part of the NCHRP 8-44 research effort, researchers developed PLANSAFE, or planning level safety prediction models. These models make use of socio-economic, demographic, and roadway variables for predicting planning level safety. Accounting for regional differences - similar to the experience for microscale safety models - has been problematic during the development of planning level safety prediction models. More specifically, without weather related variables there is an insufficient set of variables for explaining safety differences across regions and states. Furthermore, omitted variable bias resulting from excluding these important variables may adversely impact the coefficients of included variables, thus contributing to difficulty in model interpretation and accuracy. This paper summarizes the results of an effort to include weather related variables, particularly various measures of rainfall, into accident frequency prediction and the prediction of the frequency of fatal and/or injury degree of severity crash models. The purpose of the study was to determine whether these variables do in fact improve overall goodness of fit of the models, whether these variables may explain some or all of observed regional differences, and identifying the estimated effects of rainfall on safety. The models are based on Traffic Analysis Zone level datasets from Michigan, and Pima and Maricopa Counties in Arizona. Numerous rain-related variables were found to be statistically significant, selected rain related variables improved the overall goodness of fit, and inclusion of these variables reduced the portion of the model explained by the constant in the base models without weather variables. Rain tends to diminish safety, as expected, in fairly complex ways, depending on rain frequency and intensity.