113 resultados para genetics, statistical genetics, variable models


Relevância:

40.00% 40.00%

Publicador:

Resumo:

We propose a model-based approach to unify clustering and network modeling using time-course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster-specific expression profiles using state-space models. We discuss the application of our model to simulated data as well as to time-course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses, we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states—perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of “excess” zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to “excess” zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed—and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Continuum diffusion models are often used to represent the collective motion of cell populations. Most previous studies have simply used linear diffusion to represent collective cell spreading, while others found that degenerate nonlinear diffusion provides a better match to experimental cell density profiles. In the cell modeling literature there is no guidance available with regard to which approach is more appropriate for representing the spreading of cell populations. Furthermore, there is no knowledge of particular experimental measurements that can be made to distinguish between situations where these two models are appropriate. Here we provide a link between individual-based and continuum models using a multi-scale approach in which we analyze the collective motion of a population of interacting agents in a generalized lattice-based exclusion process. For round agents that occupy a single lattice site, we find that the relevant continuum description of the system is a linear diffusion equation, whereas for elongated rod-shaped agents that occupy L adjacent lattice sites we find that the relevant continuum description is connected to the porous media equation (pme). The exponent in the nonlinear diffusivity function is related to the aspect ratio of the agents. Our work provides a physical connection between modeling collective cell spreading and the use of either the linear diffusion equation or the pme to represent cell density profiles. Results suggest that when using continuum models to represent cell population spreading, we should take care to account for variations in the cell aspect ratio because different aspect ratios lead to different continuum models.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. Conclusion We describe an Australian origin for B. pseudomallei, characterized by a single introduction event into Southeast Asia during a recent glacial period, and variable levels of lateral gene transfer within populations. These patterns provide insights into mechanisms of genetic diversification in B. pseudomallei and its closest relatives, and provide a framework for integrating the traditionally separate fields of population genetics and phylogenetics for other bacterial species with high levels of lateral gene transfer.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Although germline mutations in CDKN2A are present in approximately 25% of large multicase melanoma families, germline mutations are much rarer in the smaller melanoma families that make up most individuals reporting a family history of this disease. In addition, only three families worldwide have been reported with germline mutations in a gene other than CDKN2A (i.e., CDK4). Accordingly, current genomewide scans underway at the National Human Genome Research Institute hope to reveal linkage to one or more chromosomal regions, and ultimately lead to the identification of novel genes involved in melanoma predisposition. Both CDKN2A and PTEN have been identified as genes involved in sporadic melanoma development; however, mutations are more common in cell lines than uncultured tumors. A combination of cytogenetic, molecular, and functional studies suggests that additional genes involved in melanoma development are located to chromosomal regions 1p, 6q, 7p, 11q, and possibly also 9p and 10q. With the near completion of the human genome sequencing effort, combined with the advent of high throughput mutation analyses and new techniques including cDNA and tissue microarrays, the identification and characterization of additional genes involved in melanoma pathogenesis seem likely in the near future.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

As family history has been established as a risk factor for prostate cancer, attempts have been made to isolate predisposing genetic variants that are related to hereditary prostate cancer. With many genetic variants still to be identified and investigated, it is not yet possible to fully understand the impact of genetic variants on prostate cancer development. The high survival rates among men with prostate cancer have meant that other issues, such as quality of life (QoL), have also become important. Through their effect on a person’s health, a range of inherited genetic variants may potentially influence QoL in men with prostate cancer, even prior to treatment. Until now, limited research has been conducted on the relationship between genetics and QoL. Thus, this study contributes to an emerging field by aiming to identify certain genetic variants related to the QoL found in men with prostate cancer. It is hoped that this study may lead to future research that will identify men who have an increased risk of a poor QoL following prostate cancer treatment, which will aid in developing treatments that are individually tailored to support them. Previous studies have established that genetic variants of Vascular Endothelial Growth Factor (VEGF) and Insulin-like Growth Factor 1 (IGF-1) may play a role in prostate cancer development. VEGF and IGF-1 have also been reported to be associated with QoL in people with ovarian cancer and colorectal cancer, respectively. This study completed a series of secondary analyses using two major data-sets (from 850 men newly diagnosed with prostate cancer, and approximately 550 men from the general Queensland population), in which genetic variants of VEGF and IGF-1 were investigated for associations with prostate cancer susceptibility and QoL. The first aim of this research was to investigate genetic variants in the VEGF and IGF-I gene for an association with the risk of prostate cancer. It was found that one IGF-1 genetic variant (rs35765) had a statistically significant association with prostate cancer (p = 0.04), and one VEGF genetic variant (rs2146323) had a statistically significant association with advanced prostate cancer (p = 0.02). The estimates suggest that carriers of the CA and AA genotype for rs35765 may have a reduced risk of developing prostate cancer (Odds Ratio (OR) = 0.72, 95% Confidence Interval (CI) = 0.55, 0.95, OR = 0.60, 95% CI = 0.26, 1.39, respectively). Meanwhile, carriers of the CA and AA genotype for rs2146323 may be at increased risk of advanced prostate cancer, which was determined by a Gleason score of above 7 (OR = 1.72, 95% CI = 1.12, 2.63, OR = 1.90, 95% CI = 1.08, 3.34, respectively). Utilising the widely used short-form health survey, the SF-36v2, the second aim of this study was to investigate the relationship between prostate cancer and QoL prior to treatment. Assessing QoL at this time-point was important as little research has been conducted to evaluate if prostate cancer affects QoL regardless of treatment. The analyses found that mean SF-36v2 scale scores related to physical health were higher by at least 0.3 Standard Deviations (SD) among men with prostate cancer than the general population comparison group. This difference was considered clinically significant (defined by group differences in mean SF-36v2 scores by at least 0.3 SD). These differences were also statistically significant (p<0.05). Mean QoL scale scores related to mental health were similar between men with prostate cancer and those from the general population comparison group. The third aim of this study was to investigate genetic variants in the VEGF and IGF-1 gene for an association with QoL in prostate cancer patients prior to their treatment. It was essential to evaluate these relationships prior to treatment, before the involvement of these genes was potentially interrupted by treatment. The analyses found that some genetic variants had a small clinically significant association (0.3 SD) to some QoL domains experienced by these men. However, most relationships were not statistically significant (p>0.05). Most of the associations found identified that a small sub-group of men with prostate cancer (approximately 2%) reported, on average, a slightly better QoL than the majority of the prostate cancer patients. The fourth aim of this research was to investigate whether associations between genetic variants in VEGF and IGF-1 and QoL were specific to men with prostate cancer, or were also applicable to the general male population. It was found that twenty out of one-hundred relationships between the genetic variants of VEGF and IGF-1 and QoL health-measures and scales examined differed between these groups. In the majority of the relationships involving VEGF SNPs that differed, a clinically significant difference (0.3 or more SD) between mean scores among the genotype groups in prostate cancer patients was found, while mean scores among men from the general-population comparison group were similar. For example, prostate cancer participants who carried at least one T allele (CT or TT genotype) for rs3024994 had a clinically significant higher (0.3 SD) mean QoL score in terms of the role-physical scale, than participants who carried the CC genotype. This was not seen among men from the general population sample, as the mean score was similar between genotype groups. The opposite was seen in regards to the IGF-1 SNPs examined. Overall, these relationships were not considered to directly impact on the clinical options for men with prostate cancer. As this study utilised secondary data from two separate studies, there are a number of important limitations that should be acknowledged including issues of multiple comparisons, power, and missing or unavailable data. It is recommended that this study be replicated as a better-designed study that takes greater consideration of the many factors involved in prostate cancer and QoL. Investigation into other genetic variants of VEGF or IGF-1 is also warranted, as is consideration of other genes and their relationship with QoL. Through identifying certain genetic variants that have a modest association to prostate cancer, this project adds to the knowledge surrounding VEGF and IGF-1 and their role in prostate cancer susceptibility. Importantly, this project has also introduced the potential role genetics plays in QoL, through investigating the relationships between genetic variants of VEGF and IGF-1 and QoL.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent years, a number of phylogenetic methods have been developed for estimating molecular rates and divergence dates under models that relax the molecular clock constraint by allowing rate change throughout the tree. These methods are being used with increasing frequency, but there have been few studies into their accuracy. We tested the accuracy of several relaxed-clock methods (penalized likelihood and Bayesian inference using various models of rate change) using nucleotide sequences simulated on a nine-taxon tree. When the sequences evolved with a constant rate, the methods were able to infer rates accurately, but estimates were more precise when a molecular clock was assumed. When the sequences evolved under a model of autocorrelated rate change, rates were accurately estimated using penalized likelihood and by Bayesian inference using lognormal and exponential models of rate change, while other models did not perform as well. When the sequences evolved under a model of uncorrelated rate change, only Bayesian inference using an exponential rate model performed well. Collectively, the results provide a strong recommendation for using the exponential model of rate change if a conservative approach to divergence time estimation is required. A case study is presented in which we use a simulation-based approach to examine the hypothesis of elevated rates in the Cambrian period, and it is found that these high rate estimates might be an artifact of the rate estimation method. If this bias is present, then the ages of metazoan divergences would be systematically underestimated. The results of this study have implications for studies of molecular rates and divergence dates.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Ratites are large, flightless birds and include the ostrich, rheas, kiwi, emu, and cassowaries, along with extinct members, such as moa and elephant birds. Previous phylogenetic analyses of complete mitochondrial genome sequences have reinforced the traditional belief that ratites are monophyletic and tinamous are their sister group. However, in these studies ratite monophyly was enforced in the analyses that modeled rate heterogeneity among variable sites. Relaxing this topological constraint results in strong support for the tinamous (which fly) nesting within ratites. Furthermore, upon reducing base compositional bias and partitioning models of sequence evolution among protein codon positions and RNA structures, the tinamou–moa clade grouped with kiwi, emu, and cassowaries to the exclusion of the successively more divergent rheas and ostrich. These relationships are consistent with recent results from a large nuclear data set, whereas our strongly supported finding of a tinamou–moa grouping further resolves palaeognath phylogeny. We infer flight to have been lost among ratites multiple times in temporally close association with the Cretaceous–Tertiary extinction event. This circumvents requirements for transient microcontinents and island chains to explain discordance between ratite phylogeny and patterns of continental breakup. Ostriches may have dispersed to Africa from Eurasia, putting in question the status of ratites as an iconic Gondwanan relict taxon. [Base composition; flightless; Gondwana; mitochondrial genome; Palaeognathae; phylogeny; ratites.]

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Quality oriented management systems and methods have become the dominant business and governance paradigm. From this perspective, satisfying customers’ expectations by supplying reliable, good quality products and services is the key factor for an organization and even government. During recent decades, Statistical Quality Control (SQC) methods have been developed as the technical core of quality management and continuous improvement philosophy and now are being applied widely to improve the quality of products and services in industrial and business sectors. Recently SQC tools, in particular quality control charts, have been used in healthcare surveillance. In some cases, these tools have been modified and developed to better suit the health sector characteristics and needs. It seems that some of the work in the healthcare area has evolved independently of the development of industrial statistical process control methods. Therefore analysing and comparing paradigms and the characteristics of quality control charts and techniques across the different sectors presents some opportunities for transferring knowledge and future development in each sectors. Meanwhile considering capabilities of Bayesian approach particularly Bayesian hierarchical models and computational techniques in which all uncertainty are expressed as a structure of probability, facilitates decision making and cost-effectiveness analyses. Therefore, this research investigates the use of quality improvement cycle in a health vii setting using clinical data from a hospital. The need of clinical data for monitoring purposes is investigated in two aspects. A framework and appropriate tools from the industrial context are proposed and applied to evaluate and improve data quality in available datasets and data flow; then a data capturing algorithm using Bayesian decision making methods is developed to determine economical sample size for statistical analyses within the quality improvement cycle. Following ensuring clinical data quality, some characteristics of control charts in the health context including the necessity of monitoring attribute data and correlated quality characteristics are considered. To this end, multivariate control charts from an industrial context are adapted to monitor radiation delivered to patients undergoing diagnostic coronary angiogram and various risk-adjusted control charts are constructed and investigated in monitoring binary outcomes of clinical interventions as well as postintervention survival time. Meanwhile, adoption of a Bayesian approach is proposed as a new framework in estimation of change point following control chart’s signal. This estimate aims to facilitate root causes efforts in quality improvement cycle since it cuts the search for the potential causes of detected changes to a tighter time-frame prior to the signal. This approach enables us to obtain highly informative estimates for change point parameters since probability distribution based results are obtained. Using Bayesian hierarchical models and Markov chain Monte Carlo computational methods, Bayesian estimators of the time and the magnitude of various change scenarios including step change, linear trend and multiple change in a Poisson process are developed and investigated. The benefits of change point investigation is revisited and promoted in monitoring hospital outcomes where the developed Bayesian estimator reports the true time of the shifts, compared to priori known causes, detected by control charts in monitoring rate of excess usage of blood products and major adverse events during and after cardiac surgery in a local hospital. The development of the Bayesian change point estimators are then followed in a healthcare surveillances for processes in which pre-intervention characteristics of patients are viii affecting the outcomes. In this setting, at first, the Bayesian estimator is extended to capture the patient mix, covariates, through risk models underlying risk-adjusted control charts. Variations of the estimator are developed to estimate the true time of step changes and linear trends in odds ratio of intensive care unit outcomes in a local hospital. Secondly, the Bayesian estimator is extended to identify the time of a shift in mean survival time after a clinical intervention which is being monitored by riskadjusted survival time control charts. In this context, the survival time after a clinical intervention is also affected by patient mix and the survival function is constructed using survival prediction model. The simulation study undertaken in each research component and obtained results highly recommend the developed Bayesian estimators as a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances as well as industrial and business contexts. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The empirical results and simulations indicate that the Bayesian estimators are a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The advantages of the Bayesian approach seen in general context of quality control may also be extended in the industrial and business domains where quality monitoring was initially developed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Animal models typically require a known genetic pedigree to estimate quantitative genetic parameters. Here we test whether animal models can alternatively be based on estimates of relatedness derived entirely from molecular marker data. Our case study is the morphology of a wild bird population, for which we report estimates of the genetic variance-covariance matrices (G) of six morphological traits using three methods: the traditional animal model; a molecular marker-based approach to estimate heritability based on Ritland's pairwise regression method; and a new approach using a molecular genealogy arranged in a relatedness matrix (R) to replace the pedigree in an animal model. Using the traditional animal model, we found significant genetic variance for all six traits and positive genetic covariance among traits. The pairwise regression method did not return reliable estimates of quantitative genetic parameters in this population, with estimates of genetic variance and covariance typically being very small or negative. In contrast, we found mixed evidence for the use of the pedigree-free animal model. Similar to the pairwise regression method, the pedigree-free approach performed poorly when the full-rank R matrix based on the molecular genealogy was employed. However, performance improved substantially when we reduced the dimensionality of the R matrix in order to maximize the signal to noise ratio. Using reduced-rank R matrices generated estimates of genetic variance that were much closer to those from the traditional model. Nevertheless, this method was less reliable at estimating covariances, which were often estimated to be negative. Taken together, these results suggest that pedigree-free animal models can recover quantitative genetic information, although the signal remains relatively weak. It remains to be determined whether this problem can be overcome by the use of a more powerful battery of molecular markers and improved methods for reconstructing genealogies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Historical information can be used, in addition to pedigree, traits and genotypes, to map quantitative trait locus (QTL) in general populations via maximum likelihood estimation of variance components. This analysis is known as linkage disequilibrium (LD) and linkage mapping, because it exploits both linkage in families and LD at the population level. The search for QTL in the wild population of Soay sheep on St. Kilda is a proof of principle. We analysed the data from a previous study and confirmed some of the QTLs reported. The most striking result was the confirmation of a QTL affecting birth weight that had been reported using association tests but not when using linkage-based analyses. Copyright © Cambridge University Press 2010.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Motivation: Unravelling the genetic architecture of complex traits requires large amounts of data, sophisticated models and large computational resources. The lack of user-friendly software incorporating all these requisites is delaying progress in the analysis of complex traits. Methods: Linkage disequilibrium and linkage analysis (LDLA) is a high-resolution gene mapping approach based on sophisticated mixed linear models, applicable to any population structure. LDLA can use population history information in addition to pedigree and molecular markers to decompose traits into genetic components. Analyses are distributed in parallel over a large public grid of computers in the UK. Results: We have proven the performance of LDLA with analyses of simulated data. There are real gains in statistical power to detect quantitative trait loci when using historical information compared with traditional linkage analysis. Moreover, the use of a grid of computers significantly increases computational speed, hence allowing analyses that would have been prohibitive on a single computer. © The Author 2009. Published by Oxford University Press. All rights reserved.