Biblioteca Digital

180 resultados para Genetic Variance-covariance Matrix

em Queensland University of Technology - ePrints Archive

Pedigree-free animal models : the relatedness matrix reloaded

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Animal models typically require a known genetic pedigree to estimate quantitative genetic parameters. Here we test whether animal models can alternatively be based on estimates of relatedness derived entirely from molecular marker data. Our case study is the morphology of a wild bird population, for which we report estimates of the genetic variance-covariance matrices (G) of six morphological traits using three methods: the traditional animal model; a molecular marker-based approach to estimate heritability based on Ritland's pairwise regression method; and a new approach using a molecular genealogy arranged in a relatedness matrix (R) to replace the pedigree in an animal model. Using the traditional animal model, we found significant genetic variance for all six traits and positive genetic covariance among traits. The pairwise regression method did not return reliable estimates of quantitative genetic parameters in this population, with estimates of genetic variance and covariance typically being very small or negative. In contrast, we found mixed evidence for the use of the pedigree-free animal model. Similar to the pairwise regression method, the pedigree-free approach performed poorly when the full-rank R matrix based on the molecular genealogy was employed. However, performance improved substantially when we reduced the dimensionality of the R matrix in order to maximize the signal to noise ratio. Using reduced-rank R matrices generated estimates of genetic variance that were much closer to those from the traditional model. Nevertheless, this method was less reliable at estimating covariances, which were often estimated to be negative. Taken together, these results suggest that pedigree-free animal models can recover quantitative genetic information, although the signal remains relatively weak. It remains to be determined whether this problem can be overcome by the use of a more powerful battery of molecular markers and improved methods for reconstructing genealogies.

Principle components analysis with covariance matrix on regional differences for mortality and percentages of cancers in Shandong province [山东省恶性肿瘤死亡率和构成比地区差异协方差矩阵主成分分析]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To explore the characteristics of regional distribution of cancer deaths in Shandong Province with the principle components analysis. Methods The principle components analysis with co-variance matrix for age-adjusted mortality rates and percentages of 20 types of cancer in 22 counties （cities） were carried out using SAS Software. Results Over 90% of the total information could be reflected by the top 3 principle components and the first principle component alone represented more than half of the overall regional variances. The first component mainly reflected the area differences of esophageal cancer. The second component mainly reflected the area differences of lung cancer， stomach cancer and liver cancer. The value of the first principal component scores showed a clear trend that the west areas possessed higher values and the east the lower values. Based on the top two components，the 22 counties （cities） could be divided into several geographical clusters. Conclusion The overall difference of regional distribution of cancers in Shandong is dominated by several major cancers including esophageal cancer， lung cancer， stomach cancer and liver cancer. Among them，esophageal cancer makes the largest contribution. If the range of counties （cities） analyzed could be further widened， the characteristics of regional distribution of cancer mortality would be better examined. Abstract in Chinese 目的利用主成分分析探讨山东省恶性肿瘤死亡的地区分布特征. 方法利用SAS软件对山东省22个县市区2004～2006午的20种恶性肿瘤标化死亡率和构成比分别进行协方差矩阵主成分分析. 结果前3个主成分就反映了总体差异90%以上的信息,其中仅第1主成分就提供了总体差异一半以上的信息.第1主成分主要反映了食管癌的地区差异,第2主成分主要反映肺癌的地区差异,兼顾胃癌和肝癌.各地区第1主成分得分呈现西高东低的趋势,根据第1和第2主成分可以将调查地区分为若干类别,表现为明显的地理聚集性. 结论山东省各地区恶性肿瘤死亡的总体差异主要取决于少数高发肿瘤,包括食管癌、肺癌、胃癌、肝癌等,其中以食管癌地位最为突出.如能进一步扩大分析范围,可更好地查明恶性肿瘤死亡的地区特征.

Large body size in an island-dwelling bird : a microevolutionary analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Island races of passerine birds display repeated evolution towards larger body size compared with their continental ancestors. The Capricorn silvereye (Zosterops lateralis chlorocephalus) has become up to six phenotypic standard deviations bigger in several morphological measures since colonization of an island approximately 4000 years ago. We estimated the genetic variance-covariance (G) matrix using full-sib and 'animal model' analyses, and selection gradients, for six morphological traits under field conditions in three consecutive cohorts of nestlings. Significant levels of genetic variance were found for all traits. Significant directional selection was detected for wing and tail lengths in one year and quadratic selection on culmen depth in another year. Although selection gradients on many traits were negative, the predicted evolutionary response to selection of these traits for all cohorts was uniformly positive. These results indicate that the G matrix and predicted evolutionary responses are consistent with those of a population evolving in the manner observed in the island passerine trend, that is, towards larger body size.

Orthogonality defect and reduced search-space size for solving integer least-squares problems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the context of ambiguity resolution (AR) of Global Navigation Satellite Systems (GNSS), decorrelation among entries of an ambiguity vector, integer ambiguity search and ambiguity validations are three standard procedures for solving integer least-squares problems. This paper contributes to AR issues from three aspects. Firstly, the orthogonality defect is introduced as a new measure of the performance of ambiguity decorrelation methods, and compared with the decorrelation number and with the condition number which are currently used as the judging criterion to measure the correlation of ambiguity variance-covariance matrix. Numerically, the orthogonality defect demonstrates slightly better performance as a measure of the correlation between decorrelation impact and computational efficiency than the condition number measure. Secondly, the paper examines the relationship of the decorrelation number, the condition number, the orthogonality defect and the size of the ambiguity search space with the ambiguity search candidates and search nodes. The size of the ambiguity search space can be properly estimated if the ambiguity matrix is decorrelated well, which is shown to be a significant parameter in the ambiguity search progress. Thirdly, a new ambiguity resolution scheme is proposed to improve ambiguity search efficiency through the control of the size of the ambiguity search space. The new AR scheme combines the LAMBDA search and validation procedures together, which results in a much smaller size of the search space and higher computational efficiency while retaining the same AR validation outcomes. In fact, the new scheme can deal with the case there are only one candidate, while the existing search methods require at least two candidates. If there are more than one candidate, the new scheme turns to the usual ratio-test procedure. Experimental results indicate that this combined method can indeed improve ambiguity search efficiency for both the single constellation and dual constellations respectively, showing the potential for processing high dimension integer parameters in multi-GNSS environment.

Sample size considerations and augmentation of computer experiments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computer Experiments, consisting of a number of runs of a computer model with different inputs, are now common-place in scientific research. Using a simple fire model for illustration some guidelines are given for the size of a computer experiment. A graph is provided relating the error of prediction to the sample size which should be of use when designing computer experiments. Methods for augmenting computer experiments with extra runs are also described and illustrated. The simplest method involves adding one point at a time choosing that point with the maximum prediction variance. Another method that appears to work well is to choose points from a candidate set with maximum determinant of the variance covariance matrix of predictions.

Rank regression analysis of correlated water quality data from South East Queensland

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.

Population Monte Carlo Algorithm in High Dimensions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The population Monte Carlo algorithm is an iterative importance sampling scheme for solving static problems. We examine the population Monte Carlo algorithm in a simplified setting, a single step of the general algorithm, and study a fundamental problem that occurs in applying importance sampling to high-dimensional problem. The precision of the computed estimate from the simplified setting is measured by the asymptotic variance of estimate under conditions on the importance function. We demonstrate the exponential growth of the asymptotic variance with the dimension and show that the optimal covariance matrix for the importance function can be estimated in special cases.

Bayesian hybrid algorithms and models : implementation and associated issues

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses computational challenges arising from Bayesian analysis of complex real-world problems. Many of the models and algorithms designed for such analysis are ‘hybrid’ in nature, in that they are a composition of components for which their individual properties may be easily described but the performance of the model or algorithm as a whole is less well understood. The aim of this research project is to after a better understanding of the performance of hybrid models and algorithms. The goal of this thesis is to analyse the computational aspects of hybrid models and hybrid algorithms in the Bayesian context. The first objective of the research focuses on computational aspects of hybrid models, notably a continuous finite mixture of t-distributions. In the mixture model, an inference of interest is the number of components, as this may relate to both the quality of model fit to data and the computational workload. The analysis of t-mixtures using Markov chain Monte Carlo (MCMC) is described and the model is compared to the Normal case based on the goodness of fit. Through simulation studies, it is demonstrated that the t-mixture model can be more flexible and more parsimonious in terms of number of components, particularly for skewed and heavytailed data. The study also reveals important computational issues associated with the use of t-mixtures, which have not been adequately considered in the literature. The second objective of the research focuses on computational aspects of hybrid algorithms for Bayesian analysis. Two approaches will be considered: a formal comparison of the performance of a range of hybrid algorithms and a theoretical investigation of the performance of one of these algorithms in high dimensions. For the first approach, the delayed rejection algorithm, the pinball sampler, the Metropolis adjusted Langevin algorithm, and the hybrid version of the population Monte Carlo (PMC) algorithm are selected as a set of examples of hybrid algorithms. Statistical literature shows how statistical efficiency is often the only criteria for an efficient algorithm. In this thesis the algorithms are also considered and compared from a more practical perspective. This extends to the study of how individual algorithms contribute to the overall efficiency of hybrid algorithms, and highlights weaknesses that may be introduced by the combination process of these components in a single algorithm. The second approach to considering computational aspects of hybrid algorithms involves an investigation of the performance of the PMC in high dimensions. It is well known that as a model becomes more complex, computation may become increasingly difficult in real time. In particular the importance sampling based algorithms, including the PMC, are known to be unstable in high dimensions. This thesis examines the PMC algorithm in a simplified setting, a single step of the general sampling, and explores a fundamental problem that occurs in applying importance sampling to a high-dimensional problem. The precision of the computed estimate from the simplified setting is measured by the asymptotic variance of the estimate under conditions on the importance function. Additionally, the exponential growth of the asymptotic variance with the dimension is demonstrated and we illustrates that the optimal covariance matrix for the importance function can be estimated in a special case.

Reliability of partial ambiguity fixing with multiple GNSS constellations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reliable ambiguity resolution (AR) is essential to Real-Time Kinematic (RTK) positioning and its applications, since incorrect ambiguity fixing can lead to largely biased positioning solutions. A partial ambiguity fixing technique is developed to improve the reliability of AR, involving partial ambiguity decorrelation (PAD) and partial ambiguity resolution (PAR). Decorrelation transformation could substantially amplify the biases in the phase measurements. The purpose of PAD is to find the optimum trade-off between decorrelation and worst-case bias amplification. The concept of PAR refers to the case where only a subset of the ambiguities can be fixed correctly to their integers in the integer least-squares (ILS) estimation system at high success rates. As a result, RTK solutions can be derived from these integer-fixed phase measurements. This is meaningful provided that the number of reliably resolved phase measurements is sufficiently large for least-square estimation of RTK solutions as well. Considering the GPS constellation alone, partially fixed measurements are often insufficient for positioning. The AR reliability is usually characterised by the AR success rate. In this contribution an AR validation decision matrix is firstly introduced to understand the impact of success rate. Moreover the AR risk probability is included into a more complete evaluation of the AR reliability. We use 16 ambiguity variance-covariance matrices with different levels of success rate to analyse the relation between success rate and AR risk probability. Next, the paper examines during the PAD process, how a bias in one measurement is propagated and amplified onto many others, leading to more than one wrong integer and to affect the success probability. Furthermore, the paper proposes a partial ambiguity fixing procedure with a predefined success rate criterion and ratio-test in the ambiguity validation process. In this paper, the Galileo constellation data is tested with simulated observations. Numerical results from our experiment clearly demonstrate that only when the computed success rate is very high, the AR validation can provide decisions about the correctness of AR which are close to real world, with both low AR risk and false alarm probabilities. The results also indicate that the PAR procedure can automatically chose adequate number of ambiguities to fix at given high-success rate from the multiple constellations instead of fixing all the ambiguities. This is a benefit that multiple GNSS constellations can offer.

A genetic analysis of cortical thickness in 372 twins

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Imaging genetics is a new field of neuroscience that blends methods from computational anatomy and quantitative genetics to identify genetic influences on brain structure and function. Here we analyzed brain MRI data from 372 young adult twins to identify cortical regions in which gray matter volume is influenced by genetic differences across subjects. Thickness maps, reconstructed from surface models of the cortical gray/white and gray/CSF interfaces, were smoothed with a 25 mm FWHM kernel and automatically parcellated into 34 regions of interest per hemisphere. In structural equation models fitted to volume values at each surface vertex, we computed components of variance due to additive genetic (A), shared (C) and unique (E) environmental factors, and tested their significance. Cortical regions in the vicinity of the perisylvian language cortex, and at the frontal and temporal poles, showed significant additive genetic variance, suggesting that volume measures from these regions may provide quantitative phenotypes to narrow the search for quantitative trait loci that influence brain structure.

Efficient estimation for rank-based regression with clustered data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.

Induced smoothing for rank regression with censored survival times

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adaptions of weighted rank regression to the accelerated failure time model for censored survival data have been successful in yielding asymptotically normal estimates and flexible weighting schemes to increase statistical efficiencies. However, for only one simple weighting scheme, Gehan or Wilcoxon weights, are estimating equations guaranteed to be monotone in parameter components, and even in this case are step functions, requiring the equivalent of linear programming for computation. The lack of smoothness makes standard error or covariance matrix estimation even more difficult. An induced smoothing technique overcame these difficulties in various problems involving monotone but pure jump estimating equations, including conventional rank regression. The present paper applies induced smoothing to the Gehan-Wilcoxon weighted rank regression for the accelerated failure time model, for the more difficult case of survival time data subject to censoring, where the inapplicability of permutation arguments necessitates a new method of estimating null variance of estimating functions. Smooth monotone parameter estimation and rapid, reliable standard error or covariance matrix estimation is obtained.

Standard errors and covariance matrices for smoothed rank estimators

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A 'pseudo-Bayesian' interpretation of standard errors yields a natural induced smoothing of statistical estimating functions. When applied to rank estimation, the lack of smoothness which prevents standard error estimation is remedied. Efficiency and robustness are preserved, while the smoothed estimation has excellent computational properties. In particular, convergence of the iterative equation for standard error is fast, and standard error calculation becomes asymptotically a one-step procedure. This property also extends to covariance matrix calculation for rank estimates in multi-parameter problems. Examples, and some simple explanations, are given.

A Phenomic Scan of the Norfolk Island Genetic Isolate Identifies a Major Pleiotropic Effect Locus Associated with Metabolic and Renal Disorder Markers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiphenotype genome-wide association studies (GWAS) may reveal pleiotropic genes, which would remain undetected using single phenotype analyses. Analysis of large pedigrees offers the added advantage of more accurately assessing trait heritability, which can help prioritise genetically influenced phenotypes for GWAS analysis. In this study we performed a principal component analysis (PCA), heritability (h2) estimation and pedigree-based GWAS of 37 cardiovascular disease -related phenotypes in 330 related individuals forming a large pedigree from the Norfolk Island genetic isolate. PCA revealed 13 components explaining >75% of the total variance. Nine components yielded statistically significant h2 values ranging from 0.22 to 0.54 (P<0.05). The most heritable component was loaded with 7 phenotypic measures reflecting metabolic and renal dysfunction. A GWAS of this composite phenotype revealed statistically significant associations for 3 adjacent SNPs on chromosome 1p22.2 (P<1x10-8). These SNPs form a 42kb haplotype block and explain 11% of the genetic variance for this renal function phenotype. Replication analysis of the tagging SNP (rs1396315) in an independent US cohort supports the association (P = 0.000011). Blood transcript analysis showed 35 genes were associated with rs1396315 (P<0.05). Gene set enrichment analysis of these genes revealed the most enriched pathway was purine metabolism (P = 0.0015). Overall, our findings provide convincing evidence for a major pleiotropic effect locus on chromosome 1p22.2 influencing risk of renal dysfunction via purine metabolism pathways in the Norfolk Island population. Further studies are now warranted to interrogate the functional relevance of this locus in terms of renal pathology and cardiovascular disease risk.

Genetic and environmental influences on migraine: a twin study across six countries

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Migraine is a common neurovascular brain disorder that is manifested in recurrent episodes of disabling headache. The aim of the present study was to compare the prevalence and heritability of migraine across six of the countries that participate in GenomEUtwin project including a total number of 29,717 twin pairs. Migraine was assessed by questionnaires that differed between most countries. It was most prevalent in Danish and Dutch females (32% and 34%, respectively), whereas the lowest prevalence was found in the younger and older Finnish cohorts (13% and 10%, respectively). The estimated genetic variance (heritability) was significant and the same between sexes in all countries. Heritability ranged from 34% to 57%, with lowest estimates in Australia, and highest estimates in the older cohort of Finland, the Netherlands, and Denmark. There was some indication that part of the genetic variance was non-additive, but this was significant in Sweden only. In addition to genetic factors, environmental effects that are non-shared between members of a twin pair contributed to the liability of migraine. After migraine definitions are homogenized among the participating countries, the GenomEUtwin project will provide a powerful resource to identify the genes involved in migraine.

«
1
2
3
4
5
6
7
8
...
11
12
»