Biblioteca Digital

981 resultados para Correlation matrix

Managing memory and reducing I/O cost for correlation matrix calculation in bioinformatics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generation of a correlation matrix from a large set of long gene sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. The generation is not only computationally intensive but also requires significant memory resources as, typically, few gene sequences can be simultaneously stored in primary memory. The standard practice in such computation is to use frequent input/output (I/O) operations. Therefore, minimizing the number of these operations will yield much faster run-times. This paper develops an approach for the faster and scalable computing of large-size correlation matrices through the full use of available memory and a reduced number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different problems with different correlation matrix sizes. The significant performance improvement of the approach over the existing approaches is demonstrated through benchmark examples.

Optimizing I/O cost and managing memory for composition vector method based on correlation matrix calculation in bioinformatics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generation of a correlation matrix for set of genomic sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. Each sequence may be millions of bases long and there may be thousands of such sequences which we wish to compare, so not all sequences may fit into main memory at the same time. Each sequence needs to be compared with every other sequence, so we will generally need to page some sequences in and out more than once. In order to minimize execution time we need to minimize this I/O. This paper develops an approach for faster and scalable computing of large-size correlation matrices through the maximal exploitation of available memory and reducing the number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different bioinformatics problems with different correlation matrix sizes. The significant performance improvement of the approach over previous work is demonstrated through benchmark examples.

Discussion of “Generalized estimating equations: Notes on the choice of the working correlation matrix”

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To discuss generalized estimating equations as an extension of generalized linear models by commenting on the paper of Ziegler and Vens "Generalized Estimating Equations. Notes on the Choice of the Working Correlation Matrix". Methods Inviting an international group of experts to comment on this paper. Results Several perspectives have been taken by the discussants. Econometricians have established parallels to the generalized method of moments (GMM). Statisticians discussed model assumptions and the aspect of missing data Applied statisticians; commented on practical aspects in data analysis. Conclusions In general, careful modeling correlation is encouraged when considering estimation efficiency and other implications, and a comparison of choosing instruments in GMM and generalized estimating equations, (GEE) would be worthwhile. Some theoretical drawbacks of GEE need to be further addressed and require careful analysis of data This particularly applies to the situation when data are missing at random.

(Table 5) Correlation matrix between past factors and the SST alkenones of ODP Site 1233

Relevância:

100.00% 100.00%

Publicador:

(Table 2) Correlation matrix obtained between the factors from the present data set and the selected environmental variables of ODP Site 1233

Relevância:

100.00% 100.00%

Publicador:

(Table 4) Correlation matrix between present and past factors of ODP Site 1233

Relevância:

100.00% 100.00%

Publicador:

Tab.3: Spearman correlation matrix

Relevância:

100.00% 100.00%

Publicador:

Test for Independence of the Variables with Missing Elements in One and the Same Column of the Empirical Correlation Matrix

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62H15, 62H12.

Hierarchical clustering of the genetic connectivity matrix reveals the network topology of gene action on brain microstructure: An N=531 twin study

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Genetic correlation (rg) analysis determines how much of the correlation between two measures is due to common genetic influences. In an analysis of 4 Tesla diffusion tensor images (DTI) from 531 healthy young adult twins and their siblings, we generalized the concept of genetic correlation to determine common genetic influences on white matter integrity, measured by fractional anisotropy (FA), at all points of the brain, yielding an NxN genetic correlation matrix rg(x,y) between FA values at all pairs of voxels in the brain. With hierarchical clustering, we identified brain regions with relatively homogeneous genetic determinants, to boost the power to identify causal single nucleotide polymorphisms (SNP). We applied genome-wide association (GWA) to assess associations between 529,497 SNPs and FA in clusters defined by hubs of the clustered genetic correlation matrix. We identified a network of genes, with a scale-free topology, that influences white matter integrity over multiple brain regions.

Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The method of generalised estimating equations for regression modelling of clustered outcomes allows for specification of a working matrix that is intended to approximate the true correlation matrix of the observations. We investigate the asymptotic relative efficiency of the generalised estimating equation for the mean parameters when the correlation parameters are estimated by various methods. The asymptotic relative efficiency depends on three-features of the analysis, namely (i) the discrepancy between the working correlation structure and the unobservable true correlation structure, (ii) the method by which the correlation parameters are estimated and (iii) the 'design', by which we refer to both the structures of the predictor matrices within clusters and distribution of cluster sizes. Analytical and numerical studies of realistic data-analysis scenarios show that choice of working covariance model has a substantial impact on regression estimator efficiency. Protection against avoidable loss of efficiency associated with covariance misspecification is obtained when a 'Gaussian estimation' pseudolikelihood procedure is used with an AR(1) structure.

On the generalized eigenvalue method for energies and matrix elements in lattice field theory

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We discuss the generalized eigenvalue problem for computing energies and matrix elements in lattice gauge theory, including effective theories such as HQET. It is analyzed how the extracted effective energies and matrix elements converge when the time separations are made large. This suggests a particularly efficient application of the method for which we can prove that corrections vanish asymptotically as exp(-(E(N+1) - E(n))t). The gap E(N+1) - E(n) can be made large by increasing the number N of interpolating fields in the correlation matrix. We also show how excited state matrix elements can be extracted such that contaminations from all other states disappear exponentially in time. As a demonstration we present numerical results for the extraction of ground state and excited B-meson masses and decay constants in static approximation and to order 1/m(b) in HQET.

Empirical Analysis of Credit Risk Regime Switching and Temporal Conditional Default Correlation in Credit Default Swap Valuation: The Market liquidity effect

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we extend the debate concerning Credit Default Swap valuation to include time varying correlation and co-variances. Traditional multi-variate techniques treat the correlations between covariates as constant over time; however, this view is not supported by the data. Secondly, since financial data does not follow a normal distribution because of its heavy tails, modeling the data using a Generalized Linear model (GLM) incorporating copulas emerge as a more robust technique over traditional approaches. This paper also includes an empirical analysis of the regime switching dynamics of credit risk in the presence of liquidity by following the general practice of assuming that credit and market risk follow a Markov process. The study was based on Credit Default Swap data obtained from Bloomberg that spanned the period January 1st 2004 to August 08th 2006. The empirical examination of the regime switching tendencies provided quantitative support to the anecdotal view that liquidity decreases as credit quality deteriorates. The analysis also examined the joint probability distribution of the credit risk determinants across credit quality through the use of a copula function which disaggregates the behavior embedded in the marginal gamma distributions, so as to isolate the level of dependence which is captured in the copula function. The results suggest that the time varying joint correlation matrix performed far superior as compared to the constant correlation matrix; the centerpiece of linear regression models.

Detection of protein fold similarity based on correlation of amino acid properties

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An increasing number of proteins with weak sequence similarity have been found to assume similar three-dimensional fold and often have similar or related biochemical or biophysical functions. We propose a method for detecting the fold similarity between two proteins with low sequence similarity based on their amino acid properties alone. The method, the proximity correlation matrix (PCM) method, is built on the observation that the physical properties of neighboring amino acid residues in sequence at structurally equivalent positions of two proteins of similar fold are often correlated even when amino acid sequences are different. The hydrophobicity is shown to be the most strongly correlated property for all protein fold classes. The PCM method was tested on 420 proteins belonging to 64 different known folds, each having at least three proteins with little sequence similarity. The method was able to detect fold similarities for 40% of the 420 sequences. Compared with sequence comparison and several fold-recognition methods, the method demonstrates good performance in detecting fold similarities among the proteins with low sequence identity. Applied to the complete genome of Methanococcus jannaschii, the method recognized the folds for 22 hypothetical proteins.

Measuring global economic interdependence : a hierarchical network approach

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper investigates the business cycle co-movement across countries and regions since 1950 as a measure for quantifying the economic interdependence in the ongoing globalisation process. Our methodological approach is based on analysis of a correlation matrix and the networks it contains. Such an approach summarises the interaction and interdependence of all elements, and it represents a more accurate measure of the global interdependence involved in an economic system. Our results show (1) the dynamics of interdependence has been driven more by synchronisation in regional growth patterns than by the synchronisation of the world economy, and (2) world crisis periods dramatically increase the global co-movement in the world economy.

A time-based quantitative approach for selecting lean strategies for manufacturing organisations

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Lean strategies have been developed to eliminate or reduce waste and thus improve operational efficiency in a manufacturing environment. However, in practice, manufacturers encounter difficulties to select appropriate lean strategies within their resource constraints and to quantitatively evaluate the perceived value of manufacturing waste reduction. This paper presents a methodology developed to quantitatively evaluate the contribution of lean strategies selected to reduce manufacturing wastes within the manufacturers’ resource (time) constraints. A mathematical model has been developed for evaluating the perceived value of lean strategies to manufacturing waste reduction and a step-by-step methodology is provided for selecting appropriate lean strategies to improve the manufacturing performance within their resource constraints. A computer program is developed in MATLAB for finding the optimum solution. With the help of a case study, the proposed methodology and developed model has been validated. A ‘lean strategy-wastes’ correlation matrix has been proposed to establish the relationship between the manufacturing wastes and lean strategies. Using the correlation matrix and applying the proposed methodology and developed mathematical model, authors came out with optimised perceived value of reduction of a manufacturer's wastes by implementing appropriate lean strategies within a manufacturer's resources constraints. Results also demonstrate that the perceived value of reduction of manufacturing wastes can significantly be changed based on policies and product strategy taken by a manufacturer. The proposed methodology can also be used in dynamic situations by changing the input in the programme developed in MATLAB. By identifying appropriate lean strategies for specific manufacturing wastes, a manufacturer can better prioritise implementation efforts and resources to maximise the success of implementing lean strategies in their organisation.

«
1
2
3
4
5
6
7
8
...
65
66
»