951 resultados para Covariance matrix estimation
Resumo:
Vector Taylor Series (VTS) model based compensation is a powerful approach for noise robust speech recognition. An important extension to this approach is VTS adaptive training (VAT), which allows canonical models to be estimated on diverse noise-degraded training data. These canonical model can be estimated using EM-based approaches, allowing simple extensions to discriminative VAT (DVAT). However to ensure a diagonal corrupted speech covariance matrix the Jacobian (loading matrix) relating the noise and clean speech is diagonalised. In this work an approach for yielding optimal diagonal loading matrices based on minimising the expected KL-divergence between the diagonal loading matrix and "correct" distributions is proposed. The performance of DVAT using the standard and optimal diagonalisation was evaluated on both in-car collected data and the Aurora4 task. © 2012 IEEE.
Resumo:
Some amount of differential settlement occurs even in the most uniform soil deposit, but it is extremely difficult to estimate because of the natural heterogeneity of the soil. The compression response of the soil and its variability must be characterised in order to estimate the probability of the differential settlement exceeding a certain threshold value. The work presented in this paper introduces a probabilistic framework to address this issue in a rigorous manner, while preserving the format of a typical geotechnical settlement analysis. In order to avoid dealing with different approaches for each category of soil, a simplified unified compression model is used to characterise the nonlinear compression behavior of soils of varying gradation through a single constitutive law. The Bayesian updating rule is used to incorporate information from three different laboratory datasets in the computation of the statistics (estimates of the means and covariance matrix) of the compression model parameters, as well as of the uncertainty inherent in the model.
Resumo:
We present the normal form of the covariance matrix for three-mode tripartite Gaussian states. By means of this result, the general form of a necessary and sufficient criterion for the possibility of a state transformation from one tripartite entangled Gaussian state to another with three modes is found. Moreover, we show that the conditions presented include not only inequalities but equalities as well.
Resumo:
基本矩阵作为分析两视图对极几何的有力工具,在视觉领域中占用重要的地位。分析了传统鲁棒方法在基本矩阵的求解问题中存在的不足,引入了稳健回归分析中的LQS方法,并结合Bucket分割技术,提出一种鲁棒估计基本矩阵的新方法,克服了RANSAC方法和LMedS方法的缺陷。模拟数据和真实图像实验结果表明,本文方法具有更高的鲁棒性和精确度。
Resumo:
The dissertation addressed the problems of signals reconstruction and data restoration in seismic data processing, which takes the representation methods of signal as the main clue, and take the seismic information reconstruction (signals separation and trace interpolation) as the core. On the natural bases signal representation, I present the ICA fundamentals, algorithms and its original applications to nature earth quake signals separation and survey seismic signals separation. On determinative bases signal representation, the paper proposed seismic dada reconstruction least square inversion regularization methods, sparseness constraints, pre-conditioned conjugate gradient methods, and their applications to seismic de-convolution, Radon transformation, et. al. The core contents are about de-alias uneven seismic data reconstruction algorithm and its application to seismic interpolation. Although the dissertation discussed two cases of signal representation, they can be integrated into one frame, because they both deal with the signals or information restoration, the former reconstructing original signals from mixed signals, the later reconstructing whole data from sparse or irregular data. The goal of them is same to provide pre-processing methods and post-processing method for seismic pre-stack depth migration. ICA can separate the original signals from mixed signals by them, or abstract the basic structure from analyzed data. I surveyed the fundamental, algorithms and applications of ICA. Compared with KL transformation, I proposed the independent components transformation concept (ICT). On basis of the ne-entropy measurement of independence, I implemented the FastICA and improved it by covariance matrix. By analyzing the characteristics of the seismic signals, I introduced ICA into seismic signal processing firstly in Geophysical community, and implemented the noise separation from seismic signal. Synthetic and real data examples show the usability of ICA to seismic signal processing and initial effects are achieved. The application of ICA to separation quake conversion wave from multiple in sedimentary area is made, which demonstrates good effects, so more reasonable interpretation of underground un-continuity is got. The results show the perspective of application of ICA to Geophysical signal processing. By virtue of the relationship between ICA and Blind Deconvolution , I surveyed the seismic blind deconvolution, and discussed the perspective of applying ICA to seismic blind deconvolution with two possible solutions. The relationship of PC A, ICA and wavelet transform is claimed. It is proved that reconstruction of wavelet prototype functions is Lie group representation. By the way, over-sampled wavelet transform is proposed to enhance the seismic data resolution, which is validated by numerical examples. The key of pre-stack depth migration is the regularization of pre-stack seismic data. As a main procedure, seismic interpolation and missing data reconstruction are necessary. Firstly, I review the seismic imaging methods in order to argue the critical effect of regularization. By review of the seismic interpolation algorithms, I acclaim that de-alias uneven data reconstruction is still a challenge. The fundamental of seismic reconstruction is discussed firstly. Then sparseness constraint on least square inversion and preconditioned conjugate gradient solver are studied and implemented. Choosing constraint item with Cauchy distribution, I programmed PCG algorithm and implement sparse seismic deconvolution, high resolution Radon Transformation by PCG, which is prepared for seismic data reconstruction. About seismic interpolation, dealias even data interpolation and uneven data reconstruction are very good respectively, however they can not be combined each other. In this paper, a novel Fourier transform based method and a algorithm have been proposed, which could reconstruct both uneven and alias seismic data. I formulated band-limited data reconstruction as minimum norm least squares inversion problem where an adaptive DFT-weighted norm regularization term is used. The inverse problem is solved by pre-conditional conjugate gradient method, which makes the solutions stable and convergent quickly. Based on the assumption that seismic data are consisted of finite linear events, from sampling theorem, alias events can be attenuated via LS weight predicted linearly from low frequency. Three application issues are discussed on even gap trace interpolation, uneven gap filling, high frequency trace reconstruction from low frequency data trace constrained by few high frequency traces. Both synthetic and real data numerical examples show the proposed method is valid, efficient and applicable. The research is valuable to seismic data regularization and cross well seismic. To meet 3D shot profile depth migration request for data, schemes must be taken to make the data even and fitting the velocity dataset. The methods of this paper are used to interpolate and extrapolate the shot gathers instead of simply embedding zero traces. So, the aperture of migration is enlarged and the migration effect is improved. The results show the effectiveness and the practicability.
Resumo:
Network traffic arises from the superposition of Origin-Destination (OD) flows. Hence, a thorough understanding of OD flows is essential for modeling network traffic, and for addressing a wide variety of problems including traffic engineering, traffic matrix estimation, capacity planning, forecasting and anomaly detection. However, to date, OD flows have not been closely studied, and there is very little known about their properties. We present the first analysis of complete sets of OD flow timeseries, taken from two different backbone networks (Abilene and Sprint-Europe). Using Principal Component Analysis (PCA), we find that the set of OD flows has small intrinsic dimension. In fact, even in a network with over a hundred OD flows, these flows can be accurately modeled in time using a small number (10 or less) of independent components or dimensions. We also show how to use PCA to systematically decompose the structure of OD flow timeseries into three main constituents: common periodic trends, short-lived bursts, and noise. We provide insight into how the various constituents contribute to the overall structure of OD flows and explore the extent to which this decomposition varies over time.
Resumo:
This paper analyses multivariate statistical techniques for identifying and isolating abnormal process behaviour. These techniques include contribution charts and variable reconstructions that relate to the application of principal component analysis (PCA). The analysis reveals firstly that contribution charts produce variable contributions which are linearly dependent and may lead to an incorrect diagnosis, if the number of principal components retained is close to the number of recorded process variables. The analysis secondly yields that variable reconstruction affects the geometry of the PCA decomposition. The paper further introduces an improved variable reconstruction method for identifying multiple sensor and process faults and for isolating their influence upon the recorded process variables. It is shown that this can accommodate the effect of reconstruction, i.e. changes in the covariance matrix of the sensor readings and correctly re-defining the PCA-based monitoring statistics and their confidence limits. (c) 2006 Elsevier Ltd. All rights reserved.
Resumo:
We draw an explicit connection between the statistical properties of an entangled two-mode continuous variable (CV) resource and the amount of entanglement that can be dynamically transferred to a pair of noninteracting two-level systems. More specifically, we rigorously reformulate entanglement-transfer process by making use of covariance matrix formalism. When the resource state is Gaussian, our method makes the approach to the transfer of quantum correlations much more flexible than in previously considered schemes and allows the straightforward inclusion of the effects of noise affecting the CV system. Moreover, the proposed method reveals that the use of de-Gaussified two-mode states is almost never advantageous for transferring entanglement with respect to the full Gaussian picture, despite the entanglement in the non-Gaussian resource can be much larger than in its Gaussian counterpart. We can thus conclude that the entanglement-transfer map overthrows the
Resumo:
This paper discusses the monitoring of complex nonlinear and time-varying processes. Kernel principal component analysis (KPCA) has gained significant attention as a monitoring tool for nonlinear systems in recent years but relies on a fixed model that cannot be employed for time-varying systems. The contribution of this article is the development of a numerically efficient and memory saving moving window KPCA (MWKPCA) monitoring approach. The proposed technique incorporates an up- and downdating procedure to adapt (i) the data mean and covariance matrix in the feature space and (ii) approximates the eigenvalues and eigenvectors of the Gram matrix. The article shows that the proposed MWKPCA algorithm has a computation complexity of O(N2), whilst batch techniques, e.g. the Lanczos method, are of O(N3). Including the adaptation of the number of retained components and an l-step ahead application of the MWKPCA monitoring model, the paper finally demonstrates the utility of the proposed technique using a simulated nonlinear time-varying system and recorded data from an industrial distillation column.
Resumo:
High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods.
Resumo:
This paper explores the performance of sliding-window based training, termed as semi batch, using multilayer perceptron (MLP) neural network in the presence of correlated data. The sliding window training is a form of higher order instantaneous learning strategy without the need of covariance matrix, usually employed for modeling and tracking purposes. Sliding-window framework is implemented to combine the robustness of offline learning algorithms with the ability to track online the underlying process of a function. This paper adopted sliding window training with recent advances in conjugate gradient direction with application of data store management e.g. simple distance measure, angle evaluation and the novel prediction error test. The simulation results show the best convergence performance is gained by using store management techniques. © 2012 Springer-Verlag.
Resumo:
A geostatistical version of the classical Fisher rule (linear discriminant analysis) is presented.This method is applicable when a large dataset of multivariate observations is available within a domain split in several known subdomains, and it assumes that the variograms (or covariance functions) are comparable between subdomains, which only differ in the mean values of the available variables. The method consists on finding the eigen-decomposition of the matrix W-1B, where W is the matrix of sills of all direct- and cross-variograms, and B is the covariance matrix of the vectors of weighted means within each subdomain, obtained by generalized least squares. The method is used to map peat blanket occurrence in Northern Ireland, with data from the Tellus
survey, which requires a minimal change to the general recipe: to use compositionally-compliant variogram tools and models, and work with log-ratio transformed data.
Resumo:
The paper describes the use of radial basis function neural networks with Gaussian basis functions to classify incomplete feature vectors. The method uses the fact that any marginal distribution of a Gaussian distribution can be determined from the mean vector and covariance matrix of the joint distribution.
Resumo:
We study the problem of testing the error distribution in a multivariate linear regression (MLR) model. The tests are functions of appropriately standardized multivariate least squares residuals whose distribution is invariant to the unknown cross-equation error covariance matrix. Empirical multivariate skewness and kurtosis criteria are then compared to simulation-based estimate of their expected value under the hypothesized distribution. Special cases considered include testing multivariate normal, Student t; normal mixtures and stable error models. In the Gaussian case, finite-sample versions of the standard multivariate skewness and kurtosis tests are derived. To do this, we exploit simple, double and multi-stage Monte Carlo test methods. For non-Gaussian distribution families involving nuisance parameters, confidence sets are derived for the the nuisance parameters and the error distribution. The procedures considered are evaluated in a small simulation experi-ment. Finally, the tests are applied to an asset pricing model with observable risk-free rates, using monthly returns on New York Stock Exchange (NYSE) portfolios over five-year subperiods from 1926-1995.
Resumo:
Cette thèse envisage un ensemble de méthodes permettant aux algorithmes d'apprentissage statistique de mieux traiter la nature séquentielle des problèmes de gestion de portefeuilles financiers. Nous débutons par une considération du problème général de la composition d'algorithmes d'apprentissage devant gérer des tâches séquentielles, en particulier celui de la mise-à-jour efficace des ensembles d'apprentissage dans un cadre de validation séquentielle. Nous énumérons les desiderata que des primitives de composition doivent satisfaire, et faisons ressortir la difficulté de les atteindre de façon rigoureuse et efficace. Nous poursuivons en présentant un ensemble d'algorithmes qui atteignent ces objectifs et présentons une étude de cas d'un système complexe de prise de décision financière utilisant ces techniques. Nous décrivons ensuite une méthode générale permettant de transformer un problème de décision séquentielle non-Markovien en un problème d'apprentissage supervisé en employant un algorithme de recherche basé sur les K meilleurs chemins. Nous traitons d'une application en gestion de portefeuille où nous entraînons un algorithme d'apprentissage à optimiser directement un ratio de Sharpe (ou autre critère non-additif incorporant une aversion au risque). Nous illustrons l'approche par une étude expérimentale approfondie, proposant une architecture de réseaux de neurones spécialisée à la gestion de portefeuille et la comparant à plusieurs alternatives. Finalement, nous introduisons une représentation fonctionnelle de séries chronologiques permettant à des prévisions d'être effectuées sur un horizon variable, tout en utilisant un ensemble informationnel révélé de manière progressive. L'approche est basée sur l'utilisation des processus Gaussiens, lesquels fournissent une matrice de covariance complète entre tous les points pour lesquels une prévision est demandée. Cette information est utilisée à bon escient par un algorithme qui transige activement des écarts de cours (price spreads) entre des contrats à terme sur commodités. L'approche proposée produit, hors échantillon, un rendement ajusté pour le risque significatif, après frais de transactions, sur un portefeuille de 30 actifs.