146 resultados para Sparse mixing matrix
em CentAUR: Central Archive University of Reading - UK
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion combining local component analysis for the finite mixture model. We start with a Parzen window estimator which has the Gaussian kernels with a common covariance matrix, the local component analysis is initially applied to find the covariance matrix using expectation maximization algorithm. Since the constraint on the mixing coefficients of a finite mixture model is on the multinomial manifold, we then use the well-known Riemannian trust-region algorithm to find the set of sparse mixing coefficients. The first and second order Riemannian geometry of the multinomial manifold are utilized in the Riemannian trust-region algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
In this paper we introduce a new algorithm, based on the successful work of Fathi and Alexandrov, on hybrid Monte Carlo algorithms for matrix inversion and solving systems of linear algebraic equations. This algorithm consists of two parts, approximate inversion by Monte Carlo and iterative refinement using a deterministic method. Here we present a parallel hybrid Monte Carlo algorithm, which uses Monte Carlo to generate an approximate inverse and that improves the accuracy of the inverse with an iterative refinement. The new algorithm is applied efficiently to sparse non-singular matrices. When we are solving a system of linear algebraic equations, Bx = b, the inverse matrix is used to compute the solution vector x = B(-1)b. We present results that show the efficiency of the parallel hybrid Monte Carlo algorithm in the case of sparse matrices.
Resumo:
Mineral dust is an important aerosol species in the Earth’s atmosphere and has a major source within North Africa, of which the Sahara forms the major part. Aerosol Time of Flight Mass Spectrometry (ATOFMS) is first used to determine the mixing state of dust particles collected from the land surface in the Saharan region, showing low abundance of species such as nitrate and sulphate internally mixed with the dust mineral matrix. These data are then compared with the ATOFMS single particle mass spectra of Saharan dust particles detected in the marine atmosphere in the vicinity of the Cape Verde islands, which are further compared with those from particles with longer atmospheric residence sampled at a coastal station at Mace Head, Ireland. Saharan dust particles collected near the Cape Verde Islands showed increased internally mixed nitrate but no sulphate, whilst Saharan dust particles collected on the coast of Ireland showed a very high degree of internally mixed secondary species including nitrate, sulphate and methanesulphonate. This uptake of secondary species will change the pH and hygroscopic properties of the aerosol dust and thus can influence the budgets of other reactive gases, as well as influencing the radiative properties of the particles and the availability of metals for dissolution.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society
Resumo:
The gas phase reactions Of SiCl4 and Si2Cl6 With CH3OH and C2H5OH have been investigated using both mass spectrometry and matrix isolation techniques. SiCl4 reacts with both CH3OH and C2H5OH upon mixing of the vapours for times in excess of 3 h to generate the HCl-elimination products SiCl3OR (R = CH3 or C2H5). The identity of these products is confirmed by deuteration experiments and by ab initio calculations at the HF/6-31G(d) level. Further products are generated when the mixture is passed through a tube heated to 750degreesC. Si2Cl6 reacts with CH3OH and C2H5OH via a different mechanism in which the Si-Si bond is cleaved to yield SiCl3OR and HCl. Other products of the type SiCl4-n(OCH3)(n) are tentatively identified by a combination of mass spectrometric and matrix isolation measurements. These latter products indicate further replacement of Cl atoms by OR groups as a result of reaction of CH3OH or C2H5OH with the initial product.
Resumo:
An efficient model identification algorithm for a large class of linear-in-the-parameters models is introduced that simultaneously optimises the model approximation ability, sparsity and robustness. The derived model parameters in each forward regression step are initially estimated via the orthogonal least squares (OLS), followed by being tuned with a new gradient-descent learning algorithm based on the basis pursuit that minimises the l(1) norm of the parameter estimate vector. The model subset selection cost function includes a D-optimality design criterion that maximises the determinant of the design matrix of the subset to ensure model robustness and to enable the model selection procedure to automatically terminate at a sparse model. The proposed approach is based on the forward OLS algorithm using the modified Gram-Schmidt procedure. Both the parameter tuning procedure, based on basis pursuit, and the model selection criterion, based on the D-optimality that is effective in ensuring model robustness, are integrated with the forward regression. As a consequence the inherent computational efficiency associated with the conventional forward OLS approach is maintained in the proposed algorithm. Examples demonstrate the effectiveness of the new approach.
Resumo:
A sparse kernel density estimator is derived based on the zero-norm constraint, in which the zero-norm of the kernel weights is incorporated to enhance model sparsity. The classical Parzen window estimate is adopted as the desired response for density estimation, and an approximate function of the zero-norm is used for achieving mathemtical tractability and algorithmic efficiency. Under the mild condition of the positive definite design matrix, the kernel weights of the proposed density estimator based on the zero-norm approximation can be obtained using the multiplicative nonnegative quadratic programming algorithm. Using the -optimality based selection algorithm as the preprocessing to select a small significant subset design matrix, the proposed zero-norm based approach offers an effective means for constructing very sparse kernel density estimates with excellent generalisation performance.
Resumo:
A generalized or tunable-kernel model is proposed for probability density function estimation based on an orthogonal forward regression procedure. Each stage of the density estimation process determines a tunable kernel, namely, its center vector and diagonal covariance matrix, by minimizing a leave-one-out test criterion. The kernel mixing weights of the constructed sparse density estimate are finally updated using the multiplicative nonnegative quadratic programming algorithm to ensure the nonnegative and unity constraints, and this weight-updating process additionally has the desired ability to further reduce the model size. The proposed tunable-kernel model has advantages, in terms of model generalization capability and model sparsity, over the standard fixed-kernel model that restricts kernel centers to the training data points and employs a single common kernel variance for every kernel. On the other hand, it does not optimize all the model parameters together and thus avoids the problems of high-dimensional ill-conditioned nonlinear optimization associated with the conventional finite mixture model. Several examples are included to demonstrate the ability of the proposed novel tunable-kernel model to effectively construct a very compact density estimate accurately.
Resumo:
This paper derives an efficient algorithm for constructing sparse kernel density (SKD) estimates. The algorithm first selects a very small subset of significant kernels using an orthogonal forward regression (OFR) procedure based on the D-optimality experimental design criterion. The weights of the resulting sparse kernel model are then calculated using a modified multiplicative nonnegative quadratic programming algorithm. Unlike most of the SKD estimators, the proposed D-optimality regression approach is an unsupervised construction algorithm and it does not require an empirical desired response for the kernel selection task. The strength of the D-optimality OFR is owing to the fact that the algorithm automatically selects a small subset of the most significant kernels related to the largest eigenvalues of the kernel design matrix, which counts for the most energy of the kernel training data, and this also guarantees the most accurate kernel weight estimate. The proposed method is also computationally attractive, in comparison with many existing SKD construction algorithms. Extensive numerical investigation demonstrates the ability of this regression-based approach to efficiently construct a very sparse kernel density estimate with excellent test accuracy, and our results show that the proposed method compares favourably with other existing sparse methods, in terms of test accuracy, model sparsity and complexity, for constructing kernel density estimates.
Resumo:
We develop a new sparse kernel density estimator using a forward constrained regression framework, within which the nonnegative and summing-to-unity constraints of the mixing weights can easily be satisfied. Our main contribution is to derive a recursive algorithm to select significant kernels one at time based on the minimum integrated square error (MISE) criterion for both the selection of kernels and the estimation of mixing weights. The proposed approach is simple to implement and the associated computational cost is very low. Specifically, the complexity of our algorithm is in the order of the number of training data N, which is much lower than the order of N2 offered by the best existing sparse kernel density estimators. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with comparable accuracy to those of the classical Parzen window estimate and other existing sparse kernel density estimators.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.
Resumo:
Subspace clustering groups a set of samples from a union of several linear subspaces into clusters, so that the samples in the same cluster are drawn from the same linear subspace. In the majority of the existing work on subspace clustering, clusters are built based on feature information, while sample correlations in their original spatial structure are simply ignored. Besides, original high-dimensional feature vector contains noisy/redundant information, and the time complexity grows exponentially with the number of dimensions. To address these issues, we propose a tensor low-rank representation (TLRR) and sparse coding-based (TLRRSC) subspace clustering method by simultaneously considering feature information and spatial structures. TLRR seeks the lowest rank representation over original spatial structures along all spatial directions. Sparse coding learns a dictionary along feature spaces, so that each sample can be represented by a few atoms of the learned dictionary. The affinity matrix used for spectral clustering is built from the joint similarities in both spatial and feature spaces. TLRRSC can well capture the global structure and inherent feature information of data, and provide a robust subspace segmentation from corrupted data. Experimental results on both synthetic and real-world data sets show that TLRRSC outperforms several established state-of-the-art methods.
Resumo:
A new sparse kernel density estimator with tunable kernels is introduced within a forward constrained regression framework whereby the nonnegative and summing-to-unity constraints of the mixing weights can easily be satisfied. Based on the minimum integrated square error criterion, a recursive algorithm is developed to select significant kernels one at time, and the kernel width of the selected kernel is then tuned using the gradient descent algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing very sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
In this paper, the available potential energy (APE) framework of Winters et al. (J. Fluid Mech., vol. 289, 1995, p. 115) is extended to the fully compressible Navier– Stokes equations, with the aims of clarifying (i) the nature of the energy conversions taking place in turbulent thermally stratified fluids; and (ii) the role of surface buoyancy fluxes in the Munk & Wunsch (Deep-Sea Res., vol. 45, 1998, p. 1977) constraint on the mechanical energy sources of stirring required to maintain diapycnal mixing in the oceans. The new framework reveals that the observed turbulent rate of increase in the background gravitational potential energy GPEr , commonly thought to occur at the expense of the diffusively dissipated APE, actually occurs at the expense of internal energy, as in the laminar case. The APE dissipated by molecular diffusion, on the other hand, is found to be converted into internal energy (IE), similar to the viscously dissipated kinetic energy KE. Turbulent stirring, therefore, does not introduce a new APE/GPEr mechanical-to-mechanical energy conversion, but simply enhances the existing IE/GPEr conversion rate, in addition to enhancing the viscous dissipation and the entropy production rates. This, in turn, implies that molecular diffusion contributes to the dissipation of the available mechanical energy ME =APE +KE, along with viscous dissipation. This result has important implications for the interpretation of the concepts of mixing efficiency γmixing and flux Richardson number Rf , for which new physically based definitions are proposed and contrasted with previous definitions. The new framework allows for a more rigorous and general re-derivation from the first principles of Munk & Wunsch (1998, hereafter MW98)’s constraint, also valid for a non-Boussinesq ocean: G(KE) ≈ 1 − ξ Rf ξ Rf Wr, forcing = 1 + (1 − ξ )γmixing ξ γmixing Wr, forcing , where G(KE) is the work rate done by the mechanical forcing, Wr, forcing is the rate of loss of GPEr due to high-latitude cooling and ξ is a nonlinearity parameter such that ξ =1 for a linear equation of state (as considered by MW98), but ξ <1 otherwise. The most important result is that G(APE), the work rate done by the surface buoyancy fluxes, must be numerically as large as Wr, forcing and, therefore, as important as the mechanical forcing in stirring and driving the oceans. As a consequence, the overall mixing efficiency of the oceans is likely to be larger than the value γmixing =0.2 presently used, thereby possibly eliminating the apparent shortfall in mechanical stirring energy that results from using γmixing =0.2 in the above formula.
Resumo:
There exist two central measures of turbulent mixing in turbulent stratified fluids that are both caused by molecular diffusion: 1) the dissipation rate D(APE) of available potential energy APE; 2) the turbulent rate of change Wr, turbulent of background gravitational potential energy GPEr. So far, these two quantities have often been regarded as the same energy conversion, namely the irreversible conversion of APE into GPEr, owing to the well known exact equality D(APE)=Wr, turbulent for a Boussinesq fluid with a linear equation of state. Recently, however, Tailleux (2009) pointed out that the above equality no longer holds for a thermally-stratified compressible, with the ratio ξ=Wr, turbulent/D(APE) being generally lower than unity and sometimes even negative for water or seawater, and argued that D(APE) and Wr, turbulent actually represent two distinct types of energy conversion, respectively the dissipation of APE into one particular subcomponent of internal energy called the "dead" internal energy IE0, and the conversion between GPEr and a different subcomponent of internal energy called "exergy" IEexergy. In this paper, the behaviour of the ratio ξ is examined for different stratifications having all the same buoyancy frequency N vertical profile, but different vertical profiles of the parameter Υ=α P/(ρCp), where α is the thermal expansion coefficient, P the hydrostatic pressure, ρ the density, and Cp the specific heat capacity at constant pressure, the equation of state being that for seawater for different particular constant values of salinity. It is found that ξ and Wr, turbulent depend critically on the sign and magnitude of dΥ/dz, in contrast with D(APE), which appears largely unaffected by the latter. These results have important consequences for how the mixing efficiency should be defined and measured in practice, which are discussed.