10 resultados para orthonormal basis functions (OBF)
em Universitat de Girona, Spain
Resumo:
The simplex, the sample space of compositional data, can be structured as a real Euclidean space. This fact allows to work with the coefficients with respect to an orthonormal basis. Over these coefficients we apply standard real analysis, inparticular, we define two different laws of probability trought the density function and we study their main properties
Resumo:
The use of orthonormal coordinates in the simplex and, particularly, balance coordinates, has suggested the use of a dendrogram for the exploratory analysis of compositional data. The dendrogram is based on a sequential binary partition of a compositional vector into groups of parts. At each step of a partition, one group of parts is divided into two new groups, and a balancing axis in the simplex between both groups is defined. The set of balancing axes constitutes an orthonormal basis, and the projections of the sample on them are orthogonal coordinates. They can be represented in a dendrogram-like graph showing: (a) the way of grouping parts of the compositional vector; (b) the explanatory role of each subcomposition generated in the partition process; (c) the decomposition of the total variance into balance components associated with each binary partition; (d) a box-plot of each balance. This representation is useful to help the interpretation of balance coordinates; to identify which are the most explanatory coordinates; and to describe the whole sample in a single diagram independently of the number of parts of the sample
Resumo:
A joint distribution of two discrete random variables with finite support can be displayed as a two way table of probabilities adding to one. Assume that this table has n rows and m columns and all probabilities are non-null. This kind of table can be seen as an element in the simplex of n · m parts. In this context, the marginals are identified as compositional amalgams, conditionals (rows or columns) as subcompositions. Also, simplicial perturbation appears as Bayes theorem. However, the Euclidean elements of the Aitchison geometry of the simplex can also be translated into the table of probabilities: subspaces, orthogonal projections, distances. Two important questions are addressed: a) given a table of probabilities, which is the nearest independent table to the initial one? b) which is the largest orthogonal projection of a row onto a column? or, equivalently, which is the information in a row explained by a column, thus explaining the interaction? To answer these questions three orthogonal decompositions are presented: (1) by columns and a row-wise geometric marginal, (2) by rows and a columnwise geometric marginal, (3) by independent two-way tables and fully dependent tables representing row-column interaction. An important result is that the nearest independent table is the product of the two (row and column)-wise geometric marginal tables. A corollary is that, in an independent table, the geometric marginals conform with the traditional (arithmetic) marginals. These decompositions can be compared with standard log-linear models. Key words: balance, compositional data, simplex, Aitchison geometry, composition, orthonormal basis, arithmetic and geometric marginals, amalgam, dependence measure, contingency table
Resumo:
Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data
Resumo:
The present work provides a generalization of Mayer's energy decomposition for the density-functional theory (DFT) case. It is shown that one- and two-atom Hartree-Fock energy components in Mayer's approach can be represented as an action of a one-atom potential VA on a one-atom density ρ A or ρ B. To treat the exchange-correlation term in the DFT energy expression in a similar way, the exchange-correlation energy density per electron is expanded into a linear combination of basis functions. Calculations carried out for a number of density functionals demonstrate that the DFT and Hartree-Fock two-atom energies agree to a reasonable extent with each other. The two-atom energies for strong covalent bonds are within the range of typical bond dissociation energies and are therefore a convenient computational tool for assessment of individual bond strength in polyatomic molecules. For nonspecific nonbonding interactions, the two-atom energies are low. They can be either repulsive or slightly attractive, but the DFT results more frequently yield small attractive values compared to the Hartree-Fock case. The hydrogen bond in the water dimer is calculated to be between the strong covalent and nonbonding interactions on the energy scale
Resumo:
Aquesta tesi estudia com estimar la distribució de les variables regionalitzades l'espai mostral i l'escala de les quals admeten una estructura d'espai Euclidià. Apliquem el principi del treball en coordenades: triem una base ortonormal, fem estadística sobre les coordenades de les dades, i apliquem els output a la base per tal de recuperar un resultat en el mateix espai original. Aplicant-ho a les variables regionalitzades, obtenim una aproximació única consistent, que generalitza les conegudes propietats de les tècniques de kriging a diversos espais mostrals: dades reals, positives o composicionals (vectors de components positives amb suma constant) són tractades com casos particulars. D'aquesta manera, es generalitza la geostadística lineal, i s'ofereix solucions a coneguts problemes de la no-lineal, tot adaptant la mesura i els criteris de representativitat (i.e., mitjanes) a les dades tractades. L'estimador per a dades positives coincideix amb una mitjana geomètrica ponderada, equivalent a l'estimació de la mediana, sense cap dels problemes del clàssic kriging lognormal. El cas composicional ofereix solucions equivalents, però a més permet estimar vectors de probabilitat multinomial. Amb una aproximació bayesiana preliminar, el kriging de composicions esdevé també una alternativa consistent al kriging indicador. Aquesta tècnica s'empra per estimar funcions de probabilitat de variables qualsevol, malgrat que sovint ofereix estimacions negatives, cosa que s'evita amb l'alternativa proposada. La utilitat d'aquest conjunt de tècniques es comprova estudiant la contaminació per amoníac a una estació de control automàtic de la qualitat de l'aigua de la conca de la Tordera, i es conclou que només fent servir les tècniques proposades hom pot detectar en quins instants l'amoni es transforma en amoníac en una concentració superior a la legalment permesa.
Resumo:
Compositional data analysis motivated the introduction of a complete Euclidean structure in the simplex of D parts. This was based on the early work of J. Aitchison (1986) and completed recently when Aitchinson distance in the simplex was associated with an inner product and orthonormal bases were identified (Aitchison and others, 2002; Egozcue and others, 2003). A partition of the support of a random variable generates a composition by assigning the probability of each interval to a part of the composition. One can imagine that the partition can be refined and the probability density would represent a kind of continuous composition of probabilities in a simplex of infinitely many parts. This intuitive idea would lead to a Hilbert-space of probability densities by generalizing the Aitchison geometry for compositions in the simplex into the set probability densities
Resumo:
A comparision of the local effects of the basis set superposition error (BSSE) on the electron densities and energy components of three representative H-bonded complexes was carried out. The electron densities were obtained with Hartee-Fock and density functional theory versions of the chemical Hamiltonian approach (CHA) methodology. It was shown that the effects of the BSSE were common for all complexes studied. The electron density difference maps and the chemical energy component analysis (CECA) analysis confirmed that the local effects of the BSSE were different when diffuse functions were present in the calculations
Resumo:
Geometries, vibrational frequencies, and interaction energies of the CNH⋯O3 and HCCH⋯O3 complexes are calculated in a counterpoise-corrected (CP-corrected) potential-energy surface (PES) that corrects for the basis set superposition error (BSSE). Ab initio calculations are performed at the Hartree-Fock (HF) and second-order Møller-Plesset (MP2) levels, using the 6-31G(d,p) and D95++(d,p) basis sets. Interaction energies are presented including corrections for zero-point vibrational energy (ZPVE) and thermal correction to enthalpy at 298 K. The CP-corrected and conventional PES are compared; the unconnected PES obtained using the larger basis set including diffuse functions exhibits a double well shape, whereas use of the 6-31G(d,p) basis set leads to a flat single-well profile. The CP-corrected PES has always a multiple-well shape. In particular, it is shown that the CP-corrected PES using the smaller basis set is qualitatively analogous to that obtained with the larger basis sets, so the CP method becomes useful to correctly describe large systems, where the use of small basis sets may be necessary
Resumo:
To obtain a state-of-the-art benchmark potential energy surface (PES) for the archetypal oxidative addition of the methane C-H bond to the palladium atom, we have explored this PES using a hierarchical series of ab initio methods (Hartree-Fock, second-order Møller-Plesset perturbation theory, fourth-order Møller-Plesset perturbation theory with single, double and quadruple excitations, coupled cluster theory with single and double excitations (CCSD), and with triple excitations treated perturbatively [CCSD(T)]) and hybrid density functional theory using the B3LYP functional, in combination with a hierarchical series of ten Gaussian-type basis sets, up to g polarization. Relativistic effects are taken into account either through a relativistic effective core potential for palladium or through a full four-component all-electron approach. Counterpoise corrected relative energies of stationary points are converged to within 0.1-0.2 kcal/mol as a function of the basis-set size. Our best estimate of kinetic and thermodynamic parameters is -8.1 (-8.3) kcal/mol for the formation of the reactant complex, 5.8 (3.1) kcal/mol for the activation energy relative to the separate reactants, and 0.8 (-1.2) kcal/mol for the reaction energy (zero-point vibrational energy-corrected values in parentheses). This agrees well with available experimental data. Our work highlights the importance of sufficient higher angular momentum polarization functions, f and g, for correctly describing metal-d-electron correlation and, thus, for obtaining reliable relative energies. We show that standard basis sets, such as LANL2DZ+ 1f for palladium, are not sufficiently polarized for this purpose and lead to erroneous CCSD(T) results. B3LYP is associated with smaller basis set superposition errors and shows faster convergence with basis-set size but yields relative energies (in particular, a reaction barrier) that are ca. 3.5 kcal/mol higher than the corresponding CCSD(T) values