986 resultados para Approximate-Iterative Method
Resumo:
Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.
Resumo:
Very large molecular systems can be calculated with the so called CNDOL approximate Hamiltonians that have been developed by avoiding oversimplifications and only using a priori parameters and formulas from the simpler NDO methods. A new diagonal monoelectronic term named CNDOL/21 shows great consistency and easier SCF convergence when used together with an appropriate function for charge repulsion energies that is derived from traditional formulas. It is possible to obtain a priori molecular orbitals and electron excitation properties after the configuration interaction of single excited determinants with reliability, maintaining interpretative possibilities even being a simplified Hamiltonian. Tests with some unequivocal gas phase maxima of simple molecules (benzene, furfural, acetaldehyde, hexyl alcohol, methyl amine, 2,5 dimethyl 2,4 hexadiene, and ethyl sulfide) ratify the general quality of this approach in comparison with other methods. The calculation of large systems as porphine in gas phase and a model of the complete retinal binding pocket in rhodopsin with 622 basis functions on 280 atoms at the quantum mechanical level show reliability leading to a resulting first allowed transition in 483 nm, very similar to the known experimental value of 500 nm of "dark state." In this very important case, our model gives a central role in this excitation to a charge transfer from the neighboring Glu(-) counterion to the retinaldehyde polyene chain. Tests with gas phase maxima of some important molecules corroborate the reliability of CNDOL/2 Hamiltonians.
Resumo:
This Master’s Thesis examines knowledge creation and transfer processes in an iterative project environment. The aim is to understand how knowledge is created and transferred during an actual iterative implementation project which takes place in International Business Machines (IBM). The second aim is to create and develop new working methods that support more effective knowledge creation and transfer for future iterative implementation projects. The research methodology in this thesis is qualitative. Using focus group interviews as a research method provides qualitative information and introduces the experiences of the individuals participating in the project. This study found that the following factors affect knowledge creation and transfer in an iterative, multinational, and multi-organizational implementation project: shared vision and common goal, trust, open communication, social capital, and network density. All of these received both theoretical and empirical support. As for future projects, strengthening these factors was found to be the key for more effective knowledge creation and transfer.
Resumo:
The aim of this paper is the investigation of the error which results from the method of approximate approximations applied to functions defined on compact in- tervals, only. This method, which is based on an approximate partition of unity, was introduced by V. Mazya in 1991 and has mainly been used for functions defied on the whole space up to now. For the treatment of differential equations and boundary integral equations, however, an efficient approximation procedure on compact intervals is needed. In the present paper we apply the method of approximate approximations to functions which are defined on compact intervals. In contrast to the whole space case here a truncation error has to be controlled in addition. For the resulting total error pointwise estimates and L1-estimates are given, where all the constants are determined explicitly.
Resumo:
The method of approximate approximations is based on generating functions representing an approximate partition of the unity, only. In the present paper this method is used for the numerical solution of the Poisson equation and the Stokes system in R^n (n = 2, 3). The corresponding approximate volume potentials will be computed explicitly in these cases, containing a one-dimensional integral, only. Numerical simulations show the efficiency of the method and confirm the expected convergence of essentially second order, depending on the smoothness of the data.
Resumo:
The aim of this paper is the numerical treatment of a boundary value problem for the system of Stokes' equations. For this we extend the method of approximate approximations to boundary value problems. This method was introduced by V. Maz'ya in 1991 and has been used until now for the approximation of smooth functions defined on the whole space and for the approximation of volume potentials. In the present paper we develop an approximation procedure for the solution of the interior Dirichlet problem for the system of Stokes' equations in two dimensions. The procedure is based on potential theoretical considerations in connection with a boundary integral equations method and consists of three approximation steps as follows. In a first step the unknown source density in the potential representation of the solution is replaced by approximate approximations. In a second step the decay behavior of the generating functions is used to gain a suitable approximation for the potential kernel, and in a third step Nyström's method leads to a linear algebraic system for the approximate source density. For every step a convergence analysis is established and corresponding error estimates are given.
Resumo:
The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.
Resumo:
In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.
Resumo:
We study the preconditioning of symmetric indefinite linear systems of equations that arise in interior point solution of linear optimization problems. The preconditioning method that we study exploits the block structure of the augmented matrix to design a similar block structure preconditioner to improve the spectral properties of the resulting preconditioned matrix so as to improve the convergence rate of the iterative solution of the system. We also propose a two-phase algorithm that takes advantage of the spectral properties of the transformed matrix to solve for the Newton directions in the interior-point method. Numerical experiments have been performed on some LP test problems in the NETLIB suite to demonstrate the potential of the preconditioning method discussed.
Resumo:
A select-divide-and-conquer variational method to approximate configuration interaction (CI) is presented. Given an orthonormal set made up of occupied orbitals (Hartree-Fock or similar) and suitable correlation orbitals (natural or localized orbitals), a large N-electron target space S is split into subspaces S0,S1,S2,...,SR. S0, of dimension d0, contains all configurations K with attributes (energy contributions, etc.) above thresholds T0={T0egy, T0etc.}; the CI coefficients in S0 remain always free to vary. S1 accommodates KS with attributes above T1≤T0. An eigenproblem of dimension d0+d1 for S0+S 1 is solved first, after which the last d1 rows and columns are contracted into a single row and column, thus freezing the last d1 CI coefficients hereinafter. The process is repeated with successive Sj(j≥2) chosen so that corresponding CI matrices fit random access memory (RAM). Davidson's eigensolver is used R times. The final energy eigenvalue (lowest or excited one) is always above the corresponding exact eigenvalue in S. Threshold values {Tj;j=0, 1, 2,...,R} regulate accuracy; for large-dimensional S, high accuracy requires S 0+S1 to be solved outside RAM. From there on, however, usually a few Davidson iterations in RAM are needed for each step, so that Hamiltonian matrix-element evaluation becomes rate determining. One μhartree accuracy is achieved for an eigenproblem of order 24 × 106, involving 1.2 × 1012 nonzero matrix elements, and 8.4×109 Slater determinants
Resumo:
We present a method for analyzing the curvature (second derivatives) of the conical intersection hyperline at an optimized critical point. Our method uses the projected Hessians of the degenerate states after elimination of the two branching space coordinates, and is equivalent to a frequency calculation on a single Born-Oppenheimer potential-energy surface. Based on the projected Hessians, we develop an equation for the energy as a function of a set of curvilinear coordinates where the degeneracy is preserved to second order (i.e., the conical intersection hyperline). The curvature of the potential-energy surface in these coordinates is the curvature of the conical intersection hyperline itself, and thus determines whether one has a minimum or saddle point on the hyperline. The equation used to classify optimized conical intersection points depends in a simple way on the first- and second-order degeneracy splittings calculated at these points. As an example, for fulvene, we show that the two optimized conical intersection points of C2v symmetry are saddle points on the intersection hyperline. Accordingly, there are further intersection points of lower energy, and one of C2 symmetry - presented here for the first time - is found to be the global minimum in the intersection space
Resumo:
We consider the application of the conjugate gradient method to the solution of large, symmetric indefinite linear systems. Special emphasis is put on the use of constraint preconditioners and a new factorization that can reduce the number of flops required by the preconditioning step. Results concerning the eigenvalues of the preconditioned matrix and its minimum polynomial are given. Numerical experiments validate these conclusions.
Resumo:
Sequential techniques can enhance the efficiency of the approximate Bayesian computation algorithm, as in Sisson et al.'s (2007) partial rejection control version. While this method is based upon the theoretical works of Del Moral et al. (2006), the application to approximate Bayesian computation results in a bias in the approximation to the posterior. An alternative version based on genuine importance sampling arguments bypasses this difficulty, in connection with the population Monte Carlo method of Cappe et al. (2004), and it includes an automatic scaling of the forward kernel. When applied to a population genetics example, it compares favourably with two other versions of the approximate algorithm.
Resumo:
There is great interest in using amplified fragment length polymorphism (AFLP) markers because they are inexpensive and easy to produce. It is, therefore, possible to generate a large number of markers that have a wide coverage of species genotnes. Several statistical methods have been proposed to study the genetic structure using AFLP's but they assume Hardy-Weinberg equilibrium and do not estimate the inbreeding coefficient, F-IS. A Bayesian method has been proposed by Holsinger and colleagues that relaxes these simplifying assumptions but we have identified two sources of bias that can influence estimates based on these markers: (i) the use of a uniform prior on ancestral allele frequencies and (ii) the ascertainment bias of AFLP markers. We present a new Bayesian method that avoids these biases by using an implementation based on the approximate Bayesian computation (ABC) algorithm. This new method estimates population-specific F-IS and F-ST values and offers users the possibility of taking into account the criteria for selecting the markers that are used in the analyses. The software is available at our web site (http://www-leca.uif-grenoble.fi-/logiciels.htm). Finally, we provide advice on how to avoid the effects of ascertainment bias.
Resumo:
We have favoured the variational (secular equation) method for the determination of the (ro-) vibrational energy levels of polyatomic molecules. We use predominantly the Watson Hamiltonian in normal coordinates and an associated given potential in the variational code 'Multimode'. The dominant cost is the construction and diagonalization of matrices of ever-increasing size. Here we address this problem, using pertubation theory to select dominant expansion terms within the Davidson-Liu iterative diagonalization method. Our chosen example is the twelve-mode molecule methanol, for which we have an ab initio representation of the potential which includes the internal rotational motion of the OH group relative to CH3. Our new algorithm allows us to obtain converged energy levels for matrices of dimensions in excess of 100 000.