998 resultados para MCMC METHODS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Information about the composition of regulatory regions is of great value for designing experiments to functionally characterize gene expression. The multiplicity of available applications to predict transcription factor binding sites in a particular locus contrasts with the substantial computational expertise that is demanded to manipulate them, which may constitute a potential barrier for the experimental community. Results: CBS (Conserved regulatory Binding Sites, http://compfly.bio.ub.es/CBS) is a public platform of evolutionarily conserved binding sites and enhancers predicted in multiple Drosophila genomes that is furnished with published chromatin signatures associated to transcriptionally active regions and other experimental sources of information. The rapid access to this novel body of knowledge through a user-friendly web interface enables non-expert users to identify the binding sequences available for any particular gene, transcription factor, or genome region. Conclusions: The CBS platform is a powerful resource that provides tools for data mining individual sequences and groups of co-expressed genes with epigenomics information to conduct regulatory screenings in Drosophila.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Drying is a major step in the manufacturing process in pharmaceutical industries, and the selection of dryer and operating conditions are sometimes a bottleneck. In spite of difficulties, the bottlenecks are taken care of with utmost care due to good manufacturing practices (GMP) and industries' image in the global market. The purpose of this work is to research the use of existing knowledge for the selection of dryer and its operating conditions for drying of pharmaceutical materials with the help of methods like case-based reasoning and decision trees to reduce time and expenditure for research. The work consisted of two major parts as follows: Literature survey on the theories of spray dying, case-based reasoning and decision trees; working part includes data acquisition and testing of the models based on existing and upgraded data. Testing resulted in a combination of two models, case-based reasoning and decision trees, leading to more specific results when compared to conventional methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Currently there is a vogue for Agile Software Development methods and many software development organizations have already implemented or they are planning to implement agile methods. Objective of this thesis is to define how agile software development methods are implemented in a small organization. Agile methods covered in this thesis are Scrum and XP. From both methods the key practices are analysed and compared to waterfall method. This thesis also defines implementation strategy and actions how agile methods are implemented in a small organization. In practice organization must prepare well and all needed meters are defined before the implementation starts. In this work three different sample projects are introduced where agile methods were implemented. Experiences from these projects were encouraging although sample set of projects were too small to get trustworthy results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a very volatile industry of high technology it is of utmost importance to accurately forecast customers’ demand. However, statistical forecasting of sales, especially in heavily competitive electronics product business, has always been a challenging task due to very high variation in demand and very short product life cycles of products. The purpose of this thesis is to validate if statistical methods can be applied to forecasting sales of short life cycle electronics products and provide a feasible framework for implementing statistical forecasting in the environment of the case company. Two different approaches have been developed for forecasting on short and medium term and long term horizons. Both models are based on decomposition models, but differ in interpretation of the model residuals. For long term horizons residuals are assumed to represent white noise, whereas for short and medium term forecasting horizon residuals are modeled using statistical forecasting methods. Implementation of both approaches is performed in Matlab. Modeling results have shown that different markets exhibit different demand patterns and therefore different analytical approaches are appropriate for modeling demand in these markets. Moreover, the outcomes of modeling imply that statistical forecasting can not be handled separately from judgmental forecasting, but should be perceived only as a basis for judgmental forecasting activities. Based on modeling results recommendations for further deployment of statistical methods in sales forecasting of the case company are developed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most current methods for adult skeletal age-at-death estimation are based on American samples comprising individuals of European and African ancestry. Our limited understanding of population variability hampers our efforts to apply these techniques to various skeletal populations around the world, especially in global forensic contexts. Further, documented skeletal samples are rare, limiting our ability to test our techniques. The objective of this paper is to test three pelvic macroscopic methods (1-Suchey-Brooks; 2- Lovejoy; 3- Buckberry and Chamberlain) on a documented modern Spanish sample. These methods were selected because they are popular among Spanish anthropologists and because they never have been tested in a Spanish sample. The study sample consists of 80 individuals (55 ♂ and 25 ♀) of known sex and age from the Valladolid collection. Results indicate that in all three methods, levels of bias and inaccuracy increase with age. The Lovejoy method performs poorly (27%) compared with Suchey-Brooks (71%) and Buckberry and Chamberlain (86%). However, the levels of correlation between phases and chronological ages are low and comparable in the three methods (< 0.395). The apparent accuracy of the Suchey-Brooks and Buckberry and Chamberlain methods is largely based on the broad width of the methods" estimated intervals. This study suggests that before systematic application of these three methodologies in Spanish populations, further statistical modeling and research into the co-variance of chronological age with morphological change is necessary. Future methods should be developed specific to various world populations, and should allow for both precision and flexibility in age estimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Forensic Anthropology and Bioarchaeology studies depend critically on the accuracy and reliability of age-estimation techniques. In this study we have evaluated two age-estimation methods for adults based on the pubic symphysis (Suchey-Brooks) and the auricular surface (Buckberry-Chamberlain) in a current sample of 139 individuals (67 women and 72 men) from Madrid in order to verify the accuracy of both methods applied to a sample of innominate bones from the central Iberian Peninsula. Based on the overall results of this study, the Buckberry-Chamberlain method seems to be the method that provides better estimates in terms of accuracy (percentage of hits) and absolute difference to the chronological age taking into account the total sample. The percentage of hits and mean absolute difference of the Buckberry-Chamberlain and Suchey-Brooks methods are 97.3% and 11.24 years, and 85.7% and 14.38 years, respectively. However, this apparently greater applicability of the Buckberry-Chamberlain method is mainly due to the broad age ranges provided. Results indicated that Suchey-Brooks method is more appropriate for populations with a majority of young individuals, whereas Buckberry-Chamberlain method is recommended for populations with a higher percentage of individuals in the range 60-70 years. These different age estimation methodologies significantly influence the resulting demographic profile, consequently affecting the biological characteristics reconstruction of the samples in which they are applied.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is an increasing interest to seek new enzyme preparations for the development of new products derived from bioprocesses to obtain alternative bio-based materials. In this context, four non-commercial lipases from Pseudomonas species were prepared, immobilized on different low-cost supports, and examined for potential biotechnological applications. Results: To reduce costs of eventual scaling-up, the new lipases were obtained directly from crude cell extracts or from growth culture supernatants, and immobilized by simple adsorption on Accurel EP100, Accurel MP1000 and Celite (R) 545. The enzymes evaluated were LipA and LipC from Pseudomonas sp. 42A2, a thermostable mutant of LipC, and LipI. 3 from Pseudomonas CR611, which were produced in either homologous or heterologous hosts. Best immobilization results were obtained on Accurel EP100 for LipA and on Accurel MP1000 for LipC and its thermostable variant. Lip I. 3, requiring a refolding step, was poorly immobilized on all supports tested ( best results for Accurel MP1000). To test the behavior of immobilized lipases, they were assayed in triolein transesterification, where the best results were observed for lipases immobilized on Accurel MP1000. Conclusions: The suggested protocol does not require protein purification and uses crude enzymes immobilized by a fast adsorption technique on low-cost supports, which makes the method suitable for an eventual scaling up aimed at biotechnological applications. Therefore, a fast, simple and economic method for lipase preparation and immobilization has been set up. The low price of the supports tested and the simplicity of the procedure, skipping the tedious and expensive purification steps, will contribute to cost reduction in biotechnological lipase-catalyzed processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our inability to adequately treat many patients with refractory epilepsy caused by focal cortical dysplasia (FCD), surgical inaccessibility and failures are significant clinical drawbacks. The targeting of physiologic features of epileptogenesis in FCD and colocalizing functionality has enhanced completeness of surgical resection, the main determinant of outcome. Electroencephalography (EEG)-functional magnetic resonance imaging (fMRI) and magnetoencephalography are helpful in guiding electrode implantation and surgical treatment, and high-frequency oscillations help defining the extent of the epileptogenic dysplasia. Ultra high-field MRI has a role in understanding the laminar organization of the cortex, and fluorodeoxyglucose-positron emission tomography (FDG-PET) is highly sensitive for detecting FCD in MRI-negative cases. Multimodal imaging is clinically valuable, either by improving the rate of postoperative seizure freedom or by reducing postoperative deficits. However, there is no level 1 evidence that it improves outcomes. Proof for a specific effect of antiepileptic drugs (AEDs) in FCD is lacking. Pathogenic mutations recently described in mammalian target of rapamycin (mTOR) genes in FCD have yielded important insights into novel treatment options with mTOR inhibitors, which might represent an example of personalized treatment of epilepsy based on the known mechanisms of disease. The ketogenic diet (KD) has been demonstrated to be particularly effective in children with epilepsy caused by structural abnormalities, especially FCD. It attenuates epigenetic chromatin modifications, a master regulator for gene expression and functional adaptation of the cell, thereby modifying disease progression. This could imply lasting benefit of dietary manipulation. Neurostimulation techniques have produced variable clinical outcomes in FCD. In widespread dysplasias, vagus nerve stimulation (VNS) has achieved responder rates >50%; however, the efficacy of noninvasive cranial nerve stimulation modalities such as transcutaneous VNS (tVNS) and noninvasive (nVNS) requires further study. Although review of current strategies underscores the serious shortcomings of treatment-resistant cases, initial evidence from novel approaches suggests that future success is possible.

Relevância:

20.00% 20.00%

Publicador: