183 resultados para modeling algorithms


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability to determine the location and relative strength of all transcription-factor binding sites in a genome is important both for a comprehensive understanding of gene regulation and for effective promoter engineering in biotechnological applications. Here we present a bioinformatically driven experimental method to accurately define the DNA-binding sequence specificity of transcription factors. A generalized profile was used as a predictive quantitative model for binding sites, and its parameters were estimated from in vitro-selected ligands using standard hidden Markov model training algorithms. Computer simulations showed that several thousand low- to medium-affinity sequences are required to generate a profile of desired accuracy. To produce data on this scale, we applied high-throughput genomics methods to the biochemical problem addressed here. A method combining systematic evolution of ligands by exponential enrichment (SELEX) and serial analysis of gene expression (SAGE) protocols was coupled to an automated quality-controlled sequence extraction procedure based on Phred quality scores. This allowed the sequencing of a database of more than 10,000 potential DNA ligands for the CTF/NFI transcription factor. The resulting binding-site model defines the sequence specificity of this protein with a high degree of accuracy not achieved earlier and thereby makes it possible to identify previously unknown regulatory sequences in genomic DNA. A covariance analysis of the selected sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present research deals with the review of the analysis and modeling of Swiss franc interest rate curves (IRC) by using unsupervised (SOM, Gaussian Mixtures) and supervised machine (MLP) learning algorithms. IRC are considered as objects embedded into different feature spaces: maturities; maturity-date, parameters of Nelson-Siegel model (NSM). Analysis of NSM parameters and their temporal and clustering structures helps to understand the relevance of model and its potential use for the forecasting. Mapping of IRC in a maturity-date feature space is presented and analyzed for the visualization and forecasting purposes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Debris flows and related landslide processes occur in many regions all over Norway and pose a significant hazard to inhabited areas. Within the framework of the development of a national debris flows susceptibility map, we are working on a modeling approach suitable for Norway with a nationwide coverage. The discrimination of source areas is based on an index approach, which includes topographic parameters and hydrological settings. For the runout modeling, we use the Flow-R model (IGAR, University of Lausanne), which is based on combined probabilistic and energetic algorithms for the assessment of the spreading of the flow and maximum runout distances. First results for different test areas have shown that runout distances can be modeled reliably. For the selection of source areas, however, additional factors have to be considered, such as the lithological and quaternary geological setting, in order to accommodate the strong variation in debris flow activity in the different geological, geomorphological and climate regions of Norway.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pharmacokinetic variability in drug levels represent for some drugs a major determinant of treatment success, since sub-therapeutic concentrations might lead to toxic reactions, treatment discontinuation or inefficacy. This is true for most antiretroviral drugs, which exhibit high inter-patient variability in their pharmacokinetics that has been partially explained by some genetic and non-genetic factors. The population pharmacokinetic approach represents a very useful tool for the description of the dose-concentration relationship, the quantification of variability in the target population of patients and the identification of influencing factors. It can thus be used to make predictions and dosage adjustment optimization based on Bayesian therapeutic drug monitoring (TDM). This approach has been used to characterize the pharmacokinetics of nevirapine (NVP) in 137 HIV-positive patients followed within the frame of a TDM program. Among tested covariates, body weight, co-administration of a cytochrome (CYP) 3A4 inducer or boosted atazanavir as well as elevated aspartate transaminases showed an effect on NVP elimination. In addition, genetic polymorphism in the CYP2B6 was associated with reduced NVP clearance. Altogether, these factors could explain 26% in NVP variability. Model-based simulations were used to compare the adequacy of different dosage regimens in relation to the therapeutic target associated with treatment efficacy. In conclusion, the population approach is very useful to characterize the pharmacokinetic profile of drugs in a population of interest. The quantification and the identification of the sources of variability is a rational approach to making optimal dosage decision for certain drugs administered chronically.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A human in vivo toxicokinetic model was built to allow a better understanding of the toxicokinetics of folpet fungicide and its key ring biomarkers of exposure: phthalimide (PI), phthalamic acid (PAA) and phthalic acid (PA). Both PI and the sum of ring metabolites, expressed as PA equivalents (PAeq), may be used as biomarkers of exposure. The conceptual representation of the model was based on the analysis of the time course of these biomarkers in volunteers orally and dermally exposed to folpet. In the model, compartments were also used to represent the body burden of folpet and experimentally relevant PI, PAA and PA ring metabolites in blood and in key tissues as well as in excreta, hence urinary and feces. The time evolution of these biomarkers in each compartment of the model was then mathematically described by a system of coupled differential equations. The mathematical parameters of the model were then determined from best fits to the time courses of PI and PAeq in blood and urine of five volunteers administered orally 1 mg kg(-1) and dermally 10 mg kg(-1) of folpet. In the case of oral administration, the mean elimination half-life of PI from blood (through feces, urine or metabolism) was found to be 39.9 h as compared with 28.0 h for PAeq. In the case of a dermal application, mean elimination half-life of PI and PAeq was estimated to be 34.3 and 29.3 h, respectively. The average final fractions of administered dose recovered in urine as PI over the 0-96 h period were 0.030 and 0.002%, for oral and dermal exposure, respectively. Corresponding values for PAeq were 24.5 and 1.83%, respectively. Finally, the average clearance rate of PI from blood calculated from the oral and dermal data was 0.09 ± 0.03 and 0.13 ± 0.05 ml h(-1) while the volume of distribution was 4.30 ± 1.12 and 6.05 ± 2.22 l, respectively. It was not possible to obtain the corresponding values from PAeq data owing to the lack of blood time course data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Scientific curiosity, exploration of georesources and environmental concerns are pushing the geoscientific research community toward subsurface investigations of ever-increasing complexity. This review explores various approaches to formulate and solve inverse problems in ways that effectively integrate geological concepts with geophysical and hydrogeological data. Modern geostatistical simulation algorithms can produce multiple subsurface realizations that are in agreement with conceptual geological models and statistical rock physics can be used to map these realizations into physical properties that are sensed by the geophysical or hydrogeological data. The inverse problem consists of finding one or an ensemble of such subsurface realizations that are in agreement with the data. The most general inversion frameworks are presently often computationally intractable when applied to large-scale problems and it is necessary to better understand the implications of simplifying (1) the conceptual geological model (e.g., using model compression); (2) the physical forward problem (e.g., using proxy models); and (3) the algorithm used to solve the inverse problem (e.g., Markov chain Monte Carlo or local optimization methods) to reach practical and robust solutions given today's computer resources and knowledge. We also highlight the need to not only use geophysical and hydrogeological data for parameter estimation purposes, but also to use them to falsify or corroborate alternative geological scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work describes the ab initio procedure employed to build an activation model for the alpha 1b-adrenergic receptor (alpha 1b-AR). The first version of the model was progressively modified and complicated by means of a many-step iterative procedure characterized by the employment of experimental validations of the model in each upgrading step. A combined simulated (molecular dynamics) and experimental mutagenesis approach was used to determine the structural and dynamic features characterizing the inactive and active states of alpha 1b-AR. The latest version of the model has been successfully challenged with respect to its ability to interpret and predict the functional properties of a large number of mutants. The iterative approach employed to describe alpha 1b-AR activation in terms of molecular structure and dynamics allows further complications of the model to allow prediction and interpretation of an ever-increasing number of experimental data.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Mont Collon mafic complex is one of the best preserved examples of the Early Permian magmatism in the Central Alps, related to the intra-continental collapse of the Variscan belt. It mostly consists (> 95 vol.%) of ol+hy-nonnative plagioclase-wehrlites, olivine- and cpx-gabbros with cumulitic structures, crosscut by acid dikes. Pegmatitic gabbros, troctolites and anorthosites outcrop locally. A well-preserved cumulative, sequence is exposed in the Dents de Bertol area (center of intrusion). PT-calculations indicate that this layered magma chamber emplaced at mid-crustal levels at about 0.5 GPa and 1100 degrees C. The Mont Collon cumulitic rocks record little magmatic differentiation, as illustrated by the restricted range of clinopyroxene mg-number (Mg#(cpx)=83-89). Whole-rock incompatible trace-element contents (e.g. Nb, Zr, Ba) vary largely and without correlation with major-element composition. These features are characteristic of an in-situ crystallization process with variable amounts of interstitial liquid L trapped between the cumulus mineral phases. LA-ICPMS measurements show that trace-element distribution in the latter is homogeneous, pointing to subsolidus re-equilibration between crystals and interstitial melts. A quantitative modeling based on Langmuir's in-situ crystallization equation successfully duplicated the REE concentrations in cumulitic minerals of all rock facies of the intrusion. The calculated amounts of interstitial liquid L vary between 0 and 35% for degrees of differentiation F of 0 to 20%, relative to the least evolved facies of the intrusion. L values are well correlated with the modal proportions of interstitial amphibole and whole-rock incompatible trace-element concentrations (e.g. Zr, Nb) of the tested samples. However, the in-situ crystallization model reaches its limitations with rock containing high modal content of REE-bearing minerals (i.e. zircon), such as pegmatitic gabbros. Dikes of anorthositic composition, locally crosscutting the layered lithologies, evidence that the Mont Collon rocks evolved in open system with mixing of intercumulus liquids of different origins and possibly contrasting compositions. The proposed model is not able to resolve these complex open systems, but migrating liquids could be partly responsible for the observed dispersion of points in some correlation diagrams. Absence of significant differentiation with recurrent lithologies in the cumulitic pile of Dents de Bertol points to an efficiently convective magma chamber, with possible periodic replenishment, (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The transition from wakefulness to sleep represents the most conspicuous change in behavior and the level of consciousness occurring in the healthy brain. It is accompanied by similarly conspicuous changes in neural dynamics, traditionally exemplified by the change from "desynchronized" electroencephalogram activity in wake to globally synchronized slow wave activity of early sleep. However, unit and local field recordings indicate that the transition is more gradual than it might appear: On one hand, local slow waves already appear during wake; on the other hand, slow sleep waves are only rarely global. Studies with functional magnetic resonance imaging also reveal changes in resting-state functional connectivity (FC) between wake and slow wave sleep. However, it remains unclear how resting-state networks may change during this transition period. Here, we employ large-scale modeling of the human cortico-cortical anatomical connectivity to evaluate changes in resting-state FC when the model "falls asleep" due to the progressive decrease in arousal-promoting neuromodulation. When cholinergic neuromodulation is parametrically decreased, local slow waves appear, while the overall organization of resting-state networks does not change. Furthermore, we show that these local slow waves are structured macroscopically in networks that resemble the resting-state networks. In contrast, when the neuromodulator decrease further to very low levels, slow waves become global and resting-state networks merge into a single undifferentiated, broadly synchronized network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The goal of the present work was assess the feasibility of using a pseudo-inverse and null-space optimization approach in the modeling of the shoulder biomechanics. The method was applied to a simplified musculoskeletal shoulder model. The mechanical system consisted in the arm, and the external forces were the arm weight, 6 scapulo-humeral muscles and the reaction at the glenohumeral joint, which was considered as a spherical joint. The muscle wrapping was considered around the humeral head assumed spherical. The dynamical equations were solved in a Lagrangian approach. The mathematical redundancy of the mechanical system was solved in two steps: a pseudo-inverse optimization to minimize the square of the muscle stress and a null-space optimization to restrict the muscle force to physiological limits. Several movements were simulated. The mathematical and numerical aspects of the constrained redundancy problem were efficiently solved by the proposed method. The prediction of muscle moment arms was consistent with cadaveric measurements and the joint reaction force was consistent with in vivo measurements. This preliminary work demonstrated that the developed algorithm has a great potential for more complex musculoskeletal modeling of the shoulder joint. In particular it could be further applied to a non-spherical joint model, allowing for the natural translation of the humeral head in the glenoid fossa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Odds ratios for head and neck cancer increase with greater cigarette and alcohol use and lower body mass index (BMI; weight (kg)/height(2) (m(2))). Using data from the International Head and Neck Cancer Epidemiology Consortium, the authors conducted a formal analysis of BMI as a modifier of smoking- and alcohol-related effects. Analysis of never and current smokers included 6,333 cases, while analysis of never drinkers and consumers of < or =10 drinks/day included 8,452 cases. There were 8,000 or more controls, depending on the analysis. Odds ratios for all sites increased with lower BMI, greater smoking, and greater drinking. In polytomous regression, odds ratios for BMI (P = 0.65), smoking (P = 0.52), and drinking (P = 0.73) were homogeneous for oral cavity and pharyngeal cancers. Odds ratios for BMI and drinking were greater for oral cavity/pharyngeal cancer (P < 0.01), while smoking odds ratios were greater for laryngeal cancer (P < 0.01). Lower BMI enhanced smoking- and drinking-related odds ratios for oral cavity/pharyngeal cancer (P < 0.01), while BMI did not modify smoking and drinking odds ratios for laryngeal cancer. The increased odds ratios for all sites with low BMI may suggest related carcinogenic mechanisms; however, BMI modification of smoking and drinking odds ratios for cancer of the oral cavity/pharynx but not larynx cancer suggests additional factors specific to oral cavity/pharynx cancer.