67 resultados para Data Modeling
Multimodel inference and multimodel averaging in empirical modeling of occupational exposure levels.
Resumo:
Empirical modeling of exposure levels has been popular for identifying exposure determinants in occupational hygiene. Traditional data-driven methods used to choose a model on which to base inferences have typically not accounted for the uncertainty linked to the process of selecting the final model. Several new approaches propose making statistical inferences from a set of plausible models rather than from a single model regarded as 'best'. This paper introduces the multimodel averaging approach described in the monograph by Burnham and Anderson. In their approach, a set of plausible models are defined a priori by taking into account the sample size and previous knowledge of variables influent on exposure levels. The Akaike information criterion is then calculated to evaluate the relative support of the data for each model, expressed as Akaike weight, to be interpreted as the probability of the model being the best approximating model given the model set. The model weights can then be used to rank models, quantify the evidence favoring one over another, perform multimodel prediction, estimate the relative influence of the potential predictors and estimate multimodel-averaged effects of determinants. The whole approach is illustrated with the analysis of a data set of 1500 volatile organic compound exposure levels collected by the Institute for work and health (Lausanne, Switzerland) over 20 years, each concentration having been divided by the relevant Swiss occupational exposure limit and log-transformed before analysis. Multimodel inference represents a promising procedure for modeling exposure levels that incorporates the notion that several models can be supported by the data and permits to evaluate to a certain extent model selection uncertainty, which is seldom mentioned in current practice.
Resumo:
MOTIVATION: In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. RESULTS: In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. AVAILABILITY: The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.
Resumo:
A factor limiting preliminary rockfall hazard mapping at regional scale is often the lack of knowledge of potential source areas. Nowadays, high resolution topographic data (LiDAR) can account for realistic landscape details even at large scale. With such fine-scale morphological variability, quantitative geomorphometric analyses become a relevant approach for delineating potential rockfall instabilities. Using digital elevation model (DEM)-based ?slope families? concept over areas of similar lithology and cliffs and screes zones available from the 1:25,000 topographic map, a susceptibility rockfall hazard map was drawn up in the canton of Vaud, Switzerland, in order to provide a relevant hazard overview. Slope surfaces over morphometrically-defined thresholds angles were considered as rockfall source zones. 3D modelling (CONEFALL) was then applied on each of the estimated source zones in order to assess the maximum runout length. Comparison with known events and other rockfall hazard assessments are in good agreement, showing that it is possible to assess rockfall activities over large areas from DEM-based parameters and topographical elements.
Resumo:
Retroelements are important evolutionary forces but can be deleterious if left uncontrolled. Members of the human APOBEC3 family of cytidine deaminases can inhibit a wide range of endogenous, as well as exogenous, retroelements. These enzymes are structurally organized in one or two domains comprising a zinc-coordinating motif. APOBEC3G contains two such domains, only the C terminal of which is endowed with editing activity, while its N-terminal counterpart binds RNA, promotes homo-oligomerization, and is necessary for packaging into human immunodeficiency virus type 1 (HIV-1) virions. Here, we performed a large-scale mutagenesis-based analysis of the APOBEC3G N terminus, testing mutants for (i) inhibition of vif-defective HIV-1 infection and Alu retrotransposition, (ii) RNA binding, and (iii) oligomerization. Furthermore, in the absence of structural information on this domain, we used homology modeling to examine the positions of functionally important residues and of residues found to be under positive selection by phylogenetic analyses of primate APOBEC3G genes. Our results reveal the importance of a predicted RNA binding dimerization interface both for packaging into HIV-1 virions and inhibition of both HIV-1 infection and Alu transposition. We further found that the HIV-1-blocking activity of APOBEC3G N-terminal mutants defective for packaging can be almost entirely rescued if their virion incorporation is forced by fusion with Vpr, indicating that the corresponding region of APOBEC3G plays little role in other aspects of its action against this pathogen. Interestingly, residues forming the APOBEC3G dimer interface are highly conserved, contrasting with the rapid evolution of two neighboring surface-exposed amino acid patches, one targeted by the Vif protein of primate lentiviruses and the other of yet-undefined function.
Resumo:
Summary: Global warming has led to an average earth surface temperature increase of about 0.7 °C in the 20th century, according to the 2007 IPCC report. In Switzerland, the temperature increase in the same period was even higher: 1.3 °C in the Northern Alps anal 1.7 °C in the Southern Alps. The impacts of this warming on ecosystems aspecially on climatically sensitive systems like the treeline ecotone -are already visible today. Alpine treeline species show increased growth rates, more establishment of young trees in forest gaps is observed in many locations and treelines are migrating upwards. With the forecasted warming, this globally visible phenomenon is expected to continue. This PhD thesis aimed to develop a set of methods and models to investigate current and future climatic treeline positions and treeline shifts in the Swiss Alps in a spatial context. The focus was therefore on: 1) the quantification of current treeline dynamics and its potential causes, 2) the evaluation and improvement of temperaturebased treeline indicators and 3) the spatial analysis and projection of past, current and future climatic treeline positions and their respective elevational shifts. The methods used involved a combination of field temperature measurements, statistical modeling and spatial modeling in a geographical information system. To determine treeline shifts and assign the respective drivers, neighborhood relationships between forest patches were analyzed using moving window algorithms. Time series regression modeling was used in the development of an air-to-soil temperature transfer model to calculate thermal treeline indicators. The indicators were then applied spatially to delineate the climatic treeline, based on interpolated temperature data. Observation of recent forest dynamics in the Swiss treeline ecotone showed that changes were mainly due to forest in-growth, but also partly to upward attitudinal shifts. The recent reduction in agricultural land-use was found to be the dominant driver of these changes. Climate-driven changes were identified only at the uppermost limits of the treeline ecotone. Seasonal mean temperature indicators were found to be the best for predicting climatic treelines. Applying dynamic seasonal delimitations and the air-to-soil temperature transfer model improved the indicators' applicability for spatial modeling. Reproducing the climatic treelines of the past 45 years revealed regionally different attitudinal shifts, the largest being located near the highest mountain mass. Modeling climatic treelines based on two IPCC climate warming scenarios predicted major shifts in treeline altitude. However, the currently-observed treeline is not expected to reach this limit easily, due to lagged reaction, possible climate feedback effects and other limiting factors. Résumé: Selon le rapport 2007 de l'IPCC, le réchauffement global a induit une augmentation de la température terrestre de 0.7 °C en moyenne au cours du 20e siècle. En Suisse, l'augmentation durant la même période a été plus importante: 1.3 °C dans les Alpes du nord et 1.7 °C dans les Alpes du sud. Les impacts de ce réchauffement sur les écosystèmes - en particuliers les systèmes sensibles comme l'écotone de la limite des arbres - sont déjà visibles aujourd'hui. Les espèces de la limite alpine des forêts ont des taux de croissance plus forts, on observe en de nombreux endroits un accroissement du nombre de jeunes arbres s'établissant dans les trouées et la limite des arbres migre vers le haut. Compte tenu du réchauffement prévu, on s'attend à ce que ce phénomène, visible globalement, persiste. Cette thèse de doctorat visait à développer un jeu de méthodes et de modèles pour étudier dans un contexte spatial la position présente et future de la limite climatique des arbres, ainsi que ses déplacements, au sein des Alpes suisses. L'étude s'est donc focalisée sur: 1) la quantification de la dynamique actuelle de la limite des arbres et ses causes potentielles, 2) l'évaluation et l'amélioration des indicateurs, basés sur la température, pour la limite des arbres et 3) l'analyse spatiale et la projection de la position climatique passée, présente et future de la limite des arbres et des déplacements altitudinaux de cette position. Les méthodes utilisées sont une combinaison de mesures de température sur le terrain, de modélisation statistique et de la modélisation spatiale à l'aide d'un système d'information géographique. Les relations de voisinage entre parcelles de forêt ont été analysées à l'aide d'algorithmes utilisant des fenêtres mobiles, afin de mesurer les déplacements de la limite des arbres et déterminer leurs causes. Un modèle de transfert de température air-sol, basé sur les modèles de régression sur séries temporelles, a été développé pour calculer des indicateurs thermiques de la limite des arbres. Les indicateurs ont ensuite été appliqués spatialement pour délimiter la limite climatique des arbres, sur la base de données de températures interpolées. L'observation de la dynamique forestière récente dans l'écotone de la limite des arbres en Suisse a montré que les changements étaient principalement dus à la fermeture des trouées, mais aussi en partie à des déplacements vers des altitudes plus élevées. Il a été montré que la récente déprise agricole était la cause principale de ces changements. Des changements dus au climat n'ont été identifiés qu'aux limites supérieures de l'écotone de la limite des arbres. Les indicateurs de température moyenne saisonnière se sont avérés le mieux convenir pour prédire la limite climatique des arbres. L'application de limites dynamiques saisonnières et du modèle de transfert de température air-sol a amélioré l'applicabilité des indicateurs pour la modélisation spatiale. La reproduction des limites climatiques des arbres durant ces 45 dernières années a mis en évidence des changements d'altitude différents selon les régions, les plus importants étant situés près du plus haut massif montagneux. La modélisation des limites climatiques des arbres d'après deux scénarios de réchauffement climatique de l'IPCC a prédit des changements majeurs de l'altitude de la limite des arbres. Toutefois, l'on ne s'attend pas à ce que la limite des arbres actuellement observée atteigne cette limite facilement, en raison du délai de réaction, d'effets rétroactifs du climat et d'autres facteurs limitants.
Resumo:
Recent progress in the experimental determination of protein structures allow to understand, at a very detailed level, the molecular recognition mechanisms that are at the basis of the living matter. This level of understanding makes it possible to design rational therapeutic approaches, in which effectors molecules are adapted or created de novo to perform a given function. An example of such an approach is drug design, were small inhibitory molecules are designed using in silico simulations and tested in vitro. In this article, we present a similar approach to rationally optimize the sequence of killer T lymphocytes receptors to make them more efficient against melanoma cells. The architecture of this translational research project is presented together with its implications both at the level of basic research as well as in the clinics.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot
Resumo:
Tractography is a class of algorithms aiming at in vivo mapping the major neuronal pathways in the white matter from diffusion magnetic resonance imaging (MRI) data. These techniques offer a powerful tool to noninvasively investigate at the macroscopic scale the architecture of the neuronal connections of the brain. However, unfortunately, the reconstructions recovered with existing tractography algorithms are not really quantitative even though diffusion MRI is a quantitative modality by nature. As a matter of fact, several techniques have been proposed in recent years to estimate, at the voxel level, intrinsic microstructural features of the tissue, such as axonal density and diameter, by using multicompartment models. In this paper, we present a novel framework to reestablish the link between tractography and tissue microstructure. Starting from an input set of candidate fiber-tracts, which are estimated from the data using standard fiber-tracking techniques, we model the diffusion MRI signal in each voxel of the image as a linear combination of the restricted and hindered contributions generated in every location of the brain by these candidate tracts. Then, we seek for the global weight of each of them, i.e., the effective contribution or volume, such that they globally fit the measured signal at best. We demonstrate that these weights can be easily recovered by solving a global convex optimization problem and using efficient algorithms. The effectiveness of our approach has been evaluated both on a realistic phantom with known ground-truth and in vivo brain data. Results clearly demonstrate the benefits of the proposed formulation, opening new perspectives for a more quantitative and biologically plausible assessment of the structural connectivity of the brain.
Resumo:
In the context of the investigation of the use of automated fingerprint identification systems (AFIS) for the evaluation of fingerprint evidence, the current study presents investigations into the variability of scores from an AFIS system when fingermarks from a known donor are compared to fingerprints that are not from the same source. The ultimate goal is to propose a model, based on likelihood ratios, which allows the evaluation of mark-to-print comparisons. In particular, this model, through its use of AFIS technology, benefits from the possibility of using a large amount of data, as well as from an already built-in proximity measure, the AFIS score. More precisely, the numerator of the LR is obtained from scores issued from comparisons between impressions from the same source and showing the same minutia configuration. The denominator of the LR is obtained by extracting scores from comparisons of the questioned mark with a database of non-matching sources. This paper focuses solely on the assignment of the denominator of the LR. We refer to it by the generic term of between-finger variability. The issues addressed in this paper in relation to between-finger variability are the required sample size, the influence of the finger number and general pattern, as well as that of the number of minutiae included and their configuration on a given finger. Results show that reliable estimation of between-finger variability is feasible with 10,000 scores. These scores should come from the appropriate finger number/general pattern combination as defined by the mark. Furthermore, strategies of obtaining between-finger variability when these elements cannot be conclusively seen on the mark (and its position with respect to other marks for finger number) have been presented. These results immediately allow case-by-case estimation of the between-finger variability in an operational setting.
Resumo:
Empirical literature on the analysis of the efficiency of measures for reducing persistent government deficits has mainly focused on the direct explanation of deficit. By contrast, this paper aims at modeling government revenue and expenditure within a simultaneous framework and deriving the fiscal balance (surplus or deficit) equation as the difference between the two variables. This setting enables one to not only judge how relevant the explanatory variables are in explaining the fiscal balance but also understand their impact on revenue and/or expenditure. Our empirical results, obtained by using a panel data set on Swiss Cantons for the period 1980-2002, confirm the relevance of the approach followed here, by providing unambiguous evidence of a simultaneous relationship between revenue and expenditure. They also reveal strong dynamic components in revenue, expenditure, and fiscal balance. Among the significant determinants of public fiscal balance we not only find the usual business cycle elements, but also and more importantly institutional factors such as the number of administrative units, and the ease with which people can resort to political (direct democracy) instruments, such as public initiatives and referendum.
Resumo:
Objectives: We are interested in the numerical simulation of the anastomotic region comprised between outflow canula of LVAD and the aorta. Segmenta¬tion, geometry reconstruction and grid generation from patient-specific data remain an issue because of the variable quality of DICOM images, in particular CT-scan (e.g. metallic noise of the device, non-aortic contrast phase). We pro¬pose a general framework to overcome this problem and create suitable grids for numerical simulations.Methods: Preliminary treatment of images is performed by reducing the level window and enhancing the contrast of the greyscale image using contrast-limited adaptive histogram equalization. A gradient anisotropic diffusion filter is applied to reduce the noise. Then, watershed segmentation algorithms and mathematical morphology filters allow reconstructing the patient geometry. This is done using the InsightToolKit library (www.itk.org). Finally the Vascular Model¬ing ToolKit (www.vmtk.org) and gmsh (www.geuz.org/gmsh) are used to create the meshes for the fluid (blood) and structure (arterial wall, outflow canula) and to a priori identify the boundary layers. The method is tested on five different patients with left ventricular assistance and who underwent a CT-scan exam.Results: This method produced good results in four patients. The anastomosis area is recovered and the generated grids are suitable for numerical simulations. In one patient the method failed to produce a good segmentation because of the small dimension of the aortic arch with respect to the image resolution.Conclusions: The described framework allows the use of data that could not be otherwise segmented by standard automatic segmentation tools. In particular the computational grids that have been generated are suitable for simulations that take into account fluid-structure interactions. Finally the presented method features a good reproducibility and fast application.
Resumo:
ABSTRACT: BACKGROUND: The prevalence of obesity has increased in societies of all socio-cultural backgrounds. To date, guidelines set forward to prevent obesity have universally emphasized optimal levels of physical activity. However there are few empirical data to support the assertion that low levels of energy expenditure in activity is a causal factor in the current obesity epidemic are very limited. METHODS: The Modeling the Epidemiologic Transition Study (METS) is a cohort study designed to assess the association between physical activity levels and relative weight, weight gain and diabetes and cardiovascular disease risk in five population-based samples at different stages of economic development. Twenty-five hundred young adults, ages 25-45, were enrolled in the study; 500 from sites in Ghana, South Africa, Seychelles, Jamaica and the United States. At baseline, physical activity levels were assessed using accelerometry and a questionnaire in all participants and by doubly labeled water in a subsample of 75 per site. We assessed dietary intake using two separate 24-h recalls, body composition using bioelectrical impedance analysis, and health history, social and economic indicators by questionnaire. Blood pressure was measured and blood samples collected for measurement of lipids, glucose, insulin and adipokines. Full examination including physical activity using accelerometry, anthropometric data and fasting glucose will take place at 12 and 24 months. The distribution of the main variables and the associations between physical activity, independent of energy intake, glucose metabolism and anthropometric measures will be assessed using cross-section and longitudinal analysis within and between sites. DISCUSSION: METS will provide insight on the relative contribution of physical activity and diet to excess weight, age-related weight gain and incident glucose impairment in five populations' samples of young adults at different stages of economic development. These data should be useful for the development of empirically-based public health policy aimed at the prevention of obesity and associated chronic diseases.
Resumo:
We present models predicting the potential distribution of a threatened ant species, Formica exsecta Nyl., in the Swiss National Park ( SNP). Data to fit the models have been collected according to a random-stratified design with an equal number of replicates per stratum. The basic aim of such a sampling strategy is to allow the formal testing of biological hypotheses about those factors most likely to account for the distribution of the modeled species. The stratifying factors used in this study were: vegetation, slope angle and slope aspect, the latter two being used as surrogates of solar radiation, considered one of the basic requirements of F. exsecta. Results show that, although the basic stratifying predictors account for more than 50% of the deviance, the incorporation of additional non-spatially explicit predictors into the model, as measured in the field, allows for an increased model performance (up to nearly 75%). However, this was not corroborated by permutation tests. Implementation on a national scale was made for one model only, due to the difficulty of obtaining similar predictors on this scale. The resulting map on the national scale suggests that the species might once have had a broader distribution in Switzerland. Reasons for its particular abundance within the SNP might possibly be related to habitat fragmentation and vegetation transformation outside the SNP boundaries.
Resumo:
Building a personalized model to describe the drug concentration inside the human body for each patient is highly important to the clinical practice and demanding to the modeling tools. Instead of using traditional explicit methods, in this paper we propose a machine learning approach to describe the relation between the drug concentration and patients' features. Machine learning has been largely applied to analyze data in various domains, but it is still new to personalized medicine, especially dose individualization. We focus mainly on the prediction of the drug concentrations as well as the analysis of different features' influence. Models are built based on Support Vector Machine and the prediction results are compared with the traditional analytical models.