59 resultados para modeling of data sources
em Université de Lausanne, Switzerland
Resumo:
This paper presents a review of methodology for semi-supervised modeling with kernel methods, when the manifold assumption is guaranteed to be satisfied. It concerns environmental data modeling on natural manifolds, such as complex topographies of the mountainous regions, where environmental processes are highly influenced by the relief. These relations, possibly regionalized and nonlinear, can be modeled from data with machine learning using the digital elevation models in semi-supervised kernel methods. The range of the tools and methodological issues discussed in the study includes feature selection and semisupervised Support Vector algorithms. The real case study devoted to data-driven modeling of meteorological fields illustrates the discussed approach.
Resumo:
Yosemite Valley poses significant rockfall hazard and related risk due to its glacially steepened walls and approximately 4 million visitors annually. To assess rockfall hazard, it is necessary to evaluate the geologic structure that contributes to the destabilization of rockfall sources and locate the most probable future source areas. Coupling new remote sensing techniques (Terrestrial Laser Scanning, Aerial Laser Scanning) and traditional field surveys, we investigated the regional geologic and structural setting, the orientation of the primary discontinuity sets for large areas of Yosemite Valley, and the specific discontinuity sets present at active rockfall sources. This information, combined with better understanding of the geologic processes that contribute to the progressive destabilization and triggering of granitic rock slabs, contributes to a more accurate rockfall susceptibility assessment for Yosemite Valley and elsewhere.
Resumo:
The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.
Resumo:
PECUBE is a three-dimensional thermal-kinematic code capable of solving the heat production-diffusion-advection equation under a temporally varying surface boundary condition. It was initially developed to assess the effects of time-varying surface topography (relief) on low-temperature thermochronological datasets. Thermochronometric ages are predicted by tracking the time-temperature histories of rock-particles ending up at the surface and by combining these with various age-prediction models. In the decade since its inception, the PECUBE code has been under continuous development as its use became wider and addressed different tectonic-geomorphic problems. This paper describes several major recent improvements in the code, including its integration with an inverse-modeling package based on the Neighborhood Algorithm, the incorporation of fault-controlled kinematics, several different ways to address topographic and drainage change through time, the ability to predict subsurface (tunnel or borehole) data, prediction of detrital thermochronology data and a method to compare these with observations, and the coupling with landscape-evolution (or surface-process) models. Each new development is described together with one or several applications, so that the reader and potential user can clearly assess and make use of the capabilities of PECUBE. We end with describing some developments that are currently underway or should take place in the foreseeable future. (C) 2012 Elsevier B.V. All rights reserved.
Multimodel inference and multimodel averaging in empirical modeling of occupational exposure levels.
Resumo:
Empirical modeling of exposure levels has been popular for identifying exposure determinants in occupational hygiene. Traditional data-driven methods used to choose a model on which to base inferences have typically not accounted for the uncertainty linked to the process of selecting the final model. Several new approaches propose making statistical inferences from a set of plausible models rather than from a single model regarded as 'best'. This paper introduces the multimodel averaging approach described in the monograph by Burnham and Anderson. In their approach, a set of plausible models are defined a priori by taking into account the sample size and previous knowledge of variables influent on exposure levels. The Akaike information criterion is then calculated to evaluate the relative support of the data for each model, expressed as Akaike weight, to be interpreted as the probability of the model being the best approximating model given the model set. The model weights can then be used to rank models, quantify the evidence favoring one over another, perform multimodel prediction, estimate the relative influence of the potential predictors and estimate multimodel-averaged effects of determinants. The whole approach is illustrated with the analysis of a data set of 1500 volatile organic compound exposure levels collected by the Institute for work and health (Lausanne, Switzerland) over 20 years, each concentration having been divided by the relevant Swiss occupational exposure limit and log-transformed before analysis. Multimodel inference represents a promising procedure for modeling exposure levels that incorporates the notion that several models can be supported by the data and permits to evaluate to a certain extent model selection uncertainty, which is seldom mentioned in current practice.
Resumo:
Within the framework of a retrospective study of the incidence of hip fractures in the canton of Vaud (Switzerland), all cases of hip fracture occurring among the resident population in 1986 and treated in the hospitals of the canton were identified from among five different information sources. Relevant data were then extracted from the medical records. At least two sources of information were used to identify cases in each hospital, among them the statistics of the Swiss Hospital Association (VESKA). These statistics were available for 9 of the 18 hospitals in the canton that participated in the study. The number of cases identified from the VESKA statistics was compared to the total number of cases for each hospital. For the 9 hospitals the number of cases in the VESKA statistics was 407, whereas, after having excluded diagnoses that were actually "status after fracture" and double entries, the total for these hospitals was 392, that is 4% less than the VESKA statistics indicate. It is concluded that the VESKA statistics provide a good approximation of the actual number of cases treated in these hospitals, with a tendency to overestimate this number. In order to use these statistics for calculating incidence figures, however, it is imperative that a greater proportion of all hospitals (50% presently in the canton, 35% nationwide) participate in these statistics.
Resumo:
MOTIVATION: In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. RESULTS: In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. AVAILABILITY: The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.
Resumo:
Retroelements are important evolutionary forces but can be deleterious if left uncontrolled. Members of the human APOBEC3 family of cytidine deaminases can inhibit a wide range of endogenous, as well as exogenous, retroelements. These enzymes are structurally organized in one or two domains comprising a zinc-coordinating motif. APOBEC3G contains two such domains, only the C terminal of which is endowed with editing activity, while its N-terminal counterpart binds RNA, promotes homo-oligomerization, and is necessary for packaging into human immunodeficiency virus type 1 (HIV-1) virions. Here, we performed a large-scale mutagenesis-based analysis of the APOBEC3G N terminus, testing mutants for (i) inhibition of vif-defective HIV-1 infection and Alu retrotransposition, (ii) RNA binding, and (iii) oligomerization. Furthermore, in the absence of structural information on this domain, we used homology modeling to examine the positions of functionally important residues and of residues found to be under positive selection by phylogenetic analyses of primate APOBEC3G genes. Our results reveal the importance of a predicted RNA binding dimerization interface both for packaging into HIV-1 virions and inhibition of both HIV-1 infection and Alu transposition. We further found that the HIV-1-blocking activity of APOBEC3G N-terminal mutants defective for packaging can be almost entirely rescued if their virion incorporation is forced by fusion with Vpr, indicating that the corresponding region of APOBEC3G plays little role in other aspects of its action against this pathogen. Interestingly, residues forming the APOBEC3G dimer interface are highly conserved, contrasting with the rapid evolution of two neighboring surface-exposed amino acid patches, one targeted by the Vif protein of primate lentiviruses and the other of yet-undefined function.
Resumo:
Objectives: After several years of increasing 'normalisation' of cannabis use in Switzerland at the beginning of the new millennium, a reversed tendency, marked among others by a more stringent law-enforcement, set in. The presentation examines the question of where adolescents and young adults obtained cannabis, within the context of this societal change. In addition, it compares the sources of supply for cannabis with those found in studies of other European countries. Methods: Analyses are based on data from the Swiss Cannabis Monitoring Study. As part of this longitudinal, representative population survey, more than 5000 adolescents and young adults were interviewed by telephone on the topic of cannabis. Within the total sample, 593 (2004) or 554 (2007) respectively, current cannabis users replied to the questions on sources of supply. Changes in law-enforcement and societal climate concerning cannabis are assessed based on relevant literature, media reports and parliamentary discussions. Results: Whereas 22% of cannabis users stated in 2004 that they bought their cannabis from vendors in hemp shops, this proportion drastically decreased to 6% three years later. At the same time, cannabis was obtained increasingly from friends, while the proportion of users who purchased cannabis from dealers in the alleyway, more than doubled from 6% (2004) to 13% (2007). It was male cannabis users, and in particular, young adult and frequent users, who have moved into the alleyways. Generally, users who buy cannabis in the alleyway show more cannabis-related problems than those who mainly name other sources of supply, even when adjusted for sex, age and frequency of cannabis use. Discussion: Possible consequences of these changes in cannabis supply, like the risk of merging a previously cannabis-only market with other 'harder' drugs markets, are discussed.
Resumo:
New major and trace element analyses, Sr-Nd isotopic data and K-40-Ar-40 ages on Neogene and Quaternary lavas from Morocco lead to the conclusion that the observed temporal changes from calc-alkaline to transitional and finally alkaline magmatic activity reflect the contributions of distinct sources. According to our model, magmas originally derived from the melting of an European/Western Mediterranean-type asthenospheric mantle source interact during their ascent with either a subcontinental Ronda - Beni Bousera-/type lithospheric mantle (alkaline magmas) or a lithospheric mantle containing a crustal component, and the overlying continental crust (calc-alkaline and, to a lesser extent, transitional magmas). ( (C) Academie des sciences/Elsevier, Paris.).
Resumo:
The methodology for generating a homology model of the T1 TCR-PbCS-K(d) class I major histocompatibility complex (MHC) class I complex is presented. The resulting model provides a qualitative explanation of the effect of over 50 different mutations in the region of the complementarity determining region (CDR) loops of the T cell receptor (TCR), the peptide and the MHC's alpha(1)/alpha(2) helices. The peptide is modified by an azido benzoic acid photoreactive group, which is part of the epitope recognized by the TCR. The construction of the model makes use of closely related homologs (the A6 TCR-Tax-HLA A2 complex, the 2C TCR, the 14.3.d TCR Vbeta chain, the 1934.4 TCR Valpha chain, and the H-2 K(b)-ovalbumine peptide), ab initio sampling of CDR loops conformations and experimental data to select from the set of possibilities. The model shows a complex arrangement of the CDR3alpha, CDR1beta, CDR2beta and CDR3beta loops that leads to the highly specific recognition of the photoreactive group. The protocol can be applied systematically to a series of related sequences, permitting the analysis at the structural level of the large TCR repertoire specific for a given peptide-MHC complex.
Resumo:
A remarkable feature of the carcinogenicity of inorganic arsenic is that while human exposures to high concentrations of inorganic arsenic in drinking water are associated with increases in skin, lung, and bladder cancer, inorganic arsenic has not typically caused tumors in standard laboratory animal test protocols. Inorganic arsenic administered for periods of up to 2 yr to various strains of laboratory mice, including the Swiss CD-1, Swiss CR:NIH(S), C57Bl/6p53(+/-), and C57Bl/6p53(+/+), has not resulted in significant increases in tumor incidence. However, Ng et al. (1999) have reported a 40% tumor incidence in C57Bl/6J mice exposed to arsenic in their drinking water throughout their lifetime, with no tumors reported in controls. In order to investigate the potential role of tissue dosimetry in differential susceptibility to arsenic carcinogenicity, a physiologically based pharmacokinetic (PBPK) model for inorganic arsenic in the rat, hamster, monkey, and human (Mann et al., 1996a, 1996b) was extended to describe the kinetics in the mouse. The PBPK model was parameterized in the mouse using published data from acute exposures of B6C3F1 mice to arsenate, arsenite, monomethylarsonic acid (MMA), and dimethylarsinic acid (DMA) and validated using data from acute exposures of C57Black mice. Predictions of the acute model were then compared with data from chronic exposures. There was no evidence of changes in the apparent volume of distribution or in the tissue-plasma concentration ratios between acute and chronic exposure that might support the possibility of inducible arsenite efflux. The PBPK model was also used to project tissue dosimetry in the C57Bl/6J study, in comparison with tissue levels in studies having shorter duration but higher arsenic treatment concentrations. The model evaluation indicates that pharmacokinetic factors do not provide an explanation for the difference in outcomes across the various mouse bioassays. Other possible explanations may relate to strain-specific differences, or to the different durations of dosing in each of the mouse studies, given the evidence that inorganic arsenic is likely to be active in the later stages of the carcinogenic process. [Authors]