968 resultados para Randomization-based Inference
Resumo:
Animal dispersal in a fragmented landscape depends on the complex interaction between landscape structure and animal behavior. To better understand how individuals disperse, it is important to explicitly represent the properties of organisms and the landscape in which they move. A common approach to modelling dispersal includes representing the landscape as a grid of equal sized cells and then simulating individual movement as a correlated random walk. This approach uses a priori scale of resolution, which limits the representation of all landscape features and how different dispersal abilities are modelled. We develop a vector-based landscape model coupled with an object-oriented model for animal dispersal. In this spatially explicit dispersal model, landscape features are defined based on their geographic and thematic properties and dispersal is modelled through consideration of an organism's behavior, movement rules and searching strategies (such as visual cues). We present the model's underlying concepts, its ability to adequately represent landscape features and provide simulation of dispersal according to different dispersal abilities. We demonstrate the potential of the model by simulating two virtual species in a real Swiss landscape. This illustrates the model's ability to simulate complex dispersal processes and provides information about dispersal such as colonization probability and spatial distribution of the organism's path
Resumo:
With the advancement of high-throughput sequencing and dramatic increase of available genetic data, statistical modeling has become an essential part in the field of molecular evolution. Statistical modeling results in many interesting discoveries in the field, from detection of highly conserved or diverse regions in a genome to phylogenetic inference of species evolutionary history Among different types of genome sequences, protein coding regions are particularly interesting due to their impact on proteins. The building blocks of proteins, i.e. amino acids, are coded by triples of nucleotides, known as codons. Accordingly, studying the evolution of codons leads to fundamental understanding of how proteins function and evolve. The current codon models can be classified into three principal groups: mechanistic codon models, empirical codon models and hybrid ones. The mechanistic models grasp particular attention due to clarity of their underlying biological assumptions and parameters. However, they suffer from simplified assumptions that are required to overcome the burden of computational complexity. The main assumptions applied to the current mechanistic codon models are (a) double and triple substitutions of nucleotides within codons are negligible, (b) there is no mutation variation among nucleotides of a single codon and (c) assuming HKY nucleotide model is sufficient to capture essence of transition- transversion rates at nucleotide level. In this thesis, I develop a framework of mechanistic codon models, named KCM-based model family framework, based on holding or relaxing the mentioned assumptions. Accordingly, eight different models are proposed from eight combinations of holding or relaxing the assumptions from the simplest one that holds all the assumptions to the most general one that relaxes all of them. The models derived from the proposed framework allow me to investigate the biological plausibility of the three simplified assumptions on real data sets as well as finding the best model that is aligned with the underlying characteristics of the data sets. -- Avec l'avancement de séquençage à haut débit et l'augmentation dramatique des données géné¬tiques disponibles, la modélisation statistique est devenue un élément essentiel dans le domaine dé l'évolution moléculaire. Les résultats de la modélisation statistique dans de nombreuses découvertes intéressantes dans le domaine de la détection, de régions hautement conservées ou diverses dans un génome de l'inférence phylogénétique des espèces histoire évolutive. Parmi les différents types de séquences du génome, les régions codantes de protéines sont particulièrement intéressants en raison de leur impact sur les protéines. Les blocs de construction des protéines, à savoir les acides aminés, sont codés par des triplets de nucléotides, appelés codons. Par conséquent, l'étude de l'évolution des codons mène à la compréhension fondamentale de la façon dont les protéines fonctionnent et évoluent. Les modèles de codons actuels peuvent être classés en trois groupes principaux : les modèles de codons mécanistes, les modèles de codons empiriques et les hybrides. Les modèles mécanistes saisir une attention particulière en raison de la clarté de leurs hypothèses et les paramètres biologiques sous-jacents. Cependant, ils souffrent d'hypothèses simplificatrices qui permettent de surmonter le fardeau de la complexité des calculs. Les principales hypothèses retenues pour les modèles actuels de codons mécanistes sont : a) substitutions doubles et triples de nucleotides dans les codons sont négligeables, b) il n'y a pas de variation de la mutation chez les nucléotides d'un codon unique, et c) en supposant modèle nucléotidique HKY est suffisant pour capturer l'essence de taux de transition transversion au niveau nucléotidique. Dans cette thèse, je poursuis deux objectifs principaux. Le premier objectif est de développer un cadre de modèles de codons mécanistes, nommé cadre KCM-based model family, sur la base de la détention ou de l'assouplissement des hypothèses mentionnées. En conséquence, huit modèles différents sont proposés à partir de huit combinaisons de la détention ou l'assouplissement des hypothèses de la plus simple qui détient toutes les hypothèses à la plus générale qui détend tous. Les modèles dérivés du cadre proposé nous permettent d'enquêter sur la plausibilité biologique des trois hypothèses simplificatrices sur des données réelles ainsi que de trouver le meilleur modèle qui est aligné avec les caractéristiques sous-jacentes des jeux de données. Nos expériences montrent que, dans aucun des jeux de données réelles, tenant les trois hypothèses mentionnées est réaliste. Cela signifie en utilisant des modèles simples qui détiennent ces hypothèses peuvent être trompeuses et les résultats de l'estimation inexacte des paramètres. Le deuxième objectif est de développer un modèle mécaniste de codon généralisée qui détend les trois hypothèses simplificatrices, tandis que d'informatique efficace, en utilisant une opération de matrice appelée produit de Kronecker. Nos expériences montrent que sur un jeux de données choisis au hasard, le modèle proposé de codon mécaniste généralisée surpasse autre modèle de codon par rapport à AICc métrique dans environ la moitié des ensembles de données. En outre, je montre à travers plusieurs expériences que le modèle général proposé est biologiquement plausible.
Resumo:
Background: Elevated levels of g-glutamyl transferase (GGT) have been associated with subsequent risk of elevated blood pressure (BP), hypertension and diabetes. However, the causality of these relationships has not been addressed. Mendelian randomization refers to the random allocation of alleles at the time of gamete formation. Such allocation is expected to be independent of any behavioural and environmental factors (known or unknown), allowing the analysis of largely unconfounded risk associations that are not due to reverse causation. Methods: We performed a cross-sectional analysis among 4361 participants to the population based CoLaus study. Associations of sex-specific GGT quartiles with systolic BP, diastolic BP and insulin levels were assessed using multivariable linear regression analyses. The rs2017869 GGT1 variant, which explained 1.6% of the variance in GGT levels, was used as an instrument to perform a Mendelian randomization analysis. Results: Median age of the study population was 53 years. After age and sex adjustment, GGT quartiles were strongly associated with systolic and diastolic BP (all p for linear trend <0.0001). After multivariable adjustment, these relationships were significantly attenuated, but remained significant for systolic (b(95%CI)¼1.30 (0.32;2.03), p¼0.007) and diastolic BP (b (95%CI)¼0.57 (0.02;1.13), p¼0.04). Using Mendelian randomization, we observed no positive association of GGT with either systolic BP (b (95%CI)¼-5.68 (-11.51-0.16), p¼0.06) or diastolic BP (b (95%CI)¼ -2.24 (-5.98;1.49) p¼0.24). The association of GGT with insulin was also attenuated after multivariable adjustment. Nevertheless, a strong linear trend persisted in the fully adjusted model (b (95%CI)¼0.07 (0.04;0.09), p<0.0001). Using Mendelian randomization, we observed a similar positive association of GGT with insulin (b (95%CI)¼0.19 (0.01-0.37), p¼0.04). Conclusion: In this study, we found evidence for a direct causal relationship between GGT and insulin, suggesting that oxidative stress may be causally implicated in the pathogenesis of type 2 diabetes mellitus.
Resumo:
Understanding and anticipating biological invasions can focus either on traits that favour species invasiveness or on features of the receiving communities, habitats or landscapes that promote their invasibility. Here, we address invasibility at the regional scale, testing whether some habitats and landscapes are more invasible than others by fitting models that relate alien plant species richness to various environmental predictors. We use a multi-model information-theoretic approach to assess invasibility by modelling spatial and ecological patterns of alien invasion in landscape mosaics and testing competing hypotheses of environmental factors that may control invasibility. Because invasibility may be mediated by particular characteristics of invasiveness, we classified alien species according to their C-S-R plant strategies. We illustrate this approach with a set of 86 alien species in Northern Portugal. We first focus on predictors influencing species richness and expressing invasibility and then evaluate whether distinct plant strategies respond to the same or different groups of environmental predictors. We confirmed climate as a primary determinant of alien invasions and as a primary environmental gradient determining landscape invasibility. The effects of secondary gradients were detected only when the area was sub-sampled according to predictions based on the primary gradient. Then, multiple predictor types influenced patterns of alien species richness, with some types (landscape composition, topography and fire regime) prevailing over others. Alien species richness responded most strongly to extreme land management regimes, suggesting that intermediate disturbance induces biotic resistance by favouring native species richness. Land-use intensification facilitated alien invasion, whereas conservation areas hosted few invaders, highlighting the importance of ecosystem stability in preventing invasions. Plants with different strategies exhibited different responses to environmental gradients, particularly when the variations of the primary gradient were narrowed by sub-sampling. Such differential responses of plant strategies suggest using distinct control and eradication approaches for different areas and alien plant groups.
Resumo:
We examined phylogenetic relationships among six species representing three subfamilies, Glirinae, Graphiurinae and Leithiinae with sequences from three nuclear protein-coding genes (apolipoprotein B, APOB; interphotoreceptor retinoid-binding protein, IRBP; recombination-activating gene 1, RAG1). Phylogenetic trees reconstructed from maximum-parsimony (MP), maximum-likelihood (ML) and Bayesian-inference (BI) analyses showed the monophyly of Glirinae (Glis and Glirulus) and Leithiinae (Dryomys, Eliomys and Muscardinus) with strong support, although the branch length maintaining this relationship was very short, implying rapid diversification among the three subfamilies. Divergence time estimates were calculated from ML (local clock model) and Bayesian-dating method using a calibration point of 25 Myr (million years) ago for the divergence between Glis and Glirulus, and 55 Myr ago for the split between lineages of Gliridae and Sciuridae on the basis of fossil records. The results showed that each lineage of Graphiuros, Glis, Glirulus and Muscardinus dates from the Late Oligocene to the Early Miocene period, which is mostly in agreement with fossil records. Taking into account that warm climate harbouring a glirid-favoured forest dominated from Europe to Asia during this period, it is considered that this warm environment triggered the prosperity of the glirid species through the rapid diversification. Glirulus japonicas is suggested to be a relict of this ancient diversification during the warm period.
Resumo:
Context: Until now, the testosterone/epitestosterone (T/E) ratio is the main marker for detection of testosterone (T) misuse in athletes. As this marker can be influenced by a number of confounding factors, additional steroid profile parameters indicating T misuse can provide substantiating evidence of doping with endogenous steroids. The evaluation of a steroid profile is currently based upon population statistics. Since large inter-individual variations exist, a paradigm shift towards subject-based references is ongoing in doping analysis. Objective: Proposition of new biomarkers for the detection of testosterone in sports using extensive steroid profiling and an adaptive model based upon Bayesian inference. Subjects: 6 healthy male volunteers were administered with testosterone undecanoate. Population statistics were performed upon steroid profiles from 2014 male Caucasian athletes participating in official sport competition. Design: An extended search for new biomarkers in a comprehensive steroid profile combined with Bayesian inference techniques as used in the Athlete Biological Passport resulted in a selection of additional biomarkers that may improve detection of testosterone misuse in sports. Results: Apart from T/E, 4 other steroid ratios (6α-OH-androstenedione/16α-OH-dehydroepiandrostenedione, 4-OH-androstenedione/16α-OH-androstenedione, 7α-OH-testosterone/7β-OH-dehydroepiandrostenedione and dihydrotestosterone/5β-androstane-3α,17β-diol) were identified as sensitive urinary biomarkers for T misuse. These new biomarkers were rated according to relative response, parameter stability, detection time and discriminative power. Conclusion: Newly selected biomarkers were found suitable for individual referencing within the concept of the Athlete's Biological Passport. The parameters showed improved detection time and discriminative power compared to the T/E ratio. Such biomarkers can support the evidence of doping with small oral doses of testosterone.
Resumo:
Both, Bayesian networks and probabilistic evaluation are gaining more and more widespread use within many professional branches, including forensic science. Notwithstanding, they constitute subtle topics with definitional details that require careful study. While many sophisticated developments of probabilistic approaches to evaluation of forensic findings may readily be found in published literature, there remains a gap with respect to writings that focus on foundational aspects and on how these may be acquired by interested scientists new to these topics. This paper takes this as a starting point to report on the learning about Bayesian networks for likelihood ratio based, probabilistic inference procedures in a class of master students in forensic science. The presentation uses an example that relies on a casework scenario drawn from published literature, involving a questioned signature. A complicating aspect of that case study - proposed to students in a teaching scenario - is due to the need of considering multiple competing propositions, which is an outset that may not readily be approached within a likelihood ratio based framework without drawing attention to some additional technical details. Using generic Bayesian networks fragments from existing literature on the topic, course participants were able to track the probabilistic underpinnings of the proposed scenario correctly both in terms of likelihood ratios and of posterior probabilities. In addition, further study of the example by students allowed them to derive an alternative Bayesian network structure with a computational output that is equivalent to existing probabilistic solutions. This practical experience underlines the potential of Bayesian networks to support and clarify foundational principles of probabilistic procedures for forensic evaluation.
Resumo:
Elevated levels of γ-glutamyltransferase (GGT) have been associated with elevated blood pressure (BP) and diabetes. However, the causality of these relations has not been addressed. The authors performed a cross-sectional analysis (2003-2006) among 4,360 participants from the population-based Cohorte Lausannoise (CoLaus) Study (Lausanne, Switzerland). The rs2017869 variant of the γ-glutamyltransferase 1 (GGT1) gene, which explained 1.6% of the variance in GGT levels, was used as an instrument for Mendelian randomization (MR). Sex-specific GGT quartiles were strongly associated with both systolic and diastolic BP (all P's < 0.0001). After multivariable adjustment, these relations were attenuated but remained significant. Using MR, the authors observed no positive association of GGT with BP (systolic: β -5.68, 95% confidence interval (CI): -11.51, 0.16 (P = 0.06); diastolic: β = -2.24, 95% CI: -5.98, 1.49 (P = 0.24)). The association of GGT with insulin was also attenuated after multivariable adjustment but persisted in the fully adjusted model (β = 0.07, 95% CI: 0.04, 0.09; P < 0.0001). Using MR, the authors also observed a positive association of GGT with insulin (β = 0.19, 95% CI: 0.01, 0.37; P = 0.04). In conclusion, the authors found evidence for a direct causal relation of GGT with fasting insulin but not with BP.
Resumo:
Doping with natural steroids can be detected by evaluating the urinary concentrations and ratios of several endogenous steroids. Since these biomarkers of steroid doping are known to present large inter-individual variations, monitoring of individual steroid profiles over time allows switching from population-based towards subject-based reference ranges for improved detection. In an Athlete Biological Passport (ABP), biomarkers data are collated throughout the athlete's sporting career and individual thresholds defined adaptively. For now, this approach has been validated on a limited number of markers of steroid doping, such as the testosterone (T) over epitestosterone (E) ratio to detect T misuse in athletes. Additional markers are required for other endogenous steroids like dihydrotestosterone (DHT) and dehydroepiandrosterone (DHEA). By combining comprehensive steroid profiles composed of 24 steroid concentrations with Bayesian inference techniques for longitudinal profiling, a selection was made for the detection of DHT and DHEA misuse. The biomarkers found were rated according to relative response, parameter stability, discriminative power, and maximal detection time. This analysis revealed DHT/E, DHT/5β-androstane-3α,17β-diol and 5α-androstane-3α,17β-diol/5β-androstane-3α,17β-diol as best biomarkers for DHT administration and DHEA/E, 16α-hydroxydehydroepiandrosterone/E, 7β-hydroxydehydroepiandrosterone/E and 5β-androstane-3α,17β-diol/5α-androstane-3α,17β-diol for DHEA. The selected biomarkers were found suitable for individual referencing. A drastic overall increase in sensitivity was obtained.The use of multiple markers as formalized in an Athlete Steroidal Passport (ASP) can provide firm evidence of doping with endogenous steroids. Copyright © 2010 John Wiley & Sons, Ltd.
Resumo:
With the increasing availability of various 'omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk download.
Resumo:
A new, quantitative, inference model for environmental reconstruction (transfer function), based for the first time on the simultaneous analysis of multigroup species, has been developed. Quantitative reconstructions based on palaeoecological transfer functions provide a powerful tool for addressing questions of environmental change in a wide range of environments, from oceans to mountain lakes, and over a range of timescales, from decades to millions of years. Much progress has been made in the development of inferences based on multiple proxies but usually these have been considered separately, and the different numeric reconstructions compared and reconciled post-hoc. This paper presents a new method to combine information from multiple biological groups at the reconstruction stage. The aim of the multigroup work was to test the potential of the new approach to making improved inferences of past environmental change by improving upon current reconstruction methodologies. The taxonomic groups analysed include diatoms, chironomids and chrysophyte cysts. We test the new methodology using two cold-environment training-sets, namely mountain lakes from the Pyrenees and the Alps. The use of multiple groups, as opposed to single groupings, was only found to increase the reconstruction skill slightly, as measured by the root mean square error of prediction (leave-one-out cross-validation), in the case of alkalinity, dissolved inorganic carbon and altitude (a surrogate for air-temperature), but not for pH or dissolved CO2. Reasons why the improvement was less than might have been anticipated are discussed. These can include the different life-forms, environmental responses and reaction times of the groups under study.
Resumo:
Brain-computer interfaces (BCIs) are becoming more and more popular as an input device for virtual worlds and computer games. Depending on their function, a major drawback is the mental workload associated with their use and there is significant effort and training required to effectively control them. In this paper, we present two studies assessing how mental workload of a P300-based BCI affects participants" reported sense of presence in a virtual environment (VE). In the first study, we employ a BCI exploiting the P300 event-related potential (ERP) that allows control of over 200 items in a virtual apartment. In the second study, the BCI is replaced by a gaze-based selection method coupled with wand navigation. In both studies, overall performance is measured and individual presence scores are assessed by means of a short questionnaire. The results suggest that there is no immediate benefit for visualizing events in the VE triggered by the BCI and that no learning about the layout of the virtual space takes place. In order to alleviate this, we propose that future P300-based BCIs in VR are set up so as require users to make some inference about the virtual space so that they become aware of it,which is likely to lead to higher reported presence.
Resumo:
Because natural selection is likely to act on multiple genes underlying a given phenotypic trait, we study here the potential effect of ongoing and past selection on the genetic diversity of human biological pathways. We first show that genes included in gene sets are generally under stronger selective constraints than other genes and that their evolutionary response is correlated. We then introduce a new procedure to detect selection at the pathway level based on a decomposition of the classical McDonald-Kreitman test extended to multiple genes. This new test, called 2DNS, detects outlier gene sets and takes into account past demographic effects and evolutionary constraints specific to gene sets. Selective forces acting on gene sets can be easily identified by a mere visual inspection of the position of the gene sets relative to their two-dimensional null distribution. We thus find several outlier gene sets that show signals of positive, balancing, or purifying selection but also others showing an ancient relaxation of selective constraints. The principle of the 2DNS test can also be applied to other genomic contrasts. For instance, the comparison of patterns of polymorphisms private to African and non-African populations reveals that most pathways show a higher proportion of nonsynonymous mutations in non-Africans than in Africans, potentially due to different demographic histories and selective pressures.
Resumo:
Monte Carlo simulations were used to generate data for ABAB designs of different lengths. The points of change in phase are randomly determined before gathering behaviour measurements, which allows the use of a randomization test as an analytic technique. Data simulation and analysis can be based either on data-division-specific or on common distributions. Following one method or another affects the results obtained after the randomization test has been applied. Therefore, the goal of the study was to examine these effects in more detail. The discrepancies in these approaches are obvious when data with zero treatment effect are considered and such approaches have implications for statistical power studies. Data-division-specific distributions provide more detailed information about the performance of the statistical technique.
Resumo:
BACKGROUND: Smoking is an important cardiovascular disease risk factor, but the mechanisms linking smoking to blood pressure are poorly understood. METHODS AND RESULTS: Data on 141 317 participants (62 666 never, 40 669 former, 37 982 current smokers) from 23 population-based studies were included in observational and Mendelian randomization meta-analyses of the associations of smoking status and smoking heaviness with systolic and diastolic blood pressure, hypertension, and resting heart rate. For the Mendelian randomization analyses, a genetic variant rs16969968/rs1051730 was used as a proxy for smoking heaviness in current smokers. In observational analyses, current as compared with never smoking was associated with lower systolic blood pressure and diastolic blood pressure and lower hypertension risk, but with higher resting heart rate. In observational analyses among current smokers, 1 cigarette/day higher level of smoking heaviness was associated with higher (0.21 bpm; 95% confidence interval 0.19; 0.24) resting heart rate and slightly higher diastolic blood pressure (0.05 mm Hg; 95% confidence interval 0.02; 0.08) and systolic blood pressure (0.08 mm Hg; 95% confidence interval 0.03; 0.13). However, in Mendelian randomization analyses among current smokers, although each smoking increasing allele of rs16969968/rs1051730 was associated with higher resting heart rate (0.36 bpm/allele; 95% confidence interval 0.18; 0.54), there was no strong association with diastolic blood pressure, systolic blood pressure, or hypertension. This would suggest a 7 bpm higher heart rate in those who smoke 20 cigarettes/day. CONCLUSIONS: This Mendelian randomization meta-analysis supports a causal association of smoking heaviness with higher level of resting heart rate, but not with blood pressure. These findings suggest that part of the cardiovascular risk of smoking may operate through increasing resting heart rate.