940 resultados para data-driven modelling
Resumo:
With the trend in molecular epidemiology towards both genome-wide association studies and complex modelling, the need for large sample sizes to detect small effects and to allow for the estimation of many parameters within a model continues to increase. Unfortunately, most methods of association analysis have been restricted to either a family-based or a case-control design, resulting in the lack of synthesis of data from multiple studies. Transmission disequilibrium-type methods for detecting linkage disequilibrium from family data were developed as an effective way of preventing the detection of association due to population stratification. Because these methods condition on parental genotype, however, they have precluded the joint analysis of family and case-control data, although methods for case-control data may not protect against population stratification and do not allow for familial correlations. We present here an extension of a family-based association analysis method for continuous traits that will simultaneously test for, and if necessary control for, population stratification. We further extend this method to analyse binary traits (and therefore family and case-control data together) and accurately to estimate genetic effects in the population, even when using an ascertained family sample. Finally, we present the power of this binary extension for both family-only and joint family and case-control data, and demonstrate the accuracy of the association parameter and variance components in an ascertained family sample.
Resumo:
Species distribution modelling is central to both fundamental and applied research in biogeography. Despite widespread use of models, there are still important conceptual ambiguities as well as biotic and algorithmic uncertainties that need to be investigated in order to increase confidence in model results. We identify and discuss five areas of enquiry that are of high importance for species distribution modelling: (1) clarification of the niche concept; (2) improved designs for sampling data for building models; (3) improved parameterization; (4) improved model selection and predictor contribution; and (5) improved model evaluation. The challenges discussed in this essay do not preclude the need for developments of other areas of research in this field. However, they are critical for allowing the science of species distribution modelling to move forward.
Resumo:
We consider systems described by nonlinear stochastic differential equations with multiplicative noise. We study the relaxation time of the steady-state correlation function as a function of noise parameters. We consider the white- and nonwhite-noise case for a prototype model for which numerical data are available. We discuss the validity of analytical approximation schemes. For the white-noise case we discuss the results of a projector-operator technique. This discussion allows us to give a generalization of the method to the non-white-noise case. Within this generalization, we account for the growth of the relaxation time as a function of the correlation time of the noise. This behavior is traced back to the existence of a non-Markovian term in the equation for the correlation function.
Resumo:
Summary Ecotones are sensitive to change because they contain high numbers of species living at the margin of their environmental tolerance. This is equally true of tree-lines, which are determined by attitudinal or latitudinal temperature gradients. In the current context of climate change, they are expected to undergo modifications in position, tree biomass and possibly species composition. Attitudinal and latitudinal tree-lines differ mainly in the steepness of the underlying temperature gradient: distances are larger at latitudinal tree-lines, which could have an impact on the ability of tree species to migrate in response to climate change. Aside from temperature, tree-lines are also affected on a more local level by pressure from human activities. These are also changing as a consequence of modifications in our societies and may interact with the effects of climate change. Forest dynamics models are often used for climate change simulations because of their mechanistic processes. The spatially-explicit model TreeMig was used as a base to develop a model specifically tuned for the northern European and Alpine tree-line ecotones. For the latter, a module for land-use change processes was also added. The temperature response parameters for the species in the model were first calibrated by means of tree-ring data from various species and sites at both tree-lines. This improved the growth response function in the model, but also lead to the conclusion that regeneration is probably more important than growth for controlling tree-line position and species' distributions. The second step was to implement the module for abandonment of agricultural land in the Alps, based on an existing spatial statistical model. The sensitivity of its most important variables was tested and the model's performance compared to other modelling approaches. The probability that agricultural land would be abandoned was strongly influenced by the distance from the nearest forest and the slope, bath of which are proxies for cultivation costs. When applied to a case study area, the resulting model, named TreeMig-LAb, gave the most realistic results. These were consistent with observed consequences of land-abandonment such as the expansion of the existing forest and closing up of gaps. This new model was then applied in two case study areas, one in the Swiss Alps and one in Finnish Lapland, under a variety of climate change scenarios. These were based on forecasts of temperature change over the next century by the IPCC and the HadCM3 climate model (ΔT: +1.3, +3.5 and +5.6 °C) and included a post-change stabilisation period of 300 years. The results showed radical disruptions at both tree-lines. With the most conservative climate change scenario, species' distributions simply shifted, but it took several centuries reach a new equilibrium. With the more extreme scenarios, some species disappeared from our study areas (e.g. Pinus cembra in the Alps) or dwindled to very low numbers, as they ran out of land into which they could migrate. The most striking result was the lag in the response of most species, independently from the climate change scenario or tree-line type considered. Finally, a statistical model of the effect of reindeer (Rangifer tarandus) browsing on the growth of Pinus sylvestris was developed, as a first step towards implementing human impacts at the boreal tree-line. The expected effect was an indirect one, as reindeer deplete the ground lichen cover, thought to protect the trees against adverse climate conditions. The model showed a small but significant effect of browsing, but as the link with the underlying climate variables was unclear and the model was not spatial, it was not usable as such. Developing the TreeMig-LAb model allowed to: a) establish a method for deriving species' parameters for the growth equation from tree-rings, b) highlight the importance of regeneration in determining tree-line position and species' distributions and c) improve the integration of social sciences into landscape modelling. Applying the model at the Alpine and northern European tree-lines under different climate change scenarios showed that with most forecasted levels of temperature increase, tree-lines would suffer major disruptions, with shifts in distributions and potential extinction of some tree-line species. However, these responses showed strong lags, so these effects would not become apparent before decades and could take centuries to stabilise. Résumé Les écotones son sensibles au changement en raison du nombre élevé d'espèces qui y vivent à la limite de leur tolérance environnementale. Ceci s'applique également aux limites des arbres définies par les gradients de température altitudinaux et latitudinaux. Dans le contexte actuel de changement climatique, on s'attend à ce qu'elles subissent des modifications de leur position, de la biomasse des arbres et éventuellement des essences qui les composent. Les limites altitudinales et latitudinales diffèrent essentiellement au niveau de la pente des gradients de température qui les sous-tendent les distance sont plus grandes pour les limites latitudinales, ce qui pourrait avoir un impact sur la capacité des espèces à migrer en réponse au changement climatique. En sus de la température, la limite des arbres est aussi influencée à un niveau plus local par les pressions dues aux activités humaines. Celles-ci sont aussi en mutation suite aux changements dans nos sociétés et peuvent interagir avec les effets du changement climatique. Les modèles de dynamique forestière sont souvent utilisés pour simuler les effets du changement climatique, car ils sont basés sur la modélisation de processus. Le modèle spatialement explicite TreeMig a été utilisé comme base pour développer un modèle spécialement adapté pour la limite des arbres en Europe du Nord et dans les Alpes. Pour cette dernière, un module servant à simuler des changements d'utilisation du sol a également été ajouté. Tout d'abord, les paramètres de la courbe de réponse à la température pour les espèces inclues dans le modèle ont été calibrées au moyen de données dendrochronologiques pour diverses espèces et divers sites des deux écotones. Ceci a permis d'améliorer la courbe de croissance du modèle, mais a également permis de conclure que la régénération est probablement plus déterminante que la croissance en ce qui concerne la position de la limite des arbres et la distribution des espèces. La seconde étape consistait à implémenter le module d'abandon du terrain agricole dans les Alpes, basé sur un modèle statistique spatial existant. La sensibilité des variables les plus importantes du modèle a été testée et la performance de ce dernier comparée à d'autres approches de modélisation. La probabilité qu'un terrain soit abandonné était fortement influencée par la distance à la forêt la plus proche et par la pente, qui sont tous deux des substituts pour les coûts liés à la mise en culture. Lors de l'application en situation réelle, le nouveau modèle, baptisé TreeMig-LAb, a donné les résultats les plus réalistes. Ceux-ci étaient comparables aux conséquences déjà observées de l'abandon de terrains agricoles, telles que l'expansion des forêts existantes et la fermeture des clairières. Ce nouveau modèle a ensuite été mis en application dans deux zones d'étude, l'une dans les Alpes suisses et l'autre en Laponie finlandaise, avec divers scénarios de changement climatique. Ces derniers étaient basés sur les prévisions de changement de température pour le siècle prochain établies par l'IPCC et le modèle climatique HadCM3 (ΔT: +1.3, +3.5 et +5.6 °C) et comprenaient une période de stabilisation post-changement climatique de 300 ans. Les résultats ont montré des perturbations majeures dans les deux types de limites de arbres. Avec le scénario de changement climatique le moins extrême, les distributions respectives des espèces ont subi un simple glissement, mais il a fallu plusieurs siècles pour qu'elles atteignent un nouvel équilibre. Avec les autres scénarios, certaines espèces ont disparu de la zone d'étude (p. ex. Pinus cembra dans les Alpes) ou ont vu leur population diminuer parce qu'il n'y avait plus assez de terrains disponibles dans lesquels elles puissent migrer. Le résultat le plus frappant a été le temps de latence dans la réponse de la plupart des espèces, indépendamment du scénario de changement climatique utilisé ou du type de limite des arbres. Finalement, un modèle statistique de l'effet de l'abroutissement par les rennes (Rangifer tarandus) sur la croissance de Pinus sylvestris a été développé, comme première étape en vue de l'implémentation des impacts humains sur la limite boréale des arbres. L'effet attendu était indirect, puisque les rennes réduisent la couverture de lichen sur le sol, dont on attend un effet protecteur contre les rigueurs climatiques. Le modèle a mis en évidence un effet modeste mais significatif, mais étant donné que le lien avec les variables climatiques sous jacentes était peu clair et que le modèle n'était pas appliqué dans l'espace, il n'était pas utilisable tel quel. Le développement du modèle TreeMig-LAb a permis : a) d'établir une méthode pour déduire les paramètres spécifiques de l'équation de croissance ä partir de données dendrochronologiques, b) de mettre en évidence l'importance de la régénération dans la position de la limite des arbres et la distribution des espèces et c) d'améliorer l'intégration des sciences sociales dans les modèles de paysage. L'application du modèle aux limites alpines et nord-européennes des arbres sous différents scénarios de changement climatique a montré qu'avec la plupart des niveaux d'augmentation de température prévus, la limite des arbres subirait des perturbations majeures, avec des glissements d'aires de répartition et l'extinction potentielle de certaines espèces. Cependant, ces réponses ont montré des temps de latence importants, si bien que ces effets ne seraient pas visibles avant des décennies et pourraient mettre plusieurs siècles à se stabiliser.
Resumo:
An equation for mean first-passage times of non-Markovian processes driven by colored noise is derived through an appropriate backward integro-differential equation. The equation is solved in a Bourret-like approximation. In a weak-noise bistable situation, non-Markovian effects are taken into account by an effective diffusion coefficient. In this situation, our results compare satisfactorily with other approaches and experimental data.
Resumo:
ObjectiveCandidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes.Research Design and MethodsBy integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs) which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D), central obesity, and WHO-defined metabolic syndrome (MetS).Results273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05) to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations.ConclusionsUsing a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.
Resumo:
The present research deals with an application of artificial neural networks for multitask learning from spatial environmental data. The real case study (sediments contamination of Geneva Lake) consists of 8 pollutants. There are different relationships between these variables, from linear correlations to strong nonlinear dependencies. The main idea is to construct a subsets of pollutants which can be efficiently modeled together within the multitask framework. The proposed two-step approach is based on: 1) the criterion of nonlinear predictability of each variable ?k? by analyzing all possible models composed from the rest of the variables by using a General Regression Neural Network (GRNN) as a model; 2) a multitask learning of the best model using multilayer perceptron and spatial predictions. The results of the study are analyzed using both machine learning and geostatistical tools.
Resumo:
Aims: To assess the potential distribution of an obligate seeder and active pyrophyte, Cistus salviifolius, a vulnerable species in the Swiss Red List; to derive scenarios by changing the fire return interval; and to discuss the results from a conservation perspective. A more general aim is to assess the impact of fire as a natural factor influencing the vegetation of the southern slopes of the Alps. Locations: Alps, southern Switzerland. Methods: Presence-absence data to fit the model were obtained from the most recent field mapping of C. salviifolius. The quantitative environmental predictors used in this study include topographic, climatic and disturbance (fire) predictors. Models were fitted by logistic regression and evaluated by jackknife and bootstrap approaches. Changes in fire regime were simulated by increasing the time-return interval of fire (simulating longer periods without fire). Two scenarios were considered: no fire in the past 15 years; or in the past 35 years. Results: Rock cover, slope, topographic position, potential evapotranspiration and time elapsed since the last fire were selected in the final model. The Nagelkerke R-2 of the model for C. salviifolius was 0.57 and the Jackknife area under the curve evaluation was 0.89. The bootstrap evaluation revealed model robustness. By increasing the return interval of fire by either up to 15 years, or 35 years, the modelled C. salviifolius population declined by 30-40%, respectively. Main conclusions: Although fire plays a significant role, topography and rock cover appear to be the most important predictors, suggesting that the distribution of C. salviifolius in the southern Swiss Alps is closely related to the availability of supposedly competition-free sites, such as emerging bedrock, ridge locations or steep slopes. Fire is more likely to play a secondary role in allowing C. salviifolius to extend its occurrence temporarily, by increasing germination rates and reducing the competition from surrounding vegetation. To maintain a viable dormant seed bank for C. salviifolius, conservation managers should consider carrying out vegetation clearing and managing wild fire propagation to reduce competition and ensure sufficient recruitment for this species.
Resumo:
Sequencing of pools of individuals (Pool-Seq) represents a reliable and cost-effective approach for estimating genome-wide SNP and transposable element insertion frequencies. However, Pool-Seq does not provide direct information on haplotypes so that, for example, obtaining inversion frequencies has not been possible until now. Here, we have developed a new set of diagnostic marker SNPs for seven cosmopolitan inversions in Drosophila melanogaster that can be used to infer inversion frequencies from Pool-Seq data. We applied our novel marker set to Pool-Seq data from an experimental evolution study and from North American and Australian latitudinal clines. In the experimental evolution data, we find evidence that positive selection has driven the frequencies of In(3R)C and In(3R)Mo to increase over time. In the clinal data, we confirm the existence of frequency clines for In(2L)t, In(3L)P and In(3R)Payne in both North America and Australia and detect a previously unknown latitudinal cline for In(3R)Mo in North America. The inversion markers developed here provide a versatile and robust tool for characterizing inversion frequencies and their dynamics in Pool-Seq data from diverse D. melanogaster populations.
Resumo:
An equation for mean first-passage times of non-Markovian processes driven by colored noise is derived through an appropriate backward integro-differential equation. The equation is solved in a Bourret-like approximation. In a weak-noise bistable situation, non-Markovian effects are taken into account by an effective diffusion coefficient. In this situation, our results compare satisfactorily with other approaches and experimental data.
Resumo:
Remote sensing using airborne imaging spectroscopy (AIS) is known to retrieve fundamental optical properties of ecosystems. However, the value of these properties for predicting plant species distribution remains unclear. Here, we assess whether such data can add value to topographic variables for predicting plant distributions in French and Swiss alpine grasslands. We fitted statistical models with high spectral and spatial resolution reflectance data and tested four optical indices sensitive to leaf chlorophyll content, leaf water content and leaf area index. We found moderate added-value of AIS data for predicting alpine plant species distribution. Contrary to expectations, differences between species distribution models (SDMs) were not linked to their local abundance or phylogenetic/functional similarity. Moreover, spectral signatures of species were found to be partly site-specific. We discuss current limits of AIS-based SDMs, highlighting issues of scale and informational content of AIS data.
Resumo:
This research provides a description of the process followed in order to assemble a "Social Accounting Matrix" for Spain corresponding to the year 2000 (SAMSP00). As argued in the paper, this process attempts to reconcile ESA95 conventions with requirements of applied general equilibrium modelling. Particularly, problems related to the level of aggregation of net taxation data, and to the valuation system used for expressing the monetary value of input-output transactions have deserved special attention. Since the adoption of ESA95 conventions, input-output transactions have been preferably valued at basic prices, which impose additional difficulties on modellers interested in computing applied general equilibrium models. This paper addresses these difficulties by developing a procedure that allows SAM-builders to change the valuation system of input-output transactions conveniently. In addition, this procedure produces new data related to net taxation information.
Resumo:
Studies of hybrid zones can inform our understanding of reproductive isolation and speciation. Two species of brown lemur (Eulemur rufifrons and E. cinereiceps) form an apparently stable hybrid zone in the Andringitra region of south-eastern Madagascar. The aim of this study was to identify factors that contribute to this stability. We sampled animals at 11 sites along a 90-km transect through the hybrid zone and examined variation in 26 microsatellites, the D-loop region of mitochondrial DNA, six pelage and nine morphological traits; we also included samples collected in more distant allopatric sites. Clines in these traits were noncoincident, and there was no increase in either inbreeding coefficients or linkage disequilibrium at the centre of the zone. These results could suggest that the hybrid zone is maintained by weak selection against hybrids, conforming to either the tension zone or geographical selection-gradient model. However, a closer examination of clines in pelage and microsatellites indicates that these clines are not sigmoid or stepped in shape but instead plateau at their centre. Sites within the hybrid zone also occur in a distinct habitat, characterized by greater seasonality in precipitation and lower seasonality in temperature. Together, these findings suggest that the hybrid zone may follow the bounded superiority model, with exogenous selection favouring hybrids within the transitional zone. These findings are noteworthy, as examples supporting the bounded superiority model are rare and may indicate a process of ecologically driven speciation without geographical isolation.
Resumo:
Aim: Climatic niche modelling of species and community distributions implicitly assumes strong and constant climatic determinism across geographic space. This assumption had however never been tested so far. We tested it by assessing how stacked-species distribution models (S-SDMs) perform for predicting plant species assemblages along elevation. Location: Western Swiss Alps. Methods: Using robust presence-absence data, we first assessed the ability of topo-climatic S-SDMs to predict plant assemblages in a study area encompassing a 2800 m wide elevation gradient. We then assessed the relationships among several evaluation metrics and trait-based tests of community assembly rules. Results: The standard errors of individual SDMs decreased significantly towards higher elevations. Overall, the S-SDM overpredicted far more than they underpredicted richness and could not reproduce the humpback curve along elevation. Overprediction was greater at low and mid-range elevations in absolute values but greater at high elevations when standardised by the actual richness. Looking at species composition, the evaluation metrics accounting for both the presence and absence of species (overall prediction success and kappa) or focusing on correctly predicted absences (specificity) increased with increasing elevation, while the metrics focusing on correctly predicted presences (Jaccard index and sensitivity) decreased. The best overall evaluation - as driven by specificity - occurred at high elevation where species assemblages were shown to be under significant environmental filtering of small plants. In contrast, the decreased overall accuracy in the lowlands was associated with functional patterns representing any type of assembly rule (environmental filtering, limiting similarity or null assembly). Main Conclusions: Our study reveals interesting patterns of change in S-SDM errors with changes in assembly rules along elevation. Yet, significant levels of assemblage prediction errors occurred throughout the gradient, calling for further improvement of SDMs, e.g., by adding key environmental filters that act at fine scales and developing approaches to account for variations in the influence of predictors along environmental gradients.
Resumo:
Depth-averaged velocities and unit discharges within a 30 km reach of one of the world's largest rivers, the Rio Parana, Argentina, were simulated using three hydrodynamic models with different process representations: a reduced complexity (RC) model that neglects most of the physics governing fluid flow, a two-dimensional model based on the shallow water equations, and a three-dimensional model based on the Reynolds-averaged Navier-Stokes equations. Row characteristics simulated using all three models were compared with data obtained by acoustic Doppler current profiler surveys at four cross sections within the study reach. This analysis demonstrates that, surprisingly, the performance of the RC model is generally equal to, and in some instances better than, that of the physics based models in terms of the statistical agreement between simulated and measured flow properties. In addition, in contrast to previous applications of RC models, the present study demonstrates that the RC model can successfully predict measured flow velocities. The strong performance of the RC model reflects, in part, the simplicity of the depth-averaged mean flow patterns within the study reach and the dominant role of channel-scale topographic features in controlling the flow dynamics. Moreover, the very low water surface slopes that typify large sand-bed rivers enable flow depths to be estimated reliably in the RC model using a simple fixed-lid planar water surface approximation. This approach overcomes a major problem encountered in the application of RC models in environments characterised by shallow flows and steep bed gradients. The RC model is four orders of magnitude faster than the physics based models when performing steady-state hydrodynamic calculations. However, the iterative nature of the RC model calculations implies a reduction in computational efficiency relative to some other RC models. A further implication of this is that, if used to simulate channel morphodynamics, the present RC model may offer only a marginal advantage in terms of computational efficiency over approaches based on the shallow water equations. These observations illustrate the trade off between model realism and efficiency that is a key consideration in RC modelling. Moreover, this outcome highlights a need to rethink the use of RC morphodynamic models in fluvial geomorphology and to move away from existing grid-based approaches, such as the popular cellular automata (CA) models, that remain essentially reductionist in nature. In the case of the world's largest sand-bed rivers, this might be achieved by implementing the RC model outlined here as one element within a hierarchical modelling framework that would enable computationally efficient simulation of the morphodynamics of large rivers over millennial time scales. (C) 2012 Elsevier B.V. All rights reserved.