946 resultados para Predictive Models


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Understanding adaptive genetic responses to climate change is a main challenge for preserving biological diversity. Successful predictive models for climate-driven range shifts of species depend on the integration of information on adaptation, including that derived from genomic studies. Long-lived forest trees can experience substantial environmental change across generations, which results in a much more prominent adaptation lag than in annual species. Here, we show that candidate-gene SNPs (single nucleotide polymorphisms) can be used as predictors of maladaptation to climate in maritime pine (Pinus pinaster Aiton), an outcrossing long-lived keystone tree. A set of 18 SNPs potentially associated with climate, 5 of them involving amino acid-changing variants, were retained after performing logistic regression, latent factor mixed models, and Bayesian analyses of SNP-climate correlations. These relationships identified temperature as an important adaptive driver in maritime pine and highlighted that selective forces are operating differentially in geographically discrete gene pools. The frequency of the locally advantageous alleles at these selected loci was strongly correlated with survival in a common garden under extreme (hot and dry) climate conditions, which suggests that candidate-gene SNPs can be used to forecast the likely destiny of natural forest ecosystems under climate change scenarios. Differential levels of forest decline are anticipated for distinct maritime pine gene pools. Geographically defined molecular proxies for climate adaptation will thus critically enhance the predictive power of range-shift models and help establish mitigation measures for long-lived keystone forest trees in the face of impending climate change.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Digital information generates the possibility of a high degree of redundancy in the data available for fitting predictive models used for Digital Soil Mapping (DSM). Among these models, the Decision Tree (DT) technique has been increasingly applied due to its capacity of dealing with large datasets. The purpose of this study was to evaluate the impact of the data volume used to generate the DT models on the quality of soil maps. An area of 889.33 km² was chosen in the Northern region of the State of Rio Grande do Sul. The soil-landscape relationship was obtained from reambulation of the studied area and the alignment of the units in the 1:50,000 scale topographic mapping. Six predictive covariates linked to the factors soil formation, relief and organisms, together with data sets of 1, 3, 5, 10, 15, 20 and 25 % of the total data volume, were used to generate the predictive DT models in the data mining program Waikato Environment for Knowledge Analysis (WEKA). In this study, sample densities below 5 % resulted in models with lower power of capturing the complexity of the spatial distribution of the soil in the study area. The relation between the data volume to be handled and the predictive capacity of the models was best for samples between 5 and 15 %. For the models based on these sample densities, the collected field data indicated an accuracy of predictive mapping close to 70 %.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Human-induced habitat fragmentation constitutes a major threat to biodiversity. Both genetic and demographic factors combine to drive small and isolated populations into extinction vortices. Nevertheless, the deleterious effects of inbreeding and drift load may depend on population structure, migration patterns, and mating systems and are difficult to predict in the absence of crossing experiments. We performed stochastic individual-based simulations aimed at predicting the effects of deleterious mutations on population fitness (offspring viability and median time to extinction) under a variety of settings (landscape configurations, migration models, and mating systems) on the basis of easy-to-collect demographic and genetic information. Pooling all simulations, a large part (70%) of variance in offspring viability was explained by a combination of genetic structure (F(ST)) and within-deme heterozygosity (H(S)). A similar part of variance in median time to extinction was explained by a combination of local population size (N) and heterozygosity (H(S)). In both cases the predictive power increased above 80% when information on mating systems was available. These results provide robust predictive models to evaluate the viability prospects of fragmented populations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Activity monitors based on accelerometry are used to predict the speed and energy cost of walking at 0% slope, but not at other inclinations. Parallel measurements of body accelerations and altitude variation were studied to determine whether walking speed prediction could be improved. Fourteen subjects walked twice along a 1.3 km circuit with substantial slope variations (-17% to +17%). The parameters recorded were body acceleration using a uni-axial accelerometer, altitude variation using differential barometry, and walking speed using satellite positioning (DGPS). Linear regressions were calculated between acceleration and walking speed, and between acceleration/altitude and walking speed. These predictive models, calculated using the data from the first circuit run, were used to predict speed during the second circuit. Finally the predicted velocity was compared with the measured one. The result was that acceleration alone failed to predict speed (mean r = 0.4). Adding altitude variation improved the prediction (mean r = 0.7). With regard to the altitude/acceleration-speed relationship, substantial inter-individual variation was found. It is concluded that accelerometry, combined with altitude measurement, can assess position variations of humans provided inter-individual variation is taken into account. It is also confirmed that DGPS can be used for outdoor walking speed measurements, opening up new perspectives in the field of biomechanics.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Multiple sclerosis (MS), a variable and diffuse disease affecting white and gray matter, is known to cause functional connectivity anomalies in patients. However, related studies published to-date are post hoc; our hypothesis was that such alterations could discriminate between patients and healthy controls in a predictive setting, laying the groundwork for imaging-based prognosis. Using functional magnetic resonance imaging resting state data of 22 minimally disabled MS patients and 14 controls, we developed a predictive model of connectivity alterations in MS: a whole-brain connectivity matrix was built for each subject from the slow oscillations (<0.11Hz) of region-averaged time series, and a pattern recognition technique was used to learn a discriminant function indicating which particular functional connections are most affected by disease. Classification performance using strict cross-validation yielded a sensitivity of 82% (above chance at p<0.005) and specificity of 86% (p<0.01) to distinguish between MS patients and controls. The most discriminative connectivity changes were found in subcortical and temporal regions, and contralateral connections were more discriminative than ipsilateral connections. The pattern of decreased discriminative connections can be summarized post hoc in an index that correlates positively (ρ=0.61) with white matter lesion load, possibly indicating functional reorganisation to cope with increasing lesion load. These results are consistent with a subtle but widespread impact of lesions in white matter and in gray matter structures serving as high-level integrative hubs. These findings suggest that predictive models of resting state fMRI can reveal specific anomalies due to MS with high sensitivity and specificity, potentially leading to new non-invasive markers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper asks a simple question: if humans and their actions co-evolve with hydrological systems (Sivapalan et al., 2012), what is the role of hydrological scientists, who are also humans, within this system? To put it more directly, as traditionally there is a supposed separation of scientists and society, can we maintain this separation as socio-hydrologists studying a socio-hydrological world? This paper argues that we cannot, using four linked sections. The first section draws directly upon the concern of science-technology studies to make a case to the (socio-hydrological) community that we need to be sensitive to constructivist accounts of science in general and socio-hydrology in particular. I review three positions taken by such accounts and apply them to hydrological science, supported with specific examples: (a) the ways in which scientific activities frame socio-hydrological research, such that at least some of the knowledge that we obtain is constructed by precisely what we do; (b) the need to attend to how socio-hydrological knowledge is used in decision-making, as evidence suggests that hydrological knowledge does not flow simply from science into policy; and (c) the observation that those who do not normally label themselves as socio-hydrologists may actually have a profound knowledge of socio-hydrology. The second section provides an empirical basis for considering these three issues by detailing the history of the practice of roughness parameterisation, using parameters like Manning's n, in hydrological and hydraulic models for flood inundation mapping. This history sustains the third section that is a more general consideration of one type of socio-hydrological practice: predictive modelling. I show that as part of a socio-hydrological analysis, hydrological prediction needs to be thought through much more carefully: not only because hydrological prediction exists to help inform decisions that are made about water management; but also because those predictions contain assumptions, the predictions are only correct in so far as those assumptions hold, and for those assumptions to hold, the socio-hydrological system (i.e. the world) has to be shaped so as to include them. Here, I add to the ``normal'' view that ideally our models should represent the world around us, to argue that for our models (and hence our predictions) to be valid, we have to make the world look like our models. Decisions over how the world is modelled may transform the world as much as they represent the world. Thus, socio-hydrological modelling has to become a socially accountable process such that the world is transformed, through the implications of modelling, in a fair and just manner. This leads into the final section of the paper where I consider how socio-hydrological research may be made more socially accountable, in a way that is both sensitive to the constructivist critique (Sect. 1), but which retains the contribution that hydrologists might make to socio-hydrological studies. This includes (1) working with conflict and controversy in hydrological science, rather than trying to eliminate them; (2) using hydrological events to avoid becoming locked into our own frames of explanation and prediction; (3) being empirical and experimental but in a socio-hydrological sense; and (4) co-producing socio-hydrological predictions. I will show how this might be done through a project that specifically developed predictive models for making interventions in river catchments to increase high river flow attenuation. Therein, I found myself becoming detached from my normal disciplinary networks and attached to the co-production of a predictive hydrological model with communities normally excluded from the practice of hydrological science.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

OBJECTIVE: To develop predictive models for early triage of burn patients based on hypersusceptibility to repeated infections. BACKGROUND: Infection remains a major cause of mortality and morbidity after severe trauma, demanding new strategies to combat infections. Models for infection prediction are lacking. METHODS: Secondary analysis of 459 burn patients (≥16 years old) with 20% or more total body surface area burns recruited from 6 US burn centers. We compared blood transcriptomes with a 180-hour cutoff on the injury-to-transcriptome interval of 47 patients (≤1 infection episode) to those of 66 hypersusceptible patients [multiple (≥2) infection episodes (MIE)]. We used LASSO regression to select biomarkers and multivariate logistic regression to built models, accuracy of which were assessed by area under receiver operating characteristic curve (AUROC) and cross-validation. RESULTS: Three predictive models were developed using covariates of (1) clinical characteristics; (2) expression profiles of 14 genomic probes; (3) combining (1) and (2). The genomic and clinical models were highly predictive of MIE status [AUROCGenomic = 0.946 (95% CI: 0.906-0.986); AUROCClinical = 0.864 (CI: 0.794-0.933); AUROCGenomic/AUROCClinical P = 0.044]. Combined model has an increased AUROCCombined of 0.967 (CI: 0.940-0.993) compared with the individual models (AUROCCombined/AUROCClinical P = 0.0069). Hypersusceptible patients show early alterations in immune-related signaling pathways, epigenetic modulation, and chromatin remodeling. CONCLUSIONS: Early triage of burn patients more susceptible to infections can be made using clinical characteristics and/or genomic signatures. Genomic signature suggests new insights into the pathophysiology of hypersusceptibility to infection may lead to novel potential therapeutic or prophylactic targets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Escherichia coli, as a model microorganism, was treated in phosphate-buffered saline under high hydrostatic pressure between 100 and 300 MPa, and the inactivation dynamics was investigated from the viewpoint of predictive microbiology. Inactivation data were curve fitted by typical predictive models: logistic, Gompertz and Weibull functions. Weibull function described the inactivation curve the best. Two parameters of Weibull function were calculated for each holding pressure and their dependence on holding pressure was obtained by interpolation. With the interpolated parameters, inactivation curves were simulated and compared with the experimental data sets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The chemical composition of apple juices may be used to discriminate between the varieties for consumption and those for raw material. Fuji and Gala have a chemical pattern that can be used for this classification. Multivariate methods correlate independent continuous chemical descriptors with the categorical apple variety. Three main descriptors of apple juice were selected: malic acid, total reducing sugar and total phenolic compounds. A chemometric approach, employing PCA and SIMCA, was used to classify apple juice samples. PCA was performed with 24 juices from Fuji and Gala, and SIMCA, with 15 juices. The exploratory and predictive models recognized 88% and 64%, respectively, as belonging to a mixed domain. The apple juice from commercial fruits shows a pattern related to cv. Fuji and Gala with boundaries from 0.18 to 0.389 g.100 mL-1 (malic acid), from 8.65 to 15.18 g.100 mL-1 (total reducing sugar) and from 100 to 400 mg.L-1 (total phenolic compounds), but such boundaries were slightly shorter in the remaining set of commercial apple juices, specifically from 0.16 to 0.36 g.100 mL-1, from 9.25 to 15.5 g.100 mL-1 and from 180 to 606 mg.L-1 for acidity, reducing sugar and phenolic compounds, respectively, representing the acid, sweet and bitter tastes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objectif: Évaluer l'efficacité du dépistage de l’hypertension gestationnelle par les caractéristiques démographiques maternelles, les biomarqueurs sériques et le Doppler de l'artère utérine au premier et au deuxième trimestre de grossesse. Élaborer des modèles prédictifs de l’hypertension gestationnelle fondées sur ces paramètres. Methods: Il s'agit d'une étude prospective de cohorte incluant 598 femmes nullipares. Le Doppler utérin a été étudié par échographie transabdominale entre 11 +0 à 13 +6 semaines (1er trimestre) et entre 17 +0 à 21 +6 semaines (2e trimestre). Tous les échantillons de sérum pour la mesure de plusieurs biomarqueurs placentaires ont été recueillis au 1er trimestre. Les caractéristiques démographiques maternelles ont été enregistrées en même temps. Des courbes ROC et les valeurs prédictives ont été utilisés pour analyser la puissance prédictive des paramètres ci-dessus. Différentes combinaisons et leurs modèles de régression logistique ont été également analysés. Résultats: Parmi 598 femmes, on a observé 20 pré-éclampsies (3,3%), 7 pré-éclampsies précoces (1,2%), 52 cas d’hypertension gestationnelle (8,7%) , 10 cas d’hypertension gestationnelle avant 37 semaines (1,7%). L’index de pulsatilité des artères utérines au 2e trimestre est le meilleur prédicteur. En analyse de régression logistique multivariée, la meilleure valeur prédictive au 1er et au 2e trimestre a été obtenue pour la prévision de la pré-éclampsie précoce. Le dépistage combiné a montré des résultats nettement meilleurs comparés avec les paramètres maternels ou Doppler seuls. Conclusion: Comme seul marqueur, le Doppler utérin du deuxième trimestre a la meilleure prédictive pour l'hypertension, la naissance prématurée et la restriction de croissance. La combinaison des caractéristiques démographiques maternelles, des biomarqueurs sériques maternels et du Doppler utérin améliore l'efficacité du dépistage, en particulier pour la pré-éclampsie nécessitant un accouchement prématuré.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La surveillance de l’influenza s’appuie sur un large spectre de données, dont les données de surveillance syndromique provenant des salles d’urgences. De plus en plus de variables sont enregistrées dans les dossiers électroniques des urgences et mises à la disposition des équipes de surveillance. L’objectif principal de ce mémoire est d’évaluer l’utilité potentielle de l’âge, de la catégorie de triage et de l’orientation au départ de l’urgence pour améliorer la surveillance de la morbidité liée aux cas sévères d’influenza. Les données d’un sous-ensemble des hôpitaux de Montréal ont été utilisées, d’avril 2006 à janvier 2011. Les hospitalisations avec diagnostic de pneumonie ou influenza ont été utilisées comme mesure de la morbidité liée aux cas sévères d’influenza, et ont été modélisées par régression binomiale négative, en tenant compte des tendances séculaires et saisonnières. En comparaison avec les visites avec syndrome d’allure grippale (SAG) totales, les visites avec SAG stratifiées par âge, par catégorie de triage et par orientation de départ ont amélioré le modèle prédictif des hospitalisations avec pneumonie ou influenza. Avant d’intégrer ces variables dans le système de surveillance de Montréal, des étapes additionnelles sont suggérées, incluant l’optimisation de la définition du syndrome d’allure grippale à utiliser, la confirmation de la valeur de ces prédicteurs avec de nouvelles données et l’évaluation de leur utilité pratique.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Une estimation des quantités de carbone organique dissous dans les millions de lacs boréaux est nécessaire pour améliorer notre connaissance du cycle global du carbone. Les teneurs en carbone organique dissous sont corrélées avec les quantités de matière organique dissoute colorée qui est visible depuis l’espace. Cependant, les capteurs actuels offrent une radiométrie et une résolution spatiale qui sont limitées par rapport à la taille et l’opacité des lacs boréaux. Landsat 8, lancé en février 2013, offrira une radiométrie et une résolution spatiale améliorées, et produira une couverture à grande échelle des régions boréales. Les limnologistes ont accumulé des années de campagnes de terrain dans les régions boréales pour lesquelles une image Landsat 8 sera disponible. Pourtant, la possibilité de combiner des données de terrain existantes avec une image satellite récente n'a pas encore été évaluée. En outre, les différentes stratégies envisageables pour sélectionner et combiner des mesures répétées au cours du temps, sur le terrain et depuis le satellite, n'ont pas été évaluées. Cette étude présente les possibilités et les limites d’utiliser des données de terrain existantes avec des images satellites récentes pour développer des modèles de prédiction du carbone organique dissous. Les méthodes se basent sur des données de terrain recueillies au Québec dans 53 lacs boréaux et 10 images satellites acquises par le capteur prototype de Landsat 8. Les délais entre les campagnes de terrain et les images satellites varient de 1 mois à 6 ans. Le modèle de prédiction obtenu se compare favorablement avec un modèle basé sur des campagnes de terrain synchronisées avec les images satellite. L’ajout de mesures répétées sur le terrain, sur le satellite, et les corrections atmosphériques des images, n’améliorent pas la qualité du modèle de prédiction. Deux images d’application montrent des distributions différentes de teneurs en carbone organique dissous et de volumes, mais les quantités de carbone organique dissous par surface de paysage restent de même ordre pour les deux sites. Des travaux additionnels pour intégrer les sédiments dans l’estimation sont nécessaires pour améliorer le bilan du carbone des régions boréales.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les vagues de bateau ajoutent une pression supplémentaire sur les berges de rivières et doivent être considérées dans les modèles de prédiction des taux de recul des berges. L’objectif de cette étude est d’examiner le rôle des vagues de bateau sur l’écoulement et le transport en suspension le long des berges en milieu fluvial. Pour atteindre cet objectif, nous utilisons un transect perpendiculaire à la berge de quatre courantomètres électromagnétiques (ECMs) mesurant deux dimensions de l’écoulement et deux turbidimètres (OBSs) placés dos à dos, orientés vers la berge et le large pour mesurer les conditions moyennes et turbulentes de l’écoulement longitudinal et vertical ainsi que les flux de sédiments en suspension provoqués par les vagues. Une chaloupe à moteur de 16 pieds, équipée d’un moteur 40 hp, a été utilisée afin de générer des vagues. Nous avons mesuré l’effet de trois distances à partir de la berge (5, 10, 15 m) et trois vitesses de bateau (5, 15 et 25 km/h) et cinq répliques de chaque combinaison de distance et de vitesse ont été réalisées, totalisant 45 passages. Nous avons caractérisé la variabilité des conditions d’écoulement, de vagues et de transport de sédiments et nous avons réalisé des analyses spectrales afin de séparer les portions oscillatoire et turbulente de l’écoulement généré par les vagues de bateau. L’effet de la distance et de la vitesse du bateau sur le transport de sédiments est non-linéaire et la réponse sédimentaire induite par les passages de bateau montre une variabilité importante entre les répliques et les deux sondes OBS, ce qui suggère un changement morphologique induit par les vagues de bateau. Les corrélations entre les variables d’écoulement et de transport montrent l’importance des relations entre le cisaillement et la puissance de la portion turbulente de l’écoulement avec le transport de sédiments. Cette étude a permis de quantifier les relations entre la dynamique des vagues et les flux de concentrations de sédiments en suspension, ce qui représente une contribution importante au développement de mesures de mitigation dans les environnements fluviaux où les berges sont fragilisées par le trafic plaisancier.