981 resultados para predictive modelling


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Promiscuous T-cell epitopes make ideal targets for vaccine development. We report here a computational system, multipred, for the prediction of peptide binding to the HLA-A2 supertype. It combines a novel representation of peptide/MHC interactions with a hidden Markov model as the prediction algorithm. multipred is both sensitive and specific, and demonstrates high accuracy of peptide-binding predictions for HLA-A*0201, *0204, and *0205 alleles, good accuracy for *0206 allele, and marginal accuracy for *0203 allele. multipred replaces earlier requirements for individual prediction models for each HLA allelic variant and simplifies computational aspects of peptide-binding prediction. Preliminary testing indicates that multipred can predict peptide binding to HLA-A2 supertype molecules with high accuracy, including those allelic variants for which no experimental binding data are currently available.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

1. Cluster analysis of reference sites with similar biota is the initial step in creating River Invertebrate Prediction and Classification System (RIVPACS) and similar river bioassessment models such as Australian River Assessment System (AUSRIVAS). This paper describes and tests an alternative prediction method, Assessment by Nearest Neighbour Analysis (ANNA), based on the same philosophy as RIVPACS and AUSRIVAS but without the grouping step that some people view as artificial. 2. The steps in creating ANNA models are: (i) weighting the predictor variables using a multivariate approach analogous to principal axis correlations, (ii) calculating the weighted Euclidian distance from a test site to the reference sites based on the environmental predictors, (iii) predicting the faunal composition based on the nearest reference sites and (iv) calculating an observed/expected (O/E) analogous to RIVPACS/AUSRIVAS. 3. The paper compares AUSRIVAS and ANNA models on 17 datasets representing a variety of habitats and seasons. First, it examines each model's regressions for Observed versus Expected number of taxa, including the r(2), intercept and slope. Second, the two models' assessments of 79 test sites in New Zealand are compared. Third, the models are compared on test and presumed reference sites along a known trace metal gradient. Fourth, ANNA models are evaluated for western Australia, a geographically distinct region of Australia. The comparisons demonstrate that ANNA and AUSRIVAS are generally equivalent in performance, although ANNA turns out to be potentially more robust for the O versus E regressions and is potentially more accurate on the trace metal gradient sites. 4. The ANNA method is recommended for use in bioassessment of rivers, at least for corroborating the results of the well established AUSRIVAS- and RIVPACS-type models, if not to replace them.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

1. Species distribution modelling is used increasingly in both applied and theoretical research to predict how species are distributed and to understand attributes of species' environmental requirements. In species distribution modelling, various statistical methods are used that combine species occurrence data with environmental spatial data layers to predict the suitability of any site for that species. While the number of data sharing initiatives involving species' occurrences in the scientific community has increased dramatically over the past few years, various data quality and methodological concerns related to using these data for species distribution modelling have not been addressed adequately. 2. We evaluated how uncertainty in georeferences and associated locational error in occurrences influence species distribution modelling using two treatments: (1) a control treatment where models were calibrated with original, accurate data and (2) an error treatment where data were first degraded spatially to simulate locational error. To incorporate error into the coordinates, we moved each coordinate with a random number drawn from the normal distribution with a mean of zero and a standard deviation of 5 km. We evaluated the influence of error on the performance of 10 commonly used distributional modelling techniques applied to 40 species in four distinct geographical regions. 3. Locational error in occurrences reduced model performance in three of these regions; relatively accurate predictions of species distributions were possible for most species, even with degraded occurrences. Two species distribution modelling techniques, boosted regression trees and maximum entropy, were the best performing models in the face of locational errors. The results obtained with boosted regression trees were only slightly degraded by errors in location, and the results obtained with the maximum entropy approach were not affected by such errors. 4. Synthesis and applications. To use the vast array of occurrence data that exists currently for research and management relating to the geographical ranges of species, modellers need to know the influence of locational error on model quality and whether some modelling techniques are particularly robust to error. We show that certain modelling techniques are particularly robust to a moderate level of locational error and that useful predictions of species distributions can be made even when occurrence data include some error.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Model predictiu basat en xarxes bayesianes que permet identificar els pacients amb major risc d'ingrés a un hospital segons una sèrie d'atributs de dades demogràfiques i clíniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper asks a simple question: if humans and their actions co-evolve with hydrological systems (Sivapalan et al., 2012), what is the role of hydrological scientists, who are also humans, within this system? To put it more directly, as traditionally there is a supposed separation of scientists and society, can we maintain this separation as socio-hydrologists studying a socio-hydrological world? This paper argues that we cannot, using four linked sections. The first section draws directly upon the concern of science-technology studies to make a case to the (socio-hydrological) community that we need to be sensitive to constructivist accounts of science in general and socio-hydrology in particular. I review three positions taken by such accounts and apply them to hydrological science, supported with specific examples: (a) the ways in which scientific activities frame socio-hydrological research, such that at least some of the knowledge that we obtain is constructed by precisely what we do; (b) the need to attend to how socio-hydrological knowledge is used in decision-making, as evidence suggests that hydrological knowledge does not flow simply from science into policy; and (c) the observation that those who do not normally label themselves as socio-hydrologists may actually have a profound knowledge of socio-hydrology. The second section provides an empirical basis for considering these three issues by detailing the history of the practice of roughness parameterisation, using parameters like Manning's n, in hydrological and hydraulic models for flood inundation mapping. This history sustains the third section that is a more general consideration of one type of socio-hydrological practice: predictive modelling. I show that as part of a socio-hydrological analysis, hydrological prediction needs to be thought through much more carefully: not only because hydrological prediction exists to help inform decisions that are made about water management; but also because those predictions contain assumptions, the predictions are only correct in so far as those assumptions hold, and for those assumptions to hold, the socio-hydrological system (i.e. the world) has to be shaped so as to include them. Here, I add to the ``normal'' view that ideally our models should represent the world around us, to argue that for our models (and hence our predictions) to be valid, we have to make the world look like our models. Decisions over how the world is modelled may transform the world as much as they represent the world. Thus, socio-hydrological modelling has to become a socially accountable process such that the world is transformed, through the implications of modelling, in a fair and just manner. This leads into the final section of the paper where I consider how socio-hydrological research may be made more socially accountable, in a way that is both sensitive to the constructivist critique (Sect. 1), but which retains the contribution that hydrologists might make to socio-hydrological studies. This includes (1) working with conflict and controversy in hydrological science, rather than trying to eliminate them; (2) using hydrological events to avoid becoming locked into our own frames of explanation and prediction; (3) being empirical and experimental but in a socio-hydrological sense; and (4) co-producing socio-hydrological predictions. I will show how this might be done through a project that specifically developed predictive models for making interventions in river catchments to increase high river flow attenuation. Therein, I found myself becoming detached from my normal disciplinary networks and attached to the co-production of a predictive hydrological model with communities normally excluded from the practice of hydrological science.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En écologie, dans le cadre par exemple d’études des services fournis par les écosystèmes, les modélisations descriptive, explicative et prédictive ont toutes trois leur place distincte. Certaines situations bien précises requièrent soit l’un soit l’autre de ces types de modélisation ; le bon choix s’impose afin de pouvoir faire du modèle un usage conforme aux objectifs de l’étude. Dans le cadre de ce travail, nous explorons dans un premier temps le pouvoir explicatif de l’arbre de régression multivariable (ARM). Cette méthode de modélisation est basée sur un algorithme récursif de bipartition et une méthode de rééchantillonage permettant l’élagage du modèle final, qui est un arbre, afin d’obtenir le modèle produisant les meilleures prédictions. Cette analyse asymétrique à deux tableaux permet l’obtention de groupes homogènes d’objets du tableau réponse, les divisions entre les groupes correspondant à des points de coupure des variables du tableau explicatif marquant les changements les plus abrupts de la réponse. Nous démontrons qu’afin de calculer le pouvoir explicatif de l’ARM, on doit définir un coefficient de détermination ajusté dans lequel les degrés de liberté du modèle sont estimés à l’aide d’un algorithme. Cette estimation du coefficient de détermination de la population est pratiquement non biaisée. Puisque l’ARM sous-tend des prémisses de discontinuité alors que l’analyse canonique de redondance (ACR) modélise des gradients linéaires continus, la comparaison de leur pouvoir explicatif respectif permet entre autres de distinguer quel type de patron la réponse suit en fonction des variables explicatives. La comparaison du pouvoir explicatif entre l’ACR et l’ARM a été motivée par l’utilisation extensive de l’ACR afin d’étudier la diversité bêta. Toujours dans une optique explicative, nous définissons une nouvelle procédure appelée l’arbre de régression multivariable en cascade (ARMC) qui permet de construire un modèle tout en imposant un ordre hiérarchique aux hypothèses à l’étude. Cette nouvelle procédure permet d’entreprendre l’étude de l’effet hiérarchisé de deux jeux de variables explicatives, principal et subordonné, puis de calculer leur pouvoir explicatif. L’interprétation du modèle final se fait comme dans une MANOVA hiérarchique. On peut trouver dans les résultats de cette analyse des informations supplémentaires quant aux liens qui existent entre la réponse et les variables explicatives, par exemple des interactions entres les deux jeux explicatifs qui n’étaient pas mises en évidence par l’analyse ARM usuelle. D’autre part, on étudie le pouvoir prédictif des modèles linéaires généralisés en modélisant la biomasse de différentes espèces d’arbre tropicaux en fonction de certaines de leurs mesures allométriques. Plus particulièrement, nous examinons la capacité des structures d’erreur gaussienne et gamma à fournir les prédictions les plus précises. Nous montrons que pour une espèce en particulier, le pouvoir prédictif d’un modèle faisant usage de la structure d’erreur gamma est supérieur. Cette étude s’insère dans un cadre pratique et se veut un exemple pour les gestionnaires voulant estimer précisément la capture du carbone par des plantations d’arbres tropicaux. Nos conclusions pourraient faire partie intégrante d’un programme de réduction des émissions de carbone par les changements d’utilisation des terres.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

High resolution descriptions of plant distribution have utility for many ecological applications but are especially useful for predictive modelling of gene flow from transgenic crops. Difficulty lies in the extrapolation errors that occur when limited ground survey data are scaled up to the landscape or national level. This problem is epitomized by the wide confidence limits generated in a previous attempt to describe the national abundance of riverside Brassica rapa (a wild relative of cultivated rapeseed) across the United Kingdom. Here, we assess the value of airborne remote sensing to locate B. rapa over large areas and so reduce the need for extrapolation. We describe results from flights over the river Nene in England acquired using Airborne Thematic Mapper (ATM) and Compact Airborne Spectrographic Imager (CASI) imagery, together with ground truth data. It proved possible to detect 97% of flowering B. rapa on the basis of spectral profiles. This included all stands of plants that occupied >2m square (>5 plants), which were detected using single-pixel classification. It also included very small populations (<5 flowering plants, 1-2m square) that generated mixed pixels, which were detected using spectral unmixing. The high detection accuracy for flowering B. rapa was coupled with a rather large false positive rate (43%). The latter could be reduced by using the image detections to target fieldwork to confirm species identity, or by acquiring additional remote sensing data such as laser altimetry or multitemporal imagery.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Graphical tracking is a technique for crop scheduling where the actual plant state is plotted against an ideal target curve which encapsulates all crop and environmental characteristics. Management decisions are made on the basis of the position of the actual crop against the ideal position. Due to the simplicity of the approach it is possible for graphical tracks to be developed on site without the requirement for controlled experimentation. Growth models and graphical tracks are discussed, and an implementation of the Richards curve for graphical tracking described. In many cases, the more intuitively desirable growth models perform sub-optimally due to problems with the specification of starting conditions, environmental factors outside the scope of the original model and the introduction of new cultivars. Accurate specification for a biological model requires detailed and usually costly study, and as such is not adaptable to a changing cultivar range and changing cultivation techniques. Fitting of a new graphical track for a new cultivar can be conducted on site and improved over subsequent seasons. Graphical tracking emphasises the current position relative to the objective, and as such does not require the time consuming or system specific input of an environmental history, although it does require detailed crop measurement. The approach is flexible and could be applied to a variety of specification metrics, with digital imaging providing a route for added value. For decision making regarding crop manipulation from the observed current state, there is a role for simple predictive modelling over the short term to indicate the short term consequences of crop manipulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A modelagem preditiva tem sido aplicada para analisar a distribuição geográfica de espécies, a partir de extrapolações das características ambientais dos locais conhecidos de ocorrência. O interesse por esse tipo de modelagem deve-se à necessidade de respostas rápidas e fundamentadas para as ameaças que as espécies têm enfrentado, devido à perda de habitat, invasão de espécies exóticas, mudanças climáticas, entre outros. Este artigo oferece uma visão geral dos avanços recentes no campo da modelagem e visa incentivar a discussão e aplicação desse método, que pode auxiliar tanto na aquisição de conhecimento básico sobre a biologia das espécies, quanto na análise e formulação de políticas para sua conservação.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Stroke poses a massive burden of disease, yet we have few effective therapies. The paucity of therapeutic options stands contrary to intensive research efforts. The failure of these past investments demands a thorough re-examination of the pathophysiology of ischaemic brain injury. Several critical areas hold the key to overcoming the translational roadblock: (1) vascular occlusion: current recanalization strategies have limited effectiveness and may have serious side effects; (2) complexity of stroke pathobiology: therapy must acknowledge the 'Janus-faced' nature of many stroke targets and must identify endogenous neuroprotective and repair mechanisms; (3) inflammation and brain-immune-system interaction: inflammation contributes to lesion expansion, but is also instrumental in lesion containment and repair; stroke outcome is modulated by the interaction of the injured brain with the immune system; (4) regeneration: the potential of the brain for reorganization, plasticity and repair after injury is much greater than previously thought; (5) confounding factors, long-term outcome and predictive modelling. These 5 areas are linked on all levels and therefore need to be tackled by an integrative approach and innovative therapeutic strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Shell fluxes of planktonic Foraminifera species vary intra-annually in a pattern that appears to follow the seasonal cycle. However, the variation in the timing and prominence of seasonal flux maxima in space and among species remains poorly constrained. Thus, although changing seasonality may result in a flux-weighted temperature offset of more than 5° C within a species, this effect is often ignored in the interpretation of Foraminifera-based paleoceanographic records. To address this issue we present an analysis of the intra-annual pattern of shell flux variability in 37 globally distributed time series. The existence of a seasonal component in flux variability was objectively characterised using periodic regression. This analysis yielded estimates of the number, timing and prominence of seasonal flux maxima. Over 80% of the flux series across all species showed a statistically significant periodic component, indicating that a considerable part of the intra-annual flux variability is predictable. Temperature appears to be a powerful predictor of flux seasonality, but its effect differs among species. Three different modes of seasonality are distinguishable. Tropical and subtropical species (Globigerinoides ruber (white and pink varieties), Neogloboquadrina dutertrei, Globigerinoides sacculifer, Orbulina universa, Globigerinella siphonifera, Pulleniatina obliquiloculata, Globorotalia menardii, Globoturborotalita rubescens, Globoturborotalita tenella and Globigerinoides conglobatus) appear to have a less predictable flux pattern, with random peak timing in warm waters. In colder waters, seasonality is more prevalent: peak fluxes occur shortly after summer temperature maxima and peak prominence increases. This tendency is stronger in species with a narrower temperature range, implying that warm-adapted species find it increasingly difficult to reproduce outside their optimum temperature range and that, with decreasing mean temperature, their flux is progressively more focussed in the warm season. The second group includes the temperate to cold-water species Globigerina bulloides, Globigerinita glutinata, Turborotalita quinqueloba, Neogloboquadrina incompta, Neogloboquadrina pachyderma, Globorotalia scitula, Globigerinella calida, Globigerina falconensis, Globorotalia theyeri and Globigerinita uvula. These species show a highly predictable seasonal pattern, with one to two peaks a year, which occur earlier in warmer waters. Peak prominence in this group is independent of temperature. The earlier-when-warmer pattern in this group is related to the timing of productivity maxima. Finally, the deep-dwelling Globorotalia truncatulinoides and Globorotalia inflata show a regular and pronounced peak in winter and spring. The remarkably low flux outside the main pulse may indicate a long reproductive cycle of these species. Overall, our analysis indicates that the seasonality of planktonic Foraminifera shell flux is predictable and reveals the existence of distinct modes of phenology among species. We evaluate the effect of changing seasonality on paleoceanographic reconstructions and find that, irrespective of the seasonality mode, the actual magnitude of environmental change will be underestimated. The observed constraints on flux seasonality can serve as the basis for predictive modelling of flux pattern. As long as the diversity of species seasonality is accounted for in such models, the results can be used to improve reconstructions of the magnitude of environmental change in paleoceanographic records.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Proper maintenance of plant items is crucial for the safe and profitable operation of process plants, The relevant maintenance policies fall into the following four categories: (i) preventivejopportunistic/breakdown replacement policies, (ii) inspection/inspection-repair-replacernent policies, (iii) restorative maintenance policies, and (iv) condition based maintenance policies, For correlating failure times of component equipnent and complete systems, the Weibull failure distribution has been used, A new powerful method, SEQLIM, has been proposed for the estimation of the Weibull parameters; particularly, when maintenance records contain very few failures and many successful operation times. When a system consists of a number of replaceable, ageing components, an opporturistic replacernent policy has been found to be cost-effective, A simple opportunistic rrodel has been developed. Inspection models with various objective functions have been investigated, It was found that, on the assumption of a negative exponential failure distribution, all models converge to the same optimal inspection interval; provided the safety components are very reliable and the demand rate is low, When deterioration becomes a contributory factor to same failures, periodic inspections, calculated from above models, are too frequent, A case of safety trip systems has been studied, A highly effective restorative maintenance policy can be developed if the performance of the equipment under this category can be related to some predictive modelling. A novel fouling model has been proposed to determine cleaning strategies of condensers, Condition-based maintenance policies have been investigated. A simple gauge has been designed for condition monitoring of relief valve springs. A typical case of an exothermic inert gas generation plant has been studied, to demonstrate how various policies can be applied to devise overall maintenance actions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Particle breakage due to fluid flow through various geometries can have a major influence on the performance of particle/fluid processes and on the product quality characteristics of particle/fluid products. In this study, whey protein precipitate dispersions were used as a case study to investigate the effect of flow intensity and exposure time on the breakage of these precipitate particles. Computational fluid dynamic (CFD) simulations were performed to evaluate the turbulent eddy dissipation rate (TED) and associated exposure time along various flow geometries. The focus of this work is on the predictive modelling of particle breakage in particle/fluid systems. A number of breakage models were developed to relate TED and exposure time to particle breakage. The suitability of these breakage models was evaluated for their ability to predict the experimentally determined breakage of the whey protein precipitate particles. A "power-law threshold" breakage model was found to provide a satisfactory capability for predicting the breakage of the whey protein precipitate particles. The whey protein precipitate dispersions were propelled through a number of different geometries such as bends, tees and elbows, and the model accurately predicted the mean particle size attained after flow through these geometries. © 2005 Elsevier Ltd. All rights reserved.