888 resultados para Atheoretical regression trees


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective: We used demographic and clinical data to design practical classification models for prediction of neurocognitive impairment (NCI) in people with HIV infection. Methods: The study population comprised 331 HIV-infected patients with available demographic, clinical, and neurocognitive data collected using a comprehensive battery of neuropsychological tests. Classification and regression trees (CART) were developed to btain detailed and reliable models to predict NCI. Following a practical clinical approach, NCI was considered the main variable for study outcomes, and analyses were performed separately in treatment-naïve and treatment-experienced patients. Results: The study sample comprised 52 treatment-naïve and 279 experienced patients. In the first group, the variables identified as better predictors of NCI were CD4 cell count and age (correct classification [CC]: 79.6%, 3 final nodes). In treatment-experienced patients, the variables most closely related to NCI were years of education, nadir CD4 cell count, central nervous system penetration-effectiveness score, age, employment status, and confounding comorbidities (CC: 82.1%, 7 final nodes). In patients with an undetectable viral load and no comorbidities, we obtained a fairly accurate model in which the main variables were nadir CD4 cell count, current CD4 cell count, time on current treatment, and past highest viral load (CC: 88%, 6 final nodes). Conclusion: Practical classification models to predict NCI in HIV infection can be obtained using demographic and clinical variables. An approach based on CART analyses may facilitate screening for HIV-associated neurocognitive disorders and complement clinical information about risk and protective factors for NCI in HIV-infected patients.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En écologie, dans le cadre par exemple d’études des services fournis par les écosystèmes, les modélisations descriptive, explicative et prédictive ont toutes trois leur place distincte. Certaines situations bien précises requièrent soit l’un soit l’autre de ces types de modélisation ; le bon choix s’impose afin de pouvoir faire du modèle un usage conforme aux objectifs de l’étude. Dans le cadre de ce travail, nous explorons dans un premier temps le pouvoir explicatif de l’arbre de régression multivariable (ARM). Cette méthode de modélisation est basée sur un algorithme récursif de bipartition et une méthode de rééchantillonage permettant l’élagage du modèle final, qui est un arbre, afin d’obtenir le modèle produisant les meilleures prédictions. Cette analyse asymétrique à deux tableaux permet l’obtention de groupes homogènes d’objets du tableau réponse, les divisions entre les groupes correspondant à des points de coupure des variables du tableau explicatif marquant les changements les plus abrupts de la réponse. Nous démontrons qu’afin de calculer le pouvoir explicatif de l’ARM, on doit définir un coefficient de détermination ajusté dans lequel les degrés de liberté du modèle sont estimés à l’aide d’un algorithme. Cette estimation du coefficient de détermination de la population est pratiquement non biaisée. Puisque l’ARM sous-tend des prémisses de discontinuité alors que l’analyse canonique de redondance (ACR) modélise des gradients linéaires continus, la comparaison de leur pouvoir explicatif respectif permet entre autres de distinguer quel type de patron la réponse suit en fonction des variables explicatives. La comparaison du pouvoir explicatif entre l’ACR et l’ARM a été motivée par l’utilisation extensive de l’ACR afin d’étudier la diversité bêta. Toujours dans une optique explicative, nous définissons une nouvelle procédure appelée l’arbre de régression multivariable en cascade (ARMC) qui permet de construire un modèle tout en imposant un ordre hiérarchique aux hypothèses à l’étude. Cette nouvelle procédure permet d’entreprendre l’étude de l’effet hiérarchisé de deux jeux de variables explicatives, principal et subordonné, puis de calculer leur pouvoir explicatif. L’interprétation du modèle final se fait comme dans une MANOVA hiérarchique. On peut trouver dans les résultats de cette analyse des informations supplémentaires quant aux liens qui existent entre la réponse et les variables explicatives, par exemple des interactions entres les deux jeux explicatifs qui n’étaient pas mises en évidence par l’analyse ARM usuelle. D’autre part, on étudie le pouvoir prédictif des modèles linéaires généralisés en modélisant la biomasse de différentes espèces d’arbre tropicaux en fonction de certaines de leurs mesures allométriques. Plus particulièrement, nous examinons la capacité des structures d’erreur gaussienne et gamma à fournir les prédictions les plus précises. Nous montrons que pour une espèce en particulier, le pouvoir prédictif d’un modèle faisant usage de la structure d’erreur gamma est supérieur. Cette étude s’insère dans un cadre pratique et se veut un exemple pour les gestionnaires voulant estimer précisément la capture du carbone par des plantations d’arbres tropicaux. Nos conclusions pourraient faire partie intégrante d’un programme de réduction des émissions de carbone par les changements d’utilisation des terres.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Le but de cette thèse est d’expliquer la délinquance prolifique de certains délinquants. Nous avançons la thèse que la délinquance prolifique s’explique par la formation plus fréquente de situations criminogènes. Ces situations réfèrent au moment où un délinquant entre en interaction avec une opportunité criminelle dans un contexte favorable au crime. Plus exactement, il s’agit du moment où le délinquant fait face à cette opportunité, mais où le crime n’a pas encore été commis. La formation de situations criminogènes est facilitée par l’interaction et l’interdépendance de trois éléments : la propension à la délinquance de la personne, son entourage criminalisé et son style de vie. Ainsi, la délinquance prolifique ne pourrait être expliquée adéquatement sans tenir compte de l’interaction entre le risque individuel et le risque contextuel. L’objectif général de la présente thèse est de faire la démonstration de l’importance d’une modélisation interactionnelle entre le risque individuel et le risque contextuel afin d’expliquer la délinquance plus prolifique de certains contrevenants. Pour ce faire, 155 contrevenants placés sous la responsabilité de deux établissements des Services correctionnels du Québec et de quatre centres jeunesse du Québec ont complété un protocole d’évaluation par questionnaires auto-administrés. Dans un premier temps (chapitre trois), nous avons décrit et comparé la nature de la délinquance autorévélée des contrevenants de notre échantillon. Ce premier chapitre de résultats a permis de mettre en valeur le fait que ce bassin de contrevenants est similaire à d’autres échantillons de délinquants en ce qui a trait à la nature de leur délinquance, plus particulièrement, au volume, à la variété et à la gravité de leurs crimes. En effet, la majorité des participants rapportent un volume faible de crimes contre la personne et contre les biens alors qu’un petit groupe se démarque par un lambda très élevé (13,1 % des délinquants de l’échantillon sont responsables de 60,3% de tous les crimes rapportés). Environ quatre délinquants sur cinq rapportent avoir commis au moins un crime contre la personne et un crime contre les biens. De plus, plus de 50% de ces derniers rapportent dans au moins quatre sous-catégories. Finalement, bien que les délinquants de notre échantillon aient un IGC (indice de gravité de la criminalité) moyen relativement faible (médiane = 77), près de 40% des contrevenants rapportent avoir commis au moins un des deux crimes les plus graves recensés dans cette étude (décharger une arme et vol qualifié). Le second objectif spécifique était d’explorer, au chapitre quatre, l’interaction entre les caractéristiques personnelles, l’entourage et le style de vie des délinquants dans la formation de situations criminogènes. Les personnes ayant une propension à la délinquance plus élevée semblent avoir tendance à être davantage entourées de personnes criminalisées et à avoir un style de vie plus oisif. L’entourage criminalisé semble également influencer le style de vie de ces délinquants. Ainsi, l’interdépendance entre ces trois éléments facilite la formation plus fréquente de situations criminogènes et crée une conjoncture propice à l’émergence de la délinquance prolifique. Le dernier objectif spécifique de la thèse, qui a été couvert dans le chapitre cinq, était d’analyser l’impact de la formation de situations criminogènes sur la nature de la délinquance. Les analyses de régression linéaires multiples et les arbres de régression ont permis de souligner la contribution des caractéristiques personnelles, de l’entourage et du style de vie dans l’explication de la nature de la délinquance. D’un côté, les analyses de régression (modèles additifs) suggèrent que l’ensemble des éléments favorisant la formation de situations criminogènes apporte une contribution unique à l’explication de la délinquance. D’un autre côté, les arbres de régression nous ont permis de mieux comprendre l’interaction entre les éléments dans l’explication de la délinquance prolifique. En effet, un positionnement plus faible sur certains éléments peut être compensé par un positionnement plus élevé sur d’autres. De plus, l’accumulation d’éléments favorisant la formation de situations criminogènes ne se fait pas de façon linéaire. Ces conclusions sont appuyées sur des proportions de variance expliquée plus élevées que celles des régressions linéaires multiples. En conclusion, mettre l’accent que sur un seul élément (la personne et sa propension à la délinquance ou le contexte et ses opportunités) ou leur combinaison de façon simplement additive ne permet pas de rendre justice à la complexité de l’émergence de la délinquance prolifique. En mettant à l’épreuve empiriquement cette idée généralement admise, cette thèse permet donc de souligner l’importance de considérer l’interaction entre le risque individuel et le risque contextuel dans l’explication de la délinquance prolifique.

Relevância:

80.00% 80.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Two types of ecological thresholds are now being widely used to develop conservation targets: breakpoint-based thresholds represent tipping points where system properties change dramatically, whereas classification thresholds identify groups of data points with contrasting properties. Both breakpoint-based and classification thresholds are useful tools in evidence-based conservation. However, it is critical that the type of threshold to be estimated corresponds with the question of interest and that appropriate statistical procedures are used to determine its location. On the basis of their statistical properties, we recommend using piecewise regression methods to identify breakpoint-based thresholds and discriminant analysis or classification and regression trees to identify classification thresholds.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The goal of this paper is to introduce a class of tree-structured models that combines aspects of regression trees and smooth transition regression models. The model is called the Smooth Transition Regression Tree (STR-Tree). The main idea relies on specifying a multiple-regime parametric model through a tree-growing procedure with smooth transitions among different regimes. Decisions about splits are entirely based on a sequence of Lagrange Multiplier (LM) tests of hypotheses.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A study was conducted to evaluate in vitro the effect of root surface conditioning with basic fibroblast growth factor (b-FGF) on morphology and proliferation of fibroblasts. Three experimental groups were used: non-treated, and treated with 50 microg or 125 microg b-FGF/ml. The dentin samples in each group were divided into subgroups according to the chemical treatment received before application of b-FGF: none, or conditioned with tetracycline-HCl or EDTA. After contact with b-FGF for 5 min, the samples were incubated for 24 h with 1 ml of culture medium containing 1 x 10(5) cells/ml plus 1 ml of culture medium alone. The samples were then subjected to routine preparation for SEM, and random fields were photographed. Three calibrated and blind examiners performed the assessment of morphology and density according to two index systems. Classification and regression trees indicated that the root surfaces treated with 125 microg b-FGF and previously conditioned with tetracycline-HCl or EDTA presented a morphology more suggestive of cellular adhesion and viability (P = 0.004). The density of fibroblasts on samples previously conditioned with EDTA, regardless of treatment with b-FGF, was significantly higher than in the other groups (P < 0.001). The present findings suggest that topical application of b-FGF has a positive influence on both the density and morphology of fibroblasts.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present study aimed to comparatively verify the relation between the hermit crabs and the shells they use in two populations of Loxopagurus loxochelis. Samples were collected monthly from July 2002 to June 2003, at Caraguatatuba and Ubatuba Bay, Sao Paulo, Brazil. The animals sampled had their sex identified, were weighed and measured; their shells were identified, measured and weighed, and their internal volume determined. To relate the hermit crab's characteristics and the shells' variables, principal component analysis (PCA) and a regression tree were used. According to the PCA analysis, the three gastropod shells most frequently used by L. loxochelis varied in size. The regression tree successfully explained the relationship between the hermit crab's characteristics and the internal volume of the inhabited shell. It can be inferred that the relationship between the morphometry of an individual hermit crab and its shell is not straightforward and it is impossible to explain only on the basis of direct correlations between the body's and the shell's attributes. Several factors (such as the morphometry and the availability of the shell, environmental conditions and inter- and intraspecific competition) interact and seem to be taken into consideration by the hermit crabs when they choose a shell, resulting in the diversified pattern of shell occupancy shown here and elsewhere.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Programa de doctorado: Clínica e investigación terapéutica.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Accurate seasonal to interannual streamflow forecasts based on climate information are critical for optimal management and operation of water resources systems. Considering most water supply systems are multipurpose, operating these systems to meet increasing demand under the growing stresses of climate variability and climate change, population and economic growth, and environmental concerns could be very challenging. This study was to investigate improvement in water resources systems management through the use of seasonal climate forecasts. Hydrological persistence (streamflow and precipitation) and large-scale recurrent oceanic-atmospheric patterns such as the El Niño/Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), North Atlantic Oscillation (NAO), the Atlantic Multidecadal Oscillation (AMO), the Pacific North American (PNA), and customized sea surface temperature (SST) indices were investigated for their potential to improve streamflow forecast accuracy and increase forecast lead-time in a river basin in central Texas. First, an ordinal polytomous logistic regression approach is proposed as a means of incorporating multiple predictor variables into a probabilistic forecast model. Forecast performance is assessed through a cross-validation procedure, using distributions-oriented metrics, and implications for decision making are discussed. Results indicate that, of the predictors evaluated, only hydrologic persistence and Pacific Ocean sea surface temperature patterns associated with ENSO and PDO provide forecasts which are statistically better than climatology. Secondly, a class of data mining techniques, known as tree-structured models, is investigated to address the nonlinear dynamics of climate teleconnections and screen promising probabilistic streamflow forecast models for river-reservoir systems. Results show that the tree-structured models can effectively capture the nonlinear features hidden in the data. Skill scores of probabilistic forecasts generated by both classification trees and logistic regression trees indicate that seasonal inflows throughout the system can be predicted with sufficient accuracy to improve water management, especially in the winter and spring seasons in central Texas. Lastly, a simplified two-stage stochastic economic-optimization model was proposed to investigate improvement in water use efficiency and the potential value of using seasonal forecasts, under the assumption of optimal decision making under uncertainty. Model results demonstrate that incorporating the probabilistic inflow forecasts into the optimization model can provide a significant improvement in seasonal water contract benefits over climatology, with lower average deficits (increased reliability) for a given average contract amount, or improved mean contract benefits for a given level of reliability compared to climatology. The results also illustrate the trade-off between the expected contract amount and reliability, i.e., larger contracts can be signed at greater risk.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Conservation and monitoring of forest biodiversity requires reliable information about forest structure and composition at multiple spatial scales. However, detailed data about forest habitat characteristics across large areas are often incomplete due to difficulties associated with field sampling methods. To overcome this limitation we employed a nationally available light detection and ranging (LiDAR) remote sensing dataset to develop variables describing forest landscape structure across a large environmental gradient in Switzerland. Using a model species indicative of structurally rich mountain forests (hazel grouse Bonasa bonasia), we tested the potential of such variables to predict species occurrence and evaluated the additional benefit of LiDAR data when used in combination with traditional, sample plot-based field variables. We calibrated boosted regression trees (BRT) models for both variable sets separately and in combination, and compared the models’ accuracies. While both field-based and LiDAR models performed well, combining the two data sources improved the accuracy of the species’ habitat model. The variables retained from the two datasets held different types of information: field variables mostly quantified food resources and cover in the field and shrub layer, LiDAR variables characterized heterogeneity of vegetation structure which correlated with field variables describing the understory and ground vegetation. When combined with data on forest vegetation composition from field surveys, LiDAR provides valuable complementary information for encompassing species niches more comprehensively. Thus, LiDAR bridges the gap between precise, locally restricted field-data and coarse digital land cover information by reliably identifying habitat structure and quality across large areas.