918 resultados para regression splines


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Cette thèse porte sur l'analyse bayésienne de données fonctionnelles dans un contexte hydrologique. L'objectif principal est de modéliser des données d'écoulements d'eau d'une manière parcimonieuse tout en reproduisant adéquatement les caractéristiques statistiques de celles-ci. L'analyse de données fonctionnelles nous amène à considérer les séries chronologiques d'écoulements d'eau comme des fonctions à modéliser avec une méthode non paramétrique. Dans un premier temps, les fonctions sont rendues plus homogènes en les synchronisant. Ensuite, disposant d'un échantillon de courbes homogènes, nous procédons à la modélisation de leurs caractéristiques statistiques en faisant appel aux splines de régression bayésiennes dans un cadre probabiliste assez général. Plus spécifiquement, nous étudions une famille de distributions continues, qui inclut celles de la famille exponentielle, de laquelle les observations peuvent provenir. De plus, afin d'avoir un outil de modélisation non paramétrique flexible, nous traitons les noeuds intérieurs, qui définissent les éléments de la base des splines de régression, comme des quantités aléatoires. Nous utilisons alors le MCMC avec sauts réversibles afin d'explorer la distribution a posteriori des noeuds intérieurs. Afin de simplifier cette procédure dans notre contexte général de modélisation, nous considérons des approximations de la distribution marginale des observations, nommément une approximation basée sur le critère d'information de Schwarz et une autre qui fait appel à l'approximation de Laplace. En plus de modéliser la tendance centrale d'un échantillon de courbes, nous proposons aussi une méthodologie pour modéliser simultanément la tendance centrale et la dispersion de ces courbes, et ce dans notre cadre probabiliste général. Finalement, puisque nous étudions une diversité de distributions statistiques au niveau des observations, nous mettons de l'avant une approche afin de déterminer les distributions les plus adéquates pour un échantillon de courbes donné.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dans ce mémoire, nous cherchons à modéliser des tables à deux entrées monotones en lignes et/ou en colonnes, pour une éventuelle application sur les tables de mortalité. Nous adoptons une approche bayésienne non paramétrique et représentons la forme fonctionnelle des données par splines bidimensionnelles. L’objectif consiste à condenser une table de mortalité, c’est-à-dire de réduire l’espace d’entreposage de la table en minimisant la perte d’information. De même, nous désirons étudier le temps nécessaire pour reconstituer la table. L’approximation doit conserver les mêmes propriétés que la table de référence, en particulier la monotonie des données. Nous travaillons avec une base de fonctions splines monotones afin d’imposer plus facilement la monotonie au modèle. En effet, la structure flexible des splines et leurs dérivées faciles à manipuler favorisent l’imposition de contraintes sur le modèle désiré. Après un rappel sur la modélisation unidimensionnelle de fonctions monotones, nous généralisons l’approche au cas bidimensionnel. Nous décrivons l’intégration des contraintes de monotonie dans le modèle a priori sous l’approche hiérarchique bayésienne. Ensuite, nous indiquons comment obtenir un estimateur a posteriori à l’aide des méthodes de Monte Carlo par chaînes de Markov. Finalement, nous étudions le comportement de notre estimateur en modélisant une table de la loi normale ainsi qu’une table t de distribution de Student. L’estimation de nos données d’intérêt, soit la table de mortalité, s’ensuit afin d’évaluer l’amélioration de leur accessibilité.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Splines with free knots have been extensively studied in regard to calculating the optimal knot positions. The dependence of the accuracy of approximation on the knot distribution is highly nonlinear, and optimisation techniques face a difficult problem of multiple local minima. The domain of the problem is a simplex, which adds to the complexity. We have applied a recently developed cutting angle method of deterministic global optimisation, which allows one to solve a wide class of optimisation problems on a simplex. The results of the cutting angle method are subsequently improved by local discrete gradient method. The resulting algorithm is sufficiently fast and guarantees that the global minimum has been reached. The results of numerical experiments are presented.


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes the use of empirical modeling techniques for building microarchitecture sensitive models for compiler optimizations. The models we build relate program performance to settings of compiler optimization flags, associated heuristics and key microarchitectural parameters. Unlike traditional analytical modeling methods, this relationship is learned entirely from data obtained by measuring performance at a small number of carefully selected compiler/microarchitecture configurations. We evaluate three different learning techniques in this context viz. linear regression, adaptive regression splines and radial basis function networks. We use the generated models to a) predict program performance at arbitrary compiler/microarchitecture configurations, b) quantify the significance of complex interactions between optimizations and the microarchitecture, and c) efficiently search for'optimal' settings of optimization flags and heuristics for any given microarchitectural configuration. Our evaluation using benchmarks from the SPEC CPU2000 suits suggests that accurate models (< 5% average error in prediction) can be generated using a reasonable number of simulations. We also find that using compiler settings prescribed by a model-based search can improve program performance by as much as 19% (with an average of 9.5%) over highly optimized binaries.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Identifying processes that shape species geographical ranges is a prerequisite for understanding environmental change. Currently, species distribution modelling methods do not offer credible statistical tests of the relative influence of climate factors and typically ignore other processes (e.g. biotic interactions and dispersal limitation). We use a hierarchical model fitted with Markov Chain Monte Carlo to combine ecologically plausible niche structures using regression splines to describe unimodal but potentially skewed response terms. We apply spatially explicit error terms that account for (and may help identify) missing variables. Using three example distributions of European bird species, we map model results to show sensitivity to change in each covariate. We show that the overall strength of climatic association differs between species and that each species has considerable spatial variation in both the strength of the climatic association and the sensitivity to climate change. Our methods are widely applicable to many species distribution modelling problems and enable accurate assessment of the statistical importance of biotic and abiotic influences on distributions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

L’intérêt principal de cette recherche porte sur la validation d’une méthode statistique en pharmaco-épidémiologie. Plus précisément, nous allons comparer les résultats d’une étude précédente réalisée avec un devis cas-témoins niché dans la cohorte utilisé pour tenir compte de l’exposition moyenne au traitement : – aux résultats obtenus dans un devis cohorte, en utilisant la variable exposition variant dans le temps, sans faire d’ajustement pour le temps passé depuis l’exposition ; – aux résultats obtenus en utilisant l’exposition cumulative pondérée par le passé récent ; – aux résultats obtenus selon la méthode bayésienne. Les covariables seront estimées par l’approche classique ainsi qu’en utilisant l’approche non paramétrique bayésienne. Pour la deuxième le moyennage bayésien des modèles sera utilisé pour modéliser l’incertitude face au choix des modèles. La technique utilisée dans l’approche bayésienne a été proposée en 1997 mais selon notre connaissance elle n’a pas été utilisée avec une variable dépendante du temps. Afin de modéliser l’effet cumulatif de l’exposition variant dans le temps, dans l’approche classique la fonction assignant les poids selon le passé récent sera estimée en utilisant des splines de régression. Afin de pouvoir comparer les résultats avec une étude précédemment réalisée, une cohorte de personnes ayant un diagnostique d’hypertension sera construite en utilisant les bases des données de la RAMQ et de Med-Echo. Le modèle de Cox incluant deux variables qui varient dans le temps sera utilisé. Les variables qui varient dans le temps considérées dans ce mémoire sont iv la variable dépendante (premier évènement cérébrovasculaire) et une des variables indépendantes, notamment l’exposition

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Emergency department access block is an urgent problem faced by many public hospitals today. When access block occurs, patients in need of acute care cannot access inpatient wards within an optimal time frame. A widely held belief is that access block is the end product of a long causal chain, which involves poor discharge planning, insufficient bed capacity, and inadequate admission intensity to the wards. This paper studies the last link of the causal chain-the effect of admission intensity on access block, using data from a metropolitan hospital in Australia. We applied several modern statistical methods to analyze the data. First, we modeled the admission events as a nonhomogeneous Poisson process and estimated time-varying admission intensity with penalized regression splines. Next, we established a functional linear model to investigate the effect of the time-varying admission intensity on emergency department access block. Finally, we used functional principal component analysis to explore the variation in the daily time-varying admission intensities. The analyses suggest that improving admission practice during off-peak hours may have most impact on reducing the number of ED access blocks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The considerable search for synergistic agents in cancer research is motivated by the therapeutic benefits achieved by combining anti-cancer agents. Synergistic agents make it possible to reduce dosage while maintaining or enhancing a desired effect. Other favorable outcomes of synergistic agents include reduction in toxicity and minimizing or delaying drug resistance. Dose-response assessment and drug-drug interaction analysis play an important part in the drug discovery process, however analysis are often poorly done. This dissertation is an effort to notably improve dose-response assessment and drug-drug interaction analysis. The most commonly used method in published analysis is the Median-Effect Principle/Combination Index method (Chou and Talalay, 1984). The Median-Effect Principle/Combination Index method leads to inefficiency by ignoring important sources of variation inherent in dose-response data and discarding data points that do not fit the Median-Effect Principle. Previous work has shown that the conventional method yields a high rate of false positives (Boik, Boik, Newman, 2008; Hennessey, Rosner, Bast, Chen, 2010) and, in some cases, low power to detect synergy. There is a great need for improving the current methodology. We developed a Bayesian framework for dose-response modeling and drug-drug interaction analysis. First, we developed a hierarchical meta-regression dose-response model that accounts for various sources of variation and uncertainty and allows one to incorporate knowledge from prior studies into the current analysis, thus offering a more efficient and reliable inference. Second, in the case that parametric dose-response models do not fit the data, we developed a practical and flexible nonparametric regression method for meta-analysis of independently repeated dose-response experiments. Third, and lastly, we developed a method, based on Loewe additivity that allows one to quantitatively assess interaction between two agents combined at a fixed dose ratio. The proposed method makes a comprehensive and honest account of uncertainty within drug interaction assessment. Extensive simulation studies show that the novel methodology improves the screening process of effective/synergistic agents and reduces the incidence of type I error. We consider an ovarian cancer cell line study that investigates the combined effect of DNA methylation inhibitors and histone deacetylation inhibitors in human ovarian cancer cell lines. The hypothesis is that the combination of DNA methylation inhibitors and histone deacetylation inhibitors will enhance antiproliferative activity in human ovarian cancer cell lines compared to treatment with each inhibitor alone. By applying the proposed Bayesian methodology, in vitro synergy was declared for DNA methylation inhibitor, 5-AZA-2'-deoxycytidine combined with one histone deacetylation inhibitor, suberoylanilide hydroxamic acid or trichostatin A in the cell lines HEY and SKOV3. This suggests potential new epigenetic therapies in cell growth inhibition of ovarian cancer cells.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sediment samples and hydrographic conditions were studied at 28 stations around Iceland. At these sites, Conductivity-Temperature-Depth (CTD) casts were conducted to collect hydrographic data and multicorer casts were conductd to collect data on sediment characteristics including grain size distribution, carbon and nitrogen concentration, and chloroplastic pigment concentration. A total of 14 environmental predictors were used to model sediment characteristics around Iceland on regional geographic space. For these, two approaches were used: Multivariate Adaptation Regression Splines (MARS) and randomForest regression models. RandomForest outperformed MARS in predicting grain size distribution. MARS models had a greater tendency to over- and underpredict sediment values in areas outside the environmental envelope defined by the training dataset. We provide first GIS layers on sediment characteristics around Iceland, that can be used as predictors in future models. Although models performed well, more samples, especially from the shelf areas, will be needed to improve the models in future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Within the regression framework, we show how different levels of nonlinearity influence the instantaneous firing rate prediction of single neurons. Nonlinearity can be achieved in several ways. In particular, we can enrich the predictor set with basis expansions of the input variables (enlarging the number of inputs) or train a simple but different model for each area of the data domain. Spline-based models are popular within the first category. Kernel smoothing methods fall into the second category. Whereas the first choice is useful for globally characterizing complex functions, the second is very handy for temporal data and is able to include inner-state subject variations. Also, interactions among stimuli are considered. We compare state-of-the-art firing rate prediction methods with some more sophisticated spline-based nonlinear methods: multivariate adaptive regression splines and sparse additive models. We also study the impact of kernel smoothing. Finally, we explore the combination of various local models in an incremental learning procedure. Our goal is to demonstrate that appropriate nonlinearity treatment can greatly improve the results. We test our hypothesis on both synthetic data and real neuronal recordings in cat primary visual cortex, giving a plausible explanation of the results from a biological perspective.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

1Recent studies demonstrated the sensitivity of northern forest ecosystems to changes in the amount and duration of snow cover at annual to decadal time scales. However, the consequences of snowfall variability remain uncertain for ecological variables operating at longer time scales, especially the distributions of forest communities. 2The Great Lakes region of North America offers a unique setting to examine the long-term effects of variable snowfall on forest communities. Lake-effect snow produces a three-fold gradient in annual snowfall over tens of kilometres, and dramatic edaphic variations occur among landform types resulting from Quaternary glaciations. We tested the hypothesis that these factors interact to control the distributions of mesic (dominated by Acer saccharum, Tsuga canadensis and Fagus grandifolia) and xeric forests (dominated by Pinus and Quercus spp.) in northern Lower Michigan. 3We compiled pre-European-settlement vegetation data and overlaid these data with records of climate, water balance and soil, onto Landtype Association polygons in a geographical information system. We then used multivariate adaptive regression splines to model the abundance of mesic vegetation in relation to environmental controls. 4Snowfall is the most predictive among five variables retained by our model, and it affects model performance 29% more than soil texture, the second most important variable. The abundance of mesic trees is high on fine-textured soils regardless of snowfall, but it increases with snowfall on coarse-textured substrates. Lake-effect snowfall also determines the species composition within mesic forests. The weighted importance of A. saccharum is significantly greater than of T. canadensis or F. grandifolia within the lake-effect snowbelt, whereas T. canadensis is more plentiful outside the snowbelt. These patterns are probably driven by the influence of snowfall on soil moisture, nutrient availability and fire return intervals. 5Our results imply that a key factor dictating the spatio-temporal patterns of forest communities in the vast region around the Great Lakes is how the lake-effect snowfall regime responds to global change. Snowfall reductions will probably cause a major decrease in the abundance of ecologically and economically important species, such as A. saccharum.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62G08, 62P30.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)