963 resultados para Hierarchical models


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Microarrays are high throughput biological assays that allow the screening of thousands of genes for their expression. The main idea behind microarrays is to compute for each gene a unique signal that is directly proportional to the quantity of mRNA that was hybridized on the chip. A large number of steps and errors associated with each step make the generated expression signal noisy. As a result, microarray data need to be carefully pre-processed before their analysis can be assumed to lead to reliable and biologically relevant conclusions. This thesis focuses on developing methods for improving gene signal and further utilizing this improved signal for higher level analysis. To achieve this, first, approaches for designing microarray experiments using various optimality criteria, considering both biological and technical replicates, are described. A carefully designed experiment leads to signal with low noise, as the effect of unwanted variations is minimized and the precision of the estimates of the parameters of interest are maximized. Second, a system for improving the gene signal by using three scans at varying scanner sensitivities is developed. A novel Bayesian latent intensity model is then applied on these three sets of expression values, corresponding to the three scans, to estimate the suitably calibrated true signal of genes. Third, a novel image segmentation approach that segregates the fluorescent signal from the undesired noise is developed using an additional dye, SYBR green RNA II. This technique helped in identifying signal only with respect to the hybridized DNA, and signal corresponding to dust, scratch, spilling of dye, and other noises, are avoided. Fourth, an integrated statistical model is developed, where signal correction, systematic array effects, dye effects, and differential expression, are modelled jointly as opposed to a sequential application of several methods of analysis. The methods described in here have been tested only for cDNA microarrays, but can also, with some modifications, be applied to other high-throughput technologies. Keywords: High-throughput technology, microarray, cDNA, multiple scans, Bayesian hierarchical models, image analysis, experimental design, MCMC, WinBUGS.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Aims/hypothesis: We investigated the association between the incidence of type 1 diabetes mellitus and remoteness (a proxy measure for exposure to infections) using recently developed techniques for statistical analysis of small-area data.

Subjects, materials and methods: New cases in children aged 0 to 14 years in Northern Ireland were prospectively registered from 1989 to 2003. Ecological analysis was conducted using small geographical units (582 electoral wards) and area characteristics including remoteness, deprivation and child population density. Analysis was conducted using Poisson regression models and Bayesian
hierarchical models to allow for spatially correlated risks that were potentially caused by unmeasured explanatory variables.

Results: In Northern Ireland between 1989 and 2003, there were 1,433 new cases of type 1 diabetes, giving a directly standardised incidence rate of 24.7 per 100,000 personyears. Areas in the most remote fifth of all areas had a significantly (p=0.0006) higher incidence of type 1 diabetes mellitus (incidence rate ratio=1.27 [95% CI 1.07, 1.50]) than those in the most accessible fifth of all areas. There was also a higher incidence rate in areas that were less deprived (p<0.0001) and less densely populated (p=0.002). After adjustment for deprivation and additional adjustment for child population density the association between diabetes and remoteness remained significant (p=0.01 and p=0.03, respectively).

Conclusions/interpretation: In Northern Ireland, there is evidence that remote areas experience higher rates of type 1 diabetes mellitus. This could reflect a reduced or delayed exposure to infections, particularly early in life, in these areas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Motivated by the need to solve ecological problems (climate change, habitat fragmentation and biological invasions), there has been increasing interest in species distribution models (SDMs). Predictions from these models inform conservation policy, invasive species management and disease-control measures. However, predictions are subject to uncertainty, the degree and source of which is often unrecognized. Here, we review the SDM literature in the context of uncertainty, focusing on three main classes of SDM: niche-based models, demographic models and process-based models. We identify sources of uncertainty for each class and discuss how uncertainty can be minimized or included in the modelling process to give realistic measures of confidence around predictions. Because this has typically not been performed, we conclude that uncertainty in SDMs has often been underestimated and a false precision assigned to predictions of geographical distribution. We identify areas where development of new statistical tools will improve predictions from distribution models, notably the development of hierarchical models that link different types of distribution model and their attendant uncertainties across spatial scales. Finally, we discuss the need to develop more defensible methods for assessing predictive performance, quantifying model goodness-of-fit and for assessing the significance of model covariates.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tese de doutoramento, Estatística e Investigação Operacional (Probabilidades e Estatística), Universidade de Lisboa, Faculdade de Ciências, 2014

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O objetivo desta dissertação foi estudar um conjunto de empresas cotadas na bolsa de valores de Lisboa, para identificar aquelas que têm um comportamento semelhante ao longo do tempo. Para isso utilizamos algoritmos de Clustering tais como K-Means, PAM, Modelos hierárquicos, Funny e C-Means tanto com a distância euclidiana como com a distância de Manhattan. Para selecionar o melhor número de clusters identificado por cada um dos algoritmos testados, recorremos a alguns índices de avaliação/validação de clusters como o Davies Bouldin e Calinski-Harabasz entre outros.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Understanding how the human visual system recognizes objects is one of the key challenges in neuroscience. Inspired by a large body of physiological evidence (Felleman and Van Essen, 1991; Hubel and Wiesel, 1962; Livingstone and Hubel, 1988; Tso et al., 2001; Zeki, 1993), a general class of recognition models has emerged which is based on a hierarchical organization of visual processing, with succeeding stages being sensitive to image features of increasing complexity (Hummel and Biederman, 1992; Riesenhuber and Poggio, 1999; Selfridge, 1959). However, these models appear to be incompatible with some well-known psychophysical results. Prominent among these are experiments investigating recognition impairments caused by vertical inversion of images, especially those of faces. It has been reported that faces that differ "featurally" are much easier to distinguish when inverted than those that differ "configurally" (Freire et al., 2000; Le Grand et al., 2001; Mondloch et al., 2002) ??finding that is difficult to reconcile with the aforementioned models. Here we show that after controlling for subjects' expectations, there is no difference between "featurally" and "configurally" transformed faces in terms of inversion effect. This result reinforces the plausibility of simple hierarchical models of object representation and recognition in cortex.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Conservation planning requires identifying pertinent habitat factors and locating geographic locations where land management may improve habitat conditions for high priority species. I derived habitat models and mapped predicted abundance for the Golden-winged Warbler (Vermivora chrysoptera), a species of high conservation concern, using bird counts, environmental variables, and hierarchical models applied at multiple spatial scales. My aim was to understand habitat associations at multiple spatial scales and create a predictive abundance map for purposes of conservation planning for the Golden-winged Warbler. My models indicated a substantial influence of landscape conditions, including strong positive associations with total forest composition within the landscape. However, many of the associations I observed were counter to reported associations at finer spatial extents; for instance, I found Golden-winged Warblers negatively associated with several measures of edge habitat. No single spatial scale dominated, indicating that this species is responding to factors at multiple spatial scales. I found Golden-winged Warbler abundance was negatively related with Blue-winged Warbler (Vermivora cyanoptera) abundance. I also observed a north-south spatial trend suggestive of a regional climate effect that was not previously noted for this species. The map of predicted abundance indicated a large area of concentrated abundance in west-central Wisconsin, with smaller areas of high abundance along the northern periphery of the Prairie Hardwood Transition. This map of predicted abundance compared favorably with independent evaluation data sets and can thus be used to inform regional planning efforts devoted to conserving this species.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Predictors of random effects are usually based on the popular mixed effects (ME) model developed under the assumption that the sample is obtained from a conceptual infinite population; such predictors are employed even when the actual population is finite. Two alternatives that incorporate the finite nature of the population are obtained from the superpopulation model proposed by Scott and Smith (1969. Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64, 830-840) or from the finite population mixed model recently proposed by Stanek and Singer (2004. Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 1119-1130). Predictors derived under the latter model with the additional assumptions that all variance components are known and that within-cluster variances are equal have smaller mean squared error (MSE) than the competitors based on either the ME or Scott and Smith`s models. As population variances are rarely known, we propose method of moment estimators to obtain empirical predictors and conduct a simulation study to evaluate their performance. The results suggest that the finite population mixed model empirical predictor is more stable than its competitors since, in terms of MSE, it is either the best or the second best and when second best, its performance lies within acceptable limits. When both cluster and unit intra-class correlation coefficients are very high (e.g., 0.95 or more), the performance of the empirical predictors derived under the three models is similar. (c) 2007 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O mercado privado de planos de saúde tem sido marcado por aumento dos custos da assistência médica, ampliação da cobertura de procedimentos, restrições nos reajustes dos planos e aumento das garantias de solvência exigidas pela Agência Nacional de Saúde Suplementar (ANS), impactando o desempenho econômico-financeiro das operadoras de planos de saúde. A presente dissertação tem como objetivo analisar o desempenho econômico-financeiro de operadoras das modalidades autogestão, cooperativa médica, medicina de grupo e seguradora no período de 2001 a 2012. Foi utilizada uma base de dados operacionais e contábeis disponível na página eletrônica da ANS, com 5.775 observações, avaliando-se o desempenho econômico-financeiro por meio de cinco indicadores: Retorno sobre Ativos, Retorno Operacional sobre Ativos, Retorno sobre o Patrimônio Líquido, Liquidez Corrente e Sinistralidade. Dois modelos hierárquicos foram adotados para estimar os efeitos operadora, modalidade e porte no desempenho. Dentre estes, a pesquisa identificou que o efeito operadora é responsável pela maior parte da variabilidade explicada do desempenho. A investigação permitiu identificar as operadoras que apresentaram melhor desempenho no período, direcionando a realização futura de estudos qualitativos visando conhecer os principais fatores que explicam o desempenho superior.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose mo deIs to analyze animal growlh data wilh lhe aim of eslimating and predicting quanlities of Liological and economical interest such as the maturing rate and asymptotic weight. lt is also studied lhe effect of environmenlal facLors of relevant influence in the growlh processo The models considered in this paper are based on an extension and specialization of the dynamic hierarchical model (Gamerman " Migon, 1993) lo a non-Iinear growlh curve sdLillg, where some of the growth curve parameters are considered cxchangeable among lhe unils. The inferencc for thcse models are appruximale conjugale analysis Lascd on Taylor series cxpallsiulIs aliei linear Bayes procedures.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although most recent publications focus on Ventilator-associated Pneumonia, Non-Ventilator-associated Hospital-acquired pneumonia (NVHAP) is still worrisome. We studied risk factors for NVHAP among patients admitted to a small teaching hospital. Sixty-six NVHAP case patients and 66 controls admitted to the hospital from November 2005 through November 2006 were enrolled in a case-control study. Variables under investigation included: demographic characteristics, comorbidities, procedures, invasive devices and use of medications (Sedatives, Antacids, Steroids and Antimicrobials). Univariate and multivariable analysis (hierarchical models of logistic regression) were performed. The incidence of NVHAP in our hospital was 0.68% (1.02 per 1,000 patients-day). Results from multivariable analysis identified risk factors for NVHAP: age (Odds Ratio[OR]=1.03, 95% Confidence Interval[CI]=1.01-1.05, p=0.002), use of Antacids (OR=5.29, 95%CI=1.89-4.79, p=0.001) and Central Nervous System disease (OR=3.13, 95%CI=1.24-7.93, p=0.02). Although our findings are coherent with previous reports, the association of Antacids with NVHAP recalls a controversial issue in the physiopathology of Hospital-Acquired Pneumonia, with possible implications for preventive strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study aimed to analyze the spatial distribution of dengue risk and its association with socio-environmental conditions. This was an ecological study of the counts of autochthonous dengue cases in the municipality of Campinas, São Paulo State, Brazil, in the year 2007, aggregated according to 47 coverage areas of municipal health centers. Spatial models for mapping diseases were constructed with Bayesian hierarchical models, based on Integrated Nested Laplace Approximation (INLA). The analyses were stratified according to two age groups, 0 to 14 years and above 14 years. The results indicate that the spatial distribution of dengue risk is not associated with socio-environmental conditions in the 0 to 14 year age group. In the age group older than 14 years, the relative risk of dengue increases significantly as the level of socio-environmental deprivation increases. Mapping of socio-environmental deprivation and dengue cases proved to be a useful tool for data analysis in dengue surveillance systems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Genética e Melhoramento Animal - FCAV

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Matematica Aplicada e Computacional - FCT