952 resultados para Doubly robust estimation
Resumo:
PURPOSE: The aim of this study was to develop models based on kernel regression and probability estimation in order to predict and map IRC in Switzerland by taking into account all of the following: architectural factors, spatial relationships between the measurements, as well as geological information. METHODS: We looked at about 240,000 IRC measurements carried out in about 150,000 houses. As predictor variables we included: building type, foundation type, year of construction, detector type, geographical coordinates, altitude, temperature and lithology into the kernel estimation models. We developed predictive maps as well as a map of the local probability to exceed 300 Bq/m(3). Additionally, we developed a map of a confidence index in order to estimate the reliability of the probability map. RESULTS: Our models were able to explain 28% of the variations of IRC data. All variables added information to the model. The model estimation revealed a bandwidth for each variable, making it possible to characterize the influence of each variable on the IRC estimation. Furthermore, we assessed the mapping characteristics of kernel estimation overall as well as by municipality. Overall, our model reproduces spatial IRC patterns which were already obtained earlier. On the municipal level, we could show that our model accounts well for IRC trends within municipal boundaries. Finally, we found that different building characteristics result in different IRC maps. Maps corresponding to detached houses with concrete foundations indicate systematically smaller IRC than maps corresponding to farms with earth foundation. CONCLUSIONS: IRC mapping based on kernel estimation is a powerful tool to predict and analyze IRC on a large-scale as well as on a local level. This approach enables to develop tailor-made maps for different architectural elements and measurement conditions and to account at the same time for geological information and spatial relations between IRC measurements.
Resumo:
Image registration has been proposed as an automatic method for recovering cardiac displacement fields from Tagged Magnetic Resonance Imaging (tMRI) sequences. Initially performed as a set of pairwise registrations, these techniques have evolved to the use of 3D+t deformation models, requiring metrics of joint image alignment (JA). However, only linear combinations of cost functions defined with respect to the first frame have been used. In this paper, we have applied k-Nearest Neighbors Graphs (kNNG) estimators of the -entropy (H ) to measure the joint similarity between frames, and to combine the information provided by different cardiac views in an unified metric. Experiments performed on six subjects showed a significantly higher accuracy (p < 0.05) with respect to a standard pairwise alignment (PA) approach in terms of mean positional error and variance with respect to manually placed landmarks. The developed method was used to study strains in patients with myocardial infarction, showing a consistency between strain, infarction location, and coronary occlusion. This paper also presentsan interesting clinical application of graph-based metric estimators, showing their value for solving practical problems found in medical imaging.
Resumo:
This paper presents a new and original variational framework for atlas-based segmentation. The proposed framework integrates both the active contour framework, and the dense deformation fields of optical flow framework. This framework is quite general and encompasses many of the state-of-the-art atlas-based segmentation methods. It also allows to perform the registration of atlas and target images based on only selected structures of interest. The versatility and potentiality of the proposed framework are demonstrated by presenting three diverse applications: In the first application, we show how the proposed framework can be used to simulate the growth of inconsistent structures like a tumor in an atlas. In the second application, we estimate the position of nonvisible brain structures based on the surrounding structures and validate the results by comparing with other methods. In the final application, we present the segmentation of lymph nodes in the Head and Neck CT images, and demonstrate how multiple registration forces can be used in this framework in an hierarchical manner.
Resumo:
We consider the problem of estimating the mean hospital cost of stays of a class of patients (e.g., a diagnosis-related group) as a function of patient characteristics. The statistical analysis is complicated by the asymmetry of the cost distribution, the possibility of censoring on the cost variable, and the occurrence of outliers. These problems have often been treated separately in the literature, and a method offering a joint solution to all of them is still missing. Indirect procedures have been proposed, combining an estimate of the duration distribution with an estimate of the conditional cost for a given duration. We propose a parametric version of this approach, allowing for asymmetry and censoring in the cost distribution and providing a mean cost estimator that is robust in the presence of extreme values. In addition, the new method takes covariate information into account.
Resumo:
The comparison of radiotherapy techniques regarding secondary cancer risk has yielded contradictory results possibly stemming from the many different approaches used to estimate risk. The purpose of this study was to make a comprehensive evaluation of different available risk models applied to detailed whole-body dose distributions computed by Monte Carlo for various breast radiotherapy techniques including conventional open tangents, 3D conformal wedged tangents and hybrid intensity modulated radiation therapy (IMRT). First, organ-specific linear risk models developed by the International Commission on Radiological Protection (ICRP) and the Biological Effects of Ionizing Radiation (BEIR) VII committee were applied to mean doses for remote organs only and all solid organs. Then, different general non-linear risk models were applied to the whole body dose distribution. Finally, organ-specific non-linear risk models for the lung and breast were used to assess the secondary cancer risk for these two specific organs. A total of 32 different calculated absolute risks resulted in a broad range of values (between 0.1% and 48.5%) underlying the large uncertainties in absolute risk calculation. The ratio of risk between two techniques has often been proposed as a more robust assessment of risk than the absolute risk. We found that the ratio of risk between two techniques could also vary substantially considering the different approaches to risk estimation. Sometimes the ratio of risk between two techniques would range between values smaller and larger than one, which then translates into inconsistent results on the potential higher risk of one technique compared to another. We found however that the hybrid IMRT technique resulted in a systematic reduction of risk compared to the other techniques investigated even though the magnitude of this reduction varied substantially with the different approaches investigated. Based on the epidemiological data available, a reasonable approach to risk estimation would be to use organ-specific non-linear risk models applied to the dose distributions of organs within or near the treatment fields (lungs and contralateral breast in the case of breast radiotherapy) as the majority of radiation-induced secondary cancers are found in the beam-bordering regions.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
Many transportation agencies maintain grade as an attribute in roadway inventory databases; however, the information is often in an aggregated format. Cross slope is rarely included in large roadway inventories. Accurate methods available to collect grade and cross slope include global positioning systems, traditional surveying, and mobile mapping systems. However, most agencies do not have the resources to utilize these methods to collect grade and cross slope on a large scale. This report discusses the use of LIDAR to extract roadway grade and cross slope for large-scale inventories. Current data collection methods and their advantages and disadvantages are discussed. A pilot study to extract grade and cross slope from a LIDAR data set, including methodology, results, and conclusions, is presented. This report describes the regression methodology used to extract and evaluate the accuracy of grade and cross slope from three dimensional surfaces created from LIDAR data. The use of LIDAR data to extract grade and cross slope on tangent highway segments was evaluated and compared against grade and cross slope collected using an automatic level for 10 test segments along Iowa Highway 1. Grade and cross slope were measured from a surface model created from LIDAR data points collected for the study area. While grade could be estimated to within 1%, study results indicate that cross slope cannot practically be estimated using a LIDAR derived surface model.
Resumo:
The HIV vaccine strategy that, to date, generated immune protection consisted of a prime-boost regimen using a canarypox vector and an HIV envelope protein with alum, as shown in the RV144 trial. Since the efficacy was weak, and previous HIV vaccine trials designed to generate antibody responses failed, we hypothesized that generation of T cell responses would result in improved protection. Thus, we tested the immunogenicity of a similar envelope-based vaccine using a mouse model, with two modifications: a clade C CN54gp140 HIV envelope protein was adjuvanted by the TLR9 agonist IC31®, and the viral vector was the vaccinia strain NYVAC-CN54 expressing HIV envelope gp120. The use of IC31® facilitated immunoglobulin isotype switching, leading to the production of Env-specific IgG2a, as compared to protein with alum alone. Boosting with NYVAC-CN54 resulted in the generation of more robust Th1 T cell responses. Moreover, gp140 prime with IC31® and alum followed by NYVAC-CN54 boost resulted in the formation and persistence of central and effector memory populations in the spleen and an effector memory population in the gut. Our data suggest that this regimen is promising and could improve the protection rate by eliciting strong and long-lasting humoral and cellular immune responses.
Resumo:
Abstract
Resumo:
Breast milk transmission of HIV remains an important mode of infant HIV acquisition. Enhancement of mucosal HIV-specific immune responses in milk of HIV-infected mothers through vaccination may reduce milk virus load or protect against virus transmission in the infant gastrointestinal tract. However, the ability of HIV/SIV strategies to induce virus-specific immune responses in milk has not been studied. In this study, five uninfected, hormone-induced lactating, Mamu A*01(+) female rhesus monkey were systemically primed and boosted with rDNA and the attenuated poxvirus vector, NYVAC, containing the SIVmac239 gag-pol and envelope genes. The monkeys were boosted a second time with a recombinant Adenovirus serotype 5 vector containing matching immunogens. The vaccine-elicited immunodominant epitope-specific CD8(+) T lymphocyte response in milk was of similar or greater magnitude than that in blood and the vaginal tract but higher than that in the colon. Furthermore, the vaccine-elicited SIV Gag-specific CD4(+) and CD8(+) T lymphocyte polyfunctional cytokine responses were more robust in milk than in blood after each virus vector boost. Finally, SIV envelope-specific IgG responses were detected in milk of all monkeys after vaccination, whereas an SIV envelope-specific IgA response was only detected in one vaccinated monkey. Importantly, only limited and transient increases in the proportion of activated or CCR5-expressing CD4(+) T lymphocytes in milk occurred after vaccination. Therefore, systemic DNA prime and virus vector boost of lactating rhesus monkeys elicits potent virus-specific cellular and humoral immune responses in milk and may warrant further investigation as a strategy to impede breast milk transmission of HIV.
Resumo:
The objective of this work was to develop a procedure to estimate soybean crop areas in Rio Grande do Sul state, Brazil. Estimations were made based on the temporal profiles of the enhanced vegetation index (Evi) calculated from moderate resolution imaging spectroradiometer (Modis) images. The methodology developed for soybean classification was named Modis crop detection algorithm (MCDA). The MCDA provides soybean area estimates in December (first forecast), using images from the sowing period, and March (second forecast), using images from the sowing and maximum crop development periods. The results obtained by the MCDA were compared with the official estimates on soybean area of the Instituto Brasileiro de Geografia e Estatística. The coefficients of determination ranged from 0.91 to 0.95, indicating good agreement between the estimates. For the 2000/2001 crop year, the MCDA soybean crop map was evaluated using a soybean crop map derived from Landsat images, and the overall map accuracy was approximately 82%, with similar commission and omission errors. The MCDA was able to estimate soybean crop areas in Rio Grande do Sul State and to generate an annual thematic map with the geographic position of the soybean fields. The soybean crop area estimates by the MCDA are in good agreement with the official agricultural statistics.