852 resultados para Initial data problem
Resumo:
This article describes a method for determining the polydispersity index Ip2=Mz/Mw of the molecular weight distribution (MWD) of linear polymeric materials from linear viscoelastic data. The method uses the Mellin transform of the relaxation modulus of a simple molecular rheological model. One of the main features of this technique is that it enables interesting MWD information to be obtained directly from dynamic shear experiments. It is not necessary to achieve the relaxation spectrum, so the ill-posed problem is avoided. Furthermore, a determinate shape of the continuous MWD does not have to be assumed in order to obtain the polydispersity index. The technique has been developed to deal with entangled linear polymers, whatever the form of the MWD is. The rheological information required to obtain the polydispersity index is the storage G′(ω) and loss G″(ω) moduli, extending from the terminal zone to the plateau region. The method provides a good agreement between the proposed theoretical approach and the experimental polydispersity indices of several linear polymers for a wide range of average molecular weights and polydispersity indices. It is also applicable to binary blends.
Resumo:
This paper investigates the use of ensemble of predictors in order to improve the performance of spatial prediction methods. Support vector regression (SVR), a popular method from the field of statistical machine learning, is used. Several instances of SVR are combined using different data sampling schemes (bagging and boosting). Bagging shows good performance, and proves to be more computationally efficient than training a single SVR model while reducing error. Boosting, however, does not improve results on this specific problem.
Resumo:
A 7-year-old right-handed girl developed partial complex seizures with a left-sided onset. A brief period of post-ictal aphasia of the conduction type was documented before seizure control and complete normalization of oral language were obtained. We also found that she had a history of previous unexplained difficulty with written language acquisition that had occurred prior to the clinically recognized epilepsy and a subsequent loss of this ability. This rapidly improved with control of the epilepsy. The evolution of written language were been followed for 3 years, and continued improvement has occurred with fluctuations related to her epilepsy. This observation adds support to the growing body of data indicating that specific cognitive disturbances can be due to epilepsy in young children. It shows the vulnerability of skills which are in a period of active development, and the possibility that oral/written language can be differentially involved by cerebral dysfunction in the young child.
Resumo:
U-Pb dating of zircons by laser ablation inductively coupled plasma mass spectrometry (LA-ICPMS) is a widely used analytical technique in Earth Sciences. For U-Pb ages below 1 billion years (1 Ga), Pb-206/U-238 dates are usually used, showing the least bias by external parameters such as the presence of initial lead and its isotopic composition in the analysed mineral. Precision and accuracy of the Pb/U ratio are thus of highest importance in LA-ICPMS geochronology. We consider the evaluation of the statistical distribution of the sweep intensities based on goodness-of-fit tests in order to find a model probability distribution fitting the data to apply an appropriate formulation for the standard deviation. We then discuss three main methods to calculate the Pb/U intensity ratio and its uncertainty in the LA-ICPMS: (1) ratio-of-the-mean intensities method, (2) mean-of-the-intensity-ratios method and (3) intercept method. These methods apply different functions to the same raw intensity vs. time data to calculate the mean Pb/U intensity ratio. Thus, the calculated intensity ratio and its uncertainty depend on the method applied. We demonstrate that the accuracy and, conditionally, the precision of the ratio-of-the-mean intensities method are invariant to the intensity fluctuations and averaging related to the dwell time selection and off-line data transformation (averaging of several sweeps); we present a statistical approach how to calculate the uncertainty of this method for transient signals. We also show that the accuracy of methods (2) and (3) is influenced by the intensity fluctuations and averaging, and the extent of this influence can amount to tens of percentage points; we show that the uncertainty of these methods also depends on how the signal is averaged. Each of the above methods imposes requirements to the instrumentation. The ratio-of-the-mean intensities method is sufficiently accurate provided the laser induced fractionation between the beginning and the end of the signal is kept low and linear. We show, based on a comprehensive series of analyses with different ablation pit sizes, energy densities and repetition rates for a 193 nm ns-ablation system that such a fractionation behaviour requires using a low ablation speed (low energy density and low repetition rate). Overall, we conclude that the ratio-of-the-mean intensities method combined with low sampling rates is the most mathematically accurate among the existing data treatment methods for U-Pb zircon dating by sensitive sector field ICPMS.
Resumo:
Background: Conventional magnetic resonance imaging (MRI) techniques are highly sensitive to detect multiple sclerosis (MS) plaques, enabling a quantitative assessment of inflammatory activity and lesion load. In quantitative analyses of focal lesions, manual or semi-automated segmentations have been widely used to compute the total number of lesions and the total lesion volume. These techniques, however, are both challenging and time-consuming, being also prone to intra-observer and inter-observer variability.Aim: To develop an automated approach to segment brain tissues and MS lesions from brain MRI images. The goal is to reduce the user interaction and to provide an objective tool that eliminates the inter- and intra-observer variability.Methods: Based on the recent methods developed by Souplet et al. and de Boer et al., we propose a novel pipeline which includes the following steps: bias correction, skull stripping, atlas registration, tissue classification, and lesion segmentation. After the initial pre-processing steps, a MRI scan is automatically segmented into 4 classes: white matter (WM), grey matter (GM), cerebrospinal fluid (CSF) and partial volume. An expectation maximisation method which fits a multivariate Gaussian mixture model to T1-w, T2-w and PD-w images is used for this purpose. Based on the obtained tissue masks and using the estimated GM mean and variance, we apply an intensity threshold to the FLAIR image, which provides the lesion segmentation. With the aim of improving this initial result, spatial information coming from the neighbouring tissue labels is used to refine the final lesion segmentation.Results:The experimental evaluation was performed using real data sets of 1.5T and the corresponding ground truth annotations provided by expert radiologists. The following values were obtained: 64% of true positive (TP) fraction, 80% of false positive (FP) fraction, and an average surface distance of 7.89 mm. The results of our approach were quantitatively compared to our implementations of the works of Souplet et al. and de Boer et al., obtaining higher TP and lower FP values.Conclusion: Promising MS lesion segmentation results have been obtained in terms of TP. However, the high number of FP which is still a well-known problem of all the automated MS lesion segmentation approaches has to be improved in order to use them for the standard clinical practice. Our future work will focus on tackling this issue.
Resumo:
Geophysical techniques can help to bridge the inherent gap with regard to spatial resolution and the range of coverage that plagues classical hydrological methods. This has lead to the emergence of the new and rapidly growing field of hydrogeophysics. Given the differing sensitivities of various geophysical techniques to hydrologically relevant parameters and their inherent trade-off between resolution and range the fundamental usefulness of multi-method hydrogeophysical surveys for reducing uncertainties in data analysis and interpretation is widely accepted. A major challenge arising from such endeavors is the quantitative integration of the resulting vast and diverse database in order to obtain a unified model of the probed subsurface region that is internally consistent with all available data. To address this problem, we have developed a strategy towards hydrogeophysical data integration based on Monte-Carlo-type conditional stochastic simulation that we consider to be particularly suitable for local-scale studies characterized by high-resolution and high-quality datasets. Monte-Carlo-based optimization techniques are flexible and versatile, allow for accounting for a wide variety of data and constraints of differing resolution and hardness and thus have the potential of providing, in a geostatistical sense, highly detailed and realistic models of the pertinent target parameter distributions. Compared to more conventional approaches of this kind, our approach provides significant advancements in the way that the larger-scale deterministic information resolved by the hydrogeophysical data can be accounted for, which represents an inherently problematic, and as of yet unresolved, aspect of Monte-Carlo-type conditional simulation techniques. We present the results of applying our algorithm to the integration of porosity log and tomographic crosshole georadar data to generate stochastic realizations of the local-scale porosity structure. Our procedure is first tested on pertinent synthetic data and then applied to corresponding field data collected at the Boise Hydrogeophysical Research Site near Boise, Idaho, USA.
Resumo:
Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming.
Resumo:
This investigation is the final phase of a three part study whose overall objectives were to determine if a restraining force is required to prevent inlet uplift failures in corrugated metal pipe (CMP) installations, and to develop a procedure for calculating the required force when restraint is required. In the initial phase of the study (HR-306), the extent of the uplift problem in Iowa was determined and the forces acting on a CMP were quantified. In the second phase of the study (HR- 332), laboratory and field tests were conducted. Laboratory tests measured the longitudinal stiffness ofCMP and a full scale field test on a 3.05 m (10 ft) diameter CMP with 0.612 m (2 ft) of cover determined the soil-structure interaction in response to uplift forces. Reported herein are the tasks that were completed in the final phase of the study. In this phase, a buried 2.44 m (8 ft) CMP was tested with and without end-restraint and with various configurations of soil at the inlet end of the pipe. A total of four different soil configurations were tested; in all tests the soil cover was constant at 0.61 m (2 ft). Data from these tests were used to verify the finite element analysis model (FEA) that was developed in this phase of the research. Both experiments and analyses indicate that the primary soil contribution to uplift resistance occurs in the foreslope and that depth of soil cover does not affect the required tiedown force. Using the FEA, design charts were developed with which engineers can determine for a given situation if restraint force is required to prevent an uplift failure. If an engineer determines restraint is needed, the design charts provide the magnitude of the required force. The design charts are applicable to six gages of CMP for four flow conditions and two types of soil.
Resumo:
Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending the corresponding approaches to the regional scale represents a major, and as-of-yet largely unresolved, challenge. To address this problem, we have developed an upscaling procedure based on a Bayesian sequential simulation approach. This method is then applied to the stochastic integration of low-resolution, regional-scale electrical resistivity tomography (ERT) data in combination with high-resolution, local-scale downhole measurements of the hydraulic and electrical conductivities. Finally, the overall viability of this upscaling approach is tested and verified by performing and comparing flow and transport simulation through the original and the upscaled hydraulic conductivity fields. Our results indicate that the proposed procedure does indeed allow for obtaining remarkably faithful estimates of the regional-scale hydraulic conductivity structure and correspondingly reliable predictions of the transport characteristics over relatively long distances.
Resumo:
Quantifying the spatial configuration of hydraulic conductivity (K) in heterogeneous geological environments is essential for accurate predictions of contaminant transport, but is difficult because of the inherent limitations in resolution and coverage associated with traditional hydrological measurements. To address this issue, we consider crosshole and surface-based electrical resistivity geophysical measurements, collected in time during a saline tracer experiment. We use a Bayesian Markov-chain-Monte-Carlo (McMC) methodology to jointly invert the dynamic resistivity data, together with borehole tracer concentration data, to generate multiple posterior realizations of K that are consistent with all available information. We do this within a coupled inversion framework, whereby the geophysical and hydrological forward models are linked through an uncertain relationship between electrical resistivity and concentration. To minimize computational expense, a facies-based subsurface parameterization is developed. The Bayesian-McMC methodology allows us to explore the potential benefits of including the geophysical data into the inverse problem by examining their effect on our ability to identify fast flowpaths in the subsurface, and their impact on hydrological prediction uncertainty. Using a complex, geostatistically generated, two-dimensional numerical example representative of a fluvial environment, we demonstrate that flow model calibration is improved and prediction error is decreased when the electrical resistivity data are included. The worth of the geophysical data is found to be greatest for long spatial correlation lengths of subsurface heterogeneity with respect to wellbore separation, where flow and transport are largely controlled by highly connected flowpaths.
Resumo:
Purpose: To examine the efficacy and safety of repeat deep sclerectomy (DS) versus Baerveldt shunt (BS) implantation as second line surgery following failed primary DS. Methods: Fifty one patients were prospectively recruited to undergo BS implantation following failed DS and 51 patients underwent repeat DS, for which data was collected retrospectively. All eyes had at least one failed DS. Surgical success was defined as IOP≤21mmHg and 20% reduction in IOP from baseline. Success rates, number of glaucoma medications (GMs), IOP, and complication rates were compared between the two groups at year 1, post-operatively. Results: Mean age, sex and the proportion of glaucoma subtypes were similar between groups. Preoperatively IOP was significantly lower in DS group vs BS group (18.8mmHg vs 23.8mmHg, p<0.01, two sample t-test). Postoperatively IOP was significantly higher in DS group than BS group (14.6mmHg vs 12.0mmHg, p<0.01, two-sample t-test). In the DS group, 47% of eyes did not achieve 20% reduction in IOP from baseline, as a result the success rates were significantly lower in eyes with DS (51%) than in eyes with BS (88%) (p=0.02, log-rank test). Preoperatively the number of GMs used in DS and BS groups were similar (2.2 vs 2.7 p=0.02, two sample t-test). Postoperatively there remained no significant difference in GMs between groups (0.9 vs 1.1, p= 0.58, two sample t-test). Complication rates were similar between the two groups (12% vs 10%). Conclusions: Baerveldt tube implantation was more effective in lowering IOP than repeat deep sclerectomy in eyes with failed primary DS, at year one. Complications were minor and infrequent in both groups
Resumo:
We consider the problem of estimating the mean hospital cost of stays of a class of patients (e.g., a diagnosis-related group) as a function of patient characteristics. The statistical analysis is complicated by the asymmetry of the cost distribution, the possibility of censoring on the cost variable, and the occurrence of outliers. These problems have often been treated separately in the literature, and a method offering a joint solution to all of them is still missing. Indirect procedures have been proposed, combining an estimate of the duration distribution with an estimate of the conditional cost for a given duration. We propose a parametric version of this approach, allowing for asymmetry and censoring in the cost distribution and providing a mean cost estimator that is robust in the presence of extreme values. In addition, the new method takes covariate information into account.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
US Geological Survey (USGS) based elevation data are the most commonly used data source for highway hydraulic analysis; however, due to the vertical accuracy of USGS-based elevation data, USGS data may be too “coarse” to adequately describe surface profiles of watershed areas or drainage patterns. Additionally hydraulic design requires delineation of much smaller drainage areas (watersheds) than other hydrologic applications, such as environmental, ecological, and water resource management. This research study investigated whether higher resolution LIDAR based surface models would provide better delineation of watersheds and drainage patterns as compared to surface models created from standard USGS-based elevation data. Differences in runoff values were the metric used to compare the data sets. The two data sets were compared for a pilot study area along the Iowa 1 corridor between Iowa City and Mount Vernon. Given the limited breadth of the analysis corridor, areas of particular emphasis were the location of drainage area boundaries and flow patterns parallel to and intersecting the road cross section. Traditional highway hydrology does not appear to be significantly impacted, or benefited, by the increased terrain detail that LIDAR provided for the study area. In fact, hydrologic outputs, such as streams and watersheds, may be too sensitive to the increased horizontal resolution and/or errors in the data set. However, a true comparison of LIDAR and USGS-based data sets of equal size and encompassing entire drainage areas could not be performed in this study. Differences may also result in areas with much steeper slopes or significant changes in terrain. LIDAR may provide possibly valuable detail in areas of modified terrain, such as roads. Better representations of channel and terrain detail in the vicinity of the roadway may be useful in modeling problem drainage areas and evaluating structural surety during and after significant storm events. Furthermore, LIDAR may be used to verify the intended/expected drainage patterns at newly constructed highways. LIDAR will likely provide the greatest benefit for highway projects in flood plains and areas with relatively flat terrain where slight changes in terrain may have a significant impact on drainage patterns.
Resumo:
Rural intersections account for 30% of crashes in rural areas and 6% of all fatal crashes, representing a significant but poorly understood safety problem. Transportation agencies have traditionally implemented countermeasures to address rural intersection crashes but frequently do not understand the dynamic interaction between the driver and roadway and the driver factors leading to these types of crashes. The Second Strategic Highway Research Program (SHRP 2) conducted a large-scale naturalistic driving study (NDS) using instrumented vehicles. The study has provided a significant amount of on-road driving data for a range of drivers. The present study utilizes the SHRP 2 NDS data as well as SHRP 2 Roadway Information Database (RID) data to observe driver behavior at rural intersections first hand using video, vehicle kinematics, and roadway data to determine how roadway, driver, environmental, and vehicle factors interact to affect driver safety at rural intersections. A model of driver braking behavior was developed using a dataset of vehicle activity traces for several rural stop-controlled intersections. The model was developed using the point at which a driver reacts to the upcoming intersection by initiating braking as its dependent variable, with the driver’s age, type and direction of turning movement, and countermeasure presence as independent variables. Countermeasures such as on-pavement signing and overhead flashing beacons were found to increase the braking point distance, a finding that provides insight into the countermeasures’ effect on safety at rural intersections. The results of this model can lead to better roadway design, more informed selection of traffic control and countermeasures, and targeted information that can inform policy decisions. Additionally, a model of gap acceptance was attempted but was ultimately not developed due to the small size of the dataset. However, a protocol for data reduction for a gap acceptance model was determined. This protocol can be utilized in future studies to develop a gap acceptance model that would provide additional insight into the roadway, vehicle, environmental, and driver factors that play a role in whether a driver accepts or rejects a gap.