960 resultados para Probabilistic charts
Resumo:
Until recently the use of biometrics was restricted to high-security environments and criminal identification applications, for economic and technological reasons. However, in recent years, biometric authentication has become part of daily lives of people. The large scale use of biometrics has shown that users within the system may have different degrees of accuracy. Some people may have trouble authenticating, while others may be particularly vulnerable to imitation. Recent studies have investigated and identified these types of users, giving them the names of animals: Sheep, Goats, Lambs, Wolves, Doves, Chameleons, Worms and Phantoms. The aim of this study is to evaluate the existence of these users types in a database of fingerprints and propose a new way of investigating them, based on the performance of verification between subjects samples. Once introduced some basic concepts in biometrics and fingerprint, we present the biometric menagerie and how to evaluate them.
Resumo:
Until recently the use of biometrics was restricted to high-security environments and criminal identification applications, for economic and technological reasons. However, in recent years, biometric authentication has become part of daily lives of people. The large scale use of biometrics has shown that users within the system may have different degrees of accuracy. Some people may have trouble authenticating, while others may be particularly vulnerable to imitation. Recent studies have investigated and identified these types of users, giving them the names of animals: Sheep, Goats, Lambs, Wolves, Doves, Chameleons, Worms and Phantoms. The aim of this study is to evaluate the existence of these users types in a database of fingerprints and propose a new way of investigating them, based on the performance of verification between subjects samples. Once introduced some basic concepts in biometrics and fingerprint, we present the biometric menagerie and how to evaluate them.
Resumo:
Formation of hydrates is one of the major flow assurance problems faced by the oil and gas industry. Hydrates tend to form in natural gas pipelines with the presence of water and favorable temperature and pressure conditions, generally low temperatures and corresponding high pressures. Agglomeration of hydrates can result in blockage of flowlines and equipment, which can be time consuming to remove in subsea equipment and cause safety issues. Natural gas pipelines are more susceptible to burst and explosion owing to hydrate plugging. Therefore, a rigorous risk-assessment related to hydrate formation is required, which assists in preventing hydrate blockage and ensuring equipment integrity. This thesis presents a novel methodology to assess the probability of hydrate formation and presents a risk-based approach to determine the parameters of winterization schemes to avoid hydrate formation in natural gas pipelines operating in Arctic conditions. It also presents a lab-scale multiphase flow loop to study the effects of geometric and hydrodynamic parameters on hydrate formation and discusses the effects of geometric and hydrodynamic parameters on multiphase development length of a pipeline. Therefore, this study substantially contributes to the assessment of probability of hydrate formation and the decision making process of winterization strategies to prevent hydrate formation in Arctic conditions.
Resumo:
In questo studio, un multi-model ensemble è stato implementato e verificato, seguendo una delle priorità di ricerca del Subseasonal to Seasonal Prediction Project (S2S). Una regressione lineare è stata applicata ad un insieme di previsioni di ensemble su date passate, prodotte dai centri di previsione mensile del CNR-ISAC e ECMWF-IFS. Ognuna di queste contiene un membro di controllo e quattro elementi perturbati. Le variabili scelte per l'analisi sono l'altezza geopotenziale a 500 hPa, la temperatura a 850 hPa e la temperatura a 2 metri, la griglia spaziale ha risoluzione 1 ◦ × 1 ◦ lat-lon e sono stati utilizzati gli inverni dal 1990 al 2010. Le rianalisi di ERA-Interim sono utilizzate sia per realizzare la regressione, sia nella validazione dei risultati, mediante stimatori nonprobabilistici come lo scarto quadratico medio (RMSE) e la correlazione delle anomalie. Successivamente, tecniche di Model Output Statistics (MOS) e Direct Model Output (DMO) sono applicate al multi-model ensemble per ottenere previsioni probabilistiche per la media settimanale delle anomalie di temperatura a 2 metri. I metodi MOS utilizzati sono la regressione logistica e la regressione Gaussiana non-omogenea, mentre quelli DMO sono il democratic voting e il Tukey plotting position. Queste tecniche sono applicate anche ai singoli modelli in modo da effettuare confronti basati su stimatori probabilistici, come il ranked probability skill score, il discrete ranked probability skill score e il reliability diagram. Entrambe le tipologie di stimatori mostrano come il multi-model abbia migliori performance rispetto ai singoli modelli. Inoltre, i valori più alti di stimatori probabilistici sono ottenuti usando una regressione logistica sulla sola media di ensemble. Applicando la regressione a dataset di dimensione ridotta, abbiamo realizzato una curva di apprendimento che mostra come un aumento del numero di date nella fase di addestramento non produrrebbe ulteriori miglioramenti.
Resumo:
Peer reviewed
Resumo:
The work presented in this dissertation is focused on applying engineering methods to develop and explore probabilistic survival models for the prediction of decompression sickness in US NAVY divers. Mathematical modeling, computational model development, and numerical optimization techniques were employed to formulate and evaluate the predictive quality of models fitted to empirical data. In Chapters 1 and 2 we present general background information relevant to the development of probabilistic models applied to predicting the incidence of decompression sickness. The remainder of the dissertation introduces techniques developed in an effort to improve the predictive quality of probabilistic decompression models and to reduce the difficulty of model parameter optimization.
The first project explored seventeen variations of the hazard function using a well-perfused parallel compartment model. Models were parametrically optimized using the maximum likelihood technique. Model performance was evaluated using both classical statistical methods and model selection techniques based on information theory. Optimized model parameters were overall similar to those of previously published Results indicated that a novel hazard function definition that included both ambient pressure scaling and individually fitted compartment exponent scaling terms.
We developed ten pharmacokinetic compartmental models that included explicit delay mechanics to determine if predictive quality could be improved through the inclusion of material transfer lags. A fitted discrete delay parameter augmented the inflow to the compartment systems from the environment. Based on the observation that symptoms are often reported after risk accumulation begins for many of our models, we hypothesized that the inclusion of delays might improve correlation between the model predictions and observed data. Model selection techniques identified two models as having the best overall performance, but comparison to the best performing model without delay and model selection using our best identified no delay pharmacokinetic model both indicated that the delay mechanism was not statistically justified and did not substantially improve model predictions.
Our final investigation explored parameter bounding techniques to identify parameter regions for which statistical model failure will not occur. When a model predicts a no probability of a diver experiencing decompression sickness for an exposure that is known to produce symptoms, statistical model failure occurs. Using a metric related to the instantaneous risk, we successfully identify regions where model failure will not occur and identify the boundaries of the region using a root bounding technique. Several models are used to demonstrate the techniques, which may be employed to reduce the difficulty of model optimization for future investigations.
Resumo:
Estimation of absolute risk of cardiovascular disease (CVD), preferably with population-specific risk charts, has become a cornerstone of CVD primary prevention. Regular recalibration of risk charts may be necessary due to decreasing CVD rates and CVD risk factor levels. The SCORE risk charts for fatal CVD risk assessment were first calibrated for Germany with 1998 risk factor level data and 1999 mortality statistics. We present an update of these risk charts based on the SCORE methodology including estimates of relative risks from SCORE, risk factor levels from the German Health Interview and Examination Survey for Adults 2008-11 (DEGS1) and official mortality statistics from 2012. Competing risks methods were applied and estimates were independently validated. Updated risk charts were calculated based on cholesterol, smoking, systolic blood pressure risk factor levels, sex and 5-year age-groups. The absolute 10-year risk estimates of fatal CVD were lower according to the updated risk charts compared to the first calibration for Germany. In a nationwide sample of 3062 adults aged 40-65 years free of major CVD from DEGS1, the mean 10-year risk of fatal CVD estimated by the updated charts was lower by 29% and the estimated proportion of high risk people (10-year risk > = 5%) by 50% compared to the older risk charts. This recalibration shows a need for regular updates of risk charts according to changes in mortality and risk factor levels in order to sustain the identification of people with a high CVD risk.
Resumo:
In Germany the upscaling algorithm is currently the standard approach for evaluating the PV power produced in a region. This method involves spatially interpolating the normalized power of a set of reference PV plants to estimate the power production by another set of unknown plants. As little information on the performances of this method could be found in the literature, the first goal of this thesis is to conduct an analysis of the uncertainty associated to this method. It was found that this method can lead to large errors when the set of reference plants has different characteristics or weather conditions than the set of unknown plants and when the set of reference plants is small. Based on these preliminary findings, an alternative method is proposed for calculating the aggregate power production of a set of PV plants. A probabilistic approach has been chosen by which a power production is calculated at each PV plant from corresponding weather data. The probabilistic approach consists of evaluating the power for each frequently occurring value of the parameters and estimating the most probable value by averaging these power values weighted by their frequency of occurrence. Most frequent parameter sets (e.g. module azimuth and tilt angle) and their frequency of occurrence have been assessed on the basis of a statistical analysis of parameters of approx. 35 000 PV plants. It has been found that the plant parameters are statistically dependent on the size and location of the PV plants. Accordingly, separate statistical values have been assessed for 14 classes of nominal capacity and 95 regions in Germany (two-digit zip-code areas). The performances of the upscaling and probabilistic approaches have been compared on the basis of 15 min power measurements from 715 PV plants provided by the German distribution system operator LEW Verteilnetz. It was found that the error of the probabilistic method is smaller than that of the upscaling method when the number of reference plants is sufficiently large (>100 reference plants in the case study considered in this chapter). When the number of reference plants is limited (<50 reference plants for the considered case study), it was found that the proposed approach provides a noticeable gain in accuracy with respect to the upscaling method.
Resumo:
This work provides a holistic investigation into the realm of feature modeling within software product lines. The work presented identifies limitations and challenges within the current feature modeling approaches. Those limitations include, but not limited to, the dearth of satisfactory cognitive presentation, inconveniency in scalable systems, inflexibility in adapting changes, nonexistence of predictability of models behavior, as well as the lack of probabilistic quantification of model’s implications and decision support for reasoning under uncertainty. The work in this thesis addresses these challenges by proposing a series of solutions. The first solution is the construction of a Bayesian Belief Feature Model, which is a novel modeling approach capable of quantifying the uncertainty measures in model parameters by a means of incorporating probabilistic modeling with a conventional modeling approach. The Bayesian Belief feature model presents a new enhanced feature modeling approach in terms of truth quantification and visual expressiveness. The second solution takes into consideration the unclear support for the reasoning under the uncertainty process, and the challenging constraint satisfaction problem in software product lines. This has been done through the development of a mathematical reasoner, which was designed to satisfy the model constraints by considering probability weight for all involved parameters and quantify the actual implications of the problem constraints. The developed Uncertain Constraint Satisfaction Problem approach has been tested and validated through a set of designated experiments. Profoundly stating, the main contributions of this thesis include the following: • Develop a framework for probabilistic graphical modeling to build the purported Bayesian belief feature model. • Extend the model to enhance visual expressiveness throughout the integration of colour degree variation; in which the colour varies with respect to the predefined probabilistic weights. • Enhance the constraints satisfaction problem by the uncertainty measuring of the parameters truth assumption. • Validate the developed approach against different experimental settings to determine its functionality and performance.
Resumo:
In the deregulated Power markets it is necessary to have a appropriate Transmission Pricing methodology that also takes into account “Congestion and Reliability”, in order to ensure an economically viable, equitable, and congestion free power transfer capability, with high reliability and security. This thesis presents results of research conducted on the development of a Decision Making Framework (DMF) of concepts and data analytic and modelling methods for the Reliability benefits Reflective Optimal “cost evaluation for the calculation of Transmission Cost” for composite power systems, using probabilistic methods. The methodology within the DMF devised and reported in this thesis, utilises a full AC Newton-Raphson load flow and a Monte-Carlo approach to determine, Reliability Indices which are then used for the proposed Meta-Analytical Probabilistic Approach (MAPA) for the evaluation and calculation of the Reliability benefit Reflective Optimal Transmission Cost (ROTC), of a transmission system. This DMF includes methods for transmission line embedded cost allocation among transmission transactions, accounting for line capacity-use as well as congestion costing that can be used for pricing using application of Power Transfer Distribution Factor (PTDF) as well as Bialek’s method to determine a methodology which consists of a series of methods and procedures as explained in detail in the thesis for the proposed MAPA for ROTC. The MAPA utilises the Bus Data, Generator Data, Line Data, Reliability Data and Customer Damage Function (CDF) Data for the evaluation of Congestion, Transmission and Reliability costing studies using proposed application of PTDF and other established/proven methods which are then compared, analysed and selected according to the area/state requirements and then integrated to develop ROTC. Case studies involving standard 7-Bus, IEEE 30-Bus and 146-Bus Indian utility test systems are conducted and reported throughout in the relevant sections of the dissertation. There are close correlation between results obtained through proposed application of PTDF method with the Bialek’s and different MW-Mile methods. The novel contributions of this research work are: firstly the application of PTDF method developed for determination of Transmission and Congestion costing, which are further compared with other proved methods. The viability of developed method is explained in the methodology, discussion and conclusion chapters. Secondly the development of comprehensive DMF which helps the decision makers to analyse and decide the selection of a costing approaches according to their requirements. As in the DMF all the costing approaches have been integrated to achieve ROTC. Thirdly the composite methodology for calculating ROTC has been formed into suits of algorithms and MATLAB programs for each part of the DMF, which are further described in the methodology section. Finally the dissertation concludes with suggestions for Future work.
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.