950 resultados para Probabilistic metrics
Resumo:
Aquest projecte es centra principalment en el detector no coherent d’un GPS. Per tal de caracteritzar el procés de detecció d’un receptor, es necessita conèixer l’estadística implicada. Pel cas dels detectors no coherents convencionals, l’estadística de segon ordre intervé plenament. Les prestacions que ens dóna l’estadística de segon ordre, plasmada en la ROC, són prou bons tot i que en diferents situacions poden no ser els millors. Aquest projecte intenta reproduir el procés de detecció mitjançant l’estadística de primer ordre com a alternativa a la ja coneguda i implementada estadística de segon ordre. Per tal d’aconseguir-ho, s’usen expressions basades en el Teorema Central del Límit i de les sèries Edgeworth com a bones aproximacions. Finalment, tant l’estadística convencional com l’estadística proposada són comparades, en termes de la ROC, per tal de determinar quin detector no coherent ofereix millor prestacions en cada situació.
Resumo:
BACKGROUND Drugs for inhalation are the cornerstone of therapy in obstructive lung disease. We have observed that up to 75 % of patients do not perform a correct inhalation technique. The inability of patients to correctly use their inhaler device may be a direct consequence of insufficient or poor inhaler technique instruction. The objective of this study is to test the efficacy of two educational interventions to improve the inhalation techniques in patients with Chronic Obstructive Pulmonary Disease (COPD). METHODS This study uses both a multicenter patients´ preference trial and a comprehensive cohort design with 495 COPD-diagnosed patients selected by a non-probabilistic method of sampling from seven Primary Care Centers. The participants will be divided into two groups and five arms. The two groups are: 1) the patients´ preference group with two arms and 2) the randomized group with three arms. In the preference group, the two arms correspond to the two educational interventions (Intervention A and Intervention B) designed for this study. In the randomized group the three arms comprise: intervention A, intervention B and a control arm. Intervention A is written information (a leaflet describing the correct inhalation techniques). Intervention B is written information about inhalation techniques plus training by an instructor. Every patient in each group will be visited six times during the year of the study at health care center. DISCUSSION Our hypothesis is that the application of two educational interventions in patients with COPD who are treated with inhaled therapy will increase the number of patients who perform a correct inhalation technique by at least 25 %. We will evaluate the effectiveness of these interventions on patient inhalation technique improvement, considering that it will be adequate and feasible within the context of clinical practice.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).
Resumo:
Study on the likelihood and prevalence of patients with copd, over a year in a family medicine consultation, during 2012 and first two months of 2013. In a query of a health center about 15oo patients every 6 months probabilistic evolution was studied according to the theory of Laplace. Analyze both the COPD, its symptoms, etiology, clinical consultation and treatment in Family Medicine.
Resumo:
In the forensic examination of DNA mixtures, the question of how to set the total number of contributors (N) presents a topic of ongoing interest. Part of the discussion gravitates around issues of bias, in particular when assessments of the number of contributors are not made prior to considering the genotypic configuration of potential donors. Further complication may stem from the observation that, in some cases, there may be numbers of contributors that are incompatible with the set of alleles seen in the profile of a mixed crime stain, given the genotype of a potential contributor. In such situations, procedures that take a single and fixed number contributors as their output can lead to inferential impasses. Assessing the number of contributors within a probabilistic framework can help avoiding such complication. Using elements of decision theory, this paper analyses two strategies for inference on the number of contributors. One procedure is deterministic and focuses on the minimum number of contributors required to 'explain' an observed set of alleles. The other procedure is probabilistic using Bayes' theorem and provides a probability distribution for a set of numbers of contributors, based on the set of observed alleles as well as their respective rates of occurrence. The discussion concentrates on mixed stains of varying quality (i.e., different numbers of loci for which genotyping information is available). A so-called qualitative interpretation is pursued since quantitative information such as peak area and height data are not taken into account. The competing procedures are compared using a standard scoring rule that penalizes the degree of divergence between a given agreed value for N, that is the number of contributors, and the actual value taken by N. Using only modest assumptions and a discussion with reference to a casework example, this paper reports on analyses using simulation techniques and graphical models (i.e., Bayesian networks) to point out that setting the number of contributors to a mixed crime stain in probabilistic terms is, for the conditions assumed in this study, preferable to a decision policy that uses categoric assumptions about N.
Resumo:
A ubiquitous assessment of swimming velocity (main metric of the performance) is essential for the coach to provide a tailored feedback to the trainee. We present a probabilistic framework for the data-driven estimation of the swimming velocity at every cycle using a low-cost wearable inertial measurement unit (IMU). The statistical validation of the method on 15 swimmers shows that an average relative error of 0.1 ± 9.6% and high correlation with the tethered reference system (rX,Y=0.91 ) is achievable. Besides, a simple tool to analyze the influence of sacrum kinematics on the performance is provided.
Resumo:
The genetic characterization of unbalanced mixed stains remains an important area where improvement is imperative. In fact, with current methods for DNA analysis (Polymerase Chain Reaction with the SGM Plus™ multiplex kit), it is generally not possible to obtain a conventional autosomal DNA profile of the minor contributor if the ratio between the two contributors in a mixture is smaller than 1:10. This is a consequence of the fact that the major contributor's profile 'masks' that of the minor contributor. Besides known remedies to this problem, such as Y-STR analysis, a new compound genetic marker that consists of a Deletion/Insertion Polymorphism (DIP), linked to a Short Tandem Repeat (STR) polymorphism, has recently been developed and proposed elsewhere in literature [1]. The present paper reports on the derivation of an approach for the probabilistic evaluation of DIP-STR profiling results obtained from unbalanced DNA mixtures. The procedure is based on object-oriented Bayesian networks (OOBNs) and uses the likelihood ratio as an expression of the probative value. OOBNs are retained in this paper because they allow one to provide a clear description of the genotypic configuration observed for the mixed stain as well as for the various potential contributors (e.g., victim and suspect). These models also allow one to depict the assumed relevance relationships and perform the necessary probabilistic computations.
Resumo:
Brazil will host the FIFA World Cup™, the biggest single-event competition in the world, from June 12-July 13 2014 in 12 cities. This event will draw an estimated 600,000 international visitors. Brazil is endemic for dengue. Hence, attendees of the 2014 event are theoretically at risk for dengue. We calculated the risk of dengue acquisition to non-immune international travellers to Brazil, depending on the football match schedules, considering locations and dates of such matches for June and July 2014. We estimated the average per-capita risk and expected number of dengue cases for each host-city and each game schedule chosen based on reported dengue cases to the Brazilian Ministry of Health for the period between 2010-2013. On the average, the expected number of cases among the 600,000 foreigner tourists during the World Cup is 33, varying from 3-59. Such risk estimates will not only benefit individual travellers for adequate pre-travel preparations, but also provide valuable information for public health professionals and policy makers worldwide. Furthermore, estimates of dengue cases in international travellers during the World Cup can help to anticipate the theoretical risk for exportation of dengue into currently non-infected areas.
Resumo:
In 2000 the European Statistical Office published the guidelines for developing theHarmonized European Time Use Surveys system. Under such a unified framework,the first Time Use Survey of national scope was conducted in Spain during 2002–03. The aim of these surveys is to understand human behavior and the lifestyle ofpeople. Time allocation data are of compositional nature in origin, that is, they aresubject to non-negativity and constant-sum constraints. Thus, standard multivariatetechniques cannot be directly applied to analyze them. The goal of this work is toidentify homogeneous Spanish Autonomous Communities with regard to the typicalactivity pattern of their respective populations. To this end, fuzzy clustering approachis followed. Rather than the hard partitioning of classical clustering, where objects areallocated to only a single group, fuzzy method identify overlapping groups of objectsby allowing them to belong to more than one group. Concretely, the probabilistic fuzzyc-means algorithm is conveniently adapted to deal with the Spanish Time Use Surveymicrodata. As a result, a map distinguishing Autonomous Communities with similaractivity pattern is drawn.Key words: Time use data, Fuzzy clustering; FCM; simplex space; Aitchison distance
Resumo:
In this paper, we perform a societal and economic risk assessment for debris flows at the regional scale, for lower Valtellina, Northern Italy. We apply a simple empirical debris-flow model, FLOW-R, which couples a probabilistic flow routing algorithm with an energy line approach, providing the relative probability of transit, and the maximum kinetic energy, for each cell. By assessing a vulnerability to people and to other exposed elements (buildings, public facilities, crops, woods, communication lines), and their economic value, we calculated the expected annual losses both in terms of lives (societal risk) and goods (direct economic risk). For societal risk assessment, we distinguish for the day and night scenarios. The distribution of people at different moments of the day was considered, accounting for the occupational and recreational activities, to provide a more realistic assessment of risk. Market studies were performed in order to assess a realistic economic value to goods, structures, and lifelines. As terrain unit, a 20 m x 20 m cell was used, in accordance with data availability and the spatial resolution requested for a risk assessment at this scale. Societal risk the whole area amounts to 1.98 and 4.22 deaths/year for the day and the night scenarios, respectively, with a maximum of 0.013 deaths/year/cell. Economic risk for goods amounts to 1,760,291 ?/year, with a maximum of 13,814 ?/year/cell.
Resumo:
This paper focuses on one of the methods for bandwidth allocation in an ATM network: the convolution approach. The convolution approach permits an accurate study of the system load in statistical terms by accumulated calculations, since probabilistic results of the bandwidth allocation can be obtained. Nevertheless, the convolution approach has a high cost in terms of calculation and storage requirements. This aspect makes real-time calculations difficult, so many authors do not consider this approach. With the aim of reducing the cost we propose to use the multinomial distribution function: the enhanced convolution approach (ECA). This permits direct computation of the associated probabilities of the instantaneous bandwidth requirements and makes a simple deconvolution process possible. The ECA is used in connection acceptance control, and some results are presented
Resumo:
Our essay aims at studying suitable statistical methods for the clustering ofcompositional data in situations where observations are constituted by trajectories ofcompositional data, that is, by sequences of composition measurements along a domain.Observed trajectories are known as “functional data” and several methods have beenproposed for their analysis.In particular, methods for clustering functional data, known as Functional ClusterAnalysis (FCA), have been applied by practitioners and scientists in many fields. To ourknowledge, FCA techniques have not been extended to cope with the problem ofclustering compositional data trajectories. In order to extend FCA techniques to theanalysis of compositional data, FCA clustering techniques have to be adapted by using asuitable compositional algebra.The present work centres on the following question: given a sample of compositionaldata trajectories, how can we formulate a segmentation procedure giving homogeneousclasses? To address this problem we follow the steps described below.First of all we adapt the well-known spline smoothing techniques in order to cope withthe smoothing of compositional data trajectories. In fact, an observed curve can bethought of as the sum of a smooth part plus some noise due to measurement errors.Spline smoothing techniques are used to isolate the smooth part of the trajectory:clustering algorithms are then applied to these smooth curves.The second step consists in building suitable metrics for measuring the dissimilaritybetween trajectories: we propose a metric that accounts for difference in both shape andlevel, and a metric accounting for differences in shape only.A simulation study is performed in order to evaluate the proposed methodologies, usingboth hierarchical and partitional clustering algorithm. The quality of the obtained resultsis assessed by means of several indices
Resumo:
Les écosystèmes fournissent de nombreuses ressources et services écologiques qui sont utiles à la population humaine. La biodiversité est une composante essentielle des écosystèmes et maintient de nombreux services. Afin d'assurer la permanence des services écosystémiques, des mesures doivent être prises pour conserver la biodiversité. Dans ce but, l'acquisition d'informations détaillées sur la distribution de la biodiversité dans l'espace est essentielle. Les modèles de distribution d'espèces (SDMs) sont des modèles empiriques qui mettent en lien des observations de terrain (présences ou absences d'une espèce) avec des descripteurs de l'environnement, selon des courbes de réponses statistiques qui décrive la niche réalisée des espèces. Ces modèles fournissent des projections spatiales indiquant les lieux les plus favorables pour les espèces considérées. Le principal objectif de cette thèse est de fournir des projections plus réalistes de la distribution des espèces et des communautés en montagne pour le climat présent et futur en considérant non-seulement des variables abiotiques mais aussi biotiques. Les régions de montagne et l'écosystème alpin sont très sensibles aux changements globaux et en même temps assurent de nombreux services écosystémiques. Cette thèse est séparée en trois parties : (i) fournir une meilleure compréhension du rôle des interactions biotiques dans la distribution des espèces et l'assemblage des communautés en montagne (ouest des Alpes Suisses), (ii) permettre le développement d'une nouvelle approche pour modéliser la distribution spatiale de la biodiversité, (iii) fournir des projections plus réalistes de la distribution future des espèces ainsi que de la composition des communautés. En me focalisant sur les papillons, bourdons et plantes vasculaires, j'ai détecté des interactions biotiques importantes qui lient les espèces entre elles. J'ai également identifié la signature du filtre de l'environnement sur les communautés en haute altitude confirmant l'utilité des SDMs pour reproduire ce type de processus. A partir de ces études, j'ai contribué à l'amélioration méthodologique des SDMs dans le but de prédire les communautés en incluant les interactions biotiques et également les processus non-déterministes par une approche probabiliste. Cette approche permet de prédire non-seulement la distribution d'espèces individuelles, mais également celle de communautés dans leur entier en empilant les projections (S-SDMs). Finalement, j'ai utilisé cet outil pour prédire la distribution d'espèces et de communautés dans le passé et le futur. En particulier, j'ai modélisé la migration post-glaciaire de Trollius europaeus qui est à l'origine de la structure génétique intra-spécifique chez cette espèce et évalué les risques de perte face au changement climatique. Finalement, j'ai simulé la distribution des communautés de bourdons pour le 21e siècle afin d'évaluer les changements probables dans ce groupe important de pollinisateurs. La diversité fonctionnelle des bourdons va être altérée par la perte d'espèces spécialistes de haute altitude et ceci va influencer la pollinisation des plantes en haute altitude. - Ecosystems provide a multitude of resources and ecological services, which are useful to human. Biodiversity is an essential component of those ecosystems and guarantee many services. To assure the permanence of ecosystem services for future generation, measure should be applied to conserve biodiversity. For this purpose, the acquisition of detailed information on how biodiversity implicated in ecosystem function is distributed in space is essential. Species distribution models (SDMs) are empirical models relating field observations to environmental predictors based on statistically-derived response surfaces that fit the realized niche. These models result in spatial predictions indicating locations of the most suitable environment for the species and may potentially be applied to predict composition of communities and their functional properties. The main objective of this thesis was to provide more accurate projections of species and communities distribution under current and future climate in mountains by considering not solely abiotic but also biotic drivers of species distribution. Mountain areas and alpine ecosystems are considered as particularly sensitive to global changes and are also sources of essential ecosystem services. This thesis had three main goals: (i) a better ecological understanding of biotic interactions and how they shape the distribution of species and communities, (ii) the development of a novel approach to the spatial modeling of biodiversity, that can account for biotic interactions, and (iii) ecologically more realistic projections of future species distributions, of future composition and structure of communities. Focusing on butterfly and bumblebees in interaction with the vegetation, I detected important biotic interactions for species distribution and community composition of both plant and insects along environmental gradients. I identified the signature of environmental filtering processes at high elevation confirming the suitability of SDMs for reproducing patterns of filtering. Using those case-studies, I improved SDMs by incorporating biotic interaction and accounting for non-deterministic processes and uncertainty using a probabilistic based approach. I used improved modeling to forecast the distribution of species through the past and future climate changes. SDMs hindcasting allowed a better understanding of the spatial range dynamic of Trollius europaeus in Europe at the origin of the species intra-specific genetic diversity and identified the risk of loss of this genetic diversity caused by climate change. By simulating the future distribution of all bumblebee species in the western Swiss Alps under nine climate change scenarios for the 21st century, I found that the functional diversity of this pollinator guild will be largely affected by climate change through the loss of high elevation specialists. In turn, this will have important consequences on alpine plant pollination.
Resumo:
Pippenger [Pi77] showed the existence of (6m,4m,3m,6)-concentrator for each positive integer m using a probabilistic method. We generalize his approach and prove existence of (6m,4m,3m,5.05)-concentrator (which is no longer regular, but has fewer edges). We apply this result to improve the constant of approximation of almost additive set functions by additive set functions from 44.5 (established by Kalton and Roberts in [KaRo83] to 39. We show a more direct connection of the latter problem to the Whitney type estimate for approximation of continuous functions on a cube in &b&R&/b&&sup&d&/sup& by linear functions, and improve the estimate of this Whitney constant from 802 (proved by Brudnyi and Kalton in [BrKa00] to 73.