960 resultados para Probabilistic Projections
Resumo:
The use of orthonormal coordinates in the simplex and, particularly, balance coordinates, has suggested the use of a dendrogram for the exploratory analysis of compositional data. The dendrogram is based on a sequential binary partition of a compositional vector into groups of parts. At each step of a partition, one group of parts isdivided into two new groups, and a balancing axis in the simplex between both groupsis defined. The set of balancing axes constitutes an orthonormal basis, and the projections of the sample on them are orthogonal coordinates. They can be represented in adendrogram-like graph showing: (a) the way of grouping parts of the compositional vector; (b) the explanatory role of each subcomposition generated in the partition process;(c) the decomposition of the total variance into balance components associated witheach binary partition; (d) a box-plot of each balance. This representation is useful tohelp the interpretation of balance coordinates; to identify which are the most explanatory coordinates; and to describe the whole sample in a single diagram independentlyof the number of parts of the sample
Resumo:
Background The 'database search problem', that is, the strengthening of a case - in terms of probative value - against an individual who is found as a result of a database search, has been approached during the last two decades with substantial mathematical analyses, accompanied by lively debate and centrally opposing conclusions. This represents a challenging obstacle in teaching but also hinders a balanced and coherent discussion of the topic within the wider scientific and legal community. This paper revisits and tracks the associated mathematical analyses in terms of Bayesian networks. Their derivation and discussion for capturing probabilistic arguments that explain the database search problem are outlined in detail. The resulting Bayesian networks offer a distinct view on the main debated issues, along with further clarity. Methods As a general framework for representing and analyzing formal arguments in probabilistic reasoning about uncertain target propositions (that is, whether or not a given individual is the source of a crime stain), this paper relies on graphical probability models, in particular, Bayesian networks. This graphical probability modeling approach is used to capture, within a single model, a series of key variables, such as the number of individuals in a database, the size of the population of potential crime stain sources, and the rarity of the corresponding analytical characteristics in a relevant population. Results This paper demonstrates the feasibility of deriving Bayesian network structures for analyzing, representing, and tracking the database search problem. The output of the proposed models can be shown to agree with existing but exclusively formulaic approaches. Conclusions The proposed Bayesian networks allow one to capture and analyze the currently most well-supported but reputedly counter-intuitive and difficult solution to the database search problem in a way that goes beyond the traditional, purely formulaic expressions. The method's graphical environment, along with its computational and probabilistic architectures, represents a rich package that offers analysts and discussants with additional modes of interaction, concise representation, and coherent communication.
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
We investigated procedural learning in 18 children with basal ganglia (BG) lesions or dysfunctions of various aetiologies, using a visuo-motor learning test, the Serial Reaction Time (SRT) task, and a cognitive learning test, the Probabilistic Classification Learning (PCL) task. We compared patients with early (<1 year old, n=9), later onset (>6 years old, n=7) or progressive disorder (idiopathic dystonia, n=2). All patients showed deficits in both visuo-motor and cognitive domains, except those with idiopathic dystonia, who displayed preserved classification learning skills. Impairments seem to be independent from the age of onset of pathology. As far as we know, this study is the first to investigate motor and cognitive procedural learning in children with BG damage. Procedural impairments were documented whatever the aetiology of the BG damage/dysfunction and time of pathology onset, thus supporting the claim of very early skill learning development and lack of plasticity in case of damage.
Resumo:
A joint distribution of two discrete random variables with finite support can be displayed as a two way table of probabilities adding to one. Assume that this table hasn rows and m columns and all probabilities are non-null. This kind of table can beseen as an element in the simplex of n · m parts. In this context, the marginals areidentified as compositional amalgams, conditionals (rows or columns) as subcompositions. Also, simplicial perturbation appears as Bayes theorem. However, the Euclideanelements of the Aitchison geometry of the simplex can also be translated into the tableof probabilities: subspaces, orthogonal projections, distances.Two important questions are addressed: a) given a table of probabilities, which isthe nearest independent table to the initial one? b) which is the largest orthogonalprojection of a row onto a column? or, equivalently, which is the information in arow explained by a column, thus explaining the interaction? To answer these questionsthree orthogonal decompositions are presented: (1) by columns and a row-wise geometric marginal, (2) by rows and a columnwise geometric marginal, (3) by independenttwo-way tables and fully dependent tables representing row-column interaction. Animportant result is that the nearest independent table is the product of the two (rowand column)-wise geometric marginal tables. A corollary is that, in an independenttable, the geometric marginals conform with the traditional (arithmetic) marginals.These decompositions can be compared with standard log-linear models.Key words: balance, compositional data, simplex, Aitchison geometry, composition,orthonormal basis, arithmetic and geometric marginals, amalgam, dependence measure,contingency table
Resumo:
PURPOSE: All kinds of blood manipulations aim to increase the total hemoglobin mass (tHb-mass). To establish tHb-mass as an effective screening parameter for detecting blood doping, the knowledge of its normal variation over time is necessary. The aim of the present study, therefore, was to determine the intraindividual variance of tHb-mass in elite athletes during a training year emphasizing off, training, and race seasons at sea level. METHODS: tHb-mass and hemoglobin concentration ([Hb]) were determined in 24 endurance athletes five times during a year and were compared with a control group (n = 6). An analysis of covariance was used to test the effects of training phases, age, gender, competition level, body mass, and training volume. Three error models, based on 1) a total percentage error of measurement, 2) the combination of a typical percentage error (TE) of analytical origin with an absolute SD of biological origin, and 3) between-subject and within-subject variance components as obtained by an analysis of variance, were tested. RESULTS: In addition to the expected influence of performance status, the main results were that the effects of training volume (P = 0.20) and training phases (P = 0.81) on tHb-mass were not significant. We found that within-subject variations mainly have an analytical origin (TE approximately 1.4%) and a very small SD (7.5 g) of biological origin. CONCLUSION: tHb-mass shows very low individual oscillations during a training year (<6%), and these oscillations are below the expected changes in tHb-mass due to Herythropoetin (EPO) application or blood infusion (approximately 10%). The high stability of tHb-mass over a period of 1 year suggests that it should be included in an athlete's biological passport and analyzed by recently developed probabilistic inference techniques that define subject-based reference ranges.
Resumo:
Since 1986, several near-vertical seismic reflection profiles have been recorded in Switzerland in order to map the deep geologic structure of the Alps. One objective of this endeavour has been to determine the geometries of the autochthonous basement and of the external crystalline massifs, important elements for understanding the geodynamics of the Alpine orogeny. The PNR-20 seismic line W1, located in the Rawil depression of the western Swiss Alps, provides important information on this subject. It extends northward from the `'Penninic front'' across the Helvetic nappes to the Prealps. The crystalline massifs do not outcrop along this profile. Thus, the interpretation of `'near-basement'' reflections has to be constrained by down-dip projections of surface geology, `'true amplitude'' processing, rock physical property studies and modelling. 3-D seismic modelling has been used to evaluate the seismic response of two alternative down-dip projection models. To constrain the interpretation in the southern part of the profile, `'true amplitude'' processing has provided information on the strength of the reflections. Density and velocity measurements on core samples collected up-dip from the region of the seismic line have been used to evaluate reflection coefficients of typical lithologic boundaries in the region. The cover-basement contact itself is not a source of strong reflections, but strong reflections arise from within the overlaying metasedimentary cover sequence, allowing the geometry of the top of the basement to be determined on the basis of `'near-basement'' reflections. The front of the external crystalline massifs is shown to extend beneath the Prealps, about 6 km north of the expected position. A 2-D model whose seismic response shows reflection patterns very similar to the observed is proposed.
Resumo:
The major active retinoid, all-trans retinoic acid, has long been recognized as critical for the development of several organs, including the eye. Mutations in STRA6, the gene encoding the cellular receptor for vitamin A, in patients with Matthew-Wood syndrome and anophthalmia/microphthalmia (A/M), have previously demonstrated the importance of retinol metabolism in human eye disease. We used homozygosity mapping combined with next-generation sequencing to interrogate patients with anophthalmia and microphthalmia for new causative genes. We used whole-exome and whole-genome sequencing to study a family with two affected brothers with bilateral A/M and a simplex case with bilateral anophthalmia and hypoplasia of the optic nerve and optic chiasm. Analysis of novel sequence variants revealed homozygosity for two nonsense mutations in ALDH1A3, c.568A>G, predicting p.Lys190*, in the familial cases, and c.1165A>T, predicting p.Lys389*, in the simplex case. Both mutations predict nonsense-mediated decay and complete loss of function. We performed antisense morpholino (MO) studies in Danio rerio to characterize the developmental effects of loss of Aldh1a3 function. MO-injected larvae showed a significant reduction in eye size, and aberrant axonal projections to the tectum were noted. We conclude that ALDH1A3 loss of function causes anophthalmia and aberrant eye development in humans and in animal model systems.
Resumo:
BACKGROUND: From most recent available data, we projected cancer mortality statistics for 2014, for the European Union (EU) and its six more populous countries. Specific attention was given to pancreatic cancer, the only major neoplasm showing unfavorable trends in both sexes. PATIENTS AND METHODS: Population and death certification data from stomach, colorectum, pancreas, lung, breast, uterus, prostate, leukemias and total cancers were obtained from the World Health Organisation database and Eurostat. Figures were derived for the EU, France, Germany, Italy, Poland, Spain and the UK. Projected 2014 numbers of deaths by age group were obtained by linear regression on estimated numbers of deaths over the most recent time period identified by a joinpoint regression model. RESULTS: In the EU in 2014, 1,323,600 deaths from cancer are predicted (742,500 men and 581,100 women), corresponding to standardized death rates of 138.1/100,000 men and 84.7/100,000 women, falling by 7% and 5%, respectively, since 2009. In men, predicted rates for the three major cancers (lung, colorectum and prostate cancer) are lower than in 2009, falling by 8%, 4% and 10%, respectively. In women, breast and colorectal cancers had favorable trends (-9% and -7%), but female lung cancer rates are predicted to rise 8%. Pancreatic cancer is the only neoplasm with a negative outlook in both sexes. Only in the young (25-49 years), EU trends become more favorable in men, while women keep registering slight predicted rises. CONCLUSIONS: Cancer mortality predictions for 2014 confirm the overall favorable cancer mortality trend in the EU, translating to an overall 26% fall in men since its peak in 1988, and 20% in women, and the avoidance of over 250,000 deaths in 2014 compared with the peak rate. Notable exceptions are female lung cancer and pancreatic cancer in both sexes.
Resumo:
BACKGROUND Drugs for inhalation are the cornerstone of therapy in obstructive lung disease. We have observed that up to 75 % of patients do not perform a correct inhalation technique. The inability of patients to correctly use their inhaler device may be a direct consequence of insufficient or poor inhaler technique instruction. The objective of this study is to test the efficacy of two educational interventions to improve the inhalation techniques in patients with Chronic Obstructive Pulmonary Disease (COPD). METHODS This study uses both a multicenter patients´ preference trial and a comprehensive cohort design with 495 COPD-diagnosed patients selected by a non-probabilistic method of sampling from seven Primary Care Centers. The participants will be divided into two groups and five arms. The two groups are: 1) the patients´ preference group with two arms and 2) the randomized group with three arms. In the preference group, the two arms correspond to the two educational interventions (Intervention A and Intervention B) designed for this study. In the randomized group the three arms comprise: intervention A, intervention B and a control arm. Intervention A is written information (a leaflet describing the correct inhalation techniques). Intervention B is written information about inhalation techniques plus training by an instructor. Every patient in each group will be visited six times during the year of the study at health care center. DISCUSSION Our hypothesis is that the application of two educational interventions in patients with COPD who are treated with inhaled therapy will increase the number of patients who perform a correct inhalation technique by at least 25 %. We will evaluate the effectiveness of these interventions on patient inhalation technique improvement, considering that it will be adequate and feasible within the context of clinical practice.
Resumo:
A new ceratomyxid parasite was examined for taxonomic identification, upon being found infecting the gall bladder of Hemiodus microlepis (Teleostei: Hemiodontidae), a freshwater teleost collected from the Amazon River, Brazil. Light and transmission electron microscopy revealed elongated crescent-shaped spores constituted by two asymmetrical shell valves united along a straight sutural line, each possessing a lateral projection. The spores body measured 5.2 ± 0.4 µm (n = 25) in length and 35.5 ± 0.9 µm (n = 25) in total thickness. The lateral projections were asymmetric, one measuring 18.1 ± 0.5 µm (n = 25) in thickness and the other measuring 17.5 ± 0.5 µm (n = 25) in thickness. Two equal-sized subspherical polar capsules measuring 2.2 ± 0.3 µm in diameter were located at the same level, each possessing a polar filament with 5-6 coils. The sporoplasm was binucleate. Considering the morphometric data analyzed from the microscopic observations, as well as the host species and its geographical location, this paper describes a new myxosporean species, herein named Ceratomyxa microlepis sp. nov.; therefore representing the first description of a freshwater ceratomyxid from the South American region.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
Study on the likelihood and prevalence of patients with copd, over a year in a family medicine consultation, during 2012 and first two months of 2013. In a query of a health center about 15oo patients every 6 months probabilistic evolution was studied according to the theory of Laplace. Analyze both the COPD, its symptoms, etiology, clinical consultation and treatment in Family Medicine.
Resumo:
In the forensic examination of DNA mixtures, the question of how to set the total number of contributors (N) presents a topic of ongoing interest. Part of the discussion gravitates around issues of bias, in particular when assessments of the number of contributors are not made prior to considering the genotypic configuration of potential donors. Further complication may stem from the observation that, in some cases, there may be numbers of contributors that are incompatible with the set of alleles seen in the profile of a mixed crime stain, given the genotype of a potential contributor. In such situations, procedures that take a single and fixed number contributors as their output can lead to inferential impasses. Assessing the number of contributors within a probabilistic framework can help avoiding such complication. Using elements of decision theory, this paper analyses two strategies for inference on the number of contributors. One procedure is deterministic and focuses on the minimum number of contributors required to 'explain' an observed set of alleles. The other procedure is probabilistic using Bayes' theorem and provides a probability distribution for a set of numbers of contributors, based on the set of observed alleles as well as their respective rates of occurrence. The discussion concentrates on mixed stains of varying quality (i.e., different numbers of loci for which genotyping information is available). A so-called qualitative interpretation is pursued since quantitative information such as peak area and height data are not taken into account. The competing procedures are compared using a standard scoring rule that penalizes the degree of divergence between a given agreed value for N, that is the number of contributors, and the actual value taken by N. Using only modest assumptions and a discussion with reference to a casework example, this paper reports on analyses using simulation techniques and graphical models (i.e., Bayesian networks) to point out that setting the number of contributors to a mixed crime stain in probabilistic terms is, for the conditions assumed in this study, preferable to a decision policy that uses categoric assumptions about N.
Resumo:
A ubiquitous assessment of swimming velocity (main metric of the performance) is essential for the coach to provide a tailored feedback to the trainee. We present a probabilistic framework for the data-driven estimation of the swimming velocity at every cycle using a low-cost wearable inertial measurement unit (IMU). The statistical validation of the method on 15 swimmers shows that an average relative error of 0.1 ± 9.6% and high correlation with the tethered reference system (rX,Y=0.91 ) is achievable. Besides, a simple tool to analyze the influence of sacrum kinematics on the performance is provided.