900 resultados para bayesian inference
Resumo:
The pantropical family Eriocaulaceae includes ten genera and c. 1,400 species, with diversity concentrated in the New World. The last complete revision of the family was published more than 100 years ago, and until recently the generic and infrageneric relationships were poorly resolved. However, a multi-disciplinary approach over the last 30 years, using morphological and anatomical characters, has been supplemented with additional data from palynology, chemistry, embryology, population genetics, cytology and, more recently, molecular phylogenetic studies. This led to a reassessment of phylogenetic relationships within the family. In this paper we present new data for the ITS and trnL-F regions, analysed separately and in combination, using maximum parsimony and Bayesian inference. The data confirm previous results, and show that many characters traditionally used for differentiating and circumscribing the genera within the family are homoplasious. A new generic key with characters from various sources and reflecting the current taxonomic changes is presented.
Resumo:
Automatic identification and extraction of bone contours from X-ray images is an essential first step task for further medical image analysis. In this paper we propose a 3D statistical model based framework for the proximal femur contour extraction from calibrated X-ray images. The automatic initialization is solved by an estimation of Bayesian network algorithm to fit a multiple component geometrical model to the X-ray data. The contour extraction is accomplished by a non-rigid 2D/3D registration between a 3D statistical model and the X-ray images, in which bone contours are extracted by a graphical model based Bayesian inference. Preliminary experiments on clinical data sets verified its validity
Resumo:
In this study, we present a novel genotyping scheme to classify German wild-type varicella-zoster virus (VZV) strains and to differentiate them from the Oka vaccine strain (genotype B). This approach is based on analysis of four loci in open reading frames (ORFs) 51 to 58, encompassing a total length of 1,990 bp. The new genotyping scheme produced identical clusters in phylogenetic analyses compared to full-genome sequences from well-characterized VZV strains. Based on genotype A, D, B, and C reference strains, a dichotomous identification key (DIK) was developed and applied for VZV strains obtained from vesicle fluid and liquor samples originating from 42 patients suffering from varicella or zoster between 2003 and 2006. Sequencing of regions in ORFs 51, 52, 53, 56, 57, and 58 identified 18 single-nucleotide polymorphisms (SNPs), including two novel ones, SNP 89727 and SNP 92792 in ORF51 and ORF52, respectively. The DIK as well as phylogenetic analysis by Bayesian inference showed that 14 VZV strains belonged to genotype A, and 28 VZV strains were classified as genotype D. Neither Japanese (vaccine)-like B strains nor recombinant-like C strains were found within the samples from Germany. The novel genotyping scheme and the DIK were demonstrated to be practical and simple and allow the highly efficient replication of phylogenetic patterns in VZV initially derived from full-genome DNA sequence analyses. Therefore, this approach may allow us to draw a more comprehensive picture of wild-type VZV strains circulating in Germany and Central Europe by high-throughput procedures in the future.
Resumo:
In a statistical inference scenario, the estimation of target signal or its parameters is done by processing data from informative measurements. The estimation performance can be enhanced if we choose the measurements based on some criteria that help to direct our sensing resources such that the measurements are more informative about the parameter we intend to estimate. While taking multiple measurements, the measurements can be chosen online so that more information could be extracted from the data in each measurement process. This approach fits well in Bayesian inference model often used to produce successive posterior distributions of the associated parameter. We explore the sensor array processing scenario for adaptive sensing of a target parameter. The measurement choice is described by a measurement matrix that multiplies the data vector normally associated with the array signal processing. The adaptive sensing of both static and dynamic system models is done by the online selection of proper measurement matrix over time. For the dynamic system model, the target is assumed to move with some distribution and the prior distribution at each time step is changed. The information gained through adaptive sensing of the moving target is lost due to the relative shift of the target. The adaptive sensing paradigm has many similarities with compressive sensing. We have attempted to reconcile the two approaches by modifying the observation model of adaptive sensing to match the compressive sensing model for the estimation of a sparse vector.
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
Seizure freedom in patients suffering from pharmacoresistant epilepsies is still not achieved in 20–30% of all cases. Hence, current therapies need to be improved, based on a more complete understanding of ictogenesis. In this respect, the analysis of functional networks derived from intracranial electroencephalographic (iEEG) data has recently become a standard tool. Functional networks however are purely descriptive models and thus are conceptually unable to predict fundamental features of iEEG time-series, e.g., in the context of therapeutical brain stimulation. In this paper we present some first steps towards overcoming the limitations of functional network analysis, by showing that its results are implied by a simple predictive model of time-sliced iEEG time-series. More specifically, we learn distinct graphical models (so called Chow–Liu (CL) trees) as models for the spatial dependencies between iEEG signals. Bayesian inference is then applied to the CL trees, allowing for an analytic derivation/prediction of functional networks, based on thresholding of the absolute value Pearson correlation coefficient (CC) matrix. Using various measures, the thus obtained networks are then compared to those which were derived in the classical way from the empirical CC-matrix. In the high threshold limit we find (a) an excellent agreement between the two networks and (b) key features of periictal networks as they have previously been reported in the literature. Apart from functional networks, both matrices are also compared element-wise, showing that the CL approach leads to a sparse representation, by setting small correlations to values close to zero while preserving the larger ones. Overall, this paper shows the validity of CL-trees as simple, spatially predictive models for periictal iEEG data. Moreover, we suggest straightforward generalizations of the CL-approach for modeling also the temporal features of iEEG signals.
Resumo:
Introduction Many marine planktonic crustaceans such as copepods have been considered as widespread organisms. However, the growing evidence for cryptic and pseudo-cryptic speciation has emphasized the need of re-evaluating the status of copepod species complexes in molecular and morphological studies to get a clearer picture about pelagic marine species as evolutionary units and their distributions. This study analyses the molecular diversity of the ecologically important Paracalanus parvus species complex. Its seven currently recognized species are abundant and also often dominant in marine coastal regions worldwide from temperate to tropical oceans. Results COI and Cytochrome b sequences of 160 specimens of the Paracalanus parvus complex from all oceans were obtained. Furthermore, 42 COI sequences from GenBank were added for the genetic analyses. Thirteen distinct molecular operational taxonomic units (MOTU) and two single sequences were revealed with cladistic analyses (Maximum Likelihood, Bayesian Inference), of which seven were identical with results from species delimitation methods (barcode gaps, ABDG, GMYC, Rosenberg's P(AB)). In total, 10 to 12 putative species were detected and could be placed in three categories: (1) temperate geographically isolated, (2) warm-temperate to tropical wider spread and (3) circumglobal warm-water species. Conclusions The present study provides evidence of cryptic or pseudocryptic speciation in the Paracalanus parvus complex. One major insight is that the species Paracalanus parvus s.s. is not panmictic, but may be restricted in its distribution to the northeastern Atlantic.
Resumo:
Monte Carlo techniques, which require the generation of samples from some target density, are often the only alternative for performing Bayesian inference. Two classic sampling techniques to draw independent samples are the ratio of uniforms (RoU) and rejection sampling (RS). An efficient sampling algorithm is proposed combining the RoU and polar RS (i.e. RS inside a sector of a circle using polar coordinates). Its efficiency is shown in drawing samples from truncated Cauchy and Gaussian random variables, which have many important applications in signal processing and communications. RESUMEN. Método eficiente para generar algunas variables aleatorias de uso común en procesado de señal y comunicaciones (por ejemplo, Gaussianas o Cauchy truncadas) mediante la combinación de dos técnicas: "ratio of uniforms" y "rejection sampling".
Resumo:
We present a biomolecular probabilistic model driven by the action of a DNA toolbox made of a set of DNA templates and enzymes that is able to perform Bayesian inference. The model will take single-stranded DNA as input data, representing the presence or absence of a specific molecular signal (the evidence). The program logic uses different DNA templates and their relative concentration ratios to encode the prior probability of a disease and the conditional probability of a signal given the disease. When the input and program molecules interact, an enzyme-driven cascade of reactions (DNA polymerase extension, nicking and degradation) is triggered, producing a different pair of single-stranded DNA species. Once the system reaches equilibrium, the ratio between the output species will represent the application of Bayes? law: the conditional probability of the disease given the signal. In other words, a qualitative diagnosis plus a quantitative degree of belief in that diagno- sis. Thanks to the inherent amplification capability of this DNA toolbox, the resulting system will be able to to scale up (with longer cascades and thus more input signals) a Bayesian biosensor that we designed previously.
Resumo:
Neste trabalho propomos o uso de um método Bayesiano para estimar o parâmetro de memória de um processo estocástico com memória longa quando sua função de verossimilhança é intratável ou não está disponível. Esta abordagem fornece uma aproximação para a distribuição a posteriori sobre a memória e outros parâmetros e é baseada numa aplicação simples do método conhecido como computação Bayesiana aproximada (ABC). Alguns estimadores populares para o parâmetro de memória serão revisados e comparados com esta abordagem. O emprego de nossa proposta viabiliza a solução de problemas complexos sob o ponto de vista Bayesiano e, embora aproximativa, possui um desempenho muito satisfatório quando comparada com métodos clássicos.
Resumo:
In this thesis, the origin of large-scale structures in hot star winds, believed to be responsible for the presence of discrete absorption components (DACs) in the absorption troughs of ultraviolet resonance lines, is constrained using both observations and numerical simulations. These structures are understood as arising from bright regions on the stellar surface, although their physical cause remains unknown. First, we use high quality circular spectropolarimetric observations of 13 well-studied OB stars to evaluate the potential role of dipolar magnetic fields in producing DACs. We perform longitudinal field measurements and place limits on the field strength using Bayesian inference, assuming that it is dipolar. No magnetic field was detected within this sample. The derived constraints statistically refute any significant dynamical influence from a magnetic dipole on the wind for all of these stars, ruling out such fields as a cause for DACs. Second, we perform numerical simulations using bright spots constrained by broadband optical photometric observations. We calculate hydrodynamical wind models using three sets of spot sizes and strengths. Co-rotating interaction regions are yielded in each model, and radiative transfer shows that the properties of the variations in the UV resonance lines synthesized from these models are consistent with those found in observed UV spectra, establishing the first consistent link between UV spectroscopic line profile variability and photometric variations and thus supporting the bright spot paradigm (BSP). Finally, we develop and apply a phenomenological model to quantify the measurable effects co-rotating bright spots would have on broadband optical photometry and on the profiles of photopheric lines in optical spectra. This model can be used to evaluate the existence of these spots, and, in the event of their detection, characterize them. Furthermore, a tentative spot evolution model is presented. A preliminary analysis of its output, compared to the observed photometric variations of xi Persei, suggests the possible existence of “active longitudes” on the surface of this star. Future work will expand the range of observational diagnostics that can be interpreted within the BSP, and link phenomenology (bright spots) to physical processes (magnetic spots or non-radial pulsations).
Resumo:
Sequences of small-subunit rRNA genes were determined for Dermocystidium percae and a new Dermocystidium species established as D. fennicum sp. n. from perch in Finland. On the basis of alignment and phylogenetic analysis both species were placed in the Dermocystidium-Rhinosporidium clade within Ichthyosporea, D. fennicum as a specific sister taxon to D. salmonis, and D. percae in a clade different from D. fennicum. The ultrastructures of both species well agree with the characteristics approved within Ichthyosporea: walled spores produce uniflagellate zoospores lacking a collar or cortical alveoli. The two Dermocystidium species resemble Rhinosporidium seeberi (as described by light microscope), a member of the nearest relative genus, but differ in that in R. seeberi plasmodia have thousands of nuclei discernible, endospores are discharged through a pore in the wall of the sporangium, and zoospores have not been revealed. The plasmodial stages of both Dermocystidium species have a most unusual behaviour of nuclei, although we do not actually know how the nuclei transform during the development. Early stages have an ordinary nucleus with double, fenestrated envelope. In middle-aged plasmodia ordinary nuclei seem to be totally absent or are only seldom discernible until prior to sporogony, when rather numerous nuclei again reappear. Meanwhile single-membrane vacuoles with coarsely granular content, or complicated membranous systems were discernible. Ordinary nuclei may be re-formed within these vacuoles or systems. In D. percae small canaliculi and in D. fennicum minute vesicles may aid the nucleus-cytoplasm interchange of matter before formation of double-membrane-enveloped nuclei. Dermocystidium represents a unique case when a stage of the life cycle of an eukaryote lacks a typical nucleus.
Resumo:
The green sea turtle is one of the long-lived species that comprise the charismatic marine megafauna. The green turtle has a long history of human exploitation with some stocks extinct. Here we report on a 30-year study of the nesting abundance of the green turtle stock endemic to the Hawaiian Archipelago. We show that there has been a substantial long-term increase in abundance of this once seriously depleted stock following cessation of harvesting since the 1970s. This population increase has occurred in a far shorter period of time than previously thought possible. There was also a distinct 3-4 year periodicity in annual nesting abundance that might be a function of regional environmental stochasticity that synchronises breeding behaviour throughout the Archipelago. This is one of the few reliable long-term population abundance time series for a large long-lived marine species, which are needed for gaining insights into the recovery process of long-lived marine species and long-term ecological processes. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
Euastacus crayfish are endemic to freshwater ecosystems of the eastern coast of Australia. While recent evolutionary studies have focused on a few of these species, here we provide a comprehensive phylogenetic estimate of relationships among the species within the genus. We sequenced three mitochondrial gene regions (COI, 16S, and 12S) and one nuclear region (28S) from 40 species of the genus Euastacus, as well as one undescribed species. Using these data, we estimated the phylogenetic relationships within the genus using maximum-likelihood, parsimony, and Bayesian Markov Chain Monte Carlo analyses. Using Bayes factors to test different model hypotheses, we found that the best phylogeny supports monophyletic groupings of all but two recognized species and suggests a widespread ancestor that diverged by vicariance. We also show that Eitastacus and Astacopsis are most likely monophyletic sister genera. We use the resulting phylogeny as a framework to test biogeographic hypotheses relating to the diversification of the genus. (c) 2005 Elsevier Inc. All rights reserved.