Biblioteca Digital

118 resultados para Multimodel Inference

Lossless filter for multiple repeats with bounded edit distance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Identifying local similarity between two or more sequences, or identifying repeats occurring at least twice in a sequence, is an essential part in the analysis of biological sequences and of their phylogenetic relationship. Finding such fragments while allowing for a certain number of insertions, deletions, and substitutions, is however known to be a computationally expensive task, and consequently exact methods can usually not be applied in practice. Results: The filter TUIUIU that we introduce in this paper provides a possible solution to this problem. It can be used as a preprocessing step to any multiple alignment or repeats inference method, eliminating a possibly large fraction of the input that is guaranteed not to contain any approximate repeat. It consists in the verification of several strong necessary conditions that can be checked in a fast way. We implemented three versions of the filter. The first is simply a straightforward extension to the case of multiple sequences of an application of conditions already existing in the literature. The second uses a stronger condition which, as our results show, enable to filter sensibly more with negligible (if any) additional time. The third version uses an additional condition and pushes the sensibility of the filter even further with a non negligible additional time in many circumstances; our experiments show that it is particularly useful with large error rates. The latter version was applied as a preprocessing of a multiple alignment tool, obtaining an overall time (filter plus alignment) on average 63 and at best 530 times smaller than before (direct alignment), with in most cases a better quality alignment. Conclusion: To the best of our knowledge, TUIUIU is the first filter designed for multiple repeats and for dealing with error rates greater than 10% of the repeats length.

Modeling associations between genetic markers using Bayesian networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivation: Understanding the patterns of association between polymorphisms at different loci in a population ( linkage disequilibrium, LD) is of fundamental importance in various genetic studies. Many coefficients were proposed for measuring the degree of LD, but they provide only a static view of the current LD structure. Generative models (GMs) were proposed to go beyond these measures, giving not only a description of the actual LD structure but also a tool to help understanding the process that generated such structure. GMs based in coalescent theory have been the most appealing because they link LD to evolutionary factors. Nevertheless, the inference and parameter estimation of such models is still computationally challenging. Results: We present a more practical method to build GM that describe LD. The method is based on learning weighted Bayesian network structures from haplotype data, extracting equivalence structure classes and using them to model LD. The results obtained in public data from the HapMap database showed that the method is a promising tool for modeling LD. The associations represented by the learned models are correlated with the traditional measure of LD D`. The method was able to represent LD blocks found by standard tools. The granularity of the association blocks and the readability of the models can be controlled in the method. The results suggest that the causality information gained by our method can be useful to tell about the conservability of the genetic markers and to guide the selection of subset of representative markers.

Using Bayesian networks with rule extraction to infer the risk of weed infestation in a corn-crop

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed-crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narrow and broad-leaved weeds. The inferred categorical variable values for the weed-crop competitiveness along with three other categorical variables extracted from estimated maps for the weed seed production and weed coverage are then used as input for a second Bayesian network classifier to infer categorical variables values for the risk of infestation. Weed biomass and yield loss data samples are used to learn the probability relationship among the nodes of the first and second Bayesian classifiers in a supervised fashion, respectively. For comparison purposes, two types of Bayesian network structures are considered, namely an expert-based Bayesian classifier and a naive Bayes classifier. The inference system focused on the knowledge interpretation by translating a Bayesian classifier into a set of classification rules. The results obtained for the risk inference in a corn-crop field are presented and discussed. (C) 2009 Elsevier Ltd. All rights reserved.

Assembling a consistent set of sentences in relational probabilistic logic with stochastic independence

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We examine the representation of judgements of stochastic independence in probabilistic logics. We focus on a relational logic where (i) judgements of stochastic independence are encoded by directed acyclic graphs, and (ii) probabilistic assessments are flexible in the sense that they are not required to specify a single probability measure. We discuss issues of knowledge representation and inference that arise from our particular combination of graphs, stochastic independence, logical formulas and probabilistic assessments. (C) 2007 Elsevier B.V. All rights reserved.

Approximate algorithms for credal networks with binary variables

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a family of algorithms for approximate inference in credal networks (that is, models based on directed acyclic graphs and set-valued probabilities) that contain only binary variables. Such networks can represent incomplete or vague beliefs, lack of data, and disagreements among experts; they can also encode models based on belief functions and possibilistic measures. All algorithms for approximate inference in this paper rely on exact inferences in credal networks based on polytrees with binary variables, as these inferences have polynomial complexity. We are inspired by approximate algorithms for Bayesian networks; thus the Loopy 2U algorithm resembles Loopy Belief Propagation, while the Iterated Partial Evaluation and Structured Variational 2U algorithms are, respectively, based on Localized Partial Evaluation and variational techniques. (C) 2007 Elsevier Inc. All rights reserved.

Estimating wave spectra from the motions of moored vessels: Experimental validation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The practicability of estimating directional wave spectra based on a vessel`s 1st order response has been recently addressed by several researchers. Different alternatives regarding statistical inference methods and possible drawbacks that could arise from their application have been extensively discussed, with an apparent preference for estimations based on Bayesian inference algorithms. Most of the results on this matter, however, rely exclusively on numerical simulations or at best on few and sparse full-scale measurements, comprising a questionable basis for validation purposes. This paper discusses several issues that have recently been debated regarding the advantages of Bayesian inference and different alternatives for its implementation. Among those are the definition of the best set of input motions, the number of parameters required for guaranteeing smoothness of the spectrum in frequency and direction and how to determine their optimum values. These subjects are addressed in the light of an extensive experimental campaign performed with a small-scale model of an FPSO platform (VLCC hull), which was conducted in an ocean basin in Brazil. Tests involved long and short crested seas with variable levels of directional spreading and also bimodal conditions. The calibration spectra measured in the tank by means of an array of wave probes configured the paradigm for estimations. Results showed that a wide range of sea conditions could be estimated with good precision, even those with somewhat low peak periods. Some possible drawbacks that have been pointed out in previous works concerning the viability of employing large vessels for such a task are then refuted. Also, it is shown that a second parameter for smoothing the spectrum in frequency may indeed increase the accuracy in some situations, although the criterion usually proposed for estimating the optimum values (ABIC) demands large computational effort and does not seem adequate for practical on-board systems, which require expeditious estimations. (C) 2009 Elsevier Ltd. All rights reserved.

Directional Wave Spectrum Estimation Based on Vessel`s 1st-order Motions: Field Results

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, 2 different approaches for estimating the directional wave spectrum based on a vessel`s 1st-order motions are discussed, and their predictions are compared to those provided by a wave buoy. The real-scale data were obtained in an extensive monitoring campaign based on an FPSO unit operating at Campos Basin, Brazil. Data included vessel motions, heading and tank loadings. Wave field information was obtained by means of a heave-pitch-roll buoy installed in the vicinity of the unit. `two of the methods most widely used for this kind of analysis are considered, one based on Bayesian statistical inference, the other consisting of a parametrical representation of the wave spectrum. The performance of both methods is compared, and their sensitivity to input parameters is discussed. This analysis complements a set of previous validations based on numerical and towing-tank results and allows for a preliminary evaluation of reliability when applying the methodology at full scale.

Chautemsia calcicola: A new genus and species of Gloxinieae (Gesneriaceae) from Minas Gerais,.Brazil

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new species of Gesneriaceae discovered in remnants of deciduous forests on limestone outcrops in Minas Gerais, Brazil, is described and compared with morphologically related taxa. This plant presents the diagnostic features of the tribe Gloxinieae, but a unique combination of morphological traits distinguishes this taxon from previously described genera. Its phylogenetic position was inferred based on analyzing DNA sequences variation of five loci: the rpl1 intron, rps16 intron, trnL-F intron-spacer, a portion of the plastid-expressed glutamine synthetase gene (ncpGS) and the ribosomal DNA internal transcribed spacer (ITS). Molecular phylogenetic analyses confirm the position of this new species in the Gloxinieae, as a sister lineage of a clade including the Brazilian genera Mandirola and Goyazia. However, tests using topological constraints do not reject the alternative relationship that places this taxon with Gloxiniopsis in a monophyletic group. To accomodate this species in the current generic circumscription of gloxinieae, the new genus chautemsia A.O. Araujo V.C. Souza is created.

The generalized log-gamma mixture model with covariates: local influence and residual analysis

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In a sample of censored survival times, the presence of an immune proportion of individuals who are not subject to death, failure or relapse, may be indicated by a relatively high number of individuals with large censored survival times. In this paper the generalized log-gamma model is modified for the possibility that long-term survivors may be present in the data. The model attempts to separately estimate the effects of covariates on the surviving fraction, that is, the proportion of the population for which the event never occurs. The logistic function is used for the regression model of the surviving fraction. Inference for the model parameters is considered via maximum likelihood. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. Finally, a data set from the medical area is analyzed under the log-gamma generalized mixture model. A residual analysis is performed in order to select an appropriate model.

Phylogeography of Chelonus insularis (Hymenoptera: Braconidae) and Campoletis sonorensis (Hymenoptera: Ichneumonidae), Two Primary Neotropical Parasitoids of the Fall Armyworm (Lepidoptera: Noctuidae)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In a previous study, we observed no spatial genetic structure in Mexican populations of the parasitoids Chelonus insularis Cresson (Hymenoptera: Braconidae) and Campoletis sonorensis Cameron (Hymenoptera: Ichneumonidae) by using microsatellite markers In the current study, we Investigated whether for these important parasitoids of the fall armyworm (Lepidoptera: Noctuidae) there is any genetic structure at a larger scale Insects of both species were collected across the American continent and their phylogeography was Investigated using both nuclear and mitochondria] markers Our results suggest an ancient north-south migration of C insularis, whereas no clear pattern] could be determined for C sonorensis. Nonetheless, the resulting topology indicated the existence of a cryptic taxon within this later species. a few Canadian specimens determined as C. sonorensis branch outside a clack composed of the Argentinean Chelonus grioti Blanchard, the Brazilian Chelonus flavicincta Ashmead, and the rest of the C sonorensis individuals The individuals revealing the cryptic taxon were collected from Thichoplusia in (Hubner) (Lepidoptera. Noctuidae) on tomato (Lycopersicon spp) and may represent a biotype that has adapted to the early season phenology of its host. Overall, the loosely defined spatial genetic structure previously shown at a local fine scale also was found at the larger scale, for both species Dispersal of these insects may be partly driven by wind as suggested by genetic similarities between Individuals coming from very distant locations.

Genetic structure and biology of Xylella fastidiosa strains causing disease in citrus and coffee in Brazil

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Xylella fastidiosa is a vector-borne, plant-pathogenic bacterium that causes disease in citrus (citrus variegated chlorosis [CVC]) and coffee (coffee leaf scorch [CLS]) plants in Brazil. CVC and CLS occur sympatrically and share leafhopper vectors; thus, determining whether X. fastidiosa isolates can be dispersed from one crop to another and cause disease is of epidemiological importance. We sought to clarify the genetic and biological relationships between CVC- and CLS-causing X. fastidiosa isolates. We used cross-inoculation bioassays and microsatellite and multilocus sequence typing (MLST) approaches to determine the host range and genetic structure of 26 CVC and 20 CLS isolates collected from different regions in Brazil. Our results show that citrus and coffee X. fastidiosa isolates are biologically distinct. Cross-inoculation tests showed that isolates causing CVC and CLS in the field were able to colonize citrus and coffee plants, respectively, but not the other host, indicating biological isolation between the strains. The microsatellite analysis separated most X. fastidiosa populations tested on the basis of the host plant from which they were isolated. However, recombination among isolates was detected and a lack of congruency among phylogenetic trees was observed for the loci used in the MLST scheme. Altogether, our study indicates that CVC and CLS are caused by two biologically distinct strains of X. fastidiosa that have diverged but are genetically homogenized by frequent recombination.

Does using stepwise variable selection to build sequential path analysis models make sense?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Causal inference methods - mainly path analysis and structural equation modeling - offer plant physiologists information about cause-and-effect relationships among plant traits. Recently, an unusual approach to causal inference through stepwise variable selection has been proposed and used in various works on plant physiology. The approach should not be considered correct from a biological point of view. Here, it is explained why stepwise variable selection should not be used for causal inference, and shown what strange conclusions can be drawn based upon the former analysis when one aims to interpret cause-and-effect relationships among plant traits.

Tests for the hysteresis hypothesis in Brazilian industrialized exports: A threshold cointegration analysis

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper examines the hysteresis hypothesis in the Brazilian industrialized exports using a time series analysis. This hypothesis finds an empirical representation into the nonlinear adjustments of the exported quantity to relative price changes. Thus, the threshold cointegration analysis proposed by Balke and Fomby [Balke, N.S. and Fomby, T.B. Threshold Cointegration. International Economic Review, 1997; 38; 627-645.] was used for estimating models with asymmetric adjustment of the error correction term. Amongst sixteen industrial sectors selected, there was evidence of nonlinearities in the residuals of long-run relationships of supply or demand for exports in nine of them. These nonlinearities represent asymmetric and/or discontinuous responses of exports to different representative measures of real exchange rates, in addition to other components of long-run demand or supply equations. (C) 2007 Elsevier B.V. All rights reserved.

Expectation Propagation with Factorizing Distributions: A Gaussian Approximation and Performance Results for Simple Models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We discuss the expectation propagation (EP) algorithm for approximate Bayesian inference using a factorizing posterior approximation. For neural network models, we use a central limit theorem argument to make EP tractable when the number of parameters is large. For two types of models, we show that EP can achieve optimal generalization performance when data are drawn from a simple distribution.

Dynamics of Hepatitis D (delta) virus genotype 3 in the Amazon region of South America

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hepatitis delta virus (HDV) is widely distributed and associated with fulminant hepatitis epidemics in areas with high prevalence of HBV. Several studies performed in the 1980s showed data on HDV infection in South America, but there are no studies on the viral dynamics of this virus. The aim of this study was to conduct an evolutionary analysis of hepatitis delta genotype 3 (HDV/3) prevalent in South America: estimate its nucleotide substitution rate, determine the time of most recent ancestor (TMRCA) and characterize the epidemic history and evolutionary dynamics. Furthermore, we characterized the presence of HBV/HDV infection in seven samples collected from patients who died due to fulminant hepatitis from Amazon region in Colombia and included them in the evolutionary analysis. This is the first study reporting HBV and HDV sequences from the Amazon region of Colombia. Of the seven Colombian patients, five were positive for HBV-DNA and HDV-RNA. Of them, two samples were successfully sequenced for HBV (subgenotypes F3 and Fib) and the five samples HDV positive were classified as HDV/3. By using all HDV/3 available reference sequences with sampling dates (n = 36), we estimated the HDV/3 substitution rate in 1.07 x 10(-3) substitutions per site per year (s/s/y), which resulted in a time to the most recent common ancestor (TMRCA) of 85 years. Also, it was determined that HDV/3 spread exponentially from early 1950s to the 1970s in South America. This work discusses for the first time the viral dynamics for the HDV/3 circulating in South America. We suggest that the measures implemented to control HBV transmission resulted in the control of HDV/3 spreading in South America, especially after the important raise in this infection associated with a huge mortality during the 1950s up to the 1970s. The differences found among HDV/3 and the other HDV genotypes concerning its diversity raises the hypothesis of a different origin and/or a different transmission route. (C) 2011 Elsevier B.V. All rights reserved.

«
1
2
3
4
5
6
7
8
»