912 resultados para sequencing error
Resumo:
Agitation rate is an important parameter in the operation of Anaerobic Sequencing Biofilm Batch Reactors (ASBBRs), and a proper agitation rate guarantees good mixing, improves mass transfer, and enhances the solubility of the particulate organic matter. Dairy effluents have a high amount of particulate organic matter, and their anaerobic digestion presents inhibitory intermediates (e. g., long-chain fatty acids). The importance of studying agitation in such batch systems is clear. The present study aimed to evaluate how agitation frequency influences the anaerobic treatment of dairy effluents. The ASBBR was fed with wastewater from milk pasteurisation process and cheese manufacture with no whey segregation. The organic matter concentration, measured as chemical oxygen demand (COD), was maintained at approximately 8,000 mg/L. The reactor was operated with four agitation frequencies: 500 rpm, 350 rpm, 200 rpm, and no agitation. In terms of COD removal efficiency, similar results were observed for 500 rpm and 350 rpm (around 90%) and for 200 rpm and no agitation (around 80%). Increasing the system`s agitation thus not only improved the global efficiency of organic matter removal but also influenced volatile acid production and consumption and clearly modified this balance in each experimental condition.
Resumo:
We estimate the conditions for detectability of two planets in a 2/1 mean-motion resonance from radial velocity data, as a function of their masses, number of observations and the signal-to-noise ratio. Even for a data set of the order of 100 observations and standard deviations of the order of a few meters per second, we find that Jovian-size resonant planets are difficult to detect if the masses of the planets differ by a factor larger than similar to 4. This is consistent with the present population of real exosystems in the 2/1 commensurability, most of which have resonant pairs with similar minimum masses, and could indicate that many other resonant systems exist, but are currently beyond the detectability limit. Furthermore, we analyze the error distribution in masses and orbital elements of orbital fits from synthetic data sets for resonant planets in the 2/1 commensurability. For various mass ratios and number of data points we find that the eccentricity of the outer planet is systematically overestimated, although the inner planet`s eccentricity suffers a much smaller effect. If the initial conditions correspond to small-amplitude oscillations around stable apsidal corotation resonances, the amplitudes estimated from the orbital fits are biased toward larger amplitudes, in accordance to results found in real resonant extrasolar systems.
Resumo:
The gene SNRNP200 is composed of 45 exons and encodes a protein essential for pre-mRNA splicing, the 200 kDa helicase hBrr2. Two mutations in SNRNP200 have recently been associated with autosomal dominant retinitis pigmentosa (adRP), a retinal degenerative disease, in two families from China. In this work we analyzed the entire 35-Kb SNRNP200 genomic region in a cohort of 96 unrelated North American patients with adRP. To complete this large-scale sequencing project, we performed ultra high-throughput sequencing of pooled, untagged PCR products. We then validated the detected DNA changes by Sanger sequencing of individual samples from this cohort and from an additional one of 95 patients. One of the two previously known mutations (p.S1087L) was identified in 3 patients, while 4 new missense changes (p.R681C, p.R681H, p.V683L, p.Y689C) affecting highly conserved codons were identified in 6 unrelated individuals, indicating that the prevalence of SNRNP200-associated adRP is relatively high. We also took advantage of this research to evaluate the pool-and-sequence method, especially with respect to the generation of false positive and negative results. We conclude that, although this strategy can be adopted for rapid discovery of new disease-associated variants, it still requires extensive validation to be used in routine DNA screenings. (C) 2011 Wiley-Liss, Inc.
Resumo:
Introduction: The characterization of microbial communities infecting the endodontic system in each clinical condition may help on the establishment of a correct prognosis and distinct strategies of treatment. The purpose of this study was to determine the bacterial diversity in primary endodontic infections by 16S ribosomal-RNA (rRNA) sequence analysis. Methods: Samples from root canals of untreated asymptomatic teeth (n = 12) exhibiting periapical lesions were obtained, 165 rRNA bacterial genomic libraries were constructed and sequenced, and bacterial diversity was estimated. Results: A total of 489 clones were analyzed (mean, 40.7 +/- 8.0 clones per sample). Seventy phylotypes were identified of which six were novel phylotypes belonging to the family Ruminococcaceae. The mean number of taxa per canal was 10.0, ranging from 3 to 21 per sample; 65.7% of the cloned sequences represented phylotypes for which no cultivated isolates have been reported. The most prevalent taxa were Atopobium rimae (50.0%), Dialister invisus, Pre-votella oris, Pseudoramibacter alactolyticus, and Tannerella forsythia (33.3%). Conclusions: Although several key species predominate in endodontic samples of asymptomatic cases with periapical lesions, the primary endodontic infection is characterized by a wide bacterial diversity, which is mostly represented by members of the phylum Firmicutes belonging to the class Clostridia followed by the phylum Bacteroidetes. (J Ended 2011;37:922-926)
Resumo:
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.
Resumo:
Mycoplasma synoviae (MS) is an important avian pathogen may cause both respiratory disease and joint inflammation synovitis in poultry, causing economic losses to the Brazilian poultry industry. The genotypic variation in 16S rRNA gene is unknown. Partial sequences of 16S rRNA gene of 19 strains of M. synoviae were sequenced and analyzed in order to obtain molecular characterization and evaluation of the genetic variability of strains from distinct Brazilian areas of poultry production. Different polymorphic patterns were observed. The number of polymorphic alterations in the studied strains ranged from 0 to 6. The nucleotide variations, including deletion, insertion and substitutions, ranged from 3 to 5. The genotypic diversity observed in this study may be explained by spontaneous mutations that may occur when a lineage remains in the same flock for long periods. The culling and reposition in poultry flocks may be responsible for the entry of new strains in different areas. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
In this paper we deal with robust inference in heteroscedastic measurement error models Rather than the normal distribution we postulate a Student t distribution for the observed variables Maximum likelihood estimates are computed numerically Consistent estimation of the asymptotic covariance matrices of the maximum likelihood and generalized least squares estimators is also discussed Three test statistics are proposed for testing hypotheses of interest with the asymptotic chi-square distribution which guarantees correct asymptotic significance levels Results of simulations and an application to a real data set are also reported (C) 2009 The Korean Statistical Society Published by Elsevier B V All rights reserved
Resumo:
The multivariate skew-t distribution (J Multivar Anal 79:93-113, 2001; J R Stat Soc, Ser B 65:367-389, 2003; Statistics 37:359-363, 2003) includes the Student t, skew-Cauchy and Cauchy distributions as special cases and the normal and skew-normal ones as limiting cases. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis of repeated measures, pretest/post-test data, under multivariate null intercept measurement error model (J Biopharm Stat 13(4):763-771, 2003) where the random errors and the unobserved value of the covariate (latent variable) follows a Student t and skew-t distribution, respectively. The results and methods are numerically illustrated with an example in the field of dentistry.
Resumo:
Skew-normal distribution is a class of distributions that includes the normal distributions as a special case. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis in a multivariate, null intercept, measurement error model [R. Aoki, H. Bolfarine, J.A. Achcar, and D. Leao Pinto Jr, Bayesian analysis of a multivariate null intercept error-in -variables regression model, J. Biopharm. Stat. 13(4) (2003b), pp. 763-771] where the unobserved value of the covariate (latent variable) follows a skew-normal distribution. The results and methods are applied to a real dental clinical trial presented in [A. Hadgu and G. Koch, Application of generalized estimating equations to a dental randomized clinical trial, J. Biopharm. Stat. 9 (1999), pp. 161-178].
Resumo:
In this article, we discuss inferential aspects of the measurement error regression models with null intercepts when the unknown quantity x (latent variable) follows a skew normal distribution. We examine first the maximum-likelihood approach to estimation via the EM algorithm by exploring statistical properties of the model considered. Then, the marginal likelihood, the score function and the observed information matrix of the observed quantities are presented allowing direct inference implementation. In order to discuss some diagnostics techniques in this type of models, we derive the appropriate matrices to assessing the local influence on the parameter estimates under different perturbation schemes. The results and methods developed in this paper are illustrated considering part of a real data set used by Hadgu and Koch [1999, Application of generalized estimating equations to a dental randomized clinical trial. Journal of Biopharmaceutical Statistics, 9, 161-178].
Resumo:
This paper deals with asymptotic results on a multivariate ultrastructural errors-in-variables regression model with equation errors Sufficient conditions for attaining consistent estimators for model parameters are presented Asymptotic distributions for the line regression estimators are derived Applications to the elliptical class of distributions with two error assumptions are presented The model generalizes previous results aimed at univariate scenarios (C) 2010 Elsevier Inc All rights reserved
Resumo:
The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This work presents a Bayesian semiparametric approach for dealing with regression models where the covariate is measured with error. Given that (1) the error normality assumption is very restrictive, and (2) assuming a specific elliptical distribution for errors (Student-t for example), may be somewhat presumptuous; there is need for more flexible methods, in terms of assuming only symmetry of errors (admitting unknown kurtosis). In this sense, the main advantage of this extended Bayesian approach is the possibility of considering generalizations of the elliptical family of models by using Dirichlet process priors in dependent and independent situations. Conditional posterior distributions are implemented, allowing the use of Markov Chain Monte Carlo (MCMC), to generate the posterior distributions. An interesting result shown is that the Dirichlet process prior is not updated in the case of the dependent elliptical model. Furthermore, an analysis of a real data set is reported to illustrate the usefulness of our approach, in dealing with outliers. Finally, semiparametric proposed models and parametric normal model are compared, graphically with the posterior distribution density of the coefficients. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Scale mixtures of the skew-normal (SMSN) distribution is a class of asymmetric thick-tailed distributions that includes the skew-normal (SN) distribution as a special case. The main advantage of these classes of distributions is that they are easy to simulate and have a nice hierarchical representation facilitating easy implementation of the expectation-maximization algorithm for the maximum-likelihood estimation. In this paper, we assume an SMSN distribution for the unobserved value of the covariates and a symmetric scale mixtures of the normal distribution for the error term of the model. This provides a robust alternative to parameter estimation in multivariate measurement error models. Specific distributions examined include univariate and multivariate versions of the SN, skew-t, skew-slash and skew-contaminated normal distributions. The results and methods are applied to a real data set.
Resumo:
In general, the normal distribution is assumed for the surrogate of the true covariates in the classical error model. This paper considers a class of distributions, which includes the normal one, for the variables subject to error. An estimation approach yielding consistent estimators is developed and simulation studies reported.