90 resultados para Short homologous sequences


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time series regression models are especially suitable in epidemiology for evaluating short-term effects of time-varying exposures on health. The problem is that potential for confounding in time series regression is very high. Thus, it is important that trend and seasonality are properly accounted for. Our paper reviews the statistical models commonly used in time-series regression methods, specially allowing for serial correlation, make them potentially useful for selected epidemiological purposes. In particular, we discuss the use of time-series regression for counts using a wide range Generalised Linear Models as well as Generalised Additive Models. In addition, recently critical points in using statistical software for GAM were stressed, and reanalyses of time series data on air pollution and health were performed in order to update already published. Applications are offered through an example on the relationship between asthma emergency admissions and photochemical air pollutants

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Detecting changes between images of the same scene taken at different times is of great interest for monitoring and understanding the environment. It is widely used for on-land application but suffers from different constraints. Unfortunately, Change detection algorithms require highly accurate geometric and photometric registration. This requirement has precluded their use in underwater imagery in the past. In this paper, the change detection techniques available nowadays for on-land application were analyzed and a method to automatically detect the changes in sequences of underwater images is proposed. Target application scenarios are habitat restoration sites, or area monitoring after sudden impacts from hurricanes or ship groundings. The method is based on the creation of a 3D terrain model from one image sequence over an area of interest. This model allows for synthesizing textured views that correspond to the same viewpoints of a second image sequence. The generated views are photometrically matched and corrected against the corresponding frames from the second sequence. Standard change detection techniques are then applied to find areas of difference. Additionally, the paper shows that it is possible to detect false positives, resulting from non-rigid objects, by applying the same change detection method to the first sequence exclusively. The developed method was able to correctly find the changes between two challenging sequences of images from a coral reef taken one year apart and acquired with two different cameras

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article we review first some of the possibilities in which the notions of Fo lner sequences and quasidiagonality have been applied to spectral approximation problems. We construct then a canonical Fo lner sequence for the crossed product of a concrete C* -algebra and a discrete amenable group. We apply our results to the rotation algebra (which contains interesting operators like almost Mathieu operators or periodic magnetic Schrödinger operators on graphs) and the C* -algebra generated by bounded Jacobi operators.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article analyzes Folner sequences of projections for bounded linear operators and their relationship to the class of finite operators introduced by Williams in the 70ies. We prove that each essentially hyponormal operator has a proper Folner sequence (i.e. a Folner sequence of projections strongly converging to 1). In particular, any quasinormal, any subnormal, any hyponormal and any essentially normal operator has a proper Folner sequence. Moreover, we show that an operator is finite if and only if it has a proper Folner sequence or if it has a non-trivial finite dimensional reducing subspace. We also analyze the structure of operators which have no Folner sequence and give examples of them. For this analysis we introduce the notion of strongly non-Folner operators, which are far from finite block reducible operators, in some uniform sense, and show that this class coincides with the class of non-finite operators.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Realistic rendering animation is known to be an expensive processing task when physically-based global illumination methods are used in order to improve illumination details. This paper presents an acceleration technique to compute animations in radiosity environments. The technique is based on an interpolated approach that exploits temporal coherence in radiosity. A fast global Monte Carlo pre-processing step is introduced to the whole computation of the animated sequence to select important frames. These are fully computed and used as a base for the interpolation of all the sequence. The approach is completely view-independent. Once the illumination is computed, it can be visualized by any animated camera. Results present significant high speed-ups showing that the technique could be an interesting alternative to deterministic methods for computing non-interactive radiosity animations for moderately complex scenarios

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10,010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selenoproteins contain the amino acid selenocysteine which is encoded by a UGA Sec codon. Recoding UGA Sec requires a complex mechanism, comprising the cis-acting SECIS RNA hairpin in the 3′UTR of selenoprotein mRNAs, and trans-acting factors. Among these, the SECIS Binding Protein 2 (SBP2) is central to the mechanism. SBP2 has been so far functionally characterized only in rats and humans. In this work, we report the characterization of the Drosophila melanogaster SBP2 (dSBP2). Despite its shorter length, it retained the same selenoprotein synthesis-promoting capabilities as the mammalian counterpart. However, a major difference resides in the SECIS recognition pattern: while human SBP2 (hSBP2) binds the distinct form 1 and 2 SECIS RNAs with similar affinities, dSBP2 exhibits high affinity toward form 2 only. In addition, we report the identification of a K (lysine)-rich domain in all SBP2s, essential for SECIS and 60S ribosomal subunit binding, differing from the well-characterized L7Ae RNA-binding domain. Swapping only five amino acids between dSBP2 and hSBP2 in the K-rich domain conferred reversed SECIS-binding properties to the proteins, thus unveiling an important sequence for form 1 binding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper estimates the effect of judicial institutions on governance at the local level in Brazil. Our estimation strategy exploits a unique institutional feature of state judiciary branches which assigns prosecutors and judges to the most populous among contiguous counties forming a judiciary district. As a result of this assignment mechanism there are counties with nearly identical populations, some with and some without local judicial presence, which we exploit to impute counterfactual outcomes. Conditional on observable county characteristics, offenses per civil servant are about 35% lower in counties that have a local seat of the state judiciary. The lower incidence of infractions stems mostly from fewer violations of financial management regulations by local administrators, fewer instances of problems in project execution and project managment, fewer cases of non-existent or ineffective civil society oversight and fewer cases of improper handling of remittances to local residents.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sequential randomized prediction of an arbitrary binary sequence isinvestigated. No assumption is made on the mechanism of generating the bit sequence. The goal of the predictor is to minimize its relative loss, i.e., to make (almost) as few mistakes as the best ``expert'' in a fixed, possibly infinite, set of experts. We point out a surprising connection between this prediction problem and empirical process theory. First, in the special case of static (memoryless) experts, we completely characterize the minimax relative loss in terms of the maximum of an associated Rademacher process. Then we show general upper and lower bounds on the minimaxrelative loss in terms of the geometry of the class of experts. As main examples, we determine the exact order of magnitude of the minimax relative loss for the class of autoregressive linear predictors and for the class of Markov experts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a general technique to develop first and second order closed-form approximation formulas for short-time options withrandom strikes. Our method is based on Malliavin calculus techniques andallows us to obtain simple closed-form approximation formulas dependingon the derivative operator. The numerical analysis shows that these formulas are extremely accurate and improve some previous approaches ontwo-assets and three-assets spread options as Kirk's formula or the decomposition mehod presented in Alòs, Eydeland and Laurence (2011).