901 resultados para SEQUENCE VARIANT
Resumo:
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.
Resumo:
The Plasmodium falciparum var gene family encodes large variant antigens, which are important virulence factors, and also targets of the humoral host response. The frequently observed mild outcomes of falciparum malaria in many places of the Amazon area prompted us to ask whether a globally restricted variant (var) gene repertoire is present in currently circulating and older isolates of this area. By exhaustive analysis of var gene tags from 89 isolates and clones taken during many years from all over the Brazilian Amazon, we estimate that there are probably no more than 350-430 distinct sequence types, less than for any similar sized area studied so far. Detailed analysis of the var tags from genetically distinct clones obtained from single isolates revealed restricted and redundant repertoires suggesting either a low incidence of infective bites or restricted variant gene diversity in inoculated parasites. Additionally, we found a structuring of var gene repertoires observed as a higher pairwise typing sharing in isolates from the same microregion compared to isolates from different regions. Fine analysis of translated var tags revealed that certain Distinct Sequence Identifiers (DSIDs) were differently represented in Brazilian/South American isolates when compared to datasets from other continents. By global alignment of worldwide var DBL alpha sequences and sorting in groups with more than 76% identity, 125 clusters were formed and more than half of all genes were found in nine clusters with 50 or more sequences. While Brazilian/South American sequences were represented only in 64 groups, African sequences were found in the majority of clusters. DSID type 1 related sequences accumulated almost completely in one single cluster, indicating that limited recombination occurs in these specific var gene types. These data demonstrate the so far highest pairwise type sharing values for the var gene family in isolates from all over an entire subcontinent. The apparent lack of specific sequences types suggests that the P. falciparum transmission dynamics in the whole Amazon are probably different from any other endemic region studied and possibly interfere with the parasite`s ability to efficiently diversify its variant gene repertoires. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Ureaplasma diversum infection in bulls may result in seminal vesiculitis, balanoposthitis and alterations in spermatozoids. In cows, it can cause placentitis, fetal alveolitis, abortion and the birth of weak calves. U. diversum ATCC 49782 (serogroups A), ATCC 49783 (serogroup C) and 34 field isolates were used for this study. These microorganisms were submitted to Polymerase Chain Reaction for 16S gene sequence determination using Tact High Fidelity and the products were purified and bi-directionally sequenced. Using the sequence obtained, a fragment containing four hypervariable regions was selected and nucleotide polymorphisms were identified based on their position within the 16S rRNA gene. Forty-four single nucleotide polymorphisms (SNP) were detected. The genotypic variability of the 16S rRNA gene of U. diversum isolates shows that the taxonomy classification of these organisms is likely much more complex than previously described and that 16S rRNA gene sequencing may be used to suggest an epidemiologic pattern of different origin strains. (c) 2011 Elsevier B.V. All rights reserved.
Resumo:
A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.
Resumo:
We present a minor but essential modification to the CODEX 1D-MAS exchange experiment. The new CONTRA method, which requires minor changes of the original sequence only, has advantages over the previously introduced S-CODEX, since it is less sensitive to artefacts caused by finite pulse lengths. The performance of this variant, including the finite pulse effect, was confirmed by SIMPSON calculations and demonstrated on a number of dynamic systems. (C) 2007 Elsevier Inc. All rights reserved.
Resumo:
Hepatitis C virus (HCV), exhibits considerable genetic diversity, but presents a relatively well conserved 5 ` noncoding region (5 ` NCR) among all genotypes. In this study, the structural features and translational efficiency of the HCV 5 ` NCR sequences were analyzed using the programs RNAfold, RNAshapes and RNApdist and with a bicistronic dual luciferase expression system, respectively. RNA structure prediction software indicated that base substitutions will alter potentially the 5 ` NCR structure. The heterogeneous sequence observed on 5 ` NCR led to important changes in their translation efficiency in different cell culture lines. Interactions of the viral RNA with cellular transacting factors may vary according to the cell type and viral genome polymorphisms that may result in the translational efficiency observed. J. Med. Virol. 81: 1212-1219, 2009. (C) 2009 Wiley-Liss, Inc.
Resumo:
Let M be a compact, connected non-orientable surface without boundary and of genus g >= 3. We investigate the pure braid groups P,(M) of M, and in particular the possible splitting of the Fadell-Neuwirth short exact sequence 1 -> P(m)(M \ {x(1), ..., x(n)}) hooked right arrow P(n+m)(M) (P*) under right arrow P(n)(M) -> 1, where m, n >= 1, and p* is the homomorphism which corresponds geometrically to forgetting the last m strings. This problem is equivalent to that of the existence of a section for the associated fibration p: F(n+m)(M) -> F(n)(M) of configuration spaces, defined by p((x(1), ..., x(n), x(n+1), ..., x(n+m))) = (x(1), ..., x(n)). We show that p and p* admit a section if and only if n = 1. Together with previous results, this completes the resolution of the splitting problem for surface pure braid groups. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Glycine-rich proteins (GRP), serve a variety of biological functions. Acanthoscurrin is an antimicrobial GRP isolated front hemocytes-of the Brazilian spider Acanthoscurria gomesiana. Aiming to contribute to the knowledge of the secondary structure and stepwise solid-phase synthesis of GRPs` glycine-rich domains, we attempted to prepare G(101)GGLGGGRGGGYG(113) GGGGYGGGYG(123)GGy(126)GGGKYK(132)-NH(2), acanthoscurrin C-terminal amidated fragment. Although a theoretical prediction did not indicate high aggregation potential for this peptide, repetitive incomplete aminoacylations were observed after incorporating Tyr(126) to the growing peptide-MBHA resin (Boc chemistry) at 60 degrees C. The problem was not solved by varying the coupling reagents or solvents, adding chaotropic salts to the reaction media or changing the resin/chemistry (Rink amide resin/Fmoc chemistry). Some improvement was mode when CLEAR amide resin (Fmoc chemistry) was 32 used, as it allowed for obtaining fragment (G(113)-K(132) NIR-FT-Raman spectra collected for samples of the growing peptide-MBHA, -Rink amide resin and -CLEAR amide resin revealed the presence of beta-sheet structures. Only the combination of CLEAR-amide resin, 60 degrees C, Fmoc-(Fmoc-Hmb)Gly-OH and LiCl (the last two used alternately) was able to inhibit the phenomenon, as proven by NIR-FT-Raman analysis of the growing peptide-resin, allowing the total synthesis of desired 132 fragment Gly(101)-K(132). In summary, this work describes a new difficult sequence, contributes to understanding stepwise solid-phase synthesis of this type of peptide and shows that, at least while protected and linked to a resin, this GRPs glycine-rich motif presents all early tendency to assume beta-sheet structures. (c) 2008 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 92: 65-75, 2009.
Resumo:
Lithium ""butylchalcogenolates are generated in situ by reacting the elements (S, Se, and Te) with (n)butyl-lithium at 0 degrees C. Reaction of the lithium alkylchalcogenolates with activated alkenes and aldehydes gives the corresponding aldol adducts. The selenium-containing products give Morita-Baylis-Hillman adducts after the oxidation/elimination of the selenoxide. The whole sequence can be performed in a one-pot procedure. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Several studies indicate that molecular variants of HPV-16 have different geographic distribution and risk associated with persistent infection and development of high-grade cervical lesions. In the present study, the frequency of HPV-16 variants was determined in 81 biopsies from women with cervical intraepithelial neoplasia grade III or invasive cervical cancer from the city of Belem, Northern Brazil. Host DNAs were also genotyped in order to analyze the ethnicity-related distribution of these variants. Ninie different HPV-16 LCR variants belonging to four phylogenetic branches were identified. Among these, two new isolates were characterized. The most prevalent HPV-16 variant detected was the Asian-American B-2,followed by the European B-12 and the European prototype. Infections by multiple variants were observed in both invasive cervical cancer and cervical intraepithelial neoplasia grade III cases. The analysis of a specific polymorphism within the E6 viral gene was performed in a subset of 76 isolates. The E6-350G polymorphism was significantly more frequent in Asian-American variants. The HPV-16 variability detected followed the same pattern of the genetic ancestry observed in Northern Brazil, with European, Amerindian and African roots. Although African ancestry was higher among women infected by the prototype, no correlation between ethnical origin and HPV-16 variants was found. These results corroborate previous data showing a high frequency of Asian-American variants in cervical neoplasia among women with multiethnic origin.
Resumo:
In this paper, we show that the steady-state free precession sequence can be used to acquire (13)C high-resolution nuclear magnetic resonance spectra and applied to qualitative analysis. The analysis of brucine sample using this sequence with 60 degrees flip angle and time interval between pulses equal to 300 ms (acquisition time, 299.7 ms; recycle delay, 300 ms) resulted in spectrum with twofold enhancement in signal-to-noise ratio, when compared to standard (13)C sequence. This gain was better when a much shorter time interval between pulses (100 ms) was applied. The result obtained was more than fivefold enhancement in signal-to-noise ratio, equivalent to more than 20-fold reduction in total data recording time. However, this short time interval between pulses produces a spectrum with severe phase and truncation anomalies. We demonstrated that these anomalies can be minimized by applying an appropriate apodization function and plotting the spectrum in the magnitude mode.
Resumo:
This paper proposes an efficient pattern extraction algorithm that can be applied on melodic sequences that are represented as strings of abstract intervallic symbols; the melodic representation introduces special “binary don’t care” symbols for intervals that may belong to two partially overlapping intervallic categories. As a special case the well established “step–leap” representation is examined. In the step–leap representation, each melodic diatonic interval is classified as a step (±s), a leap (±l) or a unison (u). Binary don’t care symbols are used to represent the possible overlapping between the various abstract categories e.g. *=s, *=l and #=-s, #=-l. We propose an O(n+d(n-d)+z)-time algorithm for computing all maximal-pairs in a given sequence x=x[1..n], where x contains d occurrences of binary don’t cares and z is the number of reported maximal-pairs.
Resumo:
This paper proposes an efficient pattern extraction algorithm that can be applied on melodic sequences that are represented as strings of abstract intervallic symbols; the melodic representation introduces special “binary don’t care” symbols for intervals that may belong to two partially overlapping intervallic categories. As a special case the well established “step–leap” representation is examined. In the step–leap representation, each melodic diatonic interval is classified as a step (±s), a leap (±l) or a unison (u). Binary don’t care symbols are used to represent the possible overlapping between the various abstract categories e.g. *=s, *=l and #=-s, #=-l. We propose an O(n+d(n-d)+z)-time algorithm for computing all maximal-pairs in a given sequence x=x[1..n], where x contains d occurrences of binary don’t cares and z is the number of reported maximal-pairs.