892 resultados para Consensus Sequence
Resumo:
It has been previously observed that the intrinsically weak variant GC donor sites, in order to be recognized by the U2-type spliceosome, possess strong consensus sequences maximized for base pair formation with U1 and U5/U6 snRNAs. However, variability in signal strength is a fundamental mechanism for splice site selection in alternative splicing. Here we report human alternative GC-AG introns (for the first time from any species), and show that while constitutive GC-AG introns do possess strong signals at their donor sites, a large subset of alternative GC-AG introns possess weak consensus sequences at their donor sites. Surprisingly, this subset of alternative isoforms shows strong consensus at acceptor exon positions 1 and 2. The improved consensus at the acceptor exon can facilitate a strong interaction with U5 snRNA, which tethers the two exons for ligation during the second step of splicing. Further, these isoforms nearly always possess alternative acceptor sites and always possess alternative acceptor sites and exhibit particularly weak polypyrimidine tracts characteristic of AG-dependent introns. The acceptor exon nucleotides are part of the consensus required for the U2AF(35)-mediated recognition of AG in such introns. Such improved consensus at acceptor exons is not found in either normal or alternative GT-AG introns having weak donor sites or weak polypyrimidine,tracts. The changes probably reflect mechanisms that allow GC-AG alternative intron isoforms to cope with two conflicting requirements, namely an apparent need for differential splice strength to direct the choice of alternative sites and a need for improved donor signals to compensate for the central mismatch base pair (C-A) in the RNA duplex of U1 snRNA and the pre-mRNA. The other important findings include (i) one in every twenty alternative introns is a GC-AG intron, and (ii) three of every five observed GC-AG introns are alternative isoforms.
Resumo:
Introduction: Testing for HIV tropism is recommended before prescribing a chemokine receptor blocker. To date, in most European countries HIV tropism is determined using a phenotypic test. Recently, new data have emerged supporting the use of a genotypic HIV V3-loop sequence analysis as the basis for tropism determination. The European guidelines group on clinical management of HIV-1 tropism testing was established to make recommendations to clinicians and virologists. Methods: We searched online databases for articles from Jan 2006 until March 2010 with the terms: tropism or CCR5-antagonist or CCR5 antagonist or maraviroc or vicriviroc. Additional articles and/or conference abstracts were identified by hand searching. This strategy identified 712 potential articles and 1240 abstracts. All were reviewed and finally 57 papers and 42 abstracts were included and used by the panel to reach a consensus statement. Results: The panel recommends HIV-tropism testing for the following indications: i) drug-naïve patients in whom toxicity or limited therapeutic options are foreseen; ii) patients experiencing therapy failure whenever a treatment change is considered. Both the phenotypic Enhanced Trofile assay (ESTA) and genotypic population sequencing of the V3-loop are recommended for use in clinical practice. Although the panel does not recommend one methodology over another it is anticipated that genotypic testing will be used more frequently because of its greater accessibility, lower cost and shorter turnaround time. The panel also provides guidance on technical aspects and interpretation issues. If using genotypic methods, triplicate PCR amplification and sequencing testing is advised using the G2P interpretation tool (clonal model) with an FPR of 10%. If the viral load is below the level of reliable amplification, proviral DNA can be used, and the panel recommends performing triplicate testing and use of an FPR of 10%. If genotypic DNA testing is not performed in triplicate the FPR should be increased to 20%. Conclusions: The European guidelines on clinical management of HIV-1 tropism testing provide an overview of current literature, evidence-based recommendations for the clinical use of tropism testing and expert guidance on unresolved issues and current developments. Current data support both the use of genotypic population sequencing and ESTA for co-receptor tropism determination. For practical reasons genotypic population sequencing is the preferred method in Europe.
Resumo:
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are “genomic fossils” valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome’s structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction (∼80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Resumo:
A vaccinia virus late gene coding for a major structural polypeptide of 11 kDa was sequenced. Although the 5' flanking gene region is very A+T rich, it shows little homology either to the corresponding region of vaccinia early genes or to consensus sequences characteristic of most eukaryotic genes. Three DNA fragments (100, 200, and 500 base pairs, respectively), derived from the flanking region and including the late gene mRNA start site, were inserted into the coding sequence of the vaccinia virus thymidine kinase (TK) early gene by homologous in vivo recombination. Recombinants were selected on the basis of their TK- phenotype. Cells were infected with the recombinant viruses and RNA was isolated at 1-hr intervals. Transcripts initiating either from the TK early promoter, or from the late gene promoter at its authentic position, or from the translocated late gene promoters within the early gene were detected by nuclease S1 mapping. Early after infection, only transcripts from the TK early promoter were detected. Later in infection, however, transcripts were also initiated from the translocated late promoters. This RNA appeared at the same time and in similar quantities as the RNA from the late promoter at its authentic position. No quantitative differences in promoter efficiency between the 100-, 200-, and 500-base-pair insertions were observed. We conclude that all necessary signals for correct regulation of late-gene expression reside within only 100 base pairs of 5' flanking sequence.
Resumo:
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Resumo:
The M-Coffee server is a web server that makes it possible to compute multiple sequence alignments (MSAs) by running several MSA methods and combining their output into one single model. This allows the user to simultaneously run all his methods of choice without having to arbitrarily choose one of them. The MSA is delivered along with a local estimation of its consistency with the individual MSAs it was derived from. The computation of the consensus multiple alignment is carried out using a special mode of the T-Coffee package [Notredame, Higgins and Heringa (T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302: 205-217); Wallace, O'Sullivan, Higgins and Notredame (M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006; 34: 1692-1699)] Given a set of sequences (DNA or proteins) in FASTA format, M-Coffee delivers a multiple alignment in the most common formats. M-Coffee is a freeware open source package distributed under a GPL license and it is available either as a standalone package or as a web service from www.tcoffee.org.
Resumo:
An effective human immunodeficiency virus type 1 (HIV-1) vaccine must induce protective antibody responses, as well as CD4(+) and CD8(+) T cell responses, that can be effective despite extraordinary diversity of HIV-1. The consensus and mosaic immunogens are complete but artificial proteins, computationally designed to elicit immune responses with improved cross-reactive breadth, to attempt to overcome the challenge of global HIV diversity. In this study, we have compared the immunogenicity of a transmitted-founder (T/F) B clade Env (B.1059), a global group M consensus Env (Con-S), and a global trivalent mosaic Env protein in rhesus macaques. These antigens were delivered using a DNA prime-recombinant NYVAC (rNYVAC) vector and Env protein boost vaccination strategy. While Con-S Env was a single sequence, mosaic immunogens were a set of three Envs optimized to include the most common forms of potential T cell epitopes. Both Con-S and mosaic sequences retained common amino acids encompassed by both antibody and T cell epitopes and were central to globally circulating strains. Mosaics and Con-S Envs expressed as full-length proteins bound well to a number of neutralizing antibodies with discontinuous epitopes. Also, both consensus and mosaic immunogens induced significantly higher gamma interferon (IFN-γ) enzyme-linked immunosorbent spot assay (ELISpot) responses than B.1059 immunogen. Immunization with these proteins, particularly Con-S, also induced significantly higher neutralizing antibodies to viruses than B.1059 Env, primarily to tier 1 viruses. Both Con-S and mosaics stimulated more potent CD8-T cell responses against heterologous Envs than did B.1059. Both antibody and cellular data from this study strengthen the concept of using in silico-designed centralized immunogens for global HIV-1 vaccine development strategies. IMPORTANCE: There is an increasing appreciation for the importance of vaccine-induced anti-Env antibody responses for preventing HIV-1 acquisition. This nonhuman primate study demonstrates that in silico-designed global HIV-1 immunogens, designed for a human clinical trial, are capable of eliciting not only T lymphocyte responses but also potent anti-Env antibody responses.
Resumo:
We experimentally identified the activities of six predicted heptosyltransferases in Actinobacillus pleuropneumoniae genome serotype 5b strain L20 and serotype 3 strain JL03. The initial identification was based on a bioinformatic analysis of the amino acid similarity between these putative heptosyltrasferases with others of known function from enteric bacteria and Aeromonas. The putative functions of all the Actinobacillus pleuropneumoniae heptosyltrasferases were determined by using surrogate LPS acceptor molecules from well-defined A. hydrophyla AH-3 and A. salmonicida A450 mutants. Our results show that heptosyltransferases APL_0981 and APJL_1001 are responsible for the transfer of the terminal outer core D-glycero-D-manno-heptose (D,D-Hep) residue although they are not currently included in the CAZY glycosyltransferase 9 family. The WahF heptosyltransferase group signature sequence [S(T/S)(GA)XXH] differs from the heptosyltransferases consensus signature sequence [D(TS)(GA)XXH], because of the substitution of D(261) for S(261), being unique.
Resumo:
The median problem is a classical problem in Location Theory: one searches for a location that minimizes the average distance to the sites of the clients. This is for desired facilities as a distribution center for a set of warehouses. More recently, for obnoxious facilities, the antimedian was studied. Here one maximizes the average distance to the clients. In this paper the mixed case is studied. Clients are represented by a profile, which is a sequence of vertices with repetitions allowed. In a signed profile each element is provided with a sign from f+; g. Thus one can take into account whether the client prefers the facility (with a + sign) or rejects it (with a sign). The graphs for which all median sets, or all antimedian sets, are connected are characterized. Various consensus strategies for signed profiles are studied, amongst which Majority, Plurality and Scarcity. Hypercubes are the only graphs on which Majority produces the median set for all signed profiles. Finally, the antimedian sets are found by the Scarcity Strategy on e.g. Hamming graphs, Johnson graphs and halfcubes
Resumo:
Faba bean (Vicia faba L.) is a globally important nitrogen-fixing legume, which is widely grown in a diverse range of environments. In this work, we mine and validate a set of 845 SNPs from the aligned transcriptomes of two contrasting inbred lines. Each V. faba SNP is assigned by BLAST analysis to a single Medicago orthologue. This set of syntenically anchored polymorphisms were then validated as individual KASP assays, classified according to their informativeness and performance on a panel of 37 inbred lines, and the best performing 757 markers used to genotype six mapping populations. The six resulting linkage maps were merged into a single consensus map on which 687 SNPs were placed on six linkage groups, each presumed to correspond to one of the six V. faba chromosomes. This sequence-based consensus map was used to explore synteny with the most closely-related crop species, lentil, and the most closely related fully sequenced genome, Medicago. Large tracts of uninterrupted colinearity were found between faba bean and Medicago, making it relatively straightforward to predict gene content and order in mapped genetic interval. As a demonstration of this, we mapped a flower colour gene to a 2 cM interval of Vf chromosome 2 which was highly collinear with Mt3. The obvious candidate gene from 77 gene models in the collinear Medicago chromosome segment was the previously characterized MtWD40-1 gene (Mt3g092830, Mt3g092840) controlling anthocyanin production in Medicago and re-sequencing of the Vf orthologue showed a putative causative deletion of the entire 5’ end of the gene.
Resumo:
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST),program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged.
Resumo:
Although the retrotransposon copia has been studied in the melanogaster group of Drosophila species, very little is known about copia dynamism and evolution in other groups. We analyzed the occurrence and heterogeneity of the copia 5' LTR-ULR partial sequence and their phylogenetic relationships in 24 species of the repleta group of Drosophila. PCR showed that copia occurs in 18 out of the 24 species evaluated. Sequencing was possible in only eight species. The sequences showed a low nucleotide diversity, which suggests selective constraints maintaining this regulatory region over evolutionary time. on the contrary, the low nucleotide divergence and the phylogenetic relationships between the D. willistoni/Zaprionus tuberculatus/melanogaster species subgroup suggest horizontal transfer. Sixteen transcription factor binding sites were identified in the LTR-ULR repleta and melanogaster consensus sequences. However, these motifs are not homologous, neither according to their position in the LTR-ULR sequences, nor according to their sequences. Taken together, the low motif homologies, the phylogenetic relationship and the great nucleotide divergence between the melanogaster and repleta copia sequences reinforce the hypothesis that there are two copia families.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The 3D NMR structures of six octapeptide agonist analogues of somatostatin (SRIF) in the free form are described. These analogues, with the basic sequence H-DPhe/Phe2-c[Cys3-Xxx7-DTrp8-Lys9-Thr10-Cys14]-Thr-NH2 (the numbering refers to the position in native SRIF), with Xxx7 being Ala/Aph, exhibit potent and highly selective binding to human SRIF type 2 (sst2) receptors. The backbone of these sst2-selective analogues have the usual type-II' beta-turn reported in the literature for sst2/3/5-subtype-selective analogues. Correlating the biological results and NMR studies led to the identification of the side chains of DPhe2, DTrp8, and Lys9 as the necessary components of the sst2 pharmacophore. This is the first study to show that the aromatic ring at position 7 (Phe7) is not critical for sst2 binding and that it plays an important role in sst3 and sst5 binding. This pharmacophore is, therefore, different from that proposed by others for sst2/3/5 analogues.
Resumo:
The three-dimensional NMR structures of seven octapeptide analogs of somatostatin (SRIF), based on octreotide, with the basic sequence H-Cpa/Phe2-c[DCys3-Xxx7-DTrp/DAph(Cbm)8-Lys9-Thr10-Cys14]-Yyy-NH2 (the numbering refers to the position in native SRIF), with Xxx7 being Aph(Cbm)/Tyr/Agl(NMe,benzoyl) and Yyy being Nal/DTyr/Thr, are presented here. Most of these analogs exhibit potent and highly selective binding to sst2 receptors, and all of the analogs are antagonists inhibiting receptor signaling. Based on their consensus 3D structure, the pharmacophore of the sst2-selective antagonist has been defined. The pharmacophore involves the side chains of Cpa2, DTrp/DAph(Cbm)8, and Lys9, with the backbone for most of the sst2-selective antagonists comprised a Type-II' beta-turn. Hence, the sst2-selective antagonist pharmacophore is very similar to the sst2-selective agonist pharmacophore previously described.