887 resultados para Sequence alignment


Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND: The availability of the P. falciparum genome has led to novel ways to identify potential vaccine candidates. A new approach for antigen discovery based on the bioinformatic selection of heptad repeat motifs corresponding to alpha-helical coiled coil structures yielded promising results. To elucidate the question about the relationship between the coiled coil motifs and their sequence conservation, we have assessed the extent of polymorphism in putative alpha-helical coiled coil domains in culture strains, in natural populations and in the single nucleotide polymorphism data available at PlasmoDB. METHODOLOGY/PRINCIPAL FINDINGS: 14 alpha-helical coiled coil domains were selected based on preclinical experimental evaluation. They were tested by PCR amplification and sequencing of different P. falciparum culture strains and field isolates. We found that only 3 out of 14 alpha-helical coiled coils showed point mutations and/or length polymorphisms. Based on promising immunological results 5 of these peptides were selected for further analysis. Direct sequencing of field samples from Papua New Guinea and Tanzania showed that 3 out of these 5 peptides were completely conserved. An in silico analysis of polymorphism was performed for all 166 putative alpha-helical coiled coil domains originally identified in the P. falciparum genome. We found that 82% (137/166) of these peptides were conserved, and for one peptide only the detected SNPs decreased substantially the probability score for alpha-helical coiled coil formation. More SNPs were found in arrays of almost perfect tandem repeats. In summary, the coiled coil structure prediction was rarely modified by SNPs. The analysis revealed a number of peptides with strictly conserved alpha-helical coiled coil motifs. CONCLUSION/SIGNIFICANCE: We conclude that the selection of alpha-helical coiled coil structural motifs is a valuable approach to identify potential vaccine targets showing a high degree of conservation.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Errors in the inferred multiple sequence alignment may lead to false prediction of positive selection. Recently, methods for detecting unreliable alignment regions were developed and were shown to accurately identify incorrectly aligned regions. While removing unreliable alignment regions is expected to increase the accuracy of positive selection inference, such filtering may also significantly decrease the power of the test, as positively selected regions are fast evolving, and those same regions are often those that are difficult to align. Here, we used realistic simulations that mimic sequence evolution of HIV-1 genes to test the hypothesis that the performance of positive selection inference using codon models can be improved by removing unreliable alignment regions. Our study shows that the benefit of removing unreliable regions exceeds the loss of power due to the removal of some of the true positively selected sites.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The sequence profile method (Gribskov M, McLachlan AD, Eisenberg D, 1987, Proc Natl Acad Sci USA 84:4355-4358) is a powerful tool to detect distant relationships between amino acid sequences. A profile is a table of position-specific scores and gap penalties, providing a generalized description of a protein motif, which can be used for sequence alignments and database searches instead of an individual sequence. A sequence profile is derived from a multiple sequence alignment. We have found 2 ways to improve the sensitivity of sequence profiles: (1) Sequence weights: Usage of individual weights for each sequence avoids bias toward closely related sequences. These weights are automatically assigned based on the distance of the sequences using a published procedure (Sibbald PR, Argos P, 1990, J Mol Biol 216:813-818). (2) Amino acid substitution table: In addition to the alignment, the construction of a profile also needs an amino acid substitution table. We have found that in some cases a new table, the BLOSUM45 table (Henikoff S, Henikoff JG, 1992, Proc Natl Acad Sci USA 89:10915-10919), is more sensitive than the original Dayhoff table or the modified Dayhoff table used in the current implementation. Profiles derived by the improved method are more sensitive and selective in a number of cases where previous methods have failed to completely separate true members from false positives.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The amino acid sequence of mouse brain beta spectrin (beta fodrin), deduced from the nucleotide sequence of complementary DNA clones, reveals that this non-erythroid beta spectrin comprises 2363 residues, with a molecular weight of 274,449 Da. Brain beta spectrin contains three structural domains and we suggest the position of several functional domains including f-actin, synapsin I, ankyrin and spectrin self association sites. Analysis of deduced amino acid sequences indicated striking homology and similar structural characteristics of brain beta spectrin repeats beta 11 and beta 12 to globins. In vitro analysis has demonstrated that heme is capable of specific attachment to brain spectrin, suggesting possible new functions in electron transfer, oxygen binding, nitric oxide binding or heme scavenging.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

During the last 2 years, several novel genes that encode glucose transporter-like proteins have been identified and characterized. Because of their sequence similarity with GLUT1, these genes appear to belong to the family of solute carriers 2A (SLC2A, protein symbol GLUT). Sequence comparisons of all 13 family members allow the definition of characteristic sugar/polyol transporter signatures: (1) the presence of 12 membrane-spanning helices, (2) seven conserved glycine residues in the helices, (3) several basic and acidic residues at the intracellular surface of the proteins, (4) two conserved tryptophan residues, and (5) two conserved tyrosine residues. On the basis of sequence similarities and characteristic elements, the extended GLUT family can be divided into three subfamilies, namely class I (the previously known glucose transporters GLUT1-4), class II (the previously known fructose transporter GLUT5, the GLUT7, GLUT9 and GLUT11), and class III (GLUT6, 8, 10, 12, and the myo-inositol transporter HMIT1). Functional characteristics have been reported for some of the novel GLUTs. Like GLUT1-4, they exhibit a tissue/cell-specific expression (GLUT6, leukocytes, brain; GLUT8, testis, blastocysts, brain, muscle, adipocytes; GLUT9, liver, kidney; GLUT10, liver, pancreas; GLUT11, heart, skeletal muscle). GLUT6 and GLUT8 appear to be regulated by sub-cellular redistribution, because they are targeted to intra-cellular compartments by dileucine motifs in a dynamin dependent manner. Sugar transport has been reported for GLUT6, 8, and 11; HMIT1 has been shown to be a H+/myo-inositol co-transporter. Thus, the members of the extended GLUT family exhibit a surprisingly diverse substrate specificity, and the definition of sequence elements determining this substrate specificity will require a full functional characterization of all members.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A novel member of the tumor necrosis factor (TNF) receptor family, designated TRAMP, has been identified. The structural organization of the 393 amino acid long human TRAMP is most homologous to TNF receptor 1. TRAMP is abundantly expressed on thymocytes and lymphocytes. Its extracellular domain is composed of four cysteine-rich domains, and the cytoplasmic region contains a death domain known to signal apoptosis. Overexpression of TRAMP leads to two major responses, NF-kappaB activation and apoptosis. TRAMP-induced cell death is inhibited by an inhibitor of ICE-like proteases, but not by Bcl-2. In addition, TRAMP does not appear to interact with any of the known apoptosis-inducing ligands of the TNF family.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The malic enzyme (ME) gene is a target for both thyroid hormone receptors and peroxisome proliferator-activated receptors (PPAR). Within the ME promoter, two direct repeat (DR)-1-like elements, MEp and MEd, have been identified as putative PPAR response elements (PPRE). We demonstrate that only MEp and not MEd is able to bind PPAR/retinoid X receptor (RXR) heterodimers and mediate peroxisome proliferator signaling. Taking advantage of the close sequence resemblance of MEp and MEd, we have identified crucial determinants of a PPRE. Using reciprocal mutation analyses of these two elements, we show the preference for adenine as the spacing nucleotide between the two half-sites of the PPRE and demonstrate the importance of the two first bases flanking the core DR1 in 5'. This latter feature of the PPRE lead us to consider the polarity of the PPAR/RXR heterodimer bound to its cognate element. We demonstrate that, in contrast to the polarity of RXR/TR and RXR/RAR bound to DR4 and DR5 elements respectively, PPAR binds to the 5' extended half-site of the response element, while RXR occupies the 3' half-site. Consistent with this polarity is our finding that formation and binding of the PPAR/RXR heterodimer requires an intact hinge T region in RXR while its integrity is not required for binding of the RXR/TR heterodimer to a DR4.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In order to contribute to the debate about southern glacial refugia used by temperate species and more northern refugia used by boreal or cold-temperate species, we examined the phylogeography of a widespread snake species (Vipera berus) inhabiting Europe up to the Arctic Circle. The analysis of the mitochondrial DNA (mtDNA) sequence variation in 1043 bp of the cytochrome b gene and in 918 bp of the noncoding control region was performed with phylogenetic approaches. Our results suggest that both the duplicated control region and cytochrome b evolve at a similar rate in this species. Phylogenetic analysis showed that V. berus is divided into three major mitochondrial lineages, probably resulting from an Italian, a Balkan and a Northern (from France to Russia) refugial area in Eastern Europe, near the Carpathian Mountains. In addition, the Northern clade presents an important substructure, suggesting two sequential colonization events in Europe. First, the continent was colonized from the three main refugial areas mentioned above during the Lower-Mid Pleistocene. Second, recolonization of most of Europe most likely originated from several refugia located outside of the Mediterranean peninsulas (Carpathian region, east of the Carpathians, France and possibly Hungary) during the Mid-Late Pleistocene, while populations within the Italian and Balkan Peninsulas fluctuated only slightly in distribution range, with larger lowland populations during glacial times and with refugial mountain populations during interglacials, as in the present time. The phylogeographical structure revealed in our study suggests complex recolonization dynamics of the European continent by V. berus, characterized by latitudinal as well as altitudinal range shifts, driven by both climatic changes and competition with related species.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Homology modeling is the most commonly used technique to build a three-dimensional model for a protein sequence. It heavily relies on the quality of the sequence alignment between the protein to model and related proteins with a known three dimensional structure. Alignment quality can be assessed according to the physico-chemical properties of the three dimensional models it produces.In this work, we introduce fifteen predictors designed to evaluate the properties of the models obtained for various alignments. They consist of an energy value obtained from different force fields (CHARMM, ProsaII or ANOLEA) computed on residue selected around misaligned regions. These predictors were evaluated on ten challenging test cases. For each target, all possible ungapped alignments are generated and their corresponding models are computed and evaluated.The best predictor, retrieving the structural alignment for 9 out of 10 test cases, is based on the ANOLEA atomistic mean force potential and takes into account residues around misaligned secondary structure elements. The performance of the other predictors is significantly lower. This work shows that substantial improvement in local alignments can be obtained by careful assessment of the local structure of the resulting models.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Construction of multiple sequence alignments is a fundamental task in Bioinformatics. Multiple sequence alignments are used as a prerequisite in many Bioinformatics methods, and subsequently the quality of such methods can be critically dependent on the quality of the alignment. However, automatic construction of a multiple sequence alignment for a set of remotely related sequences does not always provide biologically relevant alignments.Therefore, there is a need for an objective approach for evaluating the quality of automatically aligned sequences. The profile hidden Markov model is a powerful approach in comparative genomics. In the profile hidden Markov model, the symbol probabilities are estimated at each conserved alignment position. This can increase the dimension of parameter space and cause an overfitting problem. These two research problems are both related to conservation. We have developed statistical measures for quantifying the conservation of multiple sequence alignments. Two types of methods are considered, those identifying conserved residues in an alignment position, and those calculating positional conservation scores. The positional conservation score was exploited in a statistical prediction model for assessing the quality of multiple sequence alignments. The residue conservation score was used as part of the emission probability estimation method proposed for profile hidden Markov models. The results of the predicted alignment quality score highly correlated with the correct alignment quality scores, indicating that our method is reliable for assessing the quality of any multiple sequence alignment. The comparison of the emission probability estimation method with the maximum likelihood method showed that the number of estimated parameters in the model was dramatically decreased, while the same level of accuracy was maintained. To conclude, we have shown that conservation can be successfully used in the statistical model for alignment quality assessment and in the estimation of emission probabilities in the profile hidden Markov models.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Sao Paulo State Research Foundation-FAPESP

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^