951 resultados para sequence based alignments
Resumo:
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Resumo:
Conventionally, protein structure prediction via threading relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. (C) 1999 John Wiley & Sons, Inc.
Resumo:
Trabalho apresentado no âmbito do European Master in Computational Logics, como requisito parcial para obtenção do grau de Mestre em Computational Logics
Resumo:
The sequence profile method (Gribskov M, McLachlan AD, Eisenberg D, 1987, Proc Natl Acad Sci USA 84:4355-4358) is a powerful tool to detect distant relationships between amino acid sequences. A profile is a table of position-specific scores and gap penalties, providing a generalized description of a protein motif, which can be used for sequence alignments and database searches instead of an individual sequence. A sequence profile is derived from a multiple sequence alignment. We have found 2 ways to improve the sensitivity of sequence profiles: (1) Sequence weights: Usage of individual weights for each sequence avoids bias toward closely related sequences. These weights are automatically assigned based on the distance of the sequences using a published procedure (Sibbald PR, Argos P, 1990, J Mol Biol 216:813-818). (2) Amino acid substitution table: In addition to the alignment, the construction of a profile also needs an amino acid substitution table. We have found that in some cases a new table, the BLOSUM45 table (Henikoff S, Henikoff JG, 1992, Proc Natl Acad Sci USA 89:10915-10919), is more sensitive than the original Dayhoff table or the modified Dayhoff table used in the current implementation. Profiles derived by the improved method are more sensitive and selective in a number of cases where previous methods have failed to completely separate true members from false positives.
Resumo:
The M-Coffee server is a web server that makes it possible to compute multiple sequence alignments (MSAs) by running several MSA methods and combining their output into one single model. This allows the user to simultaneously run all his methods of choice without having to arbitrarily choose one of them. The MSA is delivered along with a local estimation of its consistency with the individual MSAs it was derived from. The computation of the consensus multiple alignment is carried out using a special mode of the T-Coffee package [Notredame, Higgins and Heringa (T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302: 205-217); Wallace, O'Sullivan, Higgins and Notredame (M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006; 34: 1692-1699)] Given a set of sequences (DNA or proteins) in FASTA format, M-Coffee delivers a multiple alignment in the most common formats. M-Coffee is a freeware open source package distributed under a GPL license and it is available either as a standalone package or as a web service from www.tcoffee.org.
Resumo:
This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10,000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat.
Resumo:
This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10 000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat.
Resumo:
Comparative analysis of gene fragments of six housekeeping loci, distributed around the two chromosomes of Vibrio cholerae, has been carried out for a collection of 29 V. cholerae O139 Bengal strains isolated from India during the first epidemic period (1992 to 1993). A toxigenic O1 ElTor strain from the seventh pandemic and an environmental non-O1/non-O139 strain were also included in this study. All loci studied were polymorphic, with a small number of polymorphic sites in the sequenced fragments. The genetic diversity determined for our O139 population is concordant with a previous multilocus enzyme electrophoresis study in which we analyzed the same V. cholerae O139 strains. In both studies we have found a higher genetic diversity than reported previously in other molecular studies. The results of the present work showed that O139 strains clustered in several lineages of the dendrogram generated from the matrix of allelic mismatches between the different genotypes, a finding which does not support the hypothesis previously reported that the O139 serogroup is a unique clone. The statistical analysis performed in the V. cholerae O139 isolates suggested a clonal population structure. Moreover, the application of the Sawyer's test and split decomposition to detect intragenic recombination in the sequenced gene fragments did not indicate the existence of recombination in our O139 population.
Resumo:
Affiliation: Département de biochimie, Faculté de médecine, Université de Montréal
Resumo:
Considerable research effort has been devoted in predicting the exon regions of genes. The binary indicator (BI), Electron ion interaction pseudo potential (EIIP), Filter method are some of the methods. All these methods make use of the period three behavior of the exon region. Even though the method suggested in this paper is similar to above mentioned methods , it introduces a set of sequences for mapping the nucleotides selected by applying genetic algorithm and found to be more promising
Resumo:
The monophyly of the Peltophorum group, one of nine informal groups recognized by Polhill in the Caesalpinieae, was tested using sequence data from the trnL-F, rbcL, and rps16 regions of the chloroplast genome. Exemplars were included from all 16 genera of the Peltophorum group, and from 15 genera representing seven of the other eight informal groups in the tribe. The data were analyzed separately and in combined analyses using parsimony and Bayesian methods. The analysis method had little effect on the topology of well-supported relationships. The molecular data recovered a generally well-supported phylogeny with many intergeneric relationships resolved. Results show that the Peltophorum group as currently delimited is polyphyletic, but that eight genera plus one undescribed genus form a core Peltophorum group, which is referred to here as the Peltophorum group sensu stricto. These genera are Bussea, Conzattia, Colvillea, Delonix, Heteroflorum (inedit.), Lemuropisum, Parkinsonia, Peltophorum, and Schizolobium. The remaining eight genera of the Peltophorum group s.l. are distributed across the Caesalpinieae. Morphological support for the redelimited Peltophorum group and the other recovered clades was assessed, and no unique synapomorphy was found for the Peltophorum group s.s. A proposal for the reclassification of the Peltophorum group s.l. is presented.
Resumo:
Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.
Resumo:
The phylogenetics of Sternbergia (Amaryllidaceae) were studied using DNA sequences of the plastid ndhF and matK genes and nuclear internal transcribed spacer (ITS) ribosomal region for 38, 37 and 32 ingroup and outgroup accessions, respectively. All members of Sternbergia were represented by at least one accession, except S. minoica and S. schubertii, with additional taxa from Narcissus and Pancratium serving as principal outgroups. Sternbergia was resolved and supported as sister to Narcissus and composed of two primary subclades: S. colchiciflora sister to S. vernalis, S. candida and S. clusiana, with this clade in turn sister to S. lutea and its allies in both Bayesian and bootstrap analyses. A clear relationship between the two vernal flowering members of the genus was recovered, supporting the hypothesis of a single origin of vernal flowering in Sternbergia. However, in the S. lutea complex, the DNA markers examined did not offer sufficient resolving power to separate taxa, providing some support for the idea that S. sicula and S. greuteriana are conspecific with S. lutea