259 resultados para Alignments.


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sequence analysis of Leishmania (Viannia) kDNA minicircles and analysis of multiple sequence alignments of the conserved region (minirepeats) of five distinct minicircles from L. (V.) braziliensis species with corresponding sequences derived from other dermotropic leishmanias indicated the presence of a sub-genus specific sequence. An oligonucleotide bearing this sequence was designed and used as a molecular probe, being able to recognize solely the sub-genus Viannia species in hybridization experiments. A dendrogram reflecting the homologies among the minirepeat sequences was constructed. Sequence clustering was obtained corresponding to the traditional classification based on similarity of biochemical, biological and parasitological characteristics of these Leishmania species, distinguishing the Old World dermotropic leishmanias, the New World dermotropic leishmanias of the sub-genus Leishmania and of the sub-genus Viannia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MOTIVATION: The anatomy of model species is described in ontologies, which are used to standardize the annotations of experimental data, such as gene expression patterns. To compare such data between species, we need to establish relations between ontologies describing different species. RESULTS: We present a new algorithm, and its implementation in the software Homolonto, to create new relationships between anatomical ontologies, based on the homology concept. Homolonto uses a supervised ontology alignment approach. Several alignments can be merged, forming homology groups. We also present an algorithm to generate relationships between these homology groups. This has been used to build a multi-species ontology, for the database of gene expression evolution Bgee. AVAILABILITY: download section of the Bgee website http://bgee.unil.ch/

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The RsmA family of RNA-binding proteins are global post-transcriptional regulators that mediate extensive changes in gene expression in bacteria. They bind to, and affect the translation rate of target mRNAs, a function that is further modulated by one or more, small, untranslated competitive regulatory RNAs. To gain new insights into the nature of this protein/RNA interaction, we used X-ray crystallography to solve the structure of the Yersinia enterocolitica RsmA homologue. RsmA consists of a dimeric beta barrel from which two alpha helices are projected. From structure-based alignments of the RsmA protein family from diverse bacteria, we identified key amino acid residues likely to be involved in RNA-binding. Site-specific mutagenesis revealed that arginine at position 44, located at the N terminus of the alpha helix is essential for biological activity in vivo and RNA-binding in vitro. Mutation of this site affects swarming motility, exoenzyme and secondary metabolite production in the human pathogen Pseudomonas aeruginosa, carbon metabolism in Escherichia coli, and hydrogen cyanide production in the plant beneficial strain Pseudomonas fluorescens CHA0. R44A mutants are also unable to interact with the small untranslated RNA, RsmZ. Thus, although possessing a motif similar to the KH domain of some eukaryotic RNA-binding proteins, RsmA differs substantially and incorporates a novel class of RNA-binding site.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The MyHits web server (http://myhits.isb-sib.ch) is a new integrated service dedicated to the annotation of protein sequences and to the analysis of their domains and signatures. Guest users can use the system anonymously, with full access to (i) standard bioinformatics programs (e.g. PSI-BLAST, ClustalW, T-Coffee, Jalview); (ii) a large number of protein sequence databases, including standard (Swiss-Prot, TrEMBL) and locally developed databases (splice variants); (iii) databases of protein motifs (Prosite, Interpro); (iv) a precomputed list of matches ('hits') between the sequence and motif databases. All databases are updated on a weekly basis and the hit list is kept up to date incrementally. The MyHits server also includes a new collection of tools to generate graphical representations of pairwise and multiple sequence alignments including their annotated features. Free registration enables users to upload their own sequences and motifs to private databases. These are then made available through the same web interface and the same set of analytical tools. Registered users can manage their own sequences and annotations using only web tools and freeze their data in their private database for publication purposes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The number of sequences generated by genome projects has increased exponentially, but gene characterization has not followed at the same rate. Sequencing and analysis of full-length cDNAs is an important step in gene characterization that has been used nowadays by several research groups. In this work, we have selected Schistosoma mansoni clones for full-length sequencing, using an algorithm that investigates the presence of the initial methionine in the parasite sequence based on the positions of alignment start between two sequences. BLAST searches to produce such alignments have been performed using parasite expressed sequence tags produced by Minas Gerais Genome Network against sequences from the database Eukaryotic Cluster of Orthologous Groups (KOG). This procedure has allowed the selection of clones representing 398 proteins which have not been deposited as S. mansoni complete CDS in any public database. Dedicated sequencing of 96 of such clones with reads from both 5' and 3' ends has been performed. These reads have been assembled using PHRAP, resulting in the production of 33 full-length sequences that represent novel S. mansoni proteins. These results shall contribute to construct a more complete view of the biology of this important parasite.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tumor necrosis factor (TNF) ligand and receptor superfamily members play critical roles in diverse developmental and pathological settings. In search for novel TNF superfamily members, we identified a murine chromosomal locus that contains three new TNF receptor-related genes. Sequence alignments suggest that the ligand binding regions of these murine TNF receptor homologues, mTNFRH1, -2 and -3, are most homologous to those of the tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) receptors. By using a number of in vitro ligand-receptor binding assays, we demonstrate that mTNFRH1 and -2, but not mTNFRH3, bind murine TRAIL, suggesting that they are indeed TRAIL receptors. This notion is further supported by our demonstration that both mTNFRH1:Fc and mTNFRH2:Fc fusion proteins inhibited mTRAIL-induced apoptosis of Jurkat cells. Unlike the only other known murine TRAIL receptor mTRAILR2, however, neither mTNFRH2 nor mTNFRH3 has a cytoplasmic region containing the well characterized death domain motif. Coupled with our observation that overexpression of mTNFRH1 and -2 in 293T cells neither induces apoptosis nor triggers NFkappaB activation, we propose that the mTnfrh1 and mTnfrh2 genes encode the first described murine decoy receptors for TRAIL, and we renamed them mDcTrailr1 and -r2, respectively. Interestingly, the overall sequence structures of mDcTRAILR1 and -R2 are quite distinct from those of the known human decoy TRAIL receptors, suggesting that the presence of TRAIL decoy receptors represents a more recent evolutionary event.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The sequence profile method (Gribskov M, McLachlan AD, Eisenberg D, 1987, Proc Natl Acad Sci USA 84:4355-4358) is a powerful tool to detect distant relationships between amino acid sequences. A profile is a table of position-specific scores and gap penalties, providing a generalized description of a protein motif, which can be used for sequence alignments and database searches instead of an individual sequence. A sequence profile is derived from a multiple sequence alignment. We have found 2 ways to improve the sensitivity of sequence profiles: (1) Sequence weights: Usage of individual weights for each sequence avoids bias toward closely related sequences. These weights are automatically assigned based on the distance of the sequences using a published procedure (Sibbald PR, Argos P, 1990, J Mol Biol 216:813-818). (2) Amino acid substitution table: In addition to the alignment, the construction of a profile also needs an amino acid substitution table. We have found that in some cases a new table, the BLOSUM45 table (Henikoff S, Henikoff JG, 1992, Proc Natl Acad Sci USA 89:10915-10919), is more sensitive than the original Dayhoff table or the modified Dayhoff table used in the current implementation. Profiles derived by the improved method are more sensitive and selective in a number of cases where previous methods have failed to completely separate true members from false positives.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recent technological progress has greatly facilitated de novo genome sequencing. However, de novo assemblies consist in many pieces of contiguous sequence (contigs) arranged in thousands of scaffolds instead of small numbers of chromosomes. Confirming and improving the quality of such assemblies is critical for subsequent analysis. We present a method to evaluate genome scaffolding by aligning independently obtained transcriptome sequences to the genome and visually summarizing the alignments using the Cytoscape software. Applying this method to the genome of the red fire ant Solenopsis invicta allowed us to identify inconsistencies in 7%, confirm contig order in 20% and extend 16% of scaffolds.Scripts that generate tables for visualization in Cytoscape from FASTA sequence and scaffolding information files are publicly available at https://github.com/ksanao/TGNet.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

FHWA and the Iowa Department of Transportation are proposing geometric and capacity improvements to the Interstate 29 and Interstate 80 mainline in Segment 3 and the I-80/I-29 East System interchange, the South Expressway interchange, the U.S. Highway 275 interchange, and the Madison Avenue interchange to to safely and efficiently of transportation in the City of Council Bluffs, the Iowa DOT is also proposing to eliminate several railroad alignments and to develop new, consolidated tracks in Segment 3.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We previously introduced two new protein databases (trEST and trGEN) of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Here, we present the updates made on these two databases plus a new database (trome), which uses alignments of EST data to HTG or full genomes to generate virtual transcripts and coding sequences. This new database is of higher quality and since it contains the information in a much denser format it is of much smaller size. These new databases are in a Swiss-Prot-like format and are updated on a weekly basis (trEST and trGEN) or every 3 months (trome). They can be downloaded by anonymous ftp from ftp://ftp.isrec.isb-sib.ch/pub/databases.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent evidence suggests that many of them might have some form of biological activity, and the possibility of functionality has increased interest in their accurate annotation and integration with functional genomics data. RESULTS: As part of the GENCODE annotation of the human genome, we present the first genome-wide pseudogene assignment for protein-coding genes, based on both large-scale manual annotation and in silico pipelines. A key aspect of this coupled approach is that it allows us to identify pseudogenes in an unbiased fashion as well as untangle complex events through manual evaluation. We integrate the pseudogene annotations with the extensive ENCODE functional genomics information. In particular, we determine the expression level, transcription-factor and RNA polymerase II binding, and chromatin marks associated with each pseudogene. Based on their distribution, we develop simple statistical models for each type of activity, which we validate with large-scale RT-PCR-Seq experiments. Finally, we compare our pseudogenes with conservation and variation data from primate alignments and the 1000 Genomes project, producing lists of pseudogenes potentially under selection. CONCLUSIONS: At one extreme, some pseudogenes possess conventional characteristics of functionality; these may represent genes that have recently died. On the other hand, we find interesting patterns of partial activity, which may suggest that dead genes are being resurrected as functioning non-coding RNAs. The activity data of each pseudogene are stored in an associated resource, psiDR, which will be useful for the initial identification of potentially functional pseudogenes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Positive selection is widely estimated from protein coding sequence alignments by the nonsynonymous-to-synonymous ratio omega. Increasingly elaborate codon models are used in a likelihood framework for this estimation. Although there is widespread concern about the robustness of the estimation of the omega ratio, more efforts are needed to estimate this robustness, especially in the context of complex models. Here, we focused on the branch-site codon model. We investigated its robustness on a large set of simulated data. First, we investigated the impact of sequence divergence. We found evidence of underestimation of the synonymous substitution rate for values as small as 0.5, with a slight increase in false positives for the branch-site test. When dS increases further, underestimation of dS is worse, but false positives decrease. Interestingly, the detection of true positives follows a similar distribution, with a maximum for intermediary values of dS. Thus, high dS is more of a concern for a loss of power (false negatives) than for false positives of the test. Second, we investigated the impact of GC content. We showed that there is no significant difference of false positives between high GC (up to similar to 80%) and low GC (similar to 30%) genes. Moreover, neither shifts of GC content on a specific branch nor major shifts in GC along the gene sequence generate many false positives. Our results confirm that the branch-site is a very conservative test.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10,000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat.