67 resultados para Sequence Alignment

em University of Queensland eSpace - Australia


Relevância:

70.00% 70.00%

Publicador:

Resumo:

We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Alzheimer's disease amyloid protein precursor (APP) gene is part of a multi-gene super-family from which sixteen homologous amyloid precursor-like proteins (APLP) and APP species homologues have been isolated and characterised. Comparison of exon structure (including the uncharacterised APL-1 gene), construction of phylogenetic trees, and analysis of the protein sequence alignment of known homologues of the APP super-family were performed to reconstruct the evolution of the family and to assess the functional significance of conserved protein sequences between homologues. This analysis supports an adhesion function for all members of the APP super family, with specificity determined by those sequences which are not conserved between APLP lineages, and provides evidence for an increasingly complex APP superfamily during evolution. The analysis also suggests that Drosophila APPL and Caenorhabdotids elegans APL-1 may be a fourth APLP lineage indicating that these proteins, while not functional homologues of human APP, are similarly likely to regulate cell adhesion. Furthermore, the beta A4 sequence is highly conserved only in APP orthologues, strongly suggesting this sequence is of significant functional importance in this lineage. (C) 2000 Elsevier Science Ltd. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Allergies are a major cause of chronic ill health in industrialised countries with the incidence of reported cases steadily increasing. This Research Focus details how bioinformatics is transforming the field of allergy through providing databases for management of allergen data, algorithms for characterisation of allergic crossreactivity, structural motifs and B- and T-cell epitopes, tools for prediction of allergenicity and techniques for genomic and proteomic analysis of allergens.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Allergy is a major cause of morbidity worldwide. The number of characterized allergens and related information is increasing rapidly creating demands for advanced information storage, retrieval and analysis. Bioinformatics provides useful tools for analysing allergens and these are complementary to traditional laboratory techniques for the study of allergens. Specific applications include structural analysis of allergens, identification of B- and T-cell epitopes, assessment of allergenicity and cross-reactivity, and genome analysis. In this paper, the most important bioinformatic tools and methods with relevance to the study of allergy have been reviewed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Motivation: A consensus sequence for a family of related sequences is, as the name suggests, a sequence that captures the features common to most members of the family. Consensus sequences are important in various DNA sequencing applications and are a convenient way to characterize a family of molecules. Results: This paper describes a new algorithm for finding a consensus sequence, using the popular optimization method known as simulated annealing. Unlike the conventional approach of finding a consensus sequence by first forming a multiple sequence alignment, this algorithm searches for a sequence that minimises the sum of pairwise distances to each of the input sequences. The resulting consensus sequence can then be used to induce a multiple sequence alignment. The time required by the algorithm scales linearly with the number of input sequences and quadratically with the length of the consensus sequence. We present results demonstrating the high quality of the consensus sequences and alignments produced by the new algorithm. For comparison, we also present similar results obtained using ClustalW. The new algorithm outperforms ClustalW in many cases.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The three-dimensional structures of leucine-rich repeat (LRR) -containing proteins from five different families were previously predicted based on the crystal structure of the ribonuclease inhibitor. using an approach that combined homology-based modeling, structure-based sequence alignment of LRRs, and several rational assumptions. The structural models have been produced based on very limited sequence similarity, which, in general. cannot yield trustworthy predictions. Recently, the protein structures from three of these five families have been determined. In this report we estimate the quality of the modeling approach by comparing the models with the experimentally determined structures. The comparison suggests that the general architecture, curvature, interior/exterior orientations of side chains. and backbone conformation of the LRR structures can be predicted correctly. On the other hand. the analysis revealed that, in some cases. it is difficult to predict correctly the twist of the overall super-helical structure. Taking into consideration the conclusions from these comparisons, we identified a new family of bacterial LRR proteins and present its structural model. The reliability of the LRR protein modeling suggests that it would be informative to apply similar modeling approaches to other classes of solenoid proteins.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The recent discovery of isotrichid-like ciliates occurring as endosymbionts in macropodid marsupials posed interesting questions in regard to both their phyletic origin (all previous records confined to eutherian mammals) and their morphological evolution (Australian forms possibly representing missing links between previously described genera). The SSU rRNA gene was sequenced for three species (Dasytricha dehorityi, D. dogieli, and Batricha tasmaniensis) and aligned against representatives of all major ciliate classes. The Australian species did not group with the other isotrichid species but instead formed an independent radiation. Discrepancies between recent global phylogenies of the phylum Ciliophora were examined by manipulation of the aligned sequence data set. Sources of conflict between these studies did not stem from differences in outgroup choice or phylogenetic reconstruction methods. Differences in the application of confidence limits and primary sequence alignment have probably resulted in the reporting of spurious associations which are not supported by more conservative confidence or alignment methodology. At present, the ciliate subphylum Intramacro-nucleata is an unresolved polytomy which may be due to deficiencies in the SSU rRNA gene sequence dataset or indicate that the ciliates radiated into their extant classes by rapid burst-like evolution. (C) 2001 academic Press.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Five ripening-related ACC synthase cDNA isoforms were cloned from 80% ripe papaya cv. 'Sinta' by reverse transcription-PCR using gene-specific primers. Clone 2 had the longest transcript and contained all common exons and three alternative exons. Clones 3 and 4 contained common exons and one alternative exon each, while clone 1, the most common transcript, contained only the common exons. Clone 5 could be due to cloning artifacts and might not be a unique cDNA fragment. Thus, there are only four isoforms of ACC synthase mRNA. Southern blot analysis indicates that all five clones came from only one gene existing as a single copy in the 'Sinta' papaya genome. Multiple sequence alignment indicates that the four isoforms arise from a single gene, possibly through alternative splicing mechanisms. All the putative alternative exons were present at the 5'-end of the gene comprising the N-terminal region of the protein. 'Sinta' ACC synthase cDNAs were of the capacs 1 type and are most closely related to a 1.4 kb capacs 1-type DNA (AJ277160) from Eksotika papaya. No capacs 2-type cDNAs were cloned from 'Sinta' by RT-PCR. This is the first report of possible alternative splicing mechanism in ripening-related ACC synthase genes in hybrid papaya, possibly to modulate or fine-tune gene expression relevant to fruit ripening.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment. (C) 2005 Wiley-Liss, Inc.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A 16S rRNA gene database (http://greengenes.bl.gov) addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies. It was found that there is incongruent taxonomic nomenclature among curators even at the phylum level. Putative chimeras were identified in 3% of environmental sequences and in 0.2% of records derived from isolates. Environmental sequences were classified into 100 phylum-level lineages in the Archaea and Bacteria.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

At present, little is known about signal transduction mechanisms in schistosomes, which cause the disease of schistosomiasis. The mitogen-activated protein kinase (MAPK) signaling pathways, which are evolutionarily conserved from yeast to Homo sapiens, play key roles in multiple cellular processes. Here, we reconstructed the hypothetical MAPK signaling pathways in Schistosoma japonicum and compared the schistosome pathways with those of model eukaryote species. We identified 60 homologous components in the S. japoncium MAPK signaling pathways. Among these, 27 were predicted to be full-length sequences. Phylogenetic analysis of these proteins confirmed the evolutionary conservation of the MAPK signaling pathways. Remarkably, we identified S. japonicum homologues of GTP-binding protein beta and alpha-I subunits in the yeast mating pathway, which might be involved in the regulation of different life stages and female sexual maturation processes as well in schistosomes. In addition, several pathway member genes, including ERK, JNK, Sja-DSP, MRAS and RAS, were determined through quantitative PCR analysis to be expressed in a stage-specific manner, with ERK, JNK and their inhibitor Sja-DSP markedly upregulated in adult female schistosomes. (c) 2006 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Scorpion toxins are important physiological probes for characterizing ion channels. Molecular databases have limited functional annotation of scorpion toxins. Their function can be inferred by searching for conserved motifs in sequence signature databases that are derived statistically but are not necessarily biologically relevant. Mutation studies provide biological information on residues and positions important for structure-function relationship but are not normally used for extraction of binding motifs. 3D structure analyses also aid in the extraction of peptide motifs in which non-contiguous residues are clustered spatially. Here we present new, functionally relevant peptide motifs for ion channels, derived from the analyses of scorpion toxin native and mutant peptides. Copyright (c) 2006 European Peptide Society and John Wiley & Sons, Ltd.