971 resultados para SEQUENCE ALIGNMENT


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Subunit vaccine discovery is an accepted clinical priority. The empirical approach is time- and labor-consuming and can often end in failure. Rational information-driven approaches can overcome these limitations in a fast and efficient manner. However, informatics solutions require reliable algorithms for antigen identification. All known algorithms use sequence similarity to identify antigens. However, antigenicity may be encoded subtly in a sequence and may not be directly identifiable by sequence alignment. We propose a new alignment-independent method for antigen recognition based on the principal chemical properties of protein amino acid sequences. The method is tested by cross-validation on a training set of bacterial antigens and external validation on a test set of known antigens. The prediction accuracy is 83% for the cross-validation and 80% for the external test set. Our approach is accurate and robust, and provides a potent tool for the in silico discovery of medically relevant subunit vaccines.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. Results In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. Conclusion A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The P0 protein of poleroviruses and P1 protein of sobemoviruses suppress the plant's RNA silencing machinery. Here we identified a silencing suppressor protein (SSP), P0PE, in the Enamovirus Pea enation mosaic virus-1 (PEMV-1) and showed that it and the P0s of poleroviruses Potato leaf roll virus and Cereal yellow dwarf virus have strong local and systemic SSP activity, while the P1 of Sobemovirus Southern bean mosaic virus supresses systemic silencing. The nuclear localized P0PE has no discernable sequence conservation with known SSPs, but proved to be a strong suppressor of local silencing and a moderate suppressor of systemic silencing. Like the P0s from poleroviruses, P0PE destabilizes AGO1 and this action is mediated by an F-box-like domain. Therefore, despite the lack of any sequence similarity, the poleroviral and enamoviral SSPs have a conserved mode of action upon the RNA silencing machinery. © 2012 Elsevier Inc.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Two BRCA2-like sequences are present in the Arabidopsis genome. Both genes are expressed in flower buds and encode nearly identical proteins, which contain four BRC motifs. In a yeast two-hybrid assay, the Arabidopsis Brca2 proteins interact with Rad51 and Dmc1. RNAi constructs aimed at silencing the BRCA2 genes at meiosis triggered a reproducible sterility phenotype, which was associated with dramatic meiosis alterations. We obtained the same phenotype upon introduction of RNAi constructs aimed at silencing the RAD51 gene at meiosis in dmc1 mutant plants. The meiotic figures we observed strongly suggest that homologous recombination is highly disturbed in these meiotic cells, leaving aberrant recombination events to repair the meiotic double-strand breaks. The 'brca2' meiotic phenotype was eliminated in spo11 mutant plants. Our experiments point to an essential role of Brca2 at meiosis in Arabidopsis. We also propose a role for Rad51 in the dmc1 context.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coleoptera is the most diverse group of insects with over 360,000 described species divided into four suborders: Adephaga, Archostemata, Myxophaga, and Polyphaga. In this study, we present six new complete mitochondrial genome (mtgenome) descriptions, including a representative of each suborder, and analyze the evolution of mtgenomes from a comparative framework using all available coleopteran mtgenomes. We propose a modification of atypical cox1 start codons based on sequence alignment to better reflect the conservation observed across species as well as findings of TTG start codons in other genes. We also analyze tRNA-Ser(AGN) anticodons, usually GCU in arthropods, and report a conserved UCU anticodon as a possible synapomorphy across Polyphaga. We further analyze the secondary structure of tRNA-Ser(AGN) and present a consensus structure and an updated covariance model that allows tRNAscan-SE (via the COVE software package) to locate and fold these atypical tRNAs with much greater consistency. We also report secondary structure predictions for both rRNA genes based on conserved stems. All six species of beetle have the same gene order as the ancestral insect. We report noncoding DNA regions, including a small gap region of about 20 bp between tRNA-Ser(UCN) and nad1 that is present in all six genomes, and present results of a base composition analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bahia grass, Paspalum notatum, is an important pollen allergen source with a long season of pollination and wide distribution in subtropical and temperate regions. We aimed to characterize the 55. kDa allergen of Bahia grass pollen (BaGP) and ascertain its clinical importance. BaGP extract was separated by 2D-PAGE and immunoblotted with serum IgE of a grass pollen-allergic patient. The amino-terminal protein sequence of the predominant allergen isoform at 55. kDa had similarity with the group 13 allergens of Timothy grass and maize pollen, Phl p 13 and Zea m 13. Four sequences obtained by rapid amplification of the allergen cDNA ends represented multiple isoforms of Pas n 13. The predicted full length cDNA for Pas n 13 encoded a 423 amino acid glycoprotein including a signal peptide of 28 residues and with a predicted pI of 7.0. Tandem mass spectrometry of tryptic peptides of 2D gel spots identified peptides specific to the deduced amino acid sequence for each of the four Pas n 13 cDNA, representing 47% of the predicted mature protein sequence of Pas n 13. There was 80.6% and 72.6% amino acid identity with Zea m 13 and Phl p 13, respectively. Reactivity with a Phl p 13-specific monoclonal antibody AF6 supported designation of this allergen as Pas n 13. The allergen was purified from BaGP extract by ammonium sulphate precipitation, hydrophobic interaction and size exclusion chromatography. Purified Pas n 13 reacted with serum IgE of 34 of 71 (48%) grass pollen-allergic patients and specifically inhibited IgE reactivity with the 55. kDa band of BaGP for two grass pollen-allergic donors. Four isoforms of Pas n 13 from pI 6.3-7.8 had IgE-reactivity with grass pollen allergic sera. The allergenic activity of purified Pas n 13 was demonstrated by activation of basophils from whole blood of three grass pollen-allergic donors tested but not control donors. Pas n 13 is thus a clinically relevant pollen allergen of the subtropical Bahia grass likely to be important in eliciting seasonal allergic rhinitis and asthma in grass pollen-allergic patients.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: IgE is the pivotal-specific effector molecule of allergic reactions yet it remains unclear whether the elevated production of IgE in atopic individuals is due to superantigen activation of B cell populations, increased antibody class switching to IgE or oligoclonal allergen-driven IgE responses. Objectives: To increase our understanding of the mechanisms driving IgE responses in allergic disease we examined immunoglobulin variable regions of IgE heavy chain transcripts from three patients with seasonal rhinitis due to grass pollen allergy. Methods: Variable domain of heavy chain-epsilon constant domain 1 cDNAs were amplified from peripheral blood using a two-step semi-nested PCR, cloned and sequenced. Results: The VH gene family usage in subject A was broadly based, but there were two clusters of sequences using genes VH 3-9 and 3-11 with unusually low levels of somatic mutations, 0-3%. Subject B repeatedly used VH 1-69 and subject C repeatedly used VH 1-02, 1-46 and 5a genes. Most clones were highly mutated being only 86-95% homologous to their germline VH gene counterparts and somatic mutations were more abundant at the complementarity determining rather than framework regions. Multiple sequence alignment revealed both repeated use of particular VH genes as well as clonal relatedness among clusters of IgE transcripts. Conclusion: In contrast to previous studies we observed no preferred VH gene common to IgE transcripts of the three subjects allergic to grass pollen. Moreover, most of the VH gene characteristics of the IgE transcripts were consistent with oligoclonal antigen-driven IgE responses.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The major diabetes autoantigen, glutamic acid decarboxylase (GAD65), contains a region of sequence similarity, including six identical residues PEVKEK, to the P2C protein of coxsackie B virus, suggesting that cross-reactivity between coxsackie B virus and GAD65 can initiate autoimmune diabetes. We used the human islet cell mAbs MICA3 and MICA4 to identify the Ab epitopes of GAD65 by screening phage-displayed random peptide libraries. The identified peptide sequences could be mapped to a homology model of the pyridoxal phosphate (PLP) binding domain of GAD65. For MICA3, a surface loop containing the sequence PEVKEK and two adjacent exposed helixes were identified in the PLP binding domain as well as a region of the C terminus of GAD65 that has previously been identified as critical for MICA3 binding. To confirm that the loop containing tile PEVKEK sequence contributes to the MICA3 epitope, this loop was deleted by mutagenesis. This reduced binding of MICA3 by 70%. Peptide sequences selected using MICA4 were rich in basic or hydroxyl-containing amino acids, and the surface of the GAD65 PLP-binding domain surrounding Lys358, which is known to be critical for MICA4 binding, was likewise rich in these amino acids. Also, the two phage most reactive width MICA4 encoded the motif VALxG, and the reverse of this sequence, LAV, was located in this same region. Thus, we have defined the MICA3 and MICA4 epitopes on GAD65 using the combination of phage display, molecular modeling, and mutagenesis and have provided compelling evidence for the involvement of the PEVKEK loop in the MICA3 epitope.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ross River virus (RRV) is the predominant cause of epidemic polyarthritis in Australia, yet the antigenic determinants are not well defined. We aimed to characterize epitope(s) on RRV-E2 for a panel of monoclonal antibodies (MAbs) that recognize overlapping conformational epitopes on the E2 envelope protein of RRV and that neutralize virus infection of cells in vitro. Phage-displayed random peptide libraries were probed with the MAbs T1E7, NB3C4, and T10C9 using solution-phase and solid-phase biopanning methods. The peptides VSIFPPA and KTAISPT were selected 15 and 6 times, respectively, by all three of the MAbs using solution-phase biopanning. The peptide LRLPPAP was selected 8 times by NB3C4 using solid-phase biopanning; this peptide shares a trio of amino acids with the peptide VSIFPPA. Phage that expressed the peptides VSIFPPA and LRLPPAP were reactive with T1E7 and/or NB3C4, and phage that expressed the peptides VSIFPPA, LRLPPAP, and KTAISPT partially inhibited the reactivity of T1E7 with RRV. The selected peptides resemble regions of RRV-E2 adjacent to sites mutated in neutralization escape variants of RRV derived by culture in the presence of these MAbs (E2 210-219 and 238-245) and an additional region of E2 172-182. Together these sites represent a conformational epitope of E2 that is informative of cellular contact sites on RRV.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Antibody screening of phage-displayed random peptide libraries to identify mimotopes of conformational epitopes is promising. However, because interpretations can be difficult, an exemplary system has been used in the present study to investigate whether variation in the peptide sequences of selected phagotopes corresponded with variation in immunoreactivity. The phagotopes, derived using a well-characterized monoclonal antibody, CII-C1, to a known conformational epitope on type II collagen, C1, were tested by direct and inhibition ELISA for reactivity with CII-C1. A multiple sequence alignment algorithm, PILEUP, was used to sort the peptides expressed by the phagotopes into clusters. A model was prepared of the C1 epitope on type II collagen. The 12 selected phagotopes reacted with CII-C1 by both direct ELISA (titres from < 100-11 200) and inhibition ELISA (20-100% inhibition); the reactivity varied according to the peptide sequence and assay format. The differences in reactivity between the phagotopes were mostly in accord with the alignment, by PILEUP, of the peptide sequences. The finding that the phagotopes functionally mimicked the C1 epitope on collagen was validated in that amino acids RRL at the amino terminal of many of the peptides were topographically demonstrable on the model of the C1 epitope. Notably, one phagotope that expressed the widely divergent peptide C-IAPKRHNSA-C also mimicked the C1 epitope, as judged by reactivity in each of the assays used: these included cross-inhibition of CII-C1 reactivity with each of the other phagotopes and inhibition by a synthetic peptide corresponding to that expressed by the most frequently selected phagotope, RRLPFGSQM. Thus, it has been demonstrated that multiple phage-displayed peptides can mimic the same epitope and that observed immunoreactivity of selected phagotopes with the selecting mAb can depend on the primary sequence of the expressed peptide and also on the assay format used.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The membrane-bound ceruloplasmin homolog hephaestin plays a critical role in intestinal iron absorption. The aims of this study were to clone the rat hephaestin gene and to examine its expression in the gastrointestinal tract in relation to other genes encoding iron transport proteins. The rat hephaestin gene was isolated from intestinal mRNA and was found to encode a protein 96% identical to mouse hephaestin. Analysis by ribonuclease protection assay and Western blotting showed that hephaestin was expressed at high levels throughout the small intestine and colon. Immunofluorescence localized the hephaestin protein to the mature villus enterocytes with little or no expression in the crypts. Variations in iron status had a small but nonsignificant effect on hephaestin expression in the duodenum. The high sequence conservation between rat and mouse hephaestin is consistent with this protein playing a central role in intestinal iron absorption, although its precise function remains to be determined.