54 resultados para Protein Sequence
Resumo:
Using six kinds of lattice types (4×4 ,5×5 , and6×6 square lattices;3×3×3 cubic lattice; and2+3+4+3+2 and4+5+6+5+4 triangular lattices), three different size alphabets (HP ,HNUP , and 20 letters), and two energy functions, the designability of proteinstructures is calculated based on random samplings of structures and common biased sampling (CBS) of proteinsequence space. Then three quantities stability (average energy gap),foldability, and partnum of the structure, which are defined to elucidate the designability, are calculated. The authors find that whatever the type of lattice, alphabet size, and energy function used, there will be an emergence of highly designable (preferred) structure. For all cases considered, the local interactions reduce degeneracy and make the designability higher. The designability is sensitive to the lattice type, alphabet size, energy function, and sampling method of the sequence space. Compared with the random sampling method, both the CBS and the Metropolis Monte Carlo sampling methods make the designability higher. The correlation coefficients between the designability, stability, and foldability are mostly larger than 0.5, which demonstrate that they have strong correlation relationship. But the correlation relationship between the designability and the partnum is not so strong because the partnum is independent of the energy. The results are useful in practical use of the designability principle, such as to predict the proteintertiary structure.
Massively parallel sequencing and analysis of expressed sequence tags in a successful invasive plant
Resumo:
Background Invasive species pose a significant threat to global economies, agriculture and biodiversity. Despite progress towards understanding the ecological factors associated with plant invasions, limited genomic resources have made it difficult to elucidate the evolutionary and genetic factors responsible for invasiveness. This study presents the first expressed sequence tag (EST) collection for Senecio madagascariensis, a globally invasive plant species. Methods We used pyrosequencing of one normalized and two subtractive libraries, derived from one native and one invasive population, to generate an EST collection. ESTs were assembled into contigs, annotated by BLAST comparison with the NCBI non-redundant protein database and assigned gene ontology (GO) terms from the Plant GO Slim ontologies. Key Results Assembly of the 221 746 sequence reads resulted in 12 442 contigs. Over 50 % (6183) of 12 442 contigs showed significant homology to proteins in the NCBI database, representing approx. 4800 independent transcripts. The molecular transducer GO term was significantly over-represented in the native (South African) subtractive library compared with the invasive (Australian) library. Based on NCBI BLAST hits and literature searches, 40 % of the molecular transducer genes identified in the South African subtractive library are likely to be involved in response to biotic stimuli, such as fungal, bacterial and viral pathogens. Conclusions This EST collection is the first representation of the S. madagascariensis transcriptome and provides an important resource for the discovery of candidate genes associated with plant invasiveness. The over-representation of molecular transducer genes associated with defence responses in the native subtractive library provides preliminary support for aspects of the enemy release and evolution of increased competitive ability hypotheses in this successful invasive. This study highlights the contribution of next-generation sequencing to better understanding the molecular mechanisms underlying ecological hypotheses that are important in successful plant invasions.
Resumo:
Ratites are large, flightless birds and include the ostrich, rheas, kiwi, emu, and cassowaries, along with extinct members, such as moa and elephant birds. Previous phylogenetic analyses of complete mitochondrial genome sequences have reinforced the traditional belief that ratites are monophyletic and tinamous are their sister group. However, in these studies ratite monophyly was enforced in the analyses that modeled rate heterogeneity among variable sites. Relaxing this topological constraint results in strong support for the tinamous (which fly) nesting within ratites. Furthermore, upon reducing base compositional bias and partitioning models of sequence evolution among protein codon positions and RNA structures, the tinamou–moa clade grouped with kiwi, emu, and cassowaries to the exclusion of the successively more divergent rheas and ostrich. These relationships are consistent with recent results from a large nuclear data set, whereas our strongly supported finding of a tinamou–moa grouping further resolves palaeognath phylogeny. We infer flight to have been lost among ratites multiple times in temporally close association with the Cretaceous–Tertiary extinction event. This circumvents requirements for transient microcontinents and island chains to explain discordance between ratite phylogeny and patterns of continental breakup. Ostriches may have dispersed to Africa from Eurasia, putting in question the status of ratites as an iconic Gondwanan relict taxon. [Base composition; flightless; Gondwana; mitochondrial genome; Palaeognathae; phylogeny; ratites.]
Resumo:
We have previously reported the presence of a 70 kDa insulin-like growth factor (IGF)-II-specific binding protein in chicken serum using Western ligand blotting approaches. In order to ascertain the identity of this 70 kDa IGF-II binding species, the protein has been purified from chicken serum using a combination of ion-exchange and gel-permeation chromatography. Interestingly, amino acid sequencing of the purified protein revealed that it has the same N-terminal sequence as chicken vitronectin (VN). The protein has the ability to specifically bind IGF-II and not IGF-I as determined by ligand blotting, cross-linking and competitive binding assay approaches. In addition, the protein binds 125I-des(l-6)-IGF-II, suggesting that the interaction with IGF-II is different to those with other characterized IGF-binding proteins. Importantly, we have ascertained that both human and bovine VN also specifically bind IGF-II. These results are particularly relevant in the light of the recent report that the urokinase-type plasminogen activator receptor, a protein that also binds VN, has been shown to associate with the cation-independent mannose-6-phosphate/GF-II receptor and suggest a possible role for IGF-II in cell adhesion and invasion.
Resumo:
Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.
Resumo:
The Schizosaccharomyces pombe Mei2 gene encodes an RNA recognition motif (RRM) protein that stimulates meiosis upon binding a specific non-coding RNA and subsequent accumulation in a “mei2-dot” in the nucleus. We present here the first systematic characterization of the family of proteins with characteristic Mei2-like amino acid sequences. Mei2-like proteins are an ancient eukaryotic protein family with three identifiable RRMs. The C-terminal RRM (RRM3) is unique to Mei2-like proteins and is the most highly conserved of the three RRMs. RRM3 also contains conserved sequence elements at its C-terminus not found in other RRM domains. Single copy Mei2-like genes are present in some fungi, in alveolates such as Paramecium and in the early branching eukaryote Entamoeba histolytica, while plants contain small families of Mei2-like genes. While the C-terminal RRM is highly conserved between plants and fungi, indicating conservation of molecular mechanisms, plant Mei2-like genes have changed biological context to regulate various aspects of developmental pattern formation.
Resumo:
Reactive oxygen species (ROS) are a primary cause of cellular damage that leads to cell death. In cells, protection from ROS-induced damage and maintenance of the redox balance is mediated to a large extent by selenoproteins, a distinct family of proteins that contain selenium in form of selenocysteine (Sec) within their active site. Incorporation of Sec requires the Sec-insertion sequence element (SECIS) in the 3'-untranslated region of selenoproteins mRNAs and the SECIS-binding protein 2 (SBP2). Previous studies have shown that SBP2 is required for the Sec-incorporation mechanism; however, additional roles of SBP2 in the cell have remained undefined. We herein show that depletion of SBP2 by using antisense oligonucleotides (ASOs) causes oxidative stress and induction of caspase- and cytochrome c-dependent apoptosis. Cells depleted of SBP2 have increased levels of ROS, which lead to cellular stress manifested as 8-oxo-7,8-dihydroguanine (8-oxo-dG) DNA lesions, stress granules, and lipid peroxidation. Small-molecule antioxidants N-acetylcysteine, glutathione, and α-tocopherol only marginally reduced ROS and were unable to rescue cells fully from apoptosis, indicating that apoptosis might be directly mediated by selenoproteins. Our results demonstrate that SBP2 is required for protection against ROS-induced cellular damage and cell survival. Antioxid. Redox Signal. 12, 797–808.
Resumo:
Prior to the completion of the human genome project, the human genome was thought to have a greater number of genes as it seemed structurally and functionally more complex than other simpler organisms. This along with the belief of “one gene, one protein”, were demonstrated to be incorrect. The inequality in the ratio of gene to protein formation gave rise to the theory of alternative splicing (AS). AS is a mechanism by which one gene gives rise to multiple protein products. Numerous databases and online bioinformatic tools are available for the detection and analysis of AS. Bioinformatics provides an important approach to study mRNA and protein diversity by various tools such as expressed sequence tag (EST) sequences obtained from completely processed mRNA. Microarrays and deep sequencing approaches also aid in the detection of splicing events. Initially it was postulated that AS occurred only in about 5%; of all genes but was later found to be more abundant. Using bioinformatic approaches, the level of AS in human genes was found to be fairly high with 35-59%; of genes having at least one AS form. Our ability to determine and predict AS is important as disorders in splicing patterns may lead to abnormal splice variants resulting in genetic diseases. In addition, the diversity of proteins produced by AS poses a challenge for successful drug discovery and therefore a greater understanding of AS would be beneficial.
Resumo:
The transient leaf assay in Nicotiana benthamiana is widely used in plant sciences, with one application being the rapid assembly of complex multigene pathways that produce new fatty acid profiles. This rapid and facile assay would be further improved if it were possible to simultaneously overexpress transgenes while accurately silencing endogenes. Here, we report a draft genome resource for N. benthamiana spanning over 75% of the 3.1 Gb haploid genome. This resource revealed a two-member NbFAD2 family, NbFAD2.1 and NbFAD2.2, and quantitative RT-PCR (qRT-PCR) confirmed their expression in leaves. FAD2 activities were silenced using hairpin RNAi as monitored by qRT-PCR and biochemical assays. Silencing of endogenous FAD2 activities was combined with overexpression of transgenes via the use of the alternative viral silencing-suppressor protein, V2, from Tomato yellow leaf curl virus. We show that V2 permits maximal overexpression of transgenes but, crucially, also allows hairpin RNAi to operate unimpeded. To illustrate the efficacy of the V2-based leaf assay system, endogenous lipids were shunted from the desaturation of 18:1 to elongation reactions beginning with 18:1 as substrate. These V2-based leaf assays produced ~50% more elongated fatty acid products than p19-based assays. Analyses of small RNA populations generated from hairpin RNAi against NbFAD2 confirm that the siRNA population is dominated by 21 and 22 nt species derived from the hairpin. Collectively, these new tools expand the range of uses and possibilities for metabolic engineering in transient leaf assays. © 2012 Naim et al.
Resumo:
In Arabidopsis thaliana (Arabidopsis), DICER-LIKE1 (DCL1) functions together with the double-stranded RNA binding protein (dsRBP), DRB1, to process microRNAs (miRNAs) from their precursor transcripts prior to their transfer to the RNA-induced silencing complex (RISC). miRNA-loaded RISC directs RNA silencing of cognate mRNAs via ARGONAUTE1 (AGO1)-catalyzed cleavage. Short interefering RNAs (siRNAs) are processed from viral-derived or transgene-encoded molecules of doublestranded RNA (dsRNA) by the DCL/dsRBP partnership, DCL4/DRB4, and are also loaded to AGO1-catalyzed RISC for cleavage of complementary mRNAs. Here, we use an artificial miRNA (amiRNA) technology, transiently expressed in Nicotiana benthamiana, to produce a series of amiRNA duplexes with differing intermolecular thermostabilities at the 5′ end of duplex strands. Analyses of amiRNA duplex strand accumulation and target transcript expression revealed that strand selection (amiRNA and amiRNA*) is directed by asymmetric thermostability of the duplex termini. The duplex strand possessing a lower 59 thermostability was preferentially retained by RISC to guide mRNA cleavage of the corresponding target transgene. In addition, analysis of endogenous miRNA duplex strand accumulation in Arabidopsis drb1 and drb2345 mutant plants revealed that DRB1 dictates strand selection, presumably by directional loading of the miRNA duplex onto RISC for passenger strand degradation. Bioinformatic and Northern blot analyses of DCL4/DRB4-dependent small RNAs (miRNAs and siRNAs) revealed that small RNAs produced by this DCL/dsRBP combination do not conform to the same terminal thermostability rules as those governing DCL1/DRB1-processed miRNAs. This suggests that small RNA processing in the DCL1/DRB1-directed miRNA and DCL4/DRB4-directed sRNA biogenesis pathways operates via different mechanisms.
Resumo:
The nucleotide sequence of the genomic RNA of barley yellow dwarf virus, PAV serotype was determined except for the 5′-terminal base, and its genome organization deduced. The 5,677 nucleotide genome contains five large open reading frames (ORFs). The genes for the coat protein (1) and the putative viral RNA-dependent RNA polymerase were identified. The latter shows a striking degree of similarity to that of carnation mottle virus (CarMV). By comparison with corona- and retrovirus RNAs, it is proposed that a translational frameshift is involved in expression of the polymerase. An ORF encoding an Mr 49,797 protein (50K ORF) may be translated by in-frame readthrough of the coat protein stop codon. The coat protein, an overlapping 17K ORF, and a 3′ 6.7K ORF are likely to be expressed via subgenomic mRNAs. © 1988 IRL Press Limited.
Resumo:
The P0 protein of poleroviruses and P1 protein of sobemoviruses suppress the plant's RNA silencing machinery. Here we identified a silencing suppressor protein (SSP), P0PE, in the Enamovirus Pea enation mosaic virus-1 (PEMV-1) and showed that it and the P0s of poleroviruses Potato leaf roll virus and Cereal yellow dwarf virus have strong local and systemic SSP activity, while the P1 of Sobemovirus Southern bean mosaic virus supresses systemic silencing. The nuclear localized P0PE has no discernable sequence conservation with known SSPs, but proved to be a strong suppressor of local silencing and a moderate suppressor of systemic silencing. Like the P0s from poleroviruses, P0PE destabilizes AGO1 and this action is mediated by an F-box-like domain. Therefore, despite the lack of any sequence similarity, the poleroviral and enamoviral SSPs have a conserved mode of action upon the RNA silencing machinery. © 2012 Elsevier Inc.
Resumo:
The complete nucleotide sequence of Subterranean clover mottle virus (SCMoV) genomic RNA has been determined. The SCMoV genome is 4,258 nucleotides in length. It shares most nucleotide and amino acid sequence identity with the genome of Lucerne transient streak virus (LTSV). SCMoV RNA encodes four overlapping open reading frames and has a genome organisation similar to that of Cocksfoot mottle virus (CfMV). ORF1 and ORF4 are predicted to encode single proteins. ORF2 is predicted to encode two proteins that are derived from a -1 translational frameshift between two overlapping reading frames (ORF2a and ORF2b). A search of amino acid databases did not find a significant match for ORF1 and the function of this protein remains unclear. ORF2a contains a motif typical of chymotrypsin-like serine proteases and ORF2b has motifs characteristically present in positive-stranded RNA-dependent RNA polymerases. ORF4 is likely to be expressed from a subgenomic RNA and encodes the viral coat protein. The ORF2a/ORF2b overlapping gene expression strategy used by SCMoV and CfMV is similar to that of the poleroviruses and differ from that of other published sobemoviruses. These results suggest that the sobemoviruses could now be divided into two distinct subgroups based on those that express the RNA-dependent RNA polymerase from a single, in-frame polyprotein, and those that express it via a -1 translational frameshifting mechanism.
Resumo:
Barley yellow dwarf luteovirus-GPV (BYDV-GPV) is a common problem in Chinese wheat crops but is unrecorded elsewhere. A defining characteristic of GPV is its capacity to be transmitted efficiently by both Schizaphis graminum and Rhopaloshiphum padi. This dual aphid species transmission contrasts with those of BYDV-RPV and BYDV-SGV, globally distributed viruses, which are efficiently transmitted only by Rhopaloshiphum padi and Schizaphis graminum respectively. The viral RNA sequences encoding the coat protein (22K) gene, the movement protein (17K) gene, the region surrounding the conserved GDD motif of the polymerase gene and the intergenic sequences between these genes were determined for GPV and an Australian isolate of BYDV-RPV (RPVa). In all three genes, the sequences of GPV and RPVa were more similar to those of an American isolate of BYDV-RPV (RPVu) than to any other luteovirus for which there is data available. RPVa and RPVu were very similar, especially their coat proteins which had 97% identity at the amino acid level. The coat protein of GPV had 76% and 78% amino acid identity with RPVa and RPVu respectively. The data suggest that RPVu and RPVa are correctly named as strains of the same serotype and that GPV is sufficiently different from either RPV strain to be considered a distinct BYDV type. The coat protein and movement protein genes of GPV are very dissimilar to SGV. The polymerase sequences of RPVu, RPVa and GPV show close affinities with those of the sobemo-like luteoviruses and little similarity with those of the carmo-like luteoviruses. The sequences of the coat proteins, movement proteins and the polymerase segments of BYDV serotypes, other than RPV and GPV, form a cluster that is separate from their counterpart sequences from dicot-infecting luteoviruses. The RPV and GPV isolates consistently fall within a dicot-infecting cluster. This suggests that RPV and GPV evolved from within this group of viruses. Since these other viruses all infect dicots it seems likely that their common ancestor infected a dicot and that RPV and GPV evolved from a virus that switched hosts from a dicot to a monocot.