89 resultados para Protein Sequence Analysis
Resumo:
We completed the genome sequence of Lettuce necrotic yellows virus (LNYV) by determining the nucleotide sequences of the 4a (putative phosphoprotein), 4b, M (matrix protein), G (glycoprotein) and L (polymerase) genes. The genome consists of 12,807 nucleotides and encodes six genes in the order 3' leader-N-4a(P)-4b-M-G-L-5' trailer. Sequences were derived from clones of a cDNA library from LNYV genomic RNA and from fragments amplified using reverse transcription-polymerase chain reaction. The 4a protein has a low isoelectric point characteristic for rhabdovirus phosphoproteins. The 4b protein has significant sequence similarities with the movement proteins of capillo- and trichoviruses and may be involved in cell-to-cell movement. The putative G protein sequence contains a predicted 25 amino acids signal peptide and endopeptidase cleavage site, three predicted glycosylation sites and a putative transmembrane domain. The deduced L protein sequence shows similarities with the L proteins of other plant rhabdoviruses and contains polymerase module motifs characteristic for RNA-dependent RNA polymerases of negative-strand RNA viruses. Phylogenetic analysis of this motif among rhabdoviruses placed LNYV in a group with other sequenced cytorhabdoviruses, most closely related to Strawberry crinkle virus. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Full-length genome sequences of five virulent and five avirulent strains of Newcastle disease virus isolated between 1998 and 2002 in Victoria and New South Wales, Australia were determined. Comparisons between these strains revealed that coding sequence variability in the haemagglutinin-neuraminidase (HN), matrix (M) and phosphoprotein (P) gene sequences appeared to be more variable than in the fusion (F), nucleocapsid (N) and RNA dependent-RNA replicase (L) genes. Sequence analysis of a number of other isolates made during the recent virulent NDV outbreaks, also identified the presence of a number of variants with altered F gene cleavage sites, which resulted in altered biological properties of those viruses. Quasispecies analysis of a number of field isolates indicated the presence of virulent virus in one particular isolate. Gene sequence analysis of the progenitor virus isolated in 1998 showed very little sequence variation when compared to that of a progenitor-like virus isolated in 2001 demonstrating that in the field. viral genome sequence variation appears to be biologically restricted to that of a consensus sequence. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Large-scale gene discovery has been performed for the grass fungal endophytes Neotyphodium coenophialum, Neotyphodium lolii, and Epichloe festucae. The resulting sequences have been annotated by comparison with public DNA and protein sequence databases and using intermediate gene ontology annotation tools. Endophyte sequences have also been analysed for the presence of simple sequence repeat and single nucleotide polymorphism molecular genetic markers. Sequences and annotation are maintained within a MySQL database that may be queried using a custom web interface. Two cDNA-based microarrays have been generated from this genome resource, They permit the interrogation of 3806 Neotyphodium genes (Nchip (TM) rnicroarray), and 4195 Neotyphodium and 920 Epichloe genes (EndoChip (TM) microarray), respectively. These microarrays provide tools for high-throughput transcriptome analysis, including genome-specific gene expression studies, profiling of novel endophyte genes, and investigation of the host grass-symbiont interaction. Comparative transcriptome analysis in Neotyphodium and Epichloe was performed. (c) 2006 Elsevier
Resumo:
A single-tube RT-PCR technique generated a 387 bp or 300 bp cDNA amplicon covering the F-0 cleavage site or the carboxyl (C)-terminus of the HN gene, respectively, of Newcastle disease virus (NDV) strain 1-2. Sequence analysis was used to deduce the amino acid sequences of the cleavage site of F protein and the C-terminus of HN protein, which were then compared with sequences for other NDV strains. The cleavage site of NDV strain 1-2 had a sequence Motif of (112)RKQGRLIG(119), consistent with an avirulent phenotype. Nucleotide sequencing and deduction of amino acids at the C-terminus of HN revealed that strain 1-2 had a 7-amino-acid extension (VEILKDGVREARSSR). This differs from the virulent viruses that caused outbreaks of Newcastle disease in Australia in the 1930s and 1990s, which have HN extensions of 0 and 9 amino acids, respectively. Amino acid sequence analyses of the F and HN genes of strain 1-2 confirmed its avirulent nature and its Australian origin.
Resumo:
We present a fast method for finding optimal parameters for a low-resolution (threading) force field intended to distinguish correct from incorrect folds for a given protein sequence. In contrast to other methods, the parameterization uses information from >10(7) misfolded structures as well as a set of native sequence-structure pairs. In addition to testing the resulting force field's performance on the protein sequence threading problem, results are shown that characterize the number of parameters necessary for effective structure recognition.
Resumo:
The complete nucleotide sequence of the genomic RNA from the insect picorna-like virus Drosophila C virus (DCV) was determined. The DCV sequence predicts a genome organization different to that of other RNA virus families whose sequences are known. The single-stranded positive-sense genomic RNA is 9264 nucleotides in length and contains two large open reading frames (ORFs) which are separated by 191 nucleotides. The 5' ORF contains regions of similarities with the RNA-dependent RNA polymerase, helicase and protease domains of viruses from the picornavirus, comovirus and sequivirus families. The 3' ORF encodes the capsid proteins as confirmed by N-terminal sequence analysis of these proteins. The capsid protein coding region is unusual in two ways: firstly the cistron appears to lack an initiating methionine and secondly no subgenomic RNA is produced, suggesting that the proteins may be translated through internal initiation of translation from the genomic length RNA. The finding of this novel genome organization for DCV shows that this virus is not a member of the Picornaviridae as previously thought, but belongs to a distinct and hitherto unrecognized virus family.
Resumo:
MHCPEP (http://wehih.wehi.edu.au/mhcpep/) is a curated database comprising over 13 000 peptide sequences known to bind MHC molecules, Entries are compiled from published reports as well as from direct submissions of experimental data, Each entry contains the peptide sequence, its MHC specificity and where available, experimental method, observed activity, binding affinity, source protein and anchor positions, as well as publication references, The present format of the database allows text string matching searches but can easily be converted for use in conjunction with sequence analysis packages. The database can be accessed via Internet using WWW or FTP.
Resumo:
We describe two ways of optimizing score functions for protein sequence to structure threading. The first method adjusts parameters to improve sequence to structure alignment. The second adjusts parameters so as to improve a score function's ability to rank alignments calculated in the first score function. Unlike those functions known as knowledge-based force fields, the resulting parameter sets do not rely on Boltzmann statistics, have no claim to representing free energies and are purely constructions for recognizing protein folds. The methods give a small improvement, but suggest that functions can be profitably optimized for very specific aspects of protein fold recognition, Proteins 1999;36:454-461. (C) 1999 Wiley-Liss, Inc.
Resumo:
Sausage is a protein sequence threading program, but with remarkable run-time flexibility. Using different scripts, it can calculate protein sequence-structure alignments, search structure libraries, swap force fields, create models form alignments, convert file formats and analyse results. There are several different force fields which might be classed as knowledge-based, although they do not rely on Boltzmann statistics. Different force fields are used for alignment calculations and subsequent ranking of calculated models.
Resumo:
Over-expression of the c-myb gene and expression of activated forms of myb are known to transform haemopoietic cells, particularly cells of the myeloid lineage. Truncations or mutations that disrupt the negative regulatory domain (NRD) of the Myb protein confer an increased ability to transform cells. Although it has proved difficult to link mutations in c-MYB to human leukaemia, no studies investigating the presence of mutations within the c-MYB NRD have been reported. Therefore, we have performed mutational analysis of this region, using polymerase chain reaction-single-stranded conformation polymorphism and sequence analysis, in 26 patients with acute or chronic myeloid leukaemia, No mutations were detected, indicating that mutation of this region of the Myb protein is not common in the pathogenesis or progression of these diseases.
Resumo:
To identify novel cytokine-related genes, we searched the set of 60,770 annotated RIKEN mouse cDNA clones (FANTOM2 clones), using keywords such as cytokine itself or cytokine names (such as interferon, interleukin, epidermal growth factor, fibroblast growth factor, and transforming growth factor). This search produced 108 known cytokines and cytokine-related products such as cytokine receptors, cytokine-associated genes, or their products (enhancers, accessory proteins, cytokine-induced genes). We found 15 clusters of FANTOM2 clones that are candidates for novel cytokine-related genes. These encoded products with strong sequence similarity to guanylate-binding protein (GBP-5), interleukin-1 receptor-associated kinase 2 (IRAK-2), interleukin 20 receptor alpha isoform 3, a member of the interferon-inducible proteins of the Ifi 200 cluster, four members of the membrane-associated family 1-8 of interferon-inducible proteins, one p27-like protein, and a hypothetical protein containing a Toll/Interleukin receptor domain. All four clones representing novel candidates of gene products from the family contain a novel highly conserved cross-species domain. Clones similar to growth factor-related products included transforming growth factor beta-inducible early growth response protein 2 (TIEG-2), TGFbeta-induced factor 2, integrin beta-like 1, latent TGF-binding protein 4S, and FGF receptor 4B. We performed a detailed sequence analysis of the candidate novel genes to elucidate their likely functional properties.
Resumo:
MHCPEP is a curated database comprising over 9000 peptide sequences known to bind MHC molecules. Entries are compiled from published reports as well as from direct submissions of experimental data. Each entry contains the peptide sequence, its MHC specificity and, when available, experimental method, observed activity, binding affinity, source protein, anchor positions and publication references. The present format of the database allows text string matching searches but can easily be converted for use in conjunction with sequence analysis packages. The database can be accessed via Internet using WWW, FTP or Gopher.
Resumo:
The SH3 domains of src and other nonreceptor tyrosine kinases have been shown to associate with the motif PXXP, where P and X stand for proline and an unspecified amino acid, but a motif that binds to the SH3 domain of myosin has thus far not been characterized. We previously showed that the SH3 domain of Acanthamoeba myosin-IC interacts with the protein Acan125. We now report that the Acan125 protein sequence contains two tandem consensus PXXP motifs near the C terminus. To test for binding, we expressed a polypeptide, AD3p, which includes 344 residues of native C-terminal sequence and a mutant polypeptide, AD3 Delta 977-994p, which lacks the sequence RPKPVPPPRGAKPAPPPR containing both PXXP motifs. The SH3 domain of Acanthamoeba myosin-IC bound AD3p and not AD3 Delta 977-994p, showing that the PXXP motifs are required for SH3 binding. The sequence of Acan125 is related overall to a protein of unknown function coded by Caenorhabditis elegans gene K07G5.1. The K07G5.1 gene product contains a proline-rich segment similar to the SH3 binding motif found in Acan125. The aligned sequences show considerable conservation of leucines and other hydrophobic residues, including the spacing of these residues, which matches a motif for leucine-rich repeats (LRRs). LRR domains have been demonstrated to be sites for ligand binding. Having an LRR domain and an SH3-binding domain, Acan125 and the C. elegans homologue define a novel family of bifunctional binding proteins.
Resumo:
RAD51 colocalizes with both BRCA1 and BRCA2, and genetic variants in RAD51 would be candidate BRCA1/2 modifiers. We searched for RAD51 polymorphisms by sequencing 20 individuals. We compared the polymorphism allele frequencies between female BRCA1/2 mutation carriers with and without breast or ovarian cancer and between population-based ovarian cancer cases with BRCA1/2 mutations to cases and controls without mutations. We discovered two single nucleotide polymorphisms (SNPs) at positions 135 g-->c and 172 g-->t of the 5' untranslated region. In an initial group of BRCA1/2 mutation carriers, 14 (21%) of 67 breast cancer cases carried a c allele at RAD51:135 g-->c, whereas 8 (7%) of 119 women without breast cancer carried this allele. In a second set of 466 mutation carriers from three centers, the association of RAD51:135 g-->c with breast cancer risk was not confirmed. Analyses restricted to the 216 BRCA2 mutation carriers, however, showed a statistically significant association of the 135 c allele with the risk of breast cancer (adjusted odds ratio, 3.2; 95% confidence limit, 1.4-40). BRCA1/2 mutation carriers with ovarian cancer were only about one half as likely to carry the RAD51:135 g-->c SNP. Analysis of the RAD51:135 g-->c SNP in 738 subjects from an Israeli ovarian cancer case-control study was consistent with a lower risk of ovarian cancer among BRCA1/2 mutation carriers with the c allele. We have identified a RAD51 5' untranslated region SNP that may be associated with an increased risk of breast cancer and a lower risk of ovarian cancer among BRCA2 mutation carriers. The biochemical basis of this risk modifier is currently unknown.
Resumo:
Epithelial ovarian carcinoma is often diagnosed at an advanced stage of disease and is the leading cause of death from gynaecological neoplasia. The genetic changes that occur during the development of this carcinoma are poorly understood. It has been proposed that IGFIIR, TGF beta1 and TGF beta RII act as a functional unit in the TGF beta growth inhibitory pathway, and that somatic loss-of-function mutations in any one of these genes could lead to disruption of the pathway and subsequent loss of cell cycle control. We have examined these 3 genes in 25 epithelial ovarian carcinomas using single-stranded conformational polymorphism analysis and DNA sequence analysis. A total of 3 somatic missense mutations were found in the TGF beta RII gene, but none in IGFRII or TGF beta1. An association was found between TGF beta RII mutations and histology, with 2 out of 3 clear cell carcinomas having TGF beta RII mutations. This data supports other evidence from mutational analysis of the PTEN and beta -catenin genes that there are distinct developmental pathways responsible for the progression of different epithelial ovarian cancer histologic subtypes. (C) 2001 Cancer Research Campaign.