77 resultados para sequence identity
em National Center for Biotechnology Information - NCBI
Resumo:
Sequence divergence acts as a potent barrier to homologous recombination; much of this barrier derives from an antirecombination activity exerted by mismatch repair proteins. An inverted repeat assay system with recombination substrates ranging in identity from 74% to 100% has been used to define the relationship between sequence divergence and the rate of mitotic crossing-over in yeast. To elucidate the role of the mismatch repair machinery in regulating recombination between mismatched substrates, we performed experiments in both wild-type and mismatch repair defective strains. We find that a single mismatch is sufficient to inhibit recombination between otherwise identical sequences, and that this inhibition is dependent on the mismatch repair system. Additional mismatches have a cumulative negative effect on the recombination rate. With sequence divergence of up to approximately 10%, the inhibitory effect of mismatches results mainly from antirecombination activity of the mismatch repair system. With greater levels of divergence, recombination is inefficient even in the absence of mismatch repair activity. In both wild-type and mismatch repair defective strains, an approximate log-linear relationship is observed between the recombination rate and the level of sequence divergence.
Resumo:
This report documents the error rate in a commercially distributed subset of the IMAGE Consortium mouse cDNA clone collection. After isolation of plasmid DNA from 1189 bacterial stock cultures, only 62.2% were uncontaminated and contained cDNA inserts that had significant sequence identity to published data for the ordered clones. An agarose gel electrophoresis pre-screening strategy identified 361 stock cultures that appeared to contain two or more plasmid species. Isolation of individual colonies from these stocks demonstrated that 7.1% of the original 1189 stocks contained both a correct and an incorrect plasmid. 5.9% of the original 1189 stocks contained multiple, distinct, incorrect plasmids, indicating the likelihood of multiple contaminating events. While only 739 of the stocks purchased contained the desired cDNA clone, agarose gel pre-screening, colony isolation and similarity searching of dbEST allowed for the identification of an additional 420 clones that would have otherwise been discarded. Considering the high error rate in this subset of the IMAGE cDNA clone set, the use of sequence verified clones for cDNA microarray construction is warranted. When this is not possible, pre-screening non-sequence verified clones with agarose gel electrophoresis provides an inexpensive and efficient method to eliminate contaminated clones from the probe set.
Resumo:
Adenosine kinase catalyzes the phosphorylation of adenosine to AMP and hence is a potentially important regulator of extracellular adenosine concentrations. Despite extensive characterization of the kinetic properties of the enzyme, its primary structure has never been elucidated. Full-length cDNA clones encoding catalytically active adenosine kinase were obtained from lymphocyte, placental, and liver cDNA libraries. Corresponding mRNA species of 1.3 and 1.8 kb were noted on Northern blots of all tissues examined and were attributable to alternative polyadenylylation sites at the 3' end of the gene. The encoding protein consists of 345 amino acids with a calculated molecular size of 38.7 kDa and does not contain any sequence similarities to other well-characterized mammalian nucleoside kinases, setting it apart from this family of structurally and functionally related proteins. In contrast, two regions were identified with significant sequence identity to microbial ribokinase and fructokinases and a bacterial inosine/guanosine kinase. Thus, adenosine kinase is a structurally distinct mammalian nucleoside kinase that appears to be akin to sugar kinases of microbial origin.
Resumo:
Expansins are unusual proteins discovered by virtue of their ability to mediate cell wall extension in plants. We identified cDNA clones for two cucumber expansins on the basis of peptide sequences of proteins purified from cucumber hypocotyls. The expansin cDNAs encode related proteins with signal peptides predicted to direct protein secretion to the cell wall. Northern blot analysis showed moderate transcript abundance in the growing region of the hypocotyl and no detectable transcripts in the nongrowing region. Rice and Arabidopsis expansin cDNAs were identified from collections of anonymous cDNAs (expressed sequence tags). Sequence comparisons indicate at least four distinct expansin cDNAs in rice and at least six in Arabidopsis. Expansins are highly conserved in size and sequence (60-87% amino acid sequence identity and 75-95% similarity between any pairwise comparison), and phylogenetic trees indicate that this multigene family formed before the evolutionary divergence of monocotyledons and dicotyledons. Sequence and motif analyses show no similarities to known functional domains that might account for expansin action on wall extension. A series of highly conserved tryptophans may function in expansin binding to cellulose or other glycans. The high conservation of this multigene family indicates that the mechanism by which expansins promote wall extensin tolerates little variation in protein structure.
Resumo:
A serpin was identified in normal mammary gland by differential cDNA sequencing. In situ hybridization has detected this serpin exclusively in the myoepithelial cells on the normal and noninvasive mammary epithelial side of the basement membrane and thus was named myoepithelium-derived serine proteinase inhibitor (MEPI). No MEPI expression was detected in the malignant breast carcinomas. MEPI encodes a 405-aa precursor, including an 18-residue secretion signal with a calculated molecular mass of 46 kDa. The predicted sequence of the new protein shares 33% sequence identity and 58% sequence similarity to plasminogen activator inhibitor (PAI)-1 and PAI-2. To determine whether MEPI can modulate the in vivo growth and progression of human breast cancers, we transfected a full-length MEPI cDNA into human breast cancer cells and studied the orthotopic growth of MEPI-transfected vs. control clones in the mammary fat pad of athymic nude mice. Overexpression of MEPI inhibited the invasion of the cells in the in vitro invasion assay. When injected orthotopically into nude mice, the primary tumor volumes, axillary lymph node metastasis, and lung metastasis were significantly inhibited in MEPI-transfected clones as compared with controls. The expression of MEPI in myoepithelial cells may prevent breast cancer malignant progression leading to metastasis.
Resumo:
Protease-activated receptors 1–3 (PAR1, PAR2, and PAR3) are members of a unique G protein-coupled receptor family. They are characterized by a tethered peptide ligand at the extracellular amino terminus that is generated by minor proteolysis. A partial cDNA sequence of a fourth member of this family (PAR4) was identified in an expressed sequence tag database, and the full-length cDNA clone has been isolated from a lymphoma Daudi cell cDNA library. The ORF codes for a seven transmembrane domain protein of 385 amino acids with 33% amino acid sequence identity with PAR1, PAR2, and PAR3. A putative protease cleavage site (Arg-47/Gly-48) was identified within the extracellular amino terminus. COS cells transiently transfected with PAR4 resulted in the formation of intracellular inositol triphosphate when treated with either thrombin or trypsin. A PAR4 mutant in which the Arg-47 was replaced with Ala did not respond to thrombin or trypsin. A hexapeptide (GYPGQV) representing the newly exposed tethered ligand from the amino terminus of PAR4 after proteolysis by thrombin activated COS cells transfected with either wild-type or the mutant PAR4. Northern blot showed that PAR4 mRNA was expressed in a number of human tissues, with high levels being present in lung, pancreas, thyroid, testis, and small intestine. By fluorescence in situ hybridization, the human PAR4 gene was mapped to chromosome 19p12.
Resumo:
Carbon catabolite repression (CCR) of several Bacillus subtilis catabolic genes is mediated by ATP-dependent phosphorylation of histidine-containing protein (HPr), a phosphocarrier protein of the phosphoenolpyruvate (PEP): sugar phosphotransferase system. In this study, we report the discovery of a new B. subtilis gene encoding a HPr-like protein, Crh (for catabolite repression HPr), composed of 85 amino acids. Crh exhibits 45% sequence identity with HPr, but the active site His-15 of HPr is replaced with a glutamine in Crh. Crh is therefore not phosphorylated by PEP and enzyme I, but is phosphorylated by ATP and the HPr kinase in the presence of fructose-1,6-bisphosphate. We determined Ser-46 as the site of phosphorylation in Crh by carrying out mass spectrometry with peptides obtained by tryptic digestion or CNBr cleavage. In a B. subtilis ptsH1 mutant strain, synthesis of β-xylosidase, inositol dehydrogenase, and levanase was only partially relieved from CCR. Additional disruption of the crh gene caused almost complete relief from CCR. In a ptsH1 crh1 mutant, producing HPr and Crh in which Ser-46 is replaced with a nonphosphorylatable alanyl residue, expression of β-xylosidase was also completely relieved from glucose repression. These results suggest that CCR of certain catabolic operons requires, in addition to CcpA, ATP-dependent phosphorylation of Crh, and HPr at Ser-46.
Resumo:
Fractionation of the abundant small ribonucleoproteins (RNPs) of the trypanosomatid Leptomonas collosoma revealed the existence of a group of unidentified small RNPs that were shown to fractionate differently than the well-characterized trans-spliceosomal RNPs. One of these RNAs, an 80-nt RNA, did not possess a trimethylguanosine (TMG) cap structure but did possess a 5′ phosphate terminus and an invariant consensus U5 snRNA loop 1. The gene coding for the RNA was cloned, and the coding region showed 55% sequence identity to the recently described U5 homologue of Trypanosoma brucei [Dungan, J. D., Watkins, K. P. & Agabian, N. (1996) EMBO J. 15, 4016–4029]. The L. collosoma U5 homologue exists in multiple forms of RNP complexes, a 10S monoparticle, and two subgroups of 18S particles that either contain or lack the U4 and U6 small nuclear RNAs, suggesting the existence of a U4/U6⋅U5 tri-small nuclear RNP complex. In contrast to T. brucei U5 RNA (62 nt), the L. collosoma homologue is longer (80 nt) and possesses a second stem–loop. Like the trypanosome U3, U6, and 7SL RNA genes, a tRNA gene coding for tRNACys was found 98 nt upstream to the U5 gene. A potential for base pair interaction between U5 and SL RNA in the 5′ splice site region (positions −1 and +1) and downstream from it is proposed. The presence of a U5-like RNA in trypanosomes suggests that the most essential small nuclear RNPs are ubiquitous for both cis- and trans-splicing, yet even among the trypanosomatids the U5 RNA is highly divergent.
Resumo:
A novel virus, designated swine hepatitis E virus (swine HEV), was identified in pigs. Swine HEV crossreacts with antibody to the human HEV capsid antigen. Swine HEV is a ubiquitous agent and the majority of swine ≥3 months of age in herds from the midwestern United States were seropositive. Young pigs naturally infected by swine HEV were clinically normal but had microscopic evidence of hepatitis, and developed viremia prior to seroconversion. The entire ORFs 2 and 3 were amplified by reverse transcription–PCR from sera of naturally infected pigs. The putative capsid gene (ORF2) of swine HEV shared about 79–80% sequence identity at the nucleotide level and 90–92% identity at the amino acid level with human HEV strains. The small ORF3 of swine HEV had 83–85% nucleotide sequence identity and 77–82% amino acid identity with human HEV strains. Phylogenetic analyses showed that swine HEV is closely related to, but distinct from, human HEV strains. The discovery of swine HEV not only has implications for HEV vaccine development, diagnosis, and biology, but also raises a potential public health concern for zoonosis or xenozoonosis following xenotransplantation with pig organs.
Resumo:
The X and Y chromosomes of the mouse, like those of other mammals, are heteromorphic over most of their length, but at the distal ends of the chromosomes is a region of sequence identity, the pseudoautosomal region (PAR), where the chromosomes pair and recombine during male meiosis. The point at which the PAR diverges into X- and Y-specific sequences is called the pseudoautosomal boundary. We have completed a genomic walk from the X-specific Amelogenin gene to the PAR. Analysis of this region revealed that the pseudoautosomal boundary of mice is located within an intron of a transcribed gene that encodes a novel RING finger protein. The first three of the exons of the gene are located on the X chromosome whereas the 3′ exons of the gene are located on both X and Y chromosomes. This unusual arrangement may indicate that the gene is in a state of transition from pseudoautosomal to X-unique and provides evidence for a process of attrition of the pseudoautosomal region on the Y chromosome.
Resumo:
Mutant presenilins have been found to cause Alzheimer disease. Here, we describe the identification and characterization of HOP-1, a Caenorhabditis elegans presenilin that displays much more lower sequence identity with human presenilins than does the other C. elegans presenilin, SEL-12. Despite considerable divergence, HOP-1 appears to be a bona fide presenilin, because HOP-1 can rescue the egg-laying defect caused by mutations in sel-12 when hop-1 is expressed under the control of sel-12 regulatory sequences. HOP-1 also has the essential topological characteristics of the other presenilins. Reducing hop-1 activity in a sel-12 mutant background causes synthetic lethality and terminal phenotypes associated with reducing the function of the C. elegans lin-12 and glp-1 genes. These observations suggest that hop-1 is functionally redundant with sel-12 and underscore the intimate connection between presenilin activity and LIN-12/Notch activity inferred from genetic studies in C. elegans and mammals.
Resumo:
The structural basis of species specificity of transmissible spongiform encephalopathies, such as bovine spongiform encephalopathy or “mad cow disease” and Creutzfeldt–Jakob disease in humans, has been investigated using the refined NMR structure of the C-terminal domain of the mouse prion protein with residues 121–231. A database search for mammalian prion proteins yielded 23 different sequences for the fragment 124–226, which display a high degree of sequence identity and show relevant amino acid substitutions in only 18 of the 103 positions. Except for a unique isolated negative surface charge in the bovine protein, the amino acid differences are clustered in three distinct regions of the three-dimensional structure of the cellular form of the prion protein. Two of these regions represent potential species-dependent surface recognition sites for protein–protein interactions, which have independently been implicated from in vitro and in vivo studies of prion protein transformation. The third region consists of a cluster of interior hydrophobic side chains that may affect prion protein transformation at later stages, after initial conformational changes in the cellular protein.
Resumo:
Caenorhabditis elegans should soon be the first multicellular organism whose complete genomic sequence has been determined. This achievement provides a unique opportunity for a comprehensive assessment of the signal transduction molecules required for the existence of a multicellular animal. Although the worm C. elegans may not much resemble humans, the molecules that regulate signal transduction in these two organisms prove to be quite similar. We focus here on the content and diversity of protein kinases present in worms, together with an assessment of other classes of proteins that regulate protein phosphorylation. By systematic analysis of the 19,099 predicted C. elegans proteins, and thorough analysis of the finished and unfinished genomic sequences, we have identified 411 full length protein kinases and 21 partial kinase fragments. We also describe 82 additional proteins that are predicted to be structurally similar to conventional protein kinases even though they share minimal primary sequence identity. Finally, the richness of phosphorylation-dependent signaling pathways in worms is further supported with the identification of 185 protein phosphatases and 128 phosphoprotein-binding domains (SH2, PTB, STYX, SBF, 14-3-3, FHA, and WW) in the worm genome.
Resumo:
The insulin-like growth factor (IGF) binding proteins (IGFBPs) modulate the actions of the insulin-like growth factors in endocrine, paracrine, and autocrine settings. Additionally, some IGFBPs appear to exhibit biological effects that are IGF independent. The six high-affinity IGFBPs that have been characterized to date exhibit 40–60% amino acid sequence identity overall, with the most conserved sequences in their NH2 and COOH termini. We have recently demonstrated that the product of the mac25/IGFBP-7 gene, which shows significant conservation in the NH2 terminus, including an “IGFBP motif” (GCGCCXXC), exhibits low-affinity IGF binding. The closely related mammalian genes connective tissue growth factor (CTGF) gene, nov, and cyr61 encode secreted proteins that also contain the conserved sequences and IGFBP motifs in their NH2 termini. To ascertain if these genes, along with mac25/IGFBP-7, encode a family of low-affinity IGFBPs, we assessed the IGF binding characteristics of recombinant human CTGF (rhCTGF). The ability of baculovirus-synthesized rhCTGF to bind IGFs was demonstrated by Western ligand blotting, affinity cross-linking, and competitive affinity binding assays using 125I-labeled IGF-I or IGF-II and unlabeled IGFs. CTGF, like mac25/IGFBP-7, specifically binds IGFs, although with relatively low affinity. On the basis of these data, we propose that CTGF represents another member of the IGFBP family (IGFBP-8) and that the CTGF gene, mac25/IGFBP-7, nov, and cyr61 are members of a family of low-affinity IGFBP genes. These genes, along with those encoding the high-affinity IGFBPs 1–6, together constitute an IGFBP superfamily whose products function in IGF-dependent or IGF-independent modes to regulate normal and neoplastic cell growth.
Resumo:
An increasing number of proteins with weak sequence similarity have been found to assume similar three-dimensional fold and often have similar or related biochemical or biophysical functions. We propose a method for detecting the fold similarity between two proteins with low sequence similarity based on their amino acid properties alone. The method, the proximity correlation matrix (PCM) method, is built on the observation that the physical properties of neighboring amino acid residues in sequence at structurally equivalent positions of two proteins of similar fold are often correlated even when amino acid sequences are different. The hydrophobicity is shown to be the most strongly correlated property for all protein fold classes. The PCM method was tested on 420 proteins belonging to 64 different known folds, each having at least three proteins with little sequence similarity. The method was able to detect fold similarities for 40% of the 420 sequences. Compared with sequence comparison and several fold-recognition methods, the method demonstrates good performance in detecting fold similarities among the proteins with low sequence identity. Applied to the complete genome of Methanococcus jannaschii, the method recognized the folds for 22 hypothetical proteins.