78 resultados para Multiple Sequence Alignment
Resumo:
Two major pathways of recombination-dependent DNA replication, “join-copy” and “join-cut-copy,” can be distinguished in phage T4: join-copy requires only early and middle genes, but two late proteins, endonuclease VII and terminase, are uniquely important in the join-cut-copy pathway. In wild-type T4, timing of these pathways is integrated with the developmental program and related to transcription and packaging of DNA. In primase mutants, which are defective in origin-dependent lagging-strand DNA synthesis, the late pathway can bypass the lack of primers for lagging-strand DNA synthesis. The exquisitely regulated synthesis of endo VII, and of two proteins from its gene, explains the delay of recombination-dependent DNA replication in primase (as well as topoisomerase) mutants, and the temperature-dependence of the delay. Other proteins (e.g., the single-stranded DNA binding protein and the products of genes 46 and 47) are important in all recombination pathways, but they interact differently with other proteins in different pathways. These homologous recombination pathways contribute to evolution because they facilitate acquisition of any foreign DNA with limited sequence homology during horizontal gene transfer, without requiring transposition or site-specific recombination functions. Partial heteroduplex repair can generate what appears to be multiple mutations from a single recombinational intermediate. The resulting sequence divergence generates barriers to formation of viable recombinants. The multiple sequence changes can also lead to erroneous estimates in phylogenetic analyses.
Resumo:
The guinea pig estrogen sulfotransferase gene has been cloned and compared to three other cloned steroid and phenol sulfotransferase genes (human estrogen sulfotransferase, human phenol sulfotransferase, and guinea pig 3 alpha-hydroxysteroid sulfotransferase). The four sulfotransferase genes demonstrate a common outstanding feature: the splice sites for their 3'-terminal exons are identically located. That is, the 3'-terminal exon splice sites involve a glycine that constitutes the N-terminal glycine of an invariably conserved GXXGXXK motif present in all steroid and phenol sulfotransferases for which primary structures are known. This consistency strongly suggests that all steroid and phenol sulfotransferase genes will be similarly spliced. The GXXGXXK motif forms the active binding site for the universal sulfonate donor 3'-phosphoadenosine 5'-phosphosulfate. Amino acid sequence alignment of 19 cloned steroid and phenol sulfotransferases starting with the GXXGXXK motif indicates that the 3'-terminal exon for each steroid and phenol sulfotransferase gene encodes a similarly sized C-terminal fragment of the protein. Interestingly, on further analysis of the alignment, three distinct amino acid sequence patterns emerge. The presence of the conserved functional GXXGXXK motif suggests that the protein domains encoded by steroid and phenol sulfotransferase 3'-terminal exons have evolved from a common ancestor. Furthermore, it is hypothesized that during the course of evolution, the 3'-terminal exon further diverged into at least three sulfotransferase subdivisions: a phenol or aryl group, an estrogen or phenolic steroid group, and a neutral steroid group.
Resumo:
Multiple-complete-digest mapping is a DNA mapping technique based on complete-restriction-digest fingerprints of a set of clones that provides highly redundant coverage of the mapping target. The maps assembled from these fingerprints order both the clones and the restriction fragments. Maps are coordinated across three enzymes in the examples presented. Starting with yeast artificial chromosome contigs from the 7q31.3 and 7p14 regions of the human genome, we have produced cosmid-based maps spanning more than one million base pairs. Each yeast artificial chromosome is first subcloned into cosmids at a redundancy of ×15–30. Complete-digest fragments are electrophoresed on agarose gels, poststained, and imaged on a fluorescent scanner. Aberrant clones that are not representative of the underlying genome are rejected in the map construction process. Almost every restriction fragment is ordered, allowing selection of minimal tiling paths with clone-to-clone overlaps of only a few thousand base pairs. These maps demonstrate the practicality of applying the experimental and software-based steps in multiple-complete-digest mapping to a target of significant size and complexity. We present evidence that the maps are sufficiently accurate to validate both the clones selected for sequencing and the sequence assemblies obtained once these clones have been sequenced by a “shotgun” method.
Resumo:
PALI (release 1.2) contains three-dimensional (3-D) structure-dependent sequence alignments as well as structure-based phylogenetic trees of homologous protein domains in various families. The data set of homologous protein structures has been derived by consulting the SCOP database (release 1.50) and the data set comprises 604 families of homologous proteins involving 2739 protein domain structures with each family made up of at least two members. Each member in a family has been structurally aligned with every other member in the same family (pairwise alignment) and all the members in the family are also aligned using simultaneous superposition (multiple alignment). The structural alignments are performed largely automatically, with manual interventions especially in the cases of distantly related proteins, using the program STAMP (version 4.2). Every family is also associated with two dendrograms, calculated using PHYLIP (version 3.5), one based on a structural dissimilarity metric defined for every pairwise alignment and the other based on similarity of topologically equivalent residues. These dendrograms enable easy comparison of sequence and structure-based relationships among the members in a family. Structure-based alignments with the details of structural and sequence similarities, superposed coordinate sets and dendrograms can be accessed conveniently using a web interface. The database can be queried for protein pairs with sequence or structural similarities falling within a specified range. Thus PALI forms a useful resource to help in analysing the relationship between sequence and structure variation at a given level of sequence similarity. PALI also contains over 653 ‘orphans’ (single member families). Using the web interface involving PSI_BLAST and PHYLIP it is possible to associate the sequence of a new protein with one of the families in PALI and generate a phylogenetic tree combining the query sequence and proteins of known 3-D structure. The database with the web interfaced search and dendrogram generation tools can be accessed at http://pa uling.mbu.iisc.ernet.in/~pali.
Resumo:
STACK is a tool for detection and visualisation of expressed transcript variation in the context of developmental and pathological states. The datasystem organises and reconstructs human transcripts from available public data in the context of expression state. The expression state of a transcript can include developmental state, pathological association, site of expression and isoform of expressed transcript. STACK consensus transcripts are reconstructed from clusters that capture and reflect the growing evidence of transcript diversity. The comprehensive capture of transcript variants is achieved by the use of a novel clustering approach that is tolerant of sub-sequence diversity and does not rely on pairwise alignment. This is in contrast with other gene indexing projects. STACK is generated at least four times a year and represents the exhaustive processing of all publicly available human EST data extracted from GenBank. This processed information can be explored through 15 tissue-specific categories, a disease-related category and a whole-body index and is accessible via WWW at http://www.sanbi.ac.za/Dbases.html. STACK represents a broadly applicable resource, as it is the only reconstructed transcript database for which the tools for its generation are also broadly available (http://www.sanbi.ac.za/CODES).
Resumo:
By detailed NMR analysis of a human telomere repeating unit, d(CCCTAA), we have found that three distinct tetramers, each of which consists of four symmetric single-strands, slowly exchange in a slightly acidic solution. Our new finding is a novel i-motif topology (T-form) where T4 is intercalated between C1 and C2 of the other duplex. The other two tetramers have a topology where C1 is intercalated between C2 and C3 of the other parallel duplex, resulting in the non-stacking T4 residues (R-form), and a topology where C1 is stacked between C3 and T4 of the other duplex (S-form). From the NMR denaturation profile, the R-form is the most stable of the three structures in the temperature range of 15–50°C, the S-form the second and the T-form the least stable. The thermodynamic parameters indicate that the T-form is the most enthalpically driven and entropically opposed, and its population is increased with decreasing temperature. The T-form structure determined by restrained molecular dynamics calculation suggests that inter-strand van der Waals contacts in the narrow grooves should contribute to the enthalpic stabilization of the T-form.
Resumo:
Cancer/testis (CT) antigens—immunogenic protein antigens that are expressed in testis and a proportion of diverse human cancer types—are promising targets for cancer vaccines. To identify new CT antigens, we constructed an expression cDNA library from a melanoma cell line that expresses a wide range of CT antigens and screened the library with an allogeneic melanoma patient serum known to contain antibodies against two CT antigens, MAGE-1 and NY-ESO-1. cDNA clones isolated from this library identified four CT antigen genes: MAGE-4a, NY-ESO-1, LAGE-1, and CT7. Of these four, only MAGE-4a and NY-ESO-1 proteins had been shown to be immunogenic. LAGE-1 is a member of the NY-ESO-1 gene family, and CT7 is a newly defined gene with partial sequence homology to the MAGE family at its carboxyl terminus. The predicted CT7 protein, however, contains a distinct repetitive sequence at the 5′ end and is much larger than MAGE proteins. Our findings document the immunogenicity of LAGE-1 and CT7 and emphasize the power of serological analysis of cDNA expression libraries in identifying new human tumor antigens.
Resumo:
The partial molecular characterization of multiple sclerosis (MS)-associated retrovirus (MSRV), a novel retrovirus previously called LM7, is reported. MSRV has been isolated repeatedly from leptomeningeal, choroid plexus and from Epstein–Barr virus-immortalized B cells of MS patients. A strategy based on reverse transcriptase PCR with RNA-purified extracellular virions yielded an initial pol fragment from which other regions of the retroviral genome were subsequently obtained by sequence extension. MSRV-specific PCR primers amplified a pol region from RNA present at the peak of reverse transcriptase activity, coinciding with extracellular viral particles in sucrose density gradients. The same sequence was detected in noncellular RNA from MS patient plasma and in cerebrospinal fluid from untreated MS patients. MSRV is related to, but distinct from, the endogenous retroviral sequence ERV9. Whether MSRV represents an exogenous retrovirus with closely related endogenous elements or a replication-competent, virion-producing, endogenous provirus is as yet unknown. Further molecular epidemiological studies are required to determine precisely the apparent association of virions containing MSRV RNA with MS.
Resumo:
The bovine papillomavirus type 1 (BPV-1) exonic splicing suppressor (ESS) is juxtaposed immediately downstream of BPV-1 splicing enhancer 1 and negatively modulates selection of a suboptimal 3′ splice site at nucleotide 3225. The present study demonstrates that this pyrimidine-rich ESS inhibits utilization of upstream 3′ splice sites by blocking early steps in spliceosome assembly. Analysis of the proteins that bind to the ESS showed that the U-rich 5′ region binds U2AF65 and polypyrimidine tract binding protein, the C-rich central part binds 35- and 54–55-kDa serine/arginine-rich (SR) proteins, and the AG-rich 3′ end binds alternative splicing factor/splicing factor 2. Mutational and functional studies indicated that the most critical region of the ESS maps to the central C-rich core (GGCUCCCCC). This core sequence, along with additional nonspecific downstream nucleotides, is sufficient for partial suppression of spliceosome assembly and splicing of BPV-1 pre-mRNAs. The inhibition of splicing by the ESS can be partially relieved by excess purified HeLa SR proteins, suggesting that the ESS suppresses pre-mRNA splicing by interfering with normal bridging and recruitment activities of SR proteins.
Resumo:
The function of repressor activator protein 1 (Rap1p) at glycolytic enzyme gene upstream activating sequence (UAS) elements in Saccharomyces cerevisiae is to facilitate binding of glycolysis regulatory protein 1 (Gcr1p) at adjacent sites. Rap1p has a modular domain structure. In its amino terminus there is an asymmetric DNA-bending domain, which is distinct from its DNA-binding domain, which resides in the middle of the protein. In the carboxyl terminus of Rap1p lie its silencing and putative activation domains. We carried out a molecular dissection of Rap1p to identify domains contributing to its ability to facilitate binding of Gcr1p. We prepared full-length and three truncated versions of Rap1p and tested their ability to facilitate binding of Gcr1p by gel shift assay. The ability to detect ternary complexes containing Rap1p⋅DNA⋅Gcr1p depended on the presence of binding sites for both proteins in the probe DNA. The DNA-binding domain of Rap1p, although competent to bind DNA, was unable to facilitate binding of Gcr1p. Full-length Rap1p and the amino- and carboxyl-truncated versions of Rap1p were each able to facilitate binding of Gcr1p at an appropriately spaced binding site. Under these conditions, Gcr1p displayed an approximately 4-fold greater affinity for Rap1p-bound DNA than for otherwise identical free DNA. When spacing between Rap1p- and Gcr1p-binding sites was altered by insertion of five nucleotides, the ability to form ternary Rap1p⋅DNA⋅Gcr1p complexes was inhibited by all but the DNA-binding domain of Rap1p itself; however, the ability of each individual protein to bind the DNA probe was unaffected.
Resumo:
A multiple protein–DNA complex formed at a human α-globin locus-specific regulatory element, HS-40, confers appropriate developmental expression pattern on human embryonic ζ-globin promoter activity in humans and transgenic mice. We show here that introduction of a 1-bp mutation in an NF-E2/AP1 sequence motif converts HS-40 into an erythroid-specific locus-control region. Cis-linkage with this locus-control region, in contrast to the wild-type HS-40, allows erythroid lineage-specific derepression of the silenced human ζ-globin promoter in fetal and adult transgenic mice. Furthermore, ζ-globin promoter activities in adult mice increase in proportion to the number of integrated DNA fragments even at 19 copies/genome. The mutant HS-40 in conjunction with human ζ-globin promoter thus can be used to direct position-independent and copy number-dependent expression of transgenes in adult erythroid cells. The data also supports a model in which competitive DNA binding of different members of the NF-E2/AP1 transcription factor family modulates the developmental stage specificity of an erythroid enhancer. Feasibility to reswitch on embryonic/fetal globin genes through the manipulation of nuclear factor binding at a single regulatory DNA motif is discussed.
Resumo:
In mammals, one of the major actions of insulin-like growth factor I (IGF-I) is to increase skeletal growth by stimulating new cartilage formation. IGF-I stimulates chondrocytes in vitro to synthesize new cartilage matrix, measured by enhanced uptake of 35S-sulfate, but the addition of insulin does not produce a similar effect except when added at high concentrations. However, recent studies have shown that, in teleosts, both insulin and IGF-I are potent activators of 35S-sulfate uptake in gill cartilage. To further characterize the growth-promoting activities of these hormones in fish, we have used reverse transcriptase-linked PCR to analyze the expression of insulin receptor family genes in salmon gill cartilage. Partial cDNA sequences encoding the tyrosine kinase domains from six distinct members of the IR gene family were obtained, and sequence comparisons revealed that four of the cDNAs encoded amino acid sequences that were highly homologous to human IR whereas the encoded sequences from two of the cDNAs were more similar to the human type I IGF receptor (IGF-R). Furthermore, a comparative reverse transcriptase-linked PCR assay revealed that the four putative IR mRNAs expressed in toto in gill cartilage were 56% of that found in liver whereas the expressed amount of the two IGF-R mRNAs was 9-fold higher compared with liver. These results suggest that the chondrogenic actions of insulin and IGF-I in fish are mediated by the ligands binding to their cognate receptors. However, further studies will be required to characterize the binding properties and relative contribution of the individual IR and IGF-R genes.
Resumo:
The fundamental process of nucleocytoplasmic transport takes place through the nuclear pore. Peripheral pore structures are presumably poised to interact with transport receptors and their cargo as these receptor complexes first encounter the pore. One such peripheral structure likely to play an important role in nuclear export is the basket structure located on the nuclear side of the pore. At present, Nup153 is the only nucleoporin known to localize to the surface of this basket, suggesting that Nup153 is potentially one of the first pore components an RNA or protein encounters during export. In this study, anti-Nup153 antibodies were used to probe the role of Nup153 in nuclear export in Xenopus oocytes. We found that Nup153 antibodies block three major classes of RNA export, that of snRNA, mRNA, and 5S rRNA. Nup153 antibodies also block the NES protein export pathway, specifically the export of the HIV Rev protein, as well as Rev-dependent RNA export. Not all export was blocked; Nup153 antibodies did not impede the export of tRNA or the recycling of importin β to the cytoplasm. The specific antibodies used here also did not affect nuclear import, whether mediated by importin α/β or by transportin. Overall, the results indicate that Nup153 is crucial to multiple classes of RNA and protein export, being involved at a vital juncture point in their export pathways. This juncture point appears to be one that is bypassed by tRNA during its export. We asked whether a physical interaction between RNA and Nup153 could be observed, using homoribopolymers as sequence-independent probes for interaction. Nup153, unlike four other nucleoporins including Nup98, associated strongly with poly(G) and significantly with poly(U). Thus, Nup153 is unique among the nucleoporins tested in its ability to interact with RNA and must do so either directly or indirectly through an adaptor protein. These results suggest a unique mechanistic role for Nup153 in the export of multiple cargos.
Resumo:
We report automated DNA sequencing in 16-channel microchips. A microchip prefilled with sieving matrix is aligned on a heating plate affixed to a movable platform. Samples are loaded into sample reservoirs by using an eight-tip pipetting device, and the chip is docked with an array of electrodes in the focal plane of a four-color scanning detection system. Under computer control, high voltage is applied to the appropriate reservoirs in a programmed sequence that injects and separates the DNA samples. An integrated four-color confocal fluorescent detector automatically scans all 16 channels. The system routinely yields more than 450 bases in 15 min in all 16 channels. In the best case using an automated base-calling program, 543 bases have been called at an accuracy of >99%. Separations, including automated chip loading and sample injection, normally are completed in less than 18 min. The advantages of DNA sequencing on capillary electrophoresis chips include uniform signal intensity and tolerance of high DNA template concentration. To understand the fundamentals of these unique features we developed a theoretical treatment of cross-channel chip injection that we call the differential concentration effect. We present experimental evidence consistent with the predictions of the theory.