13 resultados para Single sequence repeat
em DigitalCommons@The Texas Medical Center
Resumo:
The molecular mechanisms responsible for the expansion and deletion of trinucleotide repeat sequences (TRS) are the focus of our studies. Several hereditary neurological diseases including Huntington's disease, myotonic dystrophy, and fragile X syndrome are associated with the instability of TRS. Using the well defined and controllable model system of Escherichia coli, the influences of three types of DNA incisions on genetic instability of CTG•CAG repeats were studied: DNA double-strand breaks (DSB), single-strand nicks, and single-strand gaps. The DNA incisions were generated in pUC19 derivatives by in vitro cleavage with restriction endonucleases. The cleaved DNA was then transformed into E. coli parental and mutant strains. Double-strand breaks induced deletions throughout the TRS region in an orientation dependent manner relative to the origin of replication. The extent of instability was enhanced by the repeat length and sequence (CTG•CAG vs. CGG•CCG). Mutations in recA and recBC increased deletions, mutations in recF stabilized the TRS, whereas mutations in ruvA had no effect. DSB were repaired by intramolecular recombination, versus an intermolecular gene conversion or crossover mechanism. 30 nt gaps formed a distinct 30 nt deletion product, whereas single strand nicks and gaps of 15 nts did not induce expansions or deletions. Formation of this deletion product required the CTG•CAG repeats to be present in the single-stranded region and was stimulated by E. coli DNA ligase, but was not dependent upon the RecFOR pathway. Models are presented to explain the DSB induced instabilities and formation of the 30 nucleotide deletion product. In addition to the in vitro creation of DSBs, several attempts to generate this incision in vivo with the use of EcoR I restriction modification systems were conducted. ^
Resumo:
The LIM domain-binding protein Ldb1 is an essential cofactor of LIM-homeodomain (LIM-HD) and LIM-only (LMO) proteins in development. The stoichiometry of Ldb1, LIM-HD, and LMO proteins is tightly controlled in the cell and is likely a critical determinant of their biological actions. Single-stranded DNA-binding proteins (SSBPs) were recently shown to interact with Ldb1 and are also important in developmental programs. We establish here that two mammalian SSBPs, SSBP2 and SSBP3, contribute to an erythroid DNA-binding complex that contains the transcription factors Tal1 and GATA-1, the LIM domain protein Lmo2, and Ldb1 and binds a bipartite E-box-GATA DNA sequence motif. In addition, SSBP2 was found to augment transcription of the Protein 4.2 (P4.2) gene, a direct target of the E-box-GATA-binding complex, in an Ldb1-dependent manner and to increase endogenous Ldb1 and Lmo2 protein levels, E-box-GATA DNA-binding activity, and P4.2 and beta-globin expression in erythroid progenitors. Finally, SSBP2 was demonstrated to inhibit Ldb1 and Lmo2 interaction with the E3 ubiquitin ligase RLIM, prevent RLIM-mediated Ldb1 ubiquitination, and protect Ldb1 and Lmo2 from proteasomal degradation. These results define a novel biochemical function for SSBPs in regulating the abundance of LIM domain and LIM domain-binding proteins.
Resumo:
Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.
Resumo:
Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.
Resumo:
In this study, we present a trilocus sequence typing (TLST) scheme based on intragenic regions of two antigenic genes, ace and salA (encoding a collagen/laminin adhesin and a cell wall-associated antigen, respectively), and a gene associated with antibiotic resistance, lsa (encoding a putative ABC transporter), for subspecies differentiation of Enterococcus faecalis. Each of the alleles was analyzed using 50 E. faecalis isolates representing 42 diverse multilocus sequence types (ST(M); based on seven housekeeping genes) and four groups of clonally linked (by pulsed-field gel electrophoresis [PFGE]) isolates. The allelic profiles and/or concatenated sequences of the three genes agreed with multilocus sequence typing (MLST) results for typing of 49 of the 50 isolates; in addition to the one exception, two isolates were found to have identical TLST types but were single-locus variants (differing by a single nucleotide) by MLST and were therefore also classified as clonally related by MLST. TLST was also comparable to PFGE for establishing short-term epidemiological relationships, typing all isolates classified as clonally related by PFGE with the same type. TLST was then applied to representative isolates (of each PFGE subtype and isolation year) of a collection of 48 hospital isolates and demonstrated the same relationships between isolates of an outbreak strain as those found by MLST and PFGE. In conclusion, the TLST scheme described here was shown to be successful for investigating short-term epidemiology in a hospital setting and may provide an alternative to MLST for discriminating isolates.
Resumo:
I studied the apolipoprotein (apo) B 3$\sp\prime$ variable number tandem repeat (VNTR) and did computer simulations of the stepwise mutation model to address four questions: (1) How did the apo B VNTR originate? (2) What is the mutational mechanism of repeat number change at the apo B VNTR? (3) To what extent are population and molecular level events responsible for the determination of the contemporary apo B allele frequency distribution? (4) Can VNTR allele frequency distributions be explained by a simple and conservative mutation-drift model? I used three general approaches to address these questions: (1) I characterized the apo B VNTR region in non-human primate species; (2) I constructed haplotypes of polymorphic markers flanking the apo B VNTR in a sample of individuals from Lorrain, France and studied the associations between the flanking-marker haplotypes and apo B VNTR size; (3) I did computer simulations of the one-step stepwise mutation model and compared the results to real data in terms of four allele frequency distribution characteristics.^ The results of this work have allowed me to conclude that the apo B VNTR originated after an initial duplication of a sequence which is still present as a single copy sequence in New World monkey species. I conclude that this locus did not originate by the transposition of an array of repeats from somewhere else in the genome. It is unlikely that recombination is the primary mutational mechanism. Furthermore, the clustered nature of these associations implicates a stepwise mutational mechanism. From the high frequencies of certain haplotype-allele size combinations, it is evident that population level events have also been important in the determination of the apo B VNTR allele frequency distribution. Results from computer simulations of the one-step stepwise mutation model have allowed me to conclude that bimodal and multimodal allele frequency distributions are not unexpected at loci evolving via stepwise mutation mechanisms. Short tandem repeat loci fit the stepwise mutation model best, followed by microsatellite loci. I therefore conclude that there are differences in the mutational mechanisms of VNTR loci as classed by repeat unit size. (Abstract shortened by UMI.) ^
Resumo:
Myotonic dystrophy (DM), an autosomal dominant disorder mapping to human chromosome 19q13.3, is the most common neuromuscular disease in human adults.^ Following the identification of the mutation underlying the DM phenotype, an unstable (CTG)$\sb{n}$ trinucleotide repeat in the 3$\prime$ untranslated region (UTR) of a gene encoding a ser/thr protein kinase named DM protein kinase (DMPK), the study was targeted at two questions: (1) the identification of the disease-causing mechanism(s) of the unstable repeat, and at a more basic level, (2) the identification of the origin and the mechanism(s) involved in repeat instability. The first goal was to identify the pathophysiological mechanisms of the (CTG)$\sb{n}$ repeat.^ The normal repeat is transcribed but not translated; therefore, initial studies centered on the effect on RNA transcript levels. The vast majority of DM affecteds are heterozygous for the mutant expansion, so that the normal allele interferes with the analysis of the mutant allele. A quantitative allele-specific RT-PCR procedure was developed and applied to a spectrum of patient tissue samples and cell lines. Equal levels of unprocessed pre-mRNA were determined for the wild type (+) and disease (DM) alleles in skeletal muscle and cell lines of heterozygous DM patients, indicating that any nucleosome binding has no effect at the level of transcriptional initiation and transcription of the mutant DMPK locus. In contrast, processed mRNA levels from the DM allele were reduced relative to the + allele as the size of the expansion increased. The unstable repeat, therefore, impairs post-transcriptional processing of DM allele transcripts. This phenomenon has profound effects on overall DMPK locus steady-state transcript levels in cells missing a wild type allele and does not appear to be mediated by imprinting, decreased mRNA stability, generation of aberrant splice forms, or absence of polyadenylation of the mutant allele.^ In Caucasian DM subjects, the unstable repeat is in complete linkage disequlibrium with a single haplotype composed of nine alleles within and flanking DMPK over a physical distance of 30 kb. A detailed haplotype analysis of the DM region was conducted on a Nigerian (Yoruba) DM family, the only indigenous sub-Saharan DM case reported to date. Each affected member of this family had an expanded (CTG)$\sb{n}$ repeat in one of their DMPK alleles. However, unlike all other DM populations studied thus far, disassociation of the (CTG)$\sb{n}$ repeat expansion from other alleles of the putative predisposing haplotype was found. Thus, the expanded (CTG)$\sb{n}$ repeat in this family was the result of an independent mutational event. Consequently, the origin of DM is unlikely the result of a single mutational event, and the hypothesis that a single ancestral haplotype predisposes to repeat expansion is not compelling. (Abstract shortened by UMI.) ^
Resumo:
We developed a novel combinatorial method termed restriction endonuclease protection selection and amplification (REPSA) to identify consensus binding sites of DNA-binding ligands. REPSA uses a unique enzymatic selection based on the inhibition of cleavage by a type IIS restriction endonuclease, an enzyme that cleaves DNA at a site distal from its recognition sequence. Sequences bound by a ligand are protected from cleavage while unprotected sequences are cleaved. This enzymatic selection occurs in solution under mild conditions and is dependant only on the DNA-binding ability of the ligand. Thus, REPSA is useful for a broad range of ligands including all classes of DNA-binding ligands, weakly binding ligands, mixed populations of ligands, and unknown ligands. Here I describe REPSA and the application of this method to select the consensus DNA-binding sequences of three representative DNA-binding ligands; a nucleic acid (triplex-forming single-stranded DNA), a protein (the TATA-binding protein), and a small molecule (Distamycin A). These studies generated new information regarding the specificity of these ligands in addition to establishing their DNA-binding sequences. ^
Resumo:
Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^
Resumo:
Friedreich's ataxia is caused by the expansion of the GAA•TTC trinucleotide repeat sequence located in intron 1 of the frataxin gene. The long GAA•TTC repeats are known to form several non-B DNA structures including hairpins, triplexes, parallel DNA and sticky DNA. Therefore it is believed that alternative DNA structures play a role in the loss of mRNA transcript and functional frataxin protein in FRDA patients. We wanted to further elucidate the characteristics for formation and stability of sticky DNA by evaluating the structure in a plasmid based system in vitro and in vivo in Escherichia coli. The negative supercoil density of plasmids harboring different lengths of GAA•TTC repeats, as well as either one or two repeat tracts were studied in E. coli to determine if plasmids containing two long tracts (≥60 repeats) in a direct repeat orientation would have a different topological effect in vivo compared to plasmids that harbored only one GAA•TTC tract or two tracts of < 60 repeats. The experiments revealed that, in fact, sticky DNA forming plasmids had a lower average negative supercoil density (-σ) compared to all other control plasmids used that had the potential to form other non-B DNA structures such as triplexes or Z-DNA. Also, the requirements for in vitro dissociation and reconstitution of the DNA•DNA associated region of sticky DNA were evaluated. Results conclude that the two repeat tracts associate in the presence of negative supercoiling and MgCl 2 or MnCl2 in a time and concentration-dependent manner. Interaction of the repeat sequences was not observed in the absence of negative supercoiling and/or MgCl2 or in the presence of other monovalent or divalent cations, indicating that supercoiling and quite specific cations are needed for the association of sticky DNA. These are the first experiments studying a more specific role of supercoiling and cation influence on this DNA conformation. To support our model of the topological effects of sticky DNA in plasmids, changes in sticky DNA band migration was measured with reference to the linear DNA after treatment with increasing concentrations of ethidium bromide (EtBr). The presence of independent negative supercoil domains was confirmed by this method and found to be segregated by the DNA-DNA associated region. Sequence-specific polyamide molecules were used to test the effect of binding of the ligands to the GAA•TTC repeats on the inhibition of sticky DNA. The destabilization of the sticky DNA conformation in vitro through this binding of the polyamides demonstrated the first conceptual therapeutic approach for the treatment of FRDA at the DNA molecular level. ^ Thus, examining the properties of sticky DNA formed by these long repeat tracts is important in the elucidation of the possible role of sticky DNA in Friedreich's ataxia. ^
Resumo:
Among Mexican Americans, the second largest minority group in the United States, the prevalence of gallbladder disease is markedly elevated. Previous data from both genetic admixture and family studies indicate that there is a genetic component to the occurrence of gallbladder disease in Mexican Americans. However, prior to this thesis no formal genetic analysis of gallbladder disease had been carried out nor had any contributing genes been identified.^ The results of complex segregation analysis in a sample of 232 Mexican American pedigrees documented the existence of a major gene having two alleles with age- and gender-specific effects influencing the occurrence of gallbladder disease. The estimated frequency of the allele increasing susceptibility was 0.39. The lifetime probabilities that an individual will be affected by gallbladder disease were 1.0, 0.54, and 0.00 for females of genotypes "AA", "Aa", and "aa", respectively, and 0.68, 0.30, and 0.00 for males, respectively. This analysis provided the first conclusive evidence for the existence of a common single gene having a large effect on the occurrence of gallbladder disease.^ Human cholesterol 7$\alpha$-hydroxylase is the rate-limiting enzyme in bile acid synthesis. The results of an association study in both a random sample and a matched case/control sample showed that there is a significant association between cholesterol 7$\alpha$-hydroxylase gene variation and the occurrence of gallbladder disease in Mexican Americans males but not in females. These data have implicated a specific gene, 7$\alpha$-hydroxylase, in the etiology of gallbladder disease in this population.^ Finally, I asked whether the inferred major gene from complex segregation analysis is genetically linked to the cholesterol 7$\alpha$-hydroxylase gene. Three pedigrees predicted to be informative for linkage analysis by virtue of supporting the major gene hypothesis and having parents with informative genotypes and multiple offspring were selected for this linkage analysis. In each of these pedigrees, the recombination fractions maximized at 0 with a positive, albeit low, LOD score. The results of this linkage analysis provide preliminary and suggestive evidence that the cholesterol 7$\alpha$-hydroxylase gene and the inferred gallbladder disease susceptibility gene are genetically linked. ^
Resumo:
The small leucine-rich repeat proteoglycans (or SLRPs) are a group of extracellular proteins (ECM) that belong to the leucine-rich repeat (LRR) superfamily of proteins. The LRR is a protein folding motif composed of 20–30 amino acids with leucines in conserved positions. LRR-containing proteins are present in a broad spectrum of organisms and possess diverse cellular functions and localization. In mammals, the SLRPs are abundant in connective tissues, such as bones, cartilage, tendons, skin, and blood vessels. We have discovered a new member of the class I small leucine rich repeat proteoglycan (SLRP) family which is distinct from the other class I SLRPs since it possesses a unique stretch of aspartate residues at its N-terminus. For this reason, we called the molecule asporin. The deduced amino acid sequence is about 50% identical (and 70% similar) to decorin and biglycan. However, asporin does not contain a serine/glycine dipeptide sequence required for the assembly of O-linked glycosaminoglycans and is probably not a proteoglycan. The tissue expression of asporin partially overlaps with the expression of decorin and biglycan. During mouse embryonic development, asporin mRNA expression was detected primarily in the skeleton and other specialized connective tissues; very little asporin message was detected in the major parenchymal organs. The mouse asporin gene structure is similar to that of biglycan and decorin with 8 exons. The asporin gene is localized to human chromosome 9q22-9g21.3 where asporin is part of a SLRP gene cluster that includes ECM2, osteoadherin, and osteoglycin. This gene cluster of four LRR-encoding genes is embedded in a 238 kilobase intron of another novel gene named Tes9orf that is expressed primarily in the testes of the adult mouse. The SLRP genes are not present in Drosophila or C. elegans , but reside in three separate gene clusters in the puffer fish, mice and humans. Targeted disruption of individual mouse SLRP genes display minor connective tissue defects such as skin fragility, tendon laxity, minor growth plate defects, and mild osteoporosis. However, double and triple knockouts of SLRP genes exacerbate these phenotypes. Both the double epiphycan/biglycan and the triple PRELP/fibromodulin/biglycan knockout mice exhibit premature osteoarthritis. ^
Resumo:
The creation, preservation, and degeneration of cis-regulatory elements controlling developmental gene expression are fundamental genome-level evolutionary processes about which little is known. In this study, critical differences in cis-regulatory elements controlling the expression of the sea urchin aboral ectoderm-specific spec genes were identified and explored. In genomes of species within the Strongylocentrotidae family, multiple copies of a repetitive sequence element termed RSR were present, but RSRs were not detected in genomes of species outside Strongylocentrotidae. RSRs are invariably associated with spec genes, and in Strongylocentrotus purpuratus, the spec2a RSR functioned as a transcriptional enhancer displaying greater activity than RSRs from the spec1 or spec2c paralogs. Single base-pair differences at two cis-regulatory elements within the spec2a RSR greatly increased the binding affinities of four transcription factors: SpCCAAT-binding factor at one element and SpOtx, SpGoosecoid, and SpGATA-E at another. The cis-regulatory elements to which SpCCAAT-binding factor, SpOtx, SpGoosecoid, and SpGATA-E bound were recent evolutionary acquisitions that could act either to activate or repress transcription, depending on the cell type. These elements were found in the spec2a RSR ortholog in Strongylocentrotus pallidus but not in the RSR orthologs of Strongylocentrotus droebachiensis or Hemicentrotus pulcherrimus. These results indicate that spec genes exhibit a dynamic pattern of cis-regulatory element evolution while stabilizing selection preserves their aboral ectoderm expression domain. ^