930 resultados para DNA Sequences
Resumo:
Based on our current knowledge about population genetics, phylogeography and speciation, we begin to understand that the deep sea harbours more species than suggested in the past. Deep-sea soft-sediment environment in particular hosts a diverse and highly endemic invertebrate fauna. Very little is known about evolutionary processes that generate this remarkable species richness, the genetic variability and spatial distribution of deep-sea animals. In this study, phylogeographic patterns and the genetic variability among eight populations of the abundant and widespread deep-sea isopod morphospecies Betamorpha fusiformis [Barnard, K.H., 1920. Contributions to the crustacean fauna of South Africa. 6. Further additions to the list of marine isopods. Annals of the South African Museum 17, 319-438] were examined. A fragment of the mitochondrial 16S rRNA gene of 50 specimens and the complete nuclear 18S rRNA gene of 7 specimens were sequenced. The molecular data reveal high levels of genetic variability of both genes between populations, giving evidence for distinct monophyletic groups of haplotypes with average p-distances ranging from 0.0470 to 0.1440 (d-distances: 0.0592-0.2850) of the 16S rDNA, and 18S rDNA p-distances ranging between 0.0032 and 0.0174 (d-distances: 0.0033-0.0195). Intermediate values are absent. Our results show that widely distributed benthic deep-sea organisms of a homogeneous phenotype can be differentiated into genetically highly divergent populations. Sympatry of some genotypes indicates the existence of cryptic speciation. Flocks of closely related but genetically distinct species probably exist in other widespread benthic deep-sea asellotes and other Peracarida. Based on existing data we hypothesize that many widespread morphospecies are complexes of cryptic biological species (patchwork hypothesis).
Resumo:
DNA methyltransferases (MTases) are a group of enzymes that catalyze the methyl group transfer from S-adenosyl-L-methionine in a sequence-specific manner. Orthodox Type II DNA MTases usually recognize palindromic DNA sequences and add a methyl group to the target base (either adenine or cytosine) on both strands. However, there are a number of MTases that recognize asymmetric target sequences and differ in their subunit organization. In a bacterial cell, after each round of replication, the substrate for any MTase is hemimethylated DNA, and it therefore needs only a single methylation event to restore the fully methylated state. This is in consistent with the fact that most of the DNA MTases studied exist as monomers in solution. Multiple lines of evidence suggest that some DNA MTases function as dimers. Further, functional analysis of many restriction-modification systems showed the presence of more than one or fused MTase genes. It was proposed that presence of two MTases responsible for the recognition and methylation of asymmetric sequences would protect the nascent strands generated during DNA replication from cognate restriction endonuclease. In this review, MTases recognizing asymmetric sequences have been grouped into different subgroups based on their unique properties. Detailed characterization of these unusual MTases would help in better understanding of their specific biological roles and mechanisms of action. The rapid progress made by the genome sequencing of bacteria and archaea may accelerate the identification and study of species- and strain-specific MTases of host-adapted bacteria and their roles in pathogenic mechanisms.
Resumo:
DNA methyltransferases (MTases) are a group of enzymes that catalyze the methyl group transfer from S-adenosyl-L-methionine in a sequence-specific manner. Orthodox Type II DNA MTases usually recognize palindromic DNA sequences and add a methyl group to the target base (either adenine or cytosine) on both strands. However, there are a number of MTases that recognize asymmetric target sequences and differ in their subunit organization. In a bacterial cell, after each round of replication, the substrate for any MTase is hemimethylated DNA, and it therefore needs only a single methylation event to restore the fully methylated state. This is in consistent with the fact that most of the DNA MTases studied exist as monomers in solution. Multiple lines of evidence suggest that some DNA MTases function as dimers. Further, functional analysis of many restriction-modification systems showed the presence of more than one or fused MTase genes. It was proposed that presence of two MTases responsible for the recognition and methylation of asymmetric sequences would protect the nascent strands generated during DNA replication from cognate restriction endonuclease. In this review, MTases recognizing asymmetric sequences have been grouped into different subgroups based on their unique properties. Detailed characterization of these unusual MTases would help in better understanding of their specific biological roles and mechanisms of action. The rapid progress made by the genome sequencing of bacteria and archaea may accelerate the identification and study of species- and strain-specific MTases of host-adapted bacteria and their roles in pathogenic mechanisms.
Resumo:
Hairpin pyrrole-imdazole polyamides are cell-permeable, sequence-programmable oligomers that bind in the minor groove of DNA. This thesis describes studies of Py-Im polyamides targeted to biologically important DNA repeat sequences for the purpose of modulating disease states. Design of a hairpin polyamide that binds the CG dyad, a site of DNA methylation that can become dysregulated in cancer, is described. We report the synthesis of a DNA methylation antagonist, its sequence specificity and affinity informed by Bind-n-Seq and iteratively designed, which improves inhibitory activity in a cell-free assay by 1000-fold to low nanomolar IC50. Additionally, a hairpin polyamide targeted to the telomeric sequence is found to trigger a slow necrotic-type cell death with the release of inflammatory molecules in a model of B cell lymphoma. The effects of the polyamide are unique in this class of oligomers; its effects are characterized and a functional assay of phagocytosis by macrophages is described. Additionally, hairpin polyamides targeted to pathologically expanded CTG•CAG triplet repeat DNA sequences, the molecular cause of myotonic dystrophy type 1, are synthesized and assessed for toxicity. Lastly, ChIP-seq of Hypoxia-Inducible Factor is performed under hypoxia-induced conditions. The study results show that ChIP-seq can be employed to understand the genome-wide perturbation of Hypoxia-Inducible Factor occupancy by a Py-Im polyamide.
Resumo:
Phenylamidine cationic groups linked by a furan ring (furamidine) and related compounds bind as monomers to AT sequences of DNA. An unsymmetric derivative (DB293) with one of the phenyl rings of furamidine replaced with a benzimidazole has been found by quantitative footprinting analyses to bind to GC-containing sites on DNA more strongly than to pure AT sequences. NMR structural analysis and surface plasmon resonance binding results clearly demonstrate that DB293 binds in the minor groove at specific GC-containing sequences of DNA in a highly cooperative manner as a stacked dimer. Neither the symmetric bisphenyl nor bisbenzimidazole analogs of DB293 bind significantly to the GC containing sequences. DB293 provides a paradigm for design of compounds for specific recognition of mixed DNA sequences and extends the boundaries for small molecule-DNA recognition.
Resumo:
A DNA-binding factor with high affinity and specificity for the [Leu5]enkephalin-encoding sequences in the prodynorphin and proenkephalin genes has been characterized. The factor has the highest affinity for the [Leu5]-enkephalin-encoding sequence in the dynorphin B-encoding region of the prodynorphin gene, has relatively high affinity for other [Leu5]enkephalin-encoding sequences in the prodynorphin and proenkephalin genes, but has no apparent affinity for similar DNA sequences coding for [Met5]-enkephalin in the prodynorphin or proopiomelanocortin genes. The factor has been named [Leu5]enkephalin-encoding sequence DNA-binding factor (LEF). LEF has a nuclear localization and is composed of three subunits of about 60, 70, and 95 kDa, respectively. The highest levels were observed in rat testis, cerebellum, and spleen and were generally higher in late embryonal compared to newborn or adult animals. LEF activity was also recorded in human clonal tumor cell lines. LEF inhibited the transcription of reporter genes in artificial gene constructs where a [Leu5]enkephalin-encoding DNA fragment had been inserted between the transcription initiation site and the coding region of the reporter genes. These observations suggest that the [Leu5]enkephalin-encoding sequences in the prodynorphin and proenkephalin genes also have regulatory functions realized through interaction with a specific DNA-binding factor.
Resumo:
Studies continue to report ancient DNA sequences and viable microbial cells that are many millions of years old. In this paper we evaluate some of the most extravagant claims of geologically ancient DNA. We conclude that although exciting, the reports suffer from inadequate experimental setup and insufficient authentication of results. Consequently, it remains doubtful whether amplifiable DNA sequences and viable bacteria can survive over geological timescales. To enhance the credibility of future studies and assist in discarding false-positive results, we propose a rigorous set of authentication criteria for work with geologically ancient DNA.
Resumo:
The effect of two different DNA minor groove binding molecules, Hoechst 33258 and distamycin A, on the binding kinetics of NF-κB p50 to three different specific DNA sequences was studied at various salt concentrations. Distamycin A was shown to significantly increase the dissociation rate constant of p50 from the sequences PRDII (5′-GGGAAATTCC-3′) and Ig-κ B (5′-GGGACTTTCC-3′) but had a negligible effect on the dissociation from the palindromic target-κB binding site (5′-GGGAATTCCC-3′). By comparison, the effect of Hoechst 33258 on binding of p50 to each sequence was found to be minimal. The dissociation rates for the protein–DNA complexes increased at higher potassium chloride concentrations for the PRDII and Ig-κB binding motifs and this effect was magnified by distamycin A. In contrast, p50 bound to the palindromic target-κB site with a much higher intrinsic affinity and exhibited a significantly reduced salt dependence of binding over the ionic strength range studied, retaining a KD of less than 10 pM at 150 mM KCl. Our results demonstrate that the DNA binding kinetics of p50 and their salt dependence is strongly sequence-dependent and, in addition, that the binding of p50 to DNA can be influenced by the addition of minor groove-binding drugs in a sequence-dependent manner.
Resumo:
Certain recent models of sex determination in mammals, Drosophila melanogaster, Caenorhabditis elegans, and snakes are examined in the light of the hypothesis that the relevant genetic regulatory mechanisms are similar and interrelated. The proposed key element in each of these instances is a noncoding DNA sequence, which serves as a high-affinity binding site for a repressor-like molecule regulating the activity of a major "sex-determining" gene. On this basis it is argued that, in several eukaryotes, (i) certain DNA sequences that are sex-determining are noncoding, in the sense that they are not the structural genes of a sex-determining protein; (ii) in some species these noncoding sequences are present in one sex and absent in the other, while in others their copy number or accessibility to regulatory molecules is significantly unequal between the two sexes; and (iii) this inequality determines whether the embryo develops into a male or a female.
Resumo:
This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.
Resumo:
Genome sequence information has generated increasing evidence for the claim that repetitive DNA sequences present within and around genes could play a important role in the regulation of gene expression. Polypurine/polypyrimidine sequences [poly(Pu/Py)] have been observed in the vicinity of promoters and within the transcribed regions of many genes. To understand whether such sequences influence the level of gene expression, we constructed several prokaryotic and eukaryotic expression vectors incorporating poly(Pu/Py) repeats both within and upstream of a reporter gene, lacZ (encoding β-galactosidase), and studied its expression in vivo. We find that, in contrast to the situation in Escherichia coli, the presence of poly(Pu/Py) sequences within the gene does not significantly inhibit gene expression in mammalian cells. On the other hand, the presence of such sequences upstream of lacZ leads to a several-fold reduction of gene expression in mammalian cells. Similar down-regulation was observed when a structural cassette containing poly(Pu/Py) sequences upstream of lacZ was integrated into yeast chromosome V. Sequence analysis of the nine totally sequenced yeast chromosomes shows that a large number of such sequences occur upstream of ORFs. On the basis of our experimental results and DNA sequence analysis, we propose that these sequences can function as cis-acting transcriptional regulators.
Resumo:
Extraintestinal pathogenic Escherichia coli (ExPEC) represent a diverse group of strains of E. coli, which infect extraintestinal sites, such as the urinary tract, the bloodstream, the meninges, the peritoneal cavity, and the lungs. Urinary tract infections (UTIs) caused by uropathogenic E. coli (UPEC), the major subgroup of ExPEC, are among the most prevalent microbial diseases world wide and a substantial burden for public health care systems. UTIs are responsible for serious morbidity and mortality in the elderly, in young children, and in immune-compromised and hospitalized patients. ExPEC strains are different, both from genetic and clinical perspectives, from commensal E. coli strains belonging to the normal intestinal flora and from intestinal pathogenic E. coli strains causing diarrhea. ExPEC strains are characterized by a broad range of alternate virulence factors, such as adhesins, toxins, and iron accumulation systems. Unlike diarrheagenic E. coli, whose distinctive virulence determinants evoke characteristic diarrheagenic symptoms and signs, ExPEC strains are exceedingly heterogeneous and are known to possess no specific virulence factors or a set of factors, which are obligatory for the infection of a certain extraintestinal site (e. g. the urinary tract). The ExPEC genomes are highly diverse mosaic structures in permanent flux. These strains have obtained a significant amount of DNA (predictably up to 25% of the genomes) through acquisition of foreign DNA from diverse related or non-related donor species by lateral transfer of mobile genetic elements, including pathogenicity islands (PAIs), plasmids, phages, transposons, and insertion elements. The ability of ExPEC strains to cause disease is mainly derived from this horizontally acquired gene pool; the extragenous DNA facilitates rapid adaptation of the pathogen to changing conditions and hence the extent of the spectrum of sites that can be infected. However, neither the amount of unique DNA in different ExPEC strains (or UPEC strains) nor the mechanisms lying behind the observed genomic mobility are known. Due to this extreme heterogeneity of the UPEC and ExPEC populations in general, the routine surveillance of ExPEC is exceedingly difficult. In this project, we presented a novel virulence gene algorithm (VGA) for the estimation of the extraintestinal virulence potential (VP, pathogenicity risk) of clinically relevant ExPECs and fecal E. coli isolates. The VGA was based on a DNA microarray specific for the ExPEC phenotype (ExPEC pathoarray). This array contained 77 DNA probes homologous with known (e.g. adhesion factors, iron accumulation systems, and toxins) and putative (e.g. genes predictably involved in adhesion, iron uptake, or in metabolic functions) ExPEC virulence determinants. In total, 25 of DNA probes homologous with known virulence factors and 36 of DNA probes representing putative extraintestinal virulence determinants were found at significantly higher frequency in virulent ExPEC isolates than in commensal E. coli strains. We showed that the ExPEC pathoarray and the VGA could be readily used for the differentiation of highly virulent ExPECs both from less virulent ExPEC clones and from commensal E. coli strains as well. Implementing the VGA in a group of unknown ExPECs (n=53) and fecal E. coli isolates (n=37), 83% of strains were correctly identified as extraintestinal virulent or commensal E. coli. Conversely, 15% of clinical ExPECs and 19% of fecal E. coli strains failed to raster into their respective pathogenic and non-pathogenic groups. Clinical data and virulence gene profiles of these strains warranted the estimated VPs; UPEC strains with atypically low risk-ratios were largely isolated from patients with certain medical history, including diabetes mellitus or catheterization, or from elderly patients. In addition, fecal E. coli strains with VPs characteristic for ExPEC were shown to represent the diagnostically important fraction of resident strains of the gut flora with a high potential of causing extraintestinal infections. Interestingly, a large fraction of DNA probes associated with the ExPEC phenotype corresponded to novel DNA sequences without any known function in UTIs and thus represented new genetic markers for the extraintestinal virulence. These DNA probes included unknown DNA sequences originating from the genomic subtractions of four clinical ExPEC isolates as well as from five novel cosmid sequences identified in the UPEC strains HE300 and JS299. The characterized cosmid sequences (pJS332, pJS448, pJS666, pJS700, and pJS706) revealed complex modular DNA structures with known and unknown DNA fragments arranged in a puzzle-like manner and integrated into the common E. coli genomic backbone. Furthermore, cosmid pJS332 of the UPEC strain HE300, which carried a chromosomal virulence gene cluster (iroBCDEN) encoding the salmochelin siderophore system, was shown to be part of a transmissible plasmid of Salmonella enterica. Taken together, the results of this project pointed towards the assumptions that first, (i) homologous recombination, even within coding genes, contributes to the observed mosaicism of ExPEC genomes and secondly, (ii) besides en block transfer of large DNA regions (e.g. chromosomal PAIs) also rearrangements of small DNA modules provide a means of genomic plasticity. The data presented in this project supplemented previous whole genome sequencing projects of E. coli and indicated that each E. coli genome displays a unique assemblage of individual mosaic structures, which enable these strains to successfully colonize and infect different anatomical sites.
Resumo:
The current explosion of DNA sequence information has generated increasing evidence for the claim that noncoding repetitive DNA sequences present within and around different genes could play an important role in genetic control processes, although the precise role and mechanism by which these sequences function are poorly understood. Several of the simple repetitive sequences which occur in a large number of loci throughout the human and other eukaryotic genomes satisfy the sequence criteria for forming non-B DNA structures in vitro. We have summarized some of the features of three different types of simple repeats that highlight the importance of repetitive DNA in the control of gene expression and chromatin organization. (i) (TG/CA)n repeats are widespread and conserved in many loci. These sequences are associated with nucleosomes of varying linker length and may play a role in chromatin organization. These Z-potential sequences can help absorb superhelical stress during transcription and aid in recombination. (ii) Human telomeric repeat (TTAGGG)n adopts a novel quadruplex structure and exhibits unusual chromatin organization. This unusual structural motif could explain chromosome pairing and stability. (iii) Intragenic amplification of (CTG)n/(CAG)n trinucleotide repeat, which is now known to be associated with several genetic disorders, could down-regulate gene expression in vivo. The overall implications of these findings vis-à-vis repetitive sequences in the genome are summarized.
Resumo:
Mycobacterium leprae recA harbors an in-frame insertion sequence that encodes an intein homing endonuclease (PI-MleI). Most inteins (intein endonucleases) possess two conserved LAGLIDADG (DOD) motifs at their ctive center. A common feature of LAGLIDADG-type homing endonucleases is that they recognize and cleave the same or very similar DNA sequences. However, PI-MleI is distinctive from other members of the family of LAGLIDADG-type HEases for its modular structure with functionally separable domains for DNA-binding and cleavage, each with distinct sequence preferences. Sequence alignment analyses of PI-MleI revealed three putative LAGLIDADG motifs; however, there is conflicting bioinformatics data in regard to their identity and specific location within the intein polypeptide. To resolve this conflict and to determine the active-site residues essential for DNA target site recognition and double-stranded DNA cleavage, we performed site-directed mutagenesis of presumptive catalytic residues in the LAGLIDADG motifs. Analysis of target DNA recognition and kinetic parameters of the wild-type PI-MleI and its variants disclosed that the two amino acid residues, Asp(122) (in Block C) and Asp(193) (in functional Block E), are crucial to the double-stranded DNA endonuclease activity, whereas Asp(218) (in pseudo-Block E) is not. However, despite the reduced catalytic activity, the PI-MleI variants, like the wild-type PI-MleI, generated a footprint of the same length around the insertion site. The D122T variant showed significantly reduced catalytic activity, and D122A and D193A mutations although failed to affect their DNA-binding affinities, but abolished the double-stranded DNA cleavage activity. On the other hand, D122C variant showed approximately twofold higher double-stranded DNA cleavage activity, compared with the wild-type PI-MleI. These results provide compelling evidence that Asp(122) and Asp(193) in DOD motif I and II, respectively, are bona fide active-site residues essential for DNA cleavage activity. The implications of these results are discussed in this report.
Resumo:
Study of the evolution of species or organisms is essential for various biological applications. Evolution is typically studied at the molecular level by analyzing the mutations of DNA sequences of organisms. Techniques have been developed for building phylogenetic or evolutionary trees for a set of sequences. Though phylogenetic trees capture the overall evolutionary relationships among the sequences, they do not reveal fine-level details of the evolution. In this work, we attempt to resolve various fine-level sequence transformation details associated with a phylogenetic tree using cellular automata. In particular, our work tries to determine the cellular automata rules for neighbor-dependent mutations of segments of DNA sequences. We also determine the number of time steps needed for evolution of a progeny from an ancestor and the unknown segments of the intermediate sequences in the phylogenetic tree. Due to the existence of vast number of cellular automata rules, we have developed a grid system that performs parallel guided explorations of the rules on grid resources. We demonstrate our techniques by conducting experiments on a grid comprising machines in three countries and obtaining potentially useful statistics regarding evolutions in three HIV sequences. In particular, our work is able to verify the phenomenon of neighbor-dependent mutations and find that certain combinations of neighbor-dependent mutations, defined by a cellular automata rule, occur with greater than 90% probability. We also find the average number of time steps for mutations for some branches of phylogenetic tree over a large number of possible transformations with standard deviations less than 2.